Corrected explicit-implicit domain decomposition ... - Springer Link

Science in China Series A: Mathematics Nov., 2009, Vol. 52, No. 11, 2362–2388 www.scichina.com math.scichina.com www.springer.com/scp www.springerlink.com

Corrected explicit-implicit domain decomposition algorithms for two-dimensional semilinear parabolic equations LIAO HongLin1,2† , SHI HanSheng2 & SUN ZhiZhong1 1

Department of Mathematics, Southeast University, Nanjing 210096, China

2

Department of Applied Mathematics and Physics, Institute of Sciences, PLAUST, Nanjing 211101, China (email: [email protected], [email protected])

Abstract

Corrected explicit-implicit domain decomposition (CEIDD) algorithms are studied for parallel approximation of semilinear parabolic problems on distributed memory processors. It is natural to divide the spatial domain into some smaller parallel strips and cells using the simplest straightline interface (SI). By using the Leray-Schauder fixed-point theorem and the discrete energy method, it is shown that the resulting CEIDD-SI algorithm is uniquely solvable, unconditionally stable and convergent. The CEIDD-SI method always suffers from the globalization of data communication when interior boundaries cross into each other inside the domain. To overcome this disadvantage, a composite interface (CI) that consists of straight segments and zigzag fractions is suggested. The corresponding CEIDD-CI algorithm is proven to be solvable, stable and convergent. Numerical experiments are presented to support the theoretical results.

Keywords: semilinear parabolic equation, explicit-implicit domain decomposition method, LeraySchauder fixed-point theorem, discrete energy method, convergence and stability MSC(2000):

1

65M06, 65M12, 65M55, 68Y05

Introduction

Domain decomposition is a powerful technique for devising parallel PDE algorithms. There is rich literature on domain decomposition methods[1, 2] for solving both time-independent and time-dependent problems. For parabolic problems, explicit difference schemes are naturally parallel and easy to implement on parallel machines; however, they always require small time steps because of stability constraints. Implicit schemes usually do not have stability constraints and are preferable for numerical solution of nonlinear evolution equations. Nonetheless, they are not inherently parallel and can not be applied directly on a parallel system because at each time level essentially an elliptic type of problem needs to be solved. An natural approach of parallelizing implicit schemes is to take advantage of explicit-implicit hybrid schemes since the problem considered here is time-dependent. The intrinsic parallel difference scheme, called by Zhou et al.[3−6] , is a class of explicit-implicit alternating difference schemes. In their works, some parameterized alternating difference schemes with intrinsic parReceived September 3, 2007; accepted November 21, 2008; published online September 2, 2009 DOI: 10.1007/s11425-009-0040-8 † Corresponding author This work was supported by National Natural Science Foundation of China (Grant No. 10871044) Citation: Liao H L, Shi H S, Sun Z Z. Corrected explicit-implicit domain decomposition algorithms for two-dimensional semilinear parabolic equations. Sci China Ser A, 2009, 52(11): 2362–2388, DOI: 10.1007/s11425-009-0040-8

Domain decomposition algorithms for semilinear parabolic equations

2363

allelism were constructed. There is certain arbitrariness for choosing the parameters in these general schemes, so that concrete intrinsic parallel difference schemes, such as the alternating group explicit scheme[7] , the alternating block explicit-implicit scheme[8] , and the block ADI scheme[9] , can be obtained. These general schemes allow possibly efficient computation in a parallel computing system because those schemes are unconditionally stable and can use large time steps. It was pointed out in [10, 11] that these alternating schemes cannot be implemented directly by making use of the existing sequential codes. An alternative approach of parallelizing the implicit schemes is to decompose the computational domain into some smaller non-overlapping subregions, and apply the implicit scheme on different subregions concurrently using the interface information obtained from previous time steps. Kuznetsov[12] proposed an explicit-implicit hybrid scheme, where the backward Euler scheme is used inside each subdomain while the forward Euler scheme is applied to obtain the interface values on the new time level. Once the interface values are available, the whole problem is decoupled completely and parallelization is achieved. We note that the additional information (numerical interface condition) on the boundaries between subdomains is usually not part of the original mathematical model and the physical problem, thereby different ways to generate the numerical boundary condition lead to various explicit-implicit domain decomposition (EIDD) methods, see [12–15] for related discussions. These EIDD methods are globally non-iterative, algorithmically simple, and computationally and communicationally efficient for each time level; however, they always suffer from temporal step-size restrictions. Recently, a new technique called implicit correction in [10, 11, 16–18] or stabilization in [19] is adopted to improve the parallel efficiency of EIDD methods by easing the time step-size restriction. The idea is to replace the interface predictor values with the new solutions computed by some implicit correction scheme, once the subdomain solutions are available at each time level. By adding the correction step to EIDD methods, the CEIDD (also called SEIDD in [19] or EPIC in [20]) algorithms always exhibit much better numerical stability. It was noted in [19] that the added correction step is a communication-cost-free process; however, the improvement on stability with negligible cost of data communication is always at the price of the flexibility of domain partitioning and consequently the scalability of parallel solver.

Ω1

Ω2

···

Ωp

Figure 1 The noncrossover straight interfaces divide Ω into p × 1 subdomains.

Figure 2

Ω1p2

Ω2p2

···

Ωp1 p2

.. .

.. .

..

.. .

Ω11

Ω21

···

.

Ωp1 1

The crossover straight interfaces divide Ω into p1 ×p2 subdomains

For deriving scalable parallel solver using domain decomposition, it is natural to divide the original domain into smaller strips (slabs) and cells by straight-line interfaces of noncrossover

Liao H L et al.

2364

and crossover types, as shown in Figures 1 and 2 respectively. It was noted in [18] that, compared with partitioning by using noncrossover boundaries for parallel simulations on given processors, the decomposition by using crossover interfaces reduces the cost of data communication without degrading the accuracy of parallel solution if it does not suffer from the globalization of data transferring. But, owing to the additional correction step using the implicit Euler scheme, the globalization occurs when straight interior boundaries cross into each other inside the spatial domain. To circumvent this problem, Yuan et al.[10, 11, 17] made the best of the interface predictors (linear extrapolation of interface solutions at previous two time-levels) and applied some partially implicit schemes but not the fully implicit scheme to updating the interface solutions around the cross-points. The resulting method is proven to be unconditionally stable and second-order convergent. In the previous work[18] , a different remedy was suggested where the straight-line interior boundaries were tailored into special zigzag-line interfaces (ZI) without modifying the interface corrector scheme. However, the configuration of zigzag-line interface is somewhat complex such that the resulting CEIDD-ZI algorithm is not easy to incorporate into the existing industrial matrix solver and the facility of parallel implementation is affected. We continue the development of CEIDD algorithms for parabolic equations. The corrected version of Kuznetsov’s EIDD method is also considered. That is to say, at each time level, one gets predictive values using the forward Euler scheme on man-made interior boundary, and then obtains subdomain solutions using the backward Euler scheme. Once the subdomain solutions are available, the interface solution are recomputed by the fully implicit scheme. In this article, a semilinear parabolic equation is considered. The simplest straight-line interface (SI) is used for domain decomposition. The resulting CEIDD-SI algorithm (CEIDD method based on straight-line interface) is proven to be uniquely solvable and unconditionally stable, see Theorems 2.1 and 2.2. In particular, an improved error estimate is achieved, see Theorem 2.3. To avoid the globalization of data communication when interior boundaries cross into each other inside the domain, a special composite-line interface (CI) that consists of straight segments and zigzag fractions is suggested. It is shown that the corresponding CEIDD-CI method is scalably parallel solver in the sense that it has a higher parallel efficiency compared with the existing CEIDD approaches, see Proposition 3.1 and Theorems 3.1–3.3. We consider the initial-boundary value problem of the semilinear parabolic equation ut = Δu + f (ux , uy , u, x, y, t), u(x, y, t) = ub (x, y, t), u(x, y, 0) = u (x, y), 0

(x, y, t) ∈ Ω × (0, T ],

(x, y, t) ∈ ∂Ω × (0, T ], ¯ (x, y) ∈ Ω,

(1.1) (1.2) (1.3)

¯ = Ω ∪ ∂Ω. We suppose the following where the domain Ω = (0, 1)2 , ∂Ω is the boundary and Ω assumptions are fulfilled. (A1) The nonlinear term f = f (w, v, u, x, y, t) is continuous with respect to (w, v, u, x, y, t) ∈ R3 × Ω × [0, T ] and globally Lipschitz continuous with respect to (w, v, u) ∈ R3 . There exists a positive constant cL , such that |f (w, v, u, x, t)| cL (|w| + |v| + |u|) + |f¯(x, y, t)|, where f¯(x, y, t) ≡ f (0, 0, 0, x, y, t). (A2) The initial and boundary data are smooth and satisfy u0 (x, y) = ub (x, y, 0) for (x, y) ∈ ∂Ω. (A3) The analytic problem (1.1)–(1.3) admits an unique


2365

smooth solution u(x, y, t) ∈ C (4,2) (Ω × [0, T ]). For positive integer K, let τ = T /K, tk = kτ , 0 k K. The time domain [0, T ] is covered by Ωτ = {tn | 0 k K}. For any mesh function wτ = {wk | 0 k K} on Ωτ , denote ∂t wk = (wk − wk−1 )/τ . The uniform partition of Ω is considered although different spacings for different spatial variables can be examined in a similar way. Let h = 1/N for positive integer N , xi = ih and yj = jh, 0 i, j N . The grid Ωh = {(xi , yj )| 1 i, j N − 1}, the boundary ¯ h = Ωh ∪ ∂Ωh . Given ∂Ωh = {(xi , yj )| i = 0 or i = N or j = 0 or j = N }, and the closure Ω ¯ h } on the grid Ω ¯ h , let grid function vh = {vij | (xi , yj ) ∈ Ω vij − vi−1,j , h vi−1,j − 2vij + vi+1,j Δx vij = , h2

δx vi− 12 ,j =

vi+1,j − vi−1,j , 2h vi−1,j + vij + vi+1,j . Ex vij = 3

Dx vij =

Notations δy vi,j− 12 , Dy vij , Δy vij and Ey vij are defined similarly. Furthermore, Δh vij = Δx vij + Δy vij , We also denote

N −1 N −1 v = h2 |vij |2 ,

Exy vij =

Ex vij + Ey vij . 2

N N −1 δx v = h2 |δx v

i=1 j=1

N −1 N δy v = h2 |δy vi,j− 12 |2 ,

i=1 j=1

|v|1 =

2 i− 12 ,j | ,

δx v2 + δy v2 .

i=1 j=1

Throughout this article c denotes a generic positive constant, not necessarily the same at different occurrances, which is always dependent of the solution and the given data but independent of the time-step size τ and the grid spacing h. In the theoretical analysis of the parallel methods, the -inequality 2ab a2 +

b2 ,

∀ a, b ∈ R, > 0,

will be applied frequently with different choices of at different occurrences. Also, the following lemmas are needed in our analysis. For any mesh function wτ = {wk | 0 k K}, it holds that

Lemma 1.1.

2wk ∂t wk = ∂t [(wk )2 ] + τ (∂t wk )2 ,

2wk−1 ∂t wk = ∂t [(wk )2 ] − τ (∂t wk )2 .

1 For any grid function vh that vanishes on ∂Ωh , v 2√ |v|1 . 3 ¯ h } satisfy Wi0 = WiN = 0 Lemma 1.3. Assume that grid functions {Wij , Vij | (xi , yj ) ∈ Ω and Vi0 = ViN = 0. For some fixed integer r (0 < r < N ), Wi r = 0. Then

Lemma 1.2[21] .

−h

2

N −1 j=1 j=r

Wij (Δy Vij ) + (Wi,r−1 + Wi,r+1 )Vi r = −h

2

N −1

Vij (Δy Wij ).

j=1 j=r

This article is organized as follows. Next section presents the CEIDD-SI algorithm and the rigorous study of solvability, convergence and stability; Section 3 has witnessed the construction

Liao H L et al.

2366

of composite-line interface and the theoretical analysis of CEIDD-CI approach; Section 4 devotes to examining numerically the stability and accuracy of the parallel algorithms for solving two nonlinear problems; Finally, some comments are presented in the concluding section. 2

The CEIDD method based on straight-line interface

2.1

Presentation of CEIDD-SI algorithm

Now we describe the CEIDD procedure for parallel approximation of the semilinear problem (1.1)–(1.3). Assume that Ωh is divided into two subregions, Ω11 and Ω21 , by a straight-line boundary γx = {(xs , yj ) | 1 j N − 1} for some integer s (1 < s < N − 1), see Figure 3. The general version of CEIDD algorithm on p × 1 subdomains (p 2) can be considered in a similar way (the reader can refer to [18]). j=N

pp p Ω12

c

Ω21

Ω11

c c c

Ω22

c

r+1 r r−1

Ω11

Ω21

pp p

j=1 i=0

Figure 3

· · · s−1 s

s+1

···

i=N

i=0

The straight interface γx divides Ωh into 2×1 subdomains

Figure 4

· · · s−1 s

s+1

···

j=0 i=N

The straight interface γ = γx ∪ γy divide Ωh into 2×2 subdomains

For simplicity, we use notation fijk (u) = f (Dx ukij , Dy ukij , Exy ukij , xi , yj , tk ),

(xi , yj ) ∈ Ωh , 0 k K.

The initial condition reads u0ij = u0 (xi , yj )

¯ h. on domain Ω

(2.1)

| (xi , yj ) ∈ Ωh } at time level tk−1 , the CEIDD algorithm computes Given the solution {uk−1 ij k ¯ {uij | (xi , yj ) ∈ Ωh } by a three-stage process. First, it obtains interface predictor u ˜ksj by the explicit Euler scheme, u ˜kij − uk−1 ij = Δh uk−1 + fijk−1 (u) ij τ

on interior boundary γx .

(2.2)

Secondly, two smaller subregion problems are solved in parallel by some stable scheme provided u ˜ksj serves the interior boundary data of Dirichlet-type. Here, we employ the implicit Euler scheme, ⎧ ⎪ ∂ uk = Δh ukij + fijk (u), ⎪ ⎨ t ij ⎪ ⎪ ⎩

ukij

=

u ˜kij ,

ukij = ub (xi , yj , tk ),

on subdomains Ω11 and Ω21 , on interior boundary γx , on exterior boundary ∂Ωh .

(2.3)


2367

An appropriate elliptic solver will be applied to obtain subdomain solutions. In the third stage, the predictor value u˜ksj is thrown away and the interface solution uksj is recomputed by the implicit Euler scheme ∂t ukij = Δh ukij + fijk (u)

on interior boundary γx .

(2.4)

Due to the usage of straight-line interface, this approach is called CEIDD-SI method. This method is easy to incorporate into the existing industrial solver, although the computation of interface solutions along the gird line γx would need an elliptic solver. From the domain decomposition point of view, this approach is elementary in the two-dimensional setting. The general case of p × 1 subdomains, and extensions to three space dimensions are straightforward. Similarly, we construct another inner interface γy = {(xi , yr )|1 i N − 1} for some integer r (1 < r < N − 1). However, difficulty raises when those straight interfaces γx and γy cross into each other with a cross-point (xs , yr ) inside the computational domain, see Figure 4. The interfaces of crossover type globalize the data transferring for computing the interface solutions and consequently suffer from overload of parallel-time on distributed memory machines. Specifically, when the five-point star stencil {(i − 1, j), (i, j − 1), (i, j), (i, j + 1), (i + 1, j)} based implicit Euler scheme is applied on the straight interfaces γ = γx ∪ γy , the interface solutions should be computed simultaneously by using one elliptic solver in one processor (consults the “ ◦ ” points in Figure 4, also see Figure 2). That is to say, once the subdomain solutions are available, the data at the points near the inner interfaces must be transmitted from all other machines to this one. Thus, the CEIDD-SI algorithm on straight interface of crossover type introduces the globalization of data communications. Next section is devoted to resolving this trouble by tailoring the configuration of those straight-line interfaces. Before we leave for this purpose, the theoretical considerations for the CEIDD-SI approach (2.1)–(2.4) are addressed. The stability, accuracy and scalability of the CEIDD-SI method for solving linear and nonlinear problems were well investigated numerically, see e.g. [16, 19, 20]; however, to our knowledge, the theoretical analysis of the method has not been published. 2.2

Solvability, stability and convergence of CEIDD-SI algorithm

The theoretical analysis of the CEIDD-SI method is based on the following lemma. k ¯ h } be the solution of the Lemma 2.1. Let 0 < λ 1, and grid function {wij | (xi , yj ) ∈ Ω following λ-parameterized nonlinear difference equations. k−1 k−1 k−1 k w ˜sj = wsj + λτ Δh wsj + λτ Gk−1 ˜sj , sj (w) + λτ g k = 0, wij k ∂t wij

=

1 j N − 1,

(xi , yj ) ∈ ∂Ωh , k λΔh wij

+

λGkij (w)

(2.6) +

k λgij ,

1 i s − 2, 1 j N − 1,

k k k k k k = λΔh ws−1,j + λG ˜sj − wsj )/h2 , ∂t ws−1,j s−1,j (w) + λgs−1,j + λ(w k ∂t ws,j

=

k ∂t ws+1,j

(2.5)

k λΔh ws,j

=

+

λGksj (w)

k λΔh ws+1,j

+

+

k λgsj ,

k λG s+1,j (w)

k k k ∂t wij = λΔh wij + λGkij (w) + λgij ,

(2.7) 1 j N − 1, (2.8)

1 j N − 1, +

k λgs+1,j

+

k λ(w˜sj

(2.9) −

k wsj )/h2 ,

1 j N − 1, (2.10)

s + 2 i N − 1, 1 j N − 1,

(2.11)

Liao H L et al.

2368

k (w) satisfy the following inequalities: where nonlinear functions Gkij (w), G ij k k k | + |Dy wij | + |Exy wij |), |Gkij (w)| cL (|Dx wij

i = s − 1, s + 1, 1 j N − 1,

k (w)| cL (|Dx wk | + |Dy wk | + |Exy wk |) + |G ij ij ij ij

cL k k (1 + h/3)|w ˜sj − wsj |, 2h

i = s − 1, s + 1, 1 j N − 1. Furthermore, denote Eλk = |wk |21 + 2λ2 τ 2 where

N −1

N

j=1

j=1

(χksj (w))2 + λ2 τ 2 h2

⎧ ⎨ Δ wk + Gk (w), h sj sj χksj (w) ≡ ⎩ 0,

(δy χks,j− 1 (w))2 , 2

1 j N − 1, j = 0, N.

Then, there exists a positive constant τ0 such that, when τ τ0 , Eλk (1 + 4cτ )Eλk−1 + 4τ (g k 2 + 2λg k 2γ ), where the constant c is dependent on cL , and N −1 N −1 N −1 k−1 2 k−1 2 k −g k )2 + 10τ 2 h−2 k −g g k γ = h2 (gsj ˜sj ) + 3τ (gsj (gsj ˜sj ) . j=1

j=1

j=1

k (w) and χk (w) by Gk , G k and χk . From the difference Proof. For clarity, denote Gkij (w), G ij sj ij ij sj equations (2.5) and (2.9), one gets k k k − wsj = −λτ 2 ∂t χksj − λτ gˆsj , w ˜sj

1 j N − 1.

k−1 k−1 k k = g˜sN = 0, gs0 = gsN = 0 such that Complementarily, we define g˜s0 k−1 k k gˆsj ≡ gsj − g˜sj = 0,

j = 0, N.

k k ˜sj − wsj ) vanishes for j = 0, N and then The definition of χksj implies that the discrepancy (w k k k − wsj = −λτ 2 ∂t χksj − λτ gˆsj , w ˜sj

0 j N.

(2.12)

Furthermore, one has k k = λχksj + λgsj , ∂t wsj

1 j N − 1.

(2.13)

k , we use the -inequality, Lemma 1.2 and the equality Under the assumptions of Gkij and G ij


2369

(2.12) to get A ≡ 2λh

2

N −1 s−2 j=1

k k k ks−1,j + (∂t wsj (∂t wij )Gkij + (∂t ws−1,j )G )Gksj

i=1

k k +(∂t ws+1,j )G s+1,j

+

N −1

k (∂t wij )Gkij

+ 2λh2

i=s+2

−1 N −1 N

k k (∂t wij )gij

i=1 j=1

λ λ ∂t wk 2 + 2λg k 2 + ∂t wk 2 + cλ|wk |21 2 4 N −1 k k k k +cλh(1 + h) |w ˜sj − wsj |(|∂t ws−1,j | + |∂t ws+1,j |) j=1

3λ = ∂t wk 2 + 2λg k 2 + cλ|wk |21 4 +cλ2 τ 2 h(1 + h)

N −1

k k |∂t χksj |(|∂t ws−1,j | + |∂t ws+1,j |)

j=1

+cλ2 τ h(1 + h)

N −1

k k k |ˆ gsj |(|∂t ws−1,j | + |∂t ws+1,j |)

j=1

3λ ∂t wk 2 + 2λg k 2 + cλ|wk |21 + cλτ ∂t wk 2 + 4cλτ 2 |∂t wk |21 4 N −1 N −1 k 2 +λ3 τ 2 (τ + h2 )(∂t χksj )2 + 2λ3 (τ + h2 )(ˆ gsj ) . j=1

j=1

Suppose that the time-step size is sufficiently small, or τ 1/(4c). It follows that A λ∂t wk 2 + λτ |∂t wk |21 + cλ|wk |21 + 2λg k 2 +λ3 τ 2

N −1

N −1

j=1

j=1

(τ + h2 )(∂t χksj )2 + 2λ3

k 2 (τ + h2 )(ˆ gsj ) .

(2.14)

We also derive that B ≡ 2λ

N −1

k k k k (w ˜sj − wsj )(∂t ws−1,j + ∂t ws+1,j )

j=1

= 2λ

N −1

k k k k k (w ˜sj − wsj )(h2 ∂t Δh wsj + 2∂t wsj − h2 ∂t Δy wsj )

j=1

= 2λ

N −1

N −1

j=1

j=1

k k k k (w ˜sj − wsj )(h2 ∂t Δh wsj + 2∂t wsj ) − 2λh2

k k k (w ˜sj − wsj )(∂t Δy wsj )

B1 + B2 .

(2.15)

From the definition of χksj , we have B1 = −2λ2

N −1

k k k (τ 2 ∂t χksj + τ gˆsj )(h2 ∂t Δh wsj + 2∂t wsj )

j=1

= −2λ2

N −1

k k (τ 2 ∂t χksj + τ gˆsj )(h2 ∂t χksj + 2∂t wsj − h2 ∂t Gksj ).

j=1

Liao H L et al.

2370

The upper bound of B1 is evaluated term by term. The first term N −1

(∂t χksj )2 .

B11 = −2λ τ h

2 2 2

j=1

Applying equality (2.13), Lemma 1.1 and the -inequality, one has B12 = −4λ2 τ 2

N −1

k (∂t χksj )(∂t wsj ) = −4λ3 τ 2

N −1

j=1

= −2λ3 τ 2

j=1

N −1

∂t (χksj )2 − 2λ3 τ 3

j=1

−2λ3 τ 2

k (∂t χksj )(χksj + gsj )

N −1

(∂t χksj )2 − 4λ3 τ 2

j=1

N −1

∂t (χksj )2 − λ3 τ 3

j=1

N −1

k (∂t χksj )gsj

j=1

N −1

N −1

j=1

j=1

(∂t χksj )2 + 4λ3 τ

k 2 (gsj ) .

With the help of the -inequality and Lemma 1.2, we can get B13 = 2λ2 τ 2 h2

N −1

(∂t χksj )(∂t Gksj ) = 2λ2 τ h2

j=1

λ3 τ 2 h2 λ3 τ 2 h2

N −1 j=1

N −1

N −1

j=1

j=1

(∂t χksj )2 + λh2

(∂t χksj )(Gksj − Gk−1 sj )

2 (Gksj − Gk−1 sj )

N −1

(∂t χksj )2 + cλ(|wk |21 + |wk−1 |21 ).

j=1

Similarly, B14 = −2λ2 τ h2

N −1

k gˆsj (∂t χksj ) λ2 τ 2 h2

j=1 N −1

B15 = −4λ2 τ

j=1

B16 = 2λ2 τ h2

(∂t χksj )2 + λ2 h2

j=1

k k gˆsj (∂t wsj )

N −1

N −1

λh 2

N −1

k 2 (ˆ gsj ) ,

j=1

−1 2 N

k 2 (∂t wsj ) + 8λ3 τ 2 h−2

j=1

N −1

k 2 (ˆ gsj ) ,

j=1

k gˆsj (∂t Gksj ) cλ(|wk |21 + |wk−1 |21 ) + λ3 h2

j=1

N −1

k 2 (ˆ gsj ) .

j=1

Therefore collecting terms we get B1 =

6

B1

=1

N −1 λ k 2 3 2 ∂t w − 2λ τ ∂t (χksj )2 + cλ(|wk |21 + |wk−1 |21 ) 2 j=1

−(2 − λ)λ2 τ 2 h2

N −1

(∂t χksj )2 − λ3 τ 3

j=1

+4λ3 τ

N −1

(∂t χksj )2

j=1

N −1

N −1

N −1

j=1

j=1

j=1

k 2 (gsj ) + (1 + λ)λ2 h2

k 2 (ˆ gsj ) + 8λ3 τ 2 h−2

k 2 (ˆ gsj ) .

(2.16)

Now we consider the estimate of B2 . Using the discrete Green formula and the equalities


2371

(2.12)–(2.13), one has B2 = −2λh

2

N −1

k (w˜sj

−

k k wsj )(∂t Δy wsj )

= −2λh

2

j=1

= 2λ2 τ 2 h2

k k k [Δy (w ˜sj − wsj )](∂t wsj )

j=1

N −1

k (∂t Δy χksj )(∂t wsj ) + 2λ2 τ h2

j=1

= 2λ3 τ 2 h2

N −1

N −1

k k (Δy gˆsj )(∂t wsj )

j=1

N −1

N −1

N −1

j=1

j=1

j=1

(∂t Δy χksj )χksj + 2λ3 τ 2 h2

k (∂t Δy χksj )gsj + 2λ2 τ h2


B21 + B22 + B23 , where B21 = −2λ2 τ 2 h2

N j=1

N

= −λ3 τ 2 h2

(∂t δy χks,j− 1 )(δy χks,j− 1 ) 2

2

N

∂t (δy χks,j− 1 )2 − λ3 τ 3 h2 2

j=1

j=1

(∂t δy χks,j− 1 )2 , 2

N −1 N −1 1 k 2 B22 λ3 τ 3 h4 (∂t Δy χksj )2 + 4λ3 τ (gsj ) 4 j=1 j=1

λ3 τ 3 h2

N j=1

B23 2λ3 τ 2 h2

(∂t δy χks,j− 1 )2 + 4λ3 τ

k 2 (gsj ) ,

2

N −1

k 2 (Δy gˆsj ) +

j=1

32λ3 τ 2 h−2

N −1

N −1

k 2 (ˆ gsj ) +

j=1

λh 2

j=1

−1 2 N

k 2 (∂t wsj )

j=1

λ ∂t wk 2 . 2

Collecting terms one gets N λ k 2 3 2 2 B2 ∂t w − λ τ h ∂t (δy χks,j− 1 )2 2 2 j=1

+4λ τ 3

N −1

k 2 (gsj )

3 2 −2

+ 32λ τ h

j=1

N −1

k 2 (ˆ gsj ) .

(2.17)

j=1

Inserting the inequalities (2.16) and (2.17) into (2.15), we get B λ∂t wk 2 − 2λ3 τ 2

N −1

∂t (χksj )2 − λ3 τ 2 h2

j=1

N j=1

+cλ(|wk |21 + |wk−1 |21 ) − (2 − λ)λ2 τ 2 h2

∂t (δy χks,j− 1 )2

N −1

2

(∂t χksj )2 − λ3 τ 3

N −1

j=1

+8λ3 τ

(∂t χksj )2

j=1

N −1

N −1

N −1

j=1

j=1

j=1

k 2 (gsj ) + (1 + λ)λ2 h2

k 2 (ˆ gsj ) + 40λ3 τ 2 h−2

k 2 (ˆ gsj ) .

(2.18)

Liao H L et al.

2372

k k k k Multiplying the equations (2.7)–(2.11) by 2h2 ∂t wij , 2h2 ∂t ws−1,j , 2h2 ∂t wsj , 2h2 ∂t ws+1,j and k , respectively, summing i, j for (xi , yj ) ∈ Ωh and then adding up the resulting equalities, 2h2 ∂t wij we get

2∂t wk 2 + λ ∂t (|wk |21 ) + λτ |∂t wk |21 = A + B,

(2.19)

where the boundary condition (2.6), the discrete Green formulation and Lemma 1.1 have been used. Inserting the inequalities (2.14) and (2.18) into (2.19), we obtain ∂t Eλk c(|wk |21 + |wk−1 |21 ) + 2g k 2 + 4λg k 2γ c(Eλk + Eλk−1 ) + 2g k 2 + 4λg k 2γ , or (1 − cτ )Eλk (1 + cτ )Eλk−1 + 2τ g k 2 + 4λτ g k 2γ . Suppose that τ0 = 1/(2c) and τ τ0 , the claimed estimate follows immediately. The proof of Lemma 2.1 is completed. Given discrete solution {uk−1 | (xi , yj ) ∈ Ωh } at the time level tk−1 , the finite difference ij schemes (2.2)–(2.4) can be regarded as a system of nonlinear equations with respect to the ¯ h } so that the existence of solution is not evident. By unknown variables {ukij | (xi , yj ) ∈ Ω using the Leray-Schauder fixed-point theorem (see [21, p. 27] or [22, p. 280]), we prove the existence of the discrete solutions of the CEIDD-SI procedure (2.1)–(2.4). Let Uijk = u(xi , yj , tk ),

ekij = Uijk − ukij ,

¯ h; (xi , yj ) ∈ Ω

e˜kij = Uijk − u ˜kij ,

(xi , yj ) ∈ γx ;

for 0 k K. Then the error system of the CEIDD-SI procedure (2.1)–(2.4) reads e˜kij − ek−1 ij k−1 = Δh ek−1 + fijk−1 (U ) − fijk−1 (u) + r˜ij , on γx , 1 k K, ij τ ⎧ k ⎪ ∂ ek = Δh ekij + fijk (U ) − fijk (u) + rij , on Ω11 ∪ Ω21 , ⎪ ⎨ t ij ⎪ ⎪ ⎩

ekij

= e˜kij ,

on γx ,

ekij

= 0,

on ∂Ωh ,

1 k K,

(2.20)

(2.21)

k , ∂t ekij = Δh ekij + fijk (U ) − fijk (u) + rij

on γx , 1 k K,

(2.22)

e0ij

¯ h, on Ω

(2.23)

= 0,

k−1 k ¯ h , 0 k K} and r˜ij are truncation error functions. If the solution {ekij |(xi , yj ) ∈ Ω where rij of the above error system (2.20)–(2.23) exists, our assumption (A3) implies that the CEIDD-SI method (2.1)–(2.4) has at least one solution. ¯ h } and z = {zij | (xi , yj ) ∈ Ω ¯ h }. For 0 λ 1, we construct a Denote ek = {ekij | (xi , yj ) ∈ Ω 2 2 2 λ-parameterized mapping Tλ : R(N +1) → R(N +1) of the Euclidean space R(N +1) into itself. ek = Tλ (z) is defined by k k ekij = ek−1 + λτ Δh zij + λτ σij (z) + λτ rij , ij

ekij = 0,

(xi , yj ) ∈ ∂Ωh ,

(xi , yj ) ∈ Ωh ,


2373

where k σij (z) = fijk (U ) − fijk (U − z),

i = s − 1, s + 1, 1 j N − 1, z˜sj − zsj k k k (z) = fs−1,j (U ) − f˜s−1,j (U − z) + , 1 j N − 1, σs−1,j h2 z˜sj − zsj k k k σs+1,j (z) = fs+1,j (U ) − f˜s+1,j (U − z) + , 1 j N − 1, h2 and

z˜sj − zs−2,j Ey zs−1,j zs−2,j + zs−1,j + z˜sj k , Dy zs−1,j , + , xs−1 , yj , tk , f˜s−1,j (z) = f 2h 6 2 1 j N − 1, zs+2,j − z˜sj Ey zs+1,j zs+2,j + zs+1,j + z˜sj k , Dy zs+1,j , + , xs+1 , yj , tk , (z) = f f˜s+1,j 2h 6 2 1 j N − 1, k−1 k−1 k−1 k−1 ˜sj , z˜sj = ek−1 sj + λτ Δh esj + λτ [fsj (U ) − fsj (u)] + λτ r 2

1 j N − 1.

2

Thus ek = Tλ (z) ∈ R(N +1) for any z ∈ R(N +1) and parameter λ ∈ [0, 1]. (C1) This defines a 2 continuous mapping Tλ of Euclidean space R(N +1) into itself with a parameter λ. (C2) When 2 λ = 0, the image of the mapping T0 is a fixed-point T0 (z) = ek−1 in R(N +1) . Then, for the nonlinear system ek = Tλ (ek ) ,

(2.24)

the Leray-Schauder fixed-point theorem of continuous mapping says that it has at least one 2 solution ek ∈ R(N +1) for any 0 λ 1 if (C3) all possible solutions of (2.24) are uniformly bounded with respect to the parameter 0 λ 1. Thanks to (C2), in order to prove the existence of the solution for the nonlinear system (2.20)–(2.22) or ek = T1 (ek ), it is sufficient to prove the uniform boundedness of all possible 2 2 fixed points of mapping Tλ : R(N +1) → R(N +1) with respect to the parameter λ ∈ (0, 1]. It is easy to check that |fijk (U ) − fijk (U − e)| cL (|Dx ekij | + |Dy ekij | + |Exy ekij |), |fijk (U )

−

f˜ijk (U

−

e)| cL (|Dx ekij |

+

|Dy ekij |

+

|Exy ekij |)

i = s − 1, s + 1, cL (1 + h/3)|˜ eksj − eksj |, + 2h i = s − 1, s + 1.

Thus we use Lemma 2.1 to find that, for small time-step size τ τ0 , |ek |21 (1

N −1 N k−1 k−1 k−1 2 2 2 2 2 2 + 4cτ ) |e |1 + 2τ (χsj (e)) + τ h (δy χs,j− 1 (e)) j=1

j=1

2

+4τ (rk 2 + 2rk 2γ ), where

N −1 N −1 N −1 k−1 2 k−1 2 k −r k )2 + 10τ 2 h−2 k −r (rsj ˜sj ) + 3τ (rsj (rsj ˜sj ) . rk γ = h2 j=1

j=1

j=1

Liao H L et al.

2374

It means that |ek |1 is uniformly bounded with respect to 0 < λ 1. Thanks to Lemma 1.2, Tλ (ek ) = ek is also uniformly bounded for given time-step τ and spacing h. Therefore, the ¯ h }. By the principle of induction, we know system (2.20)–(2.22) has a solution {ekij | (xi , yj ) ∈ Ω ¯ h , 0 k K}. Thus that the error system (2.20)–(2.23) has one solution {ekij | (xi , yj ) ∈ Ω assumption (A3) yields the following result. Lemma 2.2. Suppose that (A1)–(A3) are satisfied and time-step τ is sufficiently small. Then ¯ h , 0 k K}. the CEIDD-SI algorithm (2.1)–(2.4) has at least one solution {ukij | (xi , yj ) ∈ Ω k k ¯ h , 0 k K} To verify the uniqueness, suppose that grid functions {v , v˜ | (xi , yj ) ∈ Ω ij

ij

satisfy the following CEIDD-SI algorithm: k−1 k v˜ij − vij k−1 = Δh vij + fijk−1 (v) + ρ˜k−1 on γx , 1 k K, ij , τ ⎧ k ⎪ + fijk (v) + ρkij , on Ω11 ∪ Ω21 , ∂ v k = Δh vij ⎪ ⎨ t ij

⎪ ⎪ ⎩

k k = v˜ij , vij

on γx ,

k vij

on ∂Ωh ,

= ub (xi , yj , tk ),

k k ∂t vij = Δh vij + fijk (v) + ρkij , 0 vij

1 k K,

on γx , 1 k K,

= u (xi , yj ) + ϕij ,

¯ h. on Ω

0

k ¯ h and ε˜k = v˜k − u Let εkij = vij − ukij , (xi , yj ) ∈ Ω ˜kij , (xi , yj ) ∈ γx for 0 k K. Subtracting ij ij the difference procedure (2.2)–(2.4) and (2.1) from the above system, one has

ε˜kij − εk−1 ij = Δh εk−1 + fijk−1 (v) − fijk−1 (u) + ρ˜k−1 ij ij , τ ⎧ ⎪ ∂ εk = Δh εkij + fijk (v) − fijk (u) + ρkij , on ⎪ ⎨ t ij k k on εij = ε˜ij , ⎪ ⎪ ⎩ k εij = 0, on ∂t εkij = Δh εkij + fijk (v) − fijk (u) + ρkij , ε0ij

= ϕij ,

on γx , 1 k K,

(2.25)

Ω11 ∪ Ω21 , γx ,

1 k K,

(2.26)

∂Ωh ,

on γx , 1 k K,

(2.27)

¯ h, on Ω

(2.28)

In the difference system (2.25)–(2.27), we set k−1 εk−1 = 0 (or vij = uk−1 ij ij ),

ρkij = 0,

ρ˜k−1 = 0, ij

(xi , yj ) ∈ Ωh ;

(xi , yj ) ∈ γx .

k ¯ h } is another solution at the time level tk of the CEIDD-SI The grid function {vij | (xi , yj ) ∈ Ω procedure (2.2)–(2.4). To prove the uniqueness of numerical solution, it is sufficient to show that the finite difference schemes (2.25)–(2.27) has not nontrivial solution. In this case, k−1 k−1 k−1 k−1 fsj (v) − fsj (u) = fsj (u) − fsj (u) = 0,

1 j N − 1.

Then, by using Lemma 2.1, we have |εk |21

+ 2τ

2

N −1

(χksj (ε))2

j=1

N +τ h (δy χks,j− 1 (ε))2 0. 2 2

j=1

2


2375

¯ h. It implies that |εk |1 = 0. Then the zero boundary condition yields εkij = 0 for (xi , yj ) ∈ Ω k−1 We see that, given data {uij | (xi , yj ) ∈ Ωh } at the time level tk−1 , the CEIDD-SI procedure ¯ h } at the time level tk . Therefore, the (2.2)–(2.4) generates an unique solution {ukij | (xi , yj ) ∈ Ω principle of induction and Lemma 2.2 lead to the following result. Theorem 2.1. Suppose that the assumptions (A1)–(A3) are satisfied, and the time-step τ is sufficiently small, the CEIDD-SI algorithm (2.1)–(2.4) is uniquely solvable. Focusing on the difference system (2.25)–(2.28), we apply Lemma 2.1 to get |εk |21 + 2τ 2

N −1

N

j=1

j=1

(χksj (ε))2 + τ 2 h2

(δy χks,j− 1 (ε))2 2

N −1 N k−1 2 2 2 2 (1 + 4cτ ) |εk−1 |21 + 2τ 2 (χk−1 (ε)) + τ h (δ χ (ε)) y s,j− 1 sj j=1

+4τ (ρk 2 + 2ρk 2γ ),

2

j=1

1 k K.

The well-known discrete Gronwall inequality yields |εk |21 + 2τ 2

N −1

N

j=1

j=1

(χksj (ε))2 + τ 2 h2

e

4ckτ

|ε0 |21

+ 2τ

2

(δy χks,j− 1 (ε))2

N −1

2

(χ0sj (ε))2

+τ h

2 2

j=1

N j=1

(δy χ0s,j− 1 (ε))2 2

+ 4τ

k

2

(ρ +

2ρ 2γ )

,

=1

for 1 k K. We arrive at the following theorem. Theorem 2.2. Under the assumptions (A1)–(A3), the CEIDD-SI algorithm (2.1)–(2.4) is unconditionally stable with respect to the initial and external data in the H 1 norm (seminorm). Now examine the accuracy of the solution. Focusing on the error system (2.20)–(2.23), we apply Lemma 2.1 and the discrete Gronwall inequality to finding that |ek |21 + 2τ 2

N −1

N

j=1

j=1

(χksj (e))2 + τ 2 h2

(δy χks,j− 1 (e))2 2

k N −1 N e4ckτ |e0 |21 + 2τ 2 (χ0sj (e))2 + τ 2 h2 (δy χ0s,j− 1 (e))2 + 4τ (ρ 2 + 2ρ2γ ) j=1

j=1

2

=1

for 1 k K. Note that the truncation errors satisfy |ρkij | c(τ + h2 ) and |˜ ρksj | c(τ + h2 ). It follows that |ek |1 c 1 + (h + 3τ h−1 + 10τ 2 h−3 )(τ + h2 ) c 1 + (h1/2 + τ h−3/2 )2 (τ + h2 ) c(1 + h1/2 + τ h−3/2 )(τ + h2 ) c(τ + h2 + τ 2 h−3/2 ),

1 k K.

Thus we have the following result. Theorem 2.3. Under the assumptions (A1)–(A3), the solution of the CEIDD-SI algorithm (2.1)–(2.4) converges to the exact solution with an order of O(τ + h2 + τ 2 h−3/2 ) in the H 1 norm (seminorm), provided the spatial length h is sufficiently small and the time-step size τ = O(h3/4+ ) for 0 < 5/4.

Liao H L et al.

2376

Compared with the error estimation, |ek |1 = O(τ + h2 + τ h−1/2 ), obtained in our previous study[18] , an improved rate of convergence is observed when the time-step size τ = O(h1+ ) for 0 < 1. We note that the L2 estimation on the stability and convergence of the CEIDD-SI method can also be proved by combining the present analysis with the techniques in [18]. As for the general case of p × 1 subdomains, we consider interior interfaces (1 α p − 1), γx =

p−1

{(xiα , yj ) | 4 iα + 2 iα+1 N − 2, 1 j N − 1}

α=1

such that Ωh is decomposed into p × 1 non-overlapping subdomains, see Figure 1. We conclude that, under the assumptions (A1)–(A3), the p × 1-subdomain version of CEIDD-SI method is √ uniquely solvable, stable, and convergent with an order of O(τ + h2 + p − 1τ 2 h−3/2 ) provided the time-step size τ = O(p−1/4 h3/4+ ) for 0 < 5/4. 3

The CEIDD method based on composite interface

3.1 Presentation of CEIDD-CI algorithm The construction of composite interface is motivated by the design of zigzag-line interface (ZI), see [18]. We recall the zigzag-line interior boundaries as follows. For fixed integers s, r (2 < s, r < N − 1) satisfied mod(s, 2) = mod (r, 2), let α(j) = s − mod (j, 2) and β(i) = r − mod (i, 2). The zigzag-line interior boundaries Zx = {(xα , yj ) | 1 j N − 1}, Zy = {(xi , yβ ) | 1 i N − 1} and Z = Zx ∪ Zy . By using the zigzag interfaces, see Figures 5 and 6, the domain Ωh is divided into some nonoverlapping subregions. We observe that, the backward Euler scheme (at the implicit correction step) on any interface point always uses the adjoining subdomain solutions but does not involve the other interface points. The point-to-point correlativity of interface solutions is broken away and then data transferring across the interface is localized. In other words, from the data communication point of view, the zigzag-line interior boundaries make the implicit scheme behave in the manner of explicit schemes. Since explicit schemes are intrinsic parallel methods with local communication, the zigzag-line interfaces improve flexibility of domain partitioning but not globalize the data communications between subdomains. j=N

@ @ @ @

Ω12

@ @

@ @

@ @

@ @

@ @

Ω11

@ @

Ω11

i=0

Figure 5

...

j=N

@ @

pp p Ω22

r+1

Ω21

s−1 s

s+1

...

j=0 i=N

The zigzag interface Zx divides Ωh into 2×1 subdomains

i=0

Figure 6

· · · s−1 s

@ @

s+1

@ @

r r−1

ppp

Ω21

j=1

···

j=0 i=N

The zigzag interfaces Z divide Ωh into 2×2 subdomains

However, the point we want to note is that, due to the complex configuration of zigzag-line interface, a little more effort should be made to carry out the resulting CEIDD-ZI algorithm on


2377

parallel machines using the existing matrix solvers. To overcome this disadvantage, we suggest composite interface which consists of straight segments and zigzag fractions. Given some fixed integers s (2 < s < N − 1) and r (2 < r < N − 1), let ⎧ ⎧ ⎨ r, ⎨ s, i = s, j = r (3.1) and ς = ς(i) = σ = σ(j) = ⎩ r − 1, ⎩ s − 1, i = s. j=r We construct the following composite (straight-zigzag-straight) lines, as shown in Figures 7 and 8, Γx = {(xσ , yj ) | 1 j N − 1}, Γy = {(xi , yς ) | 1 i N − 1} and Γ = Γx ∪ Γy . By using these composite lines, Ωh is divided into some smaller non-overlapping subregions. The corresponding CEIDD-CI algorithm could be presented in an obvious manner, i.e., u ˜kij − uk−1 ij = Δh uk−1 + fijk−1 (u), on Γ, 1 k K, ij τ ⎧ ⎪ ∂t ukij = Δh ukij + fijk (u), on subdomains, ⎪ ⎪ ⎪ ⎨ 1 k K, ˜kij , on Γ, ukij = u ⎪ ⎪ ⎪ ⎪ ⎩ k on ∂Ωh , uij = ub (xi , yj , tk ),

(3.2)

(3.3)

∂t ukij = Δh ukij + fijk (u),

on Γ, 1 k K,

(3.4)

u0ij

¯ h. on Ω

(3.5)

= u (xi , yj ), 0

Compared with CEIDD-SI approach using the crossover straight interfaces (see Figure 4), a tiny modification in treating the cross-point (xs , yr ) is observed; however, it overcomes the main shortcoming on data communication. To make the argument more transparent, we consider the CEIDD-CI procedure using single interface Γx , see Figure 7. Once the subdomain solutions are available, the interface solution uks−1,r could be computed firstly by implicit Euler scheme without involving the other interface solutions. The computations of interface solution would be divided into two parts, in accordance with uksj for 1 j r − 1 and r + 1 j N − 1 respectively, and each part can be carried out concurrently in different machines. Thus, the point-to-point correlativity of interface solutions introduced by the implicit Euler scheme is locally dissolved. j=N

j=N

ppp

ppp Ω12

r+1 Ω11

Ω21

@ @

r r−1

pp p

c c c @ @c

Ω22

c

r r−1

Ω11

Ω21

j=1 i=0

Figure 7

· · · s−1 s

s+1

···

j=0 i=N

The composite interface Γx divides Ωh into 2×1 subdomains

r+1

pp p

j=1 i=0

Figure 8

· · · s−1 s

s+1

···

j=0 i=N

The composite interfaces Γ divide Ωh into 2×2 subdomains

Now consider the crossover case, see Figure 8. We can divide the composite interfaces Γ into

Liao H L et al.

2378

four parts and assign each part to one processor. The interface correction step (the third stage) of the parallel method will only require data on adjoining subregions so that the localization of communication together with an improved flexibility of domain partitioning is achieved. The parallel time spent in this step will be prominently reduced owing to that each smaller part of interface are treated synchronously in parallel machines. Moreover, by presenting a quantitative analysis (cf. [18]), we would see that the domain partitioning by employing the crossover composite interfaces has lower cost of data communication than the decomposition by using the noncrossover straight-line interfaces on given parallel computing machines. We note that, it is easy to implement the CEIDD-CI method by using the existing implicit codes built on regular spatial domain. Observation shows that smaller subregions always are rectangles other than the subdomain Ω22 which contains the original cross-point (xs , yr ). However, this point is not the matter since it is an isolated point and the solution can be computed independently. From the above arguments and the analysis given in the next subsection, one has the following proposition. Proposition 3.1. The CEIDD-CI algorithm is a scalable parallel parabolic solver in the sense that it has higher parallel efficiency including flexibility of domain partitioning, localization of data communication (vs. the CEIDD-SI method) and facility of parallel implementation (vs. the CEIDD-ZI method) without degrading the stability and accuracy of numerical solutions. 3.2 Solvablity, stability and convergence The theoretical analysis of the CEIDD-CI method is based on the following lemma, in which we denote r−1 N −1 s−1 N −1 = + , = + . j=r

j=1

j=r+1

i=s

i=1

i=s+1

k ¯ h } be a specific grid function such as the Lemma 3.1. Let 0 < λ 1, {vij | (xi , yj ) ∈ Ω ¯ h }, and grid function {wk | (xi , yj ) ∈ Ω ¯ h } be the solution of the exact solution {Uijk | (xi , yj ) ∈ Ω ij following λ-parameterized nonlinear difference equations: k−1 k w ˜ij − wij k−1 k−1 + λGk−1 gij , on Γ, 1 k K, = λΔh wij ij (w) + λ˜ τ ⎧ k k ⎪ ∂ wk = λΔh wij + λGkij (w) + λgij , on subdomains, ⎪ ⎨ t ij k k 1 k K, =w ˜ij , on Γ, wij ⎪ ⎪ ⎩ k wij = 0, on ∂Ωh , k k k = λΔh wij + λGkij (w) + λgij , ∂t wij

on Γ, 1 k K,

where Gkij (w) = fijk (v) − fijk (v − w). Furthermore, denote Fλk

=

|wk |21 + 2λ2 τ 2

(χksj (w))2 + λ2 τ 2 h2

j=1

j=r

+2λ2 τ 2

i=s

(χkir (w))2 + λ2 τ 2 h2

N

N i=1

(δy χks,j− 1 (w))2 2

(δx χki− 1 ,r (w))2 , 2

(3.6)

(3.7)

(3.8)


where

⎧ ⎨ Δ wk + Gk (w), h ij ij χkij (w) ≡ ⎩ 0,

2379

(xi , yj ) ∈ Γ, otherwise.

Then, there exists a positive constant τ0 such that, when τ τ0 , Fλk (1 + 4cτ )Fλk−1 + 4τ (g k 2 + 2λg k 2Γx + 2λg k 2Γy ), where the constant c is dependent on cL , and k−1 2 k−1 2 k −g k )2 + 20τ 2 h−2 k −g g k Γx = h2 (gsj ˜sj ) + 3τ (gsj (gsj ˜sj ) , g k Γy Proof.

j=r

j=r

j=r

i=s

i=s

i=s

2 2 = h2 (gikr − g˜ik−1 (gikr )2 + 20τ 2 h−2 (gikr − g˜ik−1 r ) + 3τ r ) .

For clarity, let

⎧ ⎨ (w k k ˜ij − wij )/h2 , Wijk ≡ ⎩ 0,

and

⎧ ⎨ (g k − g˜k−1 ), ij ij k ≡ gîj ⎩ 0,

(xi , yj ) ∈ Γ, otherwise,

(xi , yj ) ∈ Γ, otherwise.

k , and the zero-valued Using equations (3.6) and (3.8), the definitions of χkij , Wijk and gîj boundary condition, one has k h2 Wijk = −λτ 2 ∂t χkij − λτ gîj , k ∂t wij

=

λχkij

+

k λgij ,

¯ h, (xi , yj ) ∈ Ω

(3.9)

(xi , yj ) ∈ Γ.

(3.10)

(I) The case of single interface Γx (see Figure 7) is considered firstly. In this case, we can write the system (3.6)–(3.8) as the following point-related difference equations k−1 k−1 k−1 k w ˜σj = wσj + λτ Δh wσj + λτ Gk−1 ˜σj , σj + λτ g k = 0, wij k ∂t wij

=

k ∂t ws−2,j

(3.11)

(xi , yj ) ∈ ∂Ωh , k λΔh wij

+

λGkij

+

(3.12) k λgij ,

1 i s − 3, 1 j N − 1,

k k k = λΔh ws−2,j + λGks−2,j + λgs−2,j , ∂t ws−2,j k ∂t ws−2,r

1 j N − 1,

=

k λΔh ws−2,r

=

k λΔh ws−2,j

1 j r − 1,

=

(3.14)

k k ks−2,r + λWs−1,r + λG + λgs−2,r ,

+

λGks−2,j

+

k λgs−2,j ,

k λΔh ws−1,r−1

+

ks−1,r−1 λG

+

(3.15)

r + 1 j N − 1,

k k k k k = λΔh ws−1,j + λG ∂t ws−1,j s−1,j + λWsj + λgs−1,j , k ∂t ws−1,r−1

(3.13)

k λWs,r−1

+

1 j r − 2,

(3.17)

k λWs−1,r

(3.18)

+

k λgs−1,r−1 ,

k k k = λΔh ws−1,r + λGks−1,r + λgs−1,r , ∂t ws−1,r k ∂t ws−1,r+1 k ∂t ws−1,j

=

k λΔh ws−1,r+1

(3.16)

(3.19)

k k k ks−1,r+1 + λWs,r+1 + λG + λWs−1,r + λgs−1,r+1 , (3.20)

k k k k = λΔh ws−1,j + λG s−1,j + λWsj + λgs−1,j ,

k k k ∂t wsj = λΔh wsj + λGksj + λgsj ,

1 j r − 1,

r + 2 j N − 1,

(3.21) (3.22)

Liao H L et al.

2380 k k k k k k + λW k ∂t wsr = λΔh wsr + λG sr s−1,r + λWs,r−1 + λWs,r+1 + λgsr ,

(3.23)

k ∂t wsj

(3.24)

=

k λΔh wsj

+

λGksj

+

k λgsj ,

r + 1 j N − 1,

k k k k k = λΔh ws+1,j + λG ∂t ws+1,j s+1,j + λWsj + λgs+1,j , k ∂t ws+1,r k ∂t ws+1,j

=

k λΔh ws+1,r

=

k λΔh ws+1,j

+

λGks+1,r

+

ks+1,j λG

+

k λgs+1,r ,

+

k λWsj

k k k = λΔh wij + λGkij + λgij , ∂t wij

+

1 j r − 1,

(3.25) (3.26)

k λgs+1,j ,

r + 1 j N − 1,

s + 2 i N − 1, 1 j N − 1,

(3.27) (3.28)

k is defined as Gk but replacing the interface variable wk with the predictive value where G ij ij σj k . Obviously, one has w ˜σj k k k |Gkij | cL (|Dx wij | + |Dy wij | + |Exy wij |),

i = σ − 1, σ + 1, 1 j N.

k is utilized in the schemes (3.15), (3.17)–(3.18), (3.20)–(3.21), Since the predictive value w ˜σj (3.23), (3.25) and (3.27), we can get h k k k k ks−2,r | cL (|Dx ws−2,r |G | + |Dy ws−2,r | + |Exy ws−2,r |) + cL (1 + h/3)|Ws−1,r |, 2 h k k k k ks−1,j | cL (|Dx ws−1,j | + |Dy ws−1,j | + |Exy ws−1,j |) + cL (1 + h/3)|Wsj |, 1 j r − 2, |G 2 k k k ks−1,r−1 | cL (|Dx ws−1,r−1 |G | + |Dy ws−1,r−1 | + |Exy ws−1,r−1 |) h k k | + |Ws,r−1 |), + cL (1 + h/3)(|Ws−1,r 2 k k k ks−1,r+1 | cL (|Dx ws−1,r+1 |G | + |Dy ws−1,r+1 | + |Exy ws−1,r+1 |) h k k + cL (1 + h/3)(|Ws−1,r | + |Ws,r+1 |), 2 k k k ks−1,j | cL (|Dx ws−1,j |G | + |Dy ws−1,j | + |Exy ws−1,j |) h k + cL (1 + h/3)|Wsj |, r + 2 j N − 1, 2 k k k ksr | cL (|Dx wsr |G | + |Dy wsr | + |Exy wsr |) h k k k + cL (1 + h/3)(|Ws−1,r | + |Ws,r−1 | + |Ws,r+1 |), 2 k k k k |G s+1,j | cL (|Dx ws+1,j | + |Dy ws+1,j | + |Exy ws+1,j |) h k |, 1 j N − 1, j = r. + cL (1 + h/3)|Wsj 2

Applying the -inequality, Lemma 1.2 and the equality (3.9), we get N −1 r−1 s−3 k k k k P ≡ 2λh2 (∂t wij )Gkij + (∂t ws−2,j )Gks−2,j + (∂t ws−2,r )G s−2,r j=1 i=1

+

+

N −1

j=1

k (∂t ws−2,j )Gks−2,j +

j=r+1

j=1

N −1

r−1

k k (∂t ws−1,j )G s−1,j +

j=r+1

+

r−1

N −1

k (∂t wsj )Gksj +

j=r+1

k k ks−1,j + (∂t ws−1,r (∂t ws−1,j )G )Gks−1,r

k k k (∂t ws,j )Gksj + (∂t wsr )Gs,r

j=1 r−1 j=1

k k ks+1,j + (∂t ws+1,r (∂t ws+1,j )G )Gks+1,r

Domain decomposition algorithms for semilinear parabolic equations N −1

k ks+1,j + (∂t ws+1,j )G j=r+1

+

N −1 N −1

k (∂t wij )Gkij

2381

+ 2λh2

j=1 i=s+2

N −1 N −1

k k (∂t wij )gij

i=1 j=1

λ k k k ∂t wk 2 + cλ|wk |21 + cλh3 (1 + h) |Wsj |(|∂t ws−1,j | + |∂t ws+1,j |) 4 j=r

k k k +cλh3 (1 + h)|Ws−1,r |(|∂t ws−1,r−1 | + |∂t ws−1,r+1 |) +

λ ∂t wk 2 + 2λg k 2 2

λ k k k = ∂t wk 2 + cλ|wk |21 + cλh3 (1 + h) |Wsj |(|∂t ws−1,j | + |∂t ws+1,j |) 4 +cλh3 (1 + h)

j=r

k k |Wikr |(|∂t wi,r−1 | + |∂t wi,r+1 |) +

i=s

λ ∂t wk 2 + 2λg k 2 2

λ k k k = ∂t wk 2 + cλ|wk |21 + cλ2 h(1 + h) |τ 2 ∂t χksj + τ gˆsj |(|∂t ws−1,j | + |∂t ws+1,j |) 4 +cλ2 h(1 + h)

j=r

k k |τ 2 ∂t χkir + τ gîkr |(|∂t wi,r−1 | + |∂t wr,r+1 |) +

i=s

λ ∂t wk 2 + 2λg k 2 2

3λ ∂t wk 2 + cλ|wk |21 + cλτ ∂t wk 2 + 4cλτ 2 |∂t wk |21 + 2λg k 2 4 k 2 +λ3 τ 2 (τ + h2 )(∂t χksj )2 + 2λ3 (τ + h2 )(ˆ gsj ) j=r

+λ τ

3 2

j=r

(τ + h

2

)(∂t χkir )2

3

+ 2λ

i=s

(τ + h2 )(ˆ gikr )2 .

i=s

Suppose that the time-step size is sufficiently small, or τ 1/(4c). It follows that P λ∂t wk 2 + λτ |∂t wk |21 + cλ|wk |21 + 2λg k 2 + λ3 τ 2 (τ + h2 )(∂t χksj )2 3

+λ

(τ + h

2

k 2 )(ˆ gsj )

+λ τ

3 2

j=r

j=r

(τ + h

2

)(∂t χkir )2

i=s

+ λ3

(τ + h2 )(ˆ gikr )2 .

(3.29)

i=s

We also derive that k k k k k k Wsj (∂t ws−1,j + ∂t ws+1,j ) + 2λh2 (Ws,r−1 + Ws,r+1 )(∂t wsr ) Q ≡ 2λh2 j=r k k k k k k (∂t ws−1,r−1 + ∂t ws−1,r+1 ) + 2λh2 Ws−1,r (∂t ws−2,r + ∂t wsr ) +2λh2 Ws−1,r 2 k 2 k k 2 k 2 k k Wsj (h ∂t Δh wsj + 2∂t wsj ) + 2λh Ws−1,r (h ∂t Δh ws−1,r + 2∂t ws−1,r ) = 2λh j=r

−2λh4

k k k k k k k Wsj (∂t Δy wsj ) + 2λh2 (Ws,r−1 + Ws,r+1 )(∂t wsr ) + 4λh2 Ws−1,r (∂t ws−1,r )

j=r

= 2λh

2

k k k Wsj (h2 ∂t Δh wsj + 2∂t wsj ) + 2λh2

j=r

−2λh

4

j=r

Wikr (h2 ∂t Δh wikr + 2∂t wikr )

i=s

k k (Δy Wsj )(∂t wsj )

− 2λh (Δx Wikr )(∂t wikr ) Q1 + Q1 + Q2 + Q2 , 4

(3.30)

i=s

where Lemma 1.3 is applied in the last equality. The treatments of Q1 and Q1 are quite similar to those of B1 in the proof of Lemma 2.1. Thus we have λ Q1 ∂t wk 2 − 2λ3 τ 2 ∂t (χksj )2 + cλ(|wk |21 + |wk−1 |21 ) 4 j=r

Liao H L et al.

2382

−(2 − λ)λ2 τ 2 h2 +4λ τ 3

(∂t χksj )2 − λ3 τ 3

j=r k 2 (gsj )

+ (1 + λ)λ h

2 2

j=r

(∂t χksj )2

j=r k 2 (ˆ gsj ) + 16λ3 τ 2 h−2

j=r

+4λ τ

i=s

(gikr )2

+ (1 + λ)λ h

2 2

i=s

(3.31)

j=r

λ Q1 ∂t wk 2 − 2λ3 τ 2 ∂t (χkir )2 + cλ(|wk |21 + |wk−1 |21 ) 4 i=s 2 2 2 (∂t χkir )2 − λ3 τ 3 (∂t χkir )2 −(2 − λ)λ τ h 3

k 2 (ˆ gsj ) ,

i=s

(ˆ gikr )2 + 16λ3 τ 2 h−2

i=s

(ˆ gikr )2 .

(3.32)

i=s

We apply the equalities (3.9)–(3.10) to find that Q2 = 2λ2 τ 2 h2

k (∂t Δy χksj )(∂t wsj ) + 2λ2 τ h2

j=r

= 2λ τ h

2 2 2


j=r

(∂t Δy χksj )χksj

+ 2λ τ h

2 2 2

j=r

= 2λ2 τ 2 h2

k (∂t Δy χksj )gsj + 2λ2 τ h2

j=r

N −1

j=1

j=r

(∂t Δy χksj )χksj + 2λ2 τ 2 h2


j=r k (∂t Δy χksj )gsj + 2λ2 τ h2

k k (Δy gˆsj )(∂t wsj ).

j=r

Following the treatment of B2 in the proof of Lemma 2.1, we get N λ k 2 k 2 Q2 ∂t wk 2 − λ3 τ 2 h2 ∂t (δy χks,j− 1 )2 + 4λ3 τ (gsj ) + 64λ3 τ 2 h−2 (ˆ gsj ) . (3.33) 2 4 j=1 j=r

j=r

Similarly, N

λ Q2 ∂t wk 2 − λ3 τ 2 h2 ∂t (δx χki− 1 ,r )2 + 4λ3 τ (gikr )2 + 64λ3 τ 2 h−2 (ˆ gikr )2 . (3.34) 2 4 i=1 i=s

i=s

Inserting the inequalities (3.31)–(3.32) and (3.33)–(3.34) into (3.30), we get Q λ∂t wk 2 + 2cλ(|wk |21 + |wk−1 |21 ) − 2λ3 τ 2

∂t (χksj )2 − λ3 τ 2 h2

N j=1

j=r

−2λ3 τ 2

∂t (χkir )2 − λ3 τ 2 h2

i=1

i=s

−λ τ

3 3

N

(∂t χksj )2

+8λ τ

k 2 (gsj )

− (2 − λ)λ τ h

+ (1 + λ)λ h

2 2

j=r

+8λ3 τ

i=s

2

2 2 2

j=r 3

∂t (δx χki− 1 ,r )2 − (2 − λ)λ2 τ 2 h2

(∂t χkir )2

−λ τ

3 3

i=s k 2 (ˆ gsj )

(gikr )2 + (1 + λ)λ2 h2

i=s

2

(∂t χksj )2

j=r

(∂t χkir )2

i=s

3 2 −2

+ 80λ τ h

j=r

∂t (δy χks,j− 1 )2

k 2 (ˆ gsj )

j=r

(ˆ gikr )2 + 80λ3 τ 2 h−2

(ˆ gikr )2 .

(3.35)

i=s

k k k k Multiplying the equations (3.13)–(3.28) by 2h2 ∂t wij , 2h2 ∂t ws−2,j , 2h2 ∂t ws−2,r , 2h2 ∂t ws−2,j , 2 k 2 k 2 k 2 k 2 k 2 k 2 k 2h ∂t ws−1,j , 2h ∂t ws−1,r−1 , 2h ∂t ws−1,r , 2h ∂t ws−1,r+1 , 2h ∂t ws−1,j , 2h ∂t wsj , 2h ∂t wsr ,


2383

k k k k k 2h2 ∂t wsj , 2h2 ∂t ws+1,j , 2h2 ∂t ws+1,r , 2h2 ∂t ws+1,j and 2h2 ∂t wij respectively, summing i, j for (xi , yj ) ∈ Ωh and then adding up the resulting equalities, we get

2∂t wk 2 + λ ∂t (|wk |21 ) + λτ |∂t wk |21 = P + Q,

(3.36)

where the boundary condition (3.12), the discrete Green formulation and Lemma 1.1 have been used. Inserting the inequalities (3.29) and (3.35) into (3.36), we obtain ∂t Fλk 2c(|wk |21 + |wk−1 |21 ) + 2g k 2 + 4λg k 2Γx + 4λg k 2Γy 2c(Fλk + Fλk−1 ) + 2g k 2 + 4λg k 2Γx + 4λg k 2Γy , or

(1 − 2cτ )Fλk (1 + 2cτ )Fλk−1 + 2τ g k 2 + 4λτ g k 2Γx + 4λτ g k 2Γy .

Suppose that τ0 = 1/(4c) and τ τ0 , the claimed estimate follows immediately. (II) Treatments for the case of crossover interfaces Γ = Γx ∪ Γy (see Figure 8) are similar to the proof for the case of single boundary but need rather lengthy descriptions. We see no difficulty getting that 2∂t wk 2 + λ ∂t (|wk |21 ) + λτ |∂t wk |21 = P + Q, (3.37) where P is defined analogous to the term P , and k k k k k k Q ≡ 2λh2 Wsj (∂t ws−1,j + ∂t ws+1,j ) + 2λh2 (Ws,r−1 + Ws,r+1 )(∂t wsr ) j=r

+2λh2

k k k k k Wikr (∂t wi,r−1 + ∂t wi,r+1 ) + 2λh2 (Ws−1,r + Ws+1,r )(∂t wsr ).

i=s

It is easy to know that P λ∂t wk 2 + λτ |∂t wk |21 + cλ|wk |21 + 2λg k 2 + λ3 τ 2

(τ + h2 )(∂t χksj )2

j=r

k 2 +2λ (τ + h2 )(ˆ gsj ) + λ3 τ 2 (τ + h2 )(∂t χkir )2 + 2λ3 (τ + h2 )(ˆ gikr )2 . (3.38) 3

j=r

i=s

i=s

With the help of Lemma 1.3, it follows that 2 k k k k k Q ≡ 2λh2 Wsj + 2∂t wsj Wsj (∂t Δy wsj ) h ∂t Δh wsj − 2λh4 j=r

j=r

k k k k k k +2λh Ws,r−1 ∂t wsr + 2λh2 Ws−1,r ∂t wsr + Ws,r+1 + Ws+1,r Wikr (∂t Δy wikr ) + 2λh2 Wikr h2 ∂t Δh wikr + 2∂t wikr −2λh4 2

i=s

= 2λh

2

i=s

k Wsj

2 k k k k h ∂t Δh wsj − 2λh4 + 2∂t wsj (Δy Wsj )(∂t wsj )

j=r

−2λh

4

(Δy Wikr )(∂t wikr )

i=s

+ 2λh

2

j=r

Wikr

2 h ∂t Δh wikr + 2∂t wikr .

i=s

We have N Q λ∂t wk 2 + 2cλ |wk |21 + |wk−1 |21 − 2λ3 τ 2 ∂t (χksj )2 − λ3 τ 2 h2 ∂t (δy χks,j− 1 )2 j=r

−2λ3 τ 2

i=s

∂t (χkir )2 − λ3 τ 2 h2

N i=1

∂t (δx χki− 1 ,r )2 − (2 − λ)λ2 τ 2 h2 2

j=1

j=r

(∂t χksj )2

2

Liao H L et al.

2384

−λ3 τ 3

(∂t χksj )2 − (2 − λ)λ2 τ 2 h2 (∂t χkir )2 − λ3 τ 3 (∂t χkir )2 j=r

+8λ τ 3

k 2 (gsj )

+ (1 + λ)λ h

(gikr )2

+ (1 + λ)λ h

2 2

j=r

+8λ τ 3

i=s

i=s k 2 (ˆ gsj )

i=s

3 2 −2

+ 80λ τ h

j=r 2 2

k 2 (ˆ gsj )

j=r

(ˆ gikr )2

3 2 −2

+ 80λ τ h

i=s

(ˆ gikr )2 .

(3.39)

i=s

Inserting the inequalities (3.38) and (3.39) into (3.37), we obtain (1 − 2cτ )Fλk (1 + 2cτ )Fλk−1 + 2τ g k 2 + 4λτ g k 2Γx + 4λτ g k 2Γy . The claimed estimate follows immediately by setting τ0 = 1/(4c). It completes the proof. With the help of Lemma 3.1, we can present similar arguments described in the above section to obtain the following theorems on the solvability, stability and convergence of the CEIDD-CI algorithm (3.2)–(3.5). Theorem 3.1. Suppose that the assumptions (A1)–(A3) are satisfied, and the time-step τ is sufficiently small, the CEIDD-CI algorithm (3.2)–(3.5) is uniquely solvable. Theorem 3.2. Under the assumptions (A1)–(A3), the CEIDD-CI algorithm (3.2)–(3.5) is unconditionally stable with respect to the initial and external data in the H 1 norm (seminorm). Theorem 3.3. Under the assumptions (A1)–(A3), the solution of the CEIDD-CI algorithm (3.2)–(3.5) converges to the exact solution with an order of O(τ + h2 + τ 2 h−3/2 ) in the H 1 norm (seminorm), provided the spatial length h is sufficiently small and the time-step size τ = O(h3/4+ ) for 0 < 5/4. As for the general case, assume that sequences {sσ | 1 σ p1 − 1}, {rς | 1 ς p2 − 1} satisfy 4 sσ + 2 sσ+1 N − 2 and 4 rς + 2 rς+1 N − 2. We construct the composite interior interfaces Γ = Γx ∪ Γy , where p 1 −1 Γx = {(xiσ , yj ) | iσ = sσ − 1 + sgn(|j − rς |), 1 j N − 1}, σ=1

Γy =

p 2 −1

{(xi , yjς ) | jς = rς − 1 + sgn(|i − sσ |), 1 i N − 1}.

ς=1

Then the domain Ωh is decomposed into p1 × p2 non-overlapping subdomains, see Figure 9. Then, under the assumptions (A1)–(A3), the p1 ×p2 -subdomain version of CEIDD-CI procedure √ is uniquely solvable, stable and convergent with an order of O(τ + h2 + p1 + p2 − 2 τ 2 h−3/2 ) provided the time step τ = O((p1 + p2 )−1/4 h3/4+ ) for 0 < 5/4.

Figure 9

The composite interfaces Γ divide Ωh into p1 ×p2 subdomains


4

2385

Numerical experiments

In this section, we present some experimental results to examine the stability and accuracy of the CEIDD-CI procedure. Typically, the following two nonlinear models are considered: Model 1. the convection-diffusion (Burgers) problem, ut = Δu − uux − uuy + g; Model 2. the reaction-diffusion (Fisher) system, ut = Δu + u(1 − u)(u − 1/4) + g; where g = g(x, y, t) is the exterior force. Always, the Newton’s method with line search is applied to solve nonlinear difference equations. The linear problem is solved by a preconditioned Bi-CGSTAB method. As a preconditioner we use an incomplete LU factorization. An important aspect of the experiments reported here is to investigate the numerical behaviors of the CEIDD-CI algorithm as they relate to strategies of domain partitioning. So, in each set of tests, several scenarios were considered: (i) the backward Euler scheme (listed as Euler in the tables) inside the nonpartitioned domain; (ii) the CEIDD-CI method on p1 × p2 subdomains (listed as p1 × p2 subdomains); (iii) the corrected explicit hopscotch method (CEH, see [18]). We consider the backward Euler scheme on the entire nonpartitioned domain as the benchmark for our comparisons since it is the most stable and accurate method; and view the CEH scheme as the negative one since it has the worst accuracy, O(τ + τ 2 h−2 ), among our parallel algorithms. In the first group of tests, numerical stability was examined with the homogeneous initialboundary conditions u0 (x, y) = 0 and ub (x, y, t) = 0, and the external force g(x, y, t) = exp(πt). In Tables 1 and 2, discrete solutions are obtained by setting h = 1/64 and doubling time step with the minimal size τ = 1/480. The tables list the L2 norm uh of numerical solution uh at T = 0.5. The solutions of CEIDD-CI algorithm on different domain decompositions including the CEH scheme are bounded; even τ is large to spacing h. Experimentally, data in Tables 1, 2 suggest that, as it relates to the discrete mesh and subdomain partitioning, the CEIDD-CI algorithm is stable. Next, we study the numerical error of CEIDD-CI solution. Taking the initial-boundary conditions and the forcing function g = g(x, y, t), we have exact solutions: Model 1. u(x, y, t) = 1/(1 + eξ ) with ξ = (x + y − t)/2; Model 2. u(x, y, t) = eη /(1 + eη ) with η = (x + y + t/2)/2. In Tables 3 and 4, the solution u is approximated on the grids (h = 1/16). The tables list the L2 norm eh of error spacing-dependent temporal step τ = 8h2 , τ ≈ 4h3/2 and τ convergence, in h, is computed by observing that eh ≈ chq determine q.

halving grids with the coarsest eh = Uh − uh at T = 0.5 with = h. The experimental rate of and doing a least squares fit to

We observe that, on the coarser grids, the implicit Euler solutions have smaller error than the domain decomposition solutions. As the mesh is further refined, the errors of CEIDDCI solutions decrease dramatically so that on the finest mesh, these errors are comparable quantitatively to the fully implicit errors. This is not mysterious; heuristically, one would expect the CEIDD-CI solution to approximate the backward Euler solution as τ and h approach zero. Since the errors drop so rapidly in the domain decomposition cases, a better rate of convergence than predicted is seen.

Liao H L et al.

2386 Table 1

Stability to the external force for Model 1

τ

1/480

1/240

1/120

1/60

1/30

Euler

0.1715

0.1715

0.1717

0.1720

0.1726

2 × 2 subdomains

0.1713

0.1709

0.1694

0.1563

0.1779

2 × 4 subdomains

0.1712

0.1706

0.1682

0.1482

0.2015

4 × 4 subdomains

0.1711

0.1703

0.1667

0.1453

0.2324

CEH

0.1689

0.1527

0.1733

0.3064

0.3912

Table 2

Stability to the external force for Model 2

τ

1/480

1/240

1/120

1/60

1/30

Euler

0.1712

0.1713

0.1714

0.1717

0.1723

2 × 2 subdomains

0.1710

0.1707

0.1691

0.1558

0.1774

2 × 4 subdomains

0.1709

0.1703

0.1679

0.1477

0.2012

4 × 4 subdomains

0.1709

0.1700

0.1664

0.1447

0.2322

CEH

0.1686

0.1523

0.1729

0.3063

0.3912

Table 3

Convergence in h for Model 1

τ

h

Euler

2 × 2 Subdomains

2 × 4 Subdomains

4 × 4 Subdomains

CEH

1/32

1/16

5.16e–06

3.05e–05

6.29e–05

1.05e–04

3.02e–04

1/128

1/32

1.28e–06

3.04e–06

6.00e–06

9.03e–06

3.72e–05

1/512

1/64

3.18e–07

2.35e–07

5.88e–07

9.62e–07

1.05e–05

Experimental rate 1/16

1/16

2.01

3.51

3.37

3.38

2.42

1.08e–05

2.13e–04

3.67e–04

5.05e–04

8.08e–04

1/46

1/32

3.72e–06

2.90e–05

6.07e–05

1.03e–04

6.03e–04

1/128

1/64

1.34e–06

7.30e–06

1.32e–05

1.89e–05

3.42e–04

1.51

2.43

2.40

2.37

0.62


1/16

1.08e–05

2.13e–04

3.67e–04

5.05e–04

8.08e–04

1/32

1/32

5.40e–06

9.73e–05

1.93e–04

2.95e–04

8.74e–04

1/64

1/64

2.70e–06

2.98e–05

6.44e–05

1.11e–04

9.07e–04

0.99

1.42

1.26

1.09

–0.08

4 × 4 Subdomains

CEH

Experimental rate

Table 4 τ

h

Euler

Convergence in h for Model 2

2 × 2 Subdomains

2 × 4 Subdomains

1/32

1/16

2.57e–06

1.43e–05

2.80e–05

4.40e–05

1.09e–04

1/128

1/32

6.47e–07

1.50e–06

2.95e–06

4.43e–06

1.96e–05

1/512

1/64

1.62e–07

1.14e–07

2.91e–07

4.76e–07

5.22e–06

2.00

3.49

3.29

3.26

2.19

5.23e–06

7.51e–05

1.26e–04

1.68e–04

2.59e–04


1/16

1/46

1/32

1.84e–06

1.44e–05

2.80e–05

4.42e–05

2.01e–04

1/128

1/64

6.67e–07

3.62e–06

6.54e–06

9.41e–06

1.24e–04

Experimental rate

1.49

2.19

2.13

2.08

0.53

1/16

1/16

5.23e–06

7.51e–05

1.26e–04

1.68e–04

2.59e–04

1/32

1/32

2.66e–06

3.90e–05

7.24e–05

1.06e–04

2.79e–04

1/64

1/64

1.34e–06

1.54e–05

3.00e–05

4.77e–05

2.88e–04

0.98

1.14

1.03

0.91

–0.08

Experimental rate


5

2387

Concluding remarks

A class of CEIDD algorithms is studied for parallel approximation of two-dimensional semilinear parabolic problems. Various interior boundaries and strategies of data partitioning were considered at the discretization level. The basic CEIDD-SI approach has its advantage on the facility of parallel implementation. With the five-point star stencils for spatial discretization, the CEIDD-CI procedure is scalable parabolic solver in the sense that it has good efficiency including flexibility of load balance, localization of data communication and facility of parallel implementation. Moreover, the scalable CEIDD-CI algorithm can also be extended to the three-dimensional parabolic equations. Considering the sequences {sσ | 1 σ p1 − 1}, {rς | 1 ς p2 − 1} and {qκ | 1 κ p3 − 1}, we construct the composite-plane interfaces Γ = Γx ∪ Γy ∪ Γz with Γx =

p 1 −1

{(xiσ , yj , zl )| iσ = sσ − 1 + sgn(|j − rς | + |l − qκ |)},

σ=1

Γy =

p 2 −1

{(xi , yjς , zl )| jς = rς − 1 + sgn(|i − sσ | + |l − qκ |)},

ς=1

Γz =

p 3 −1

{(xi , yj , zlκ )| lκ = qκ − 1 + sgn(|i − sσ | + |j − rς |)}.

κ=1

Then interfaces Γ divide three-dimensional grids Ωh into p1 ×p2 ×p3 non-overlapping subdomains. On coarse-grain parallel machines, the CEIDD-CI algorithm is convergent with an order of O(τ + h2 + (p1 + p2 + p3 − 3)1/2 τ 2 h−3/2 ). By combining the techniques developed in [18] with the present analysis, it is not difficult to verify that the theoretical results of CEIDD algorithms are valid for the parabolic equation with variable coefficients, ∂u ∂ ∂u ∂u ∂u ∂u ∂ = , , u, x, y, t . a1 (x, y, t) + a2 (x, y, t) +f ∂t ∂x ∂x ∂y ∂y ∂x ∂y But, up to now, we are not able to prove the unconditional stability of our algorithms for parallel approximation of the quasilinear equations, ∂u ∂ ∂u ∂u ∂ = a1 (u) + a2 (u) + f (x, y, t). ∂t ∂x ∂x ∂y ∂y The CEIDD method considered here is devoted to parallelizing the implicit Euler scheme. It seems that, by employing proper explicit schemes to compute interface predictor values, we can apply the CEIDD approach to parallelize higher-order implicit time-stepping schemes such as the well-known Crank-Nicolson scheme. Future work is planned to extend our approach to higher-order methods. Acknowledgements We would like to express our thanks to the reviewer whose valuable comments and suggestions helped greatly to improve this article. References 1 Bjørstad P, Luskin M. Parallel Solution of Partial Differential Equations. New York: Springer-Verlag, 2000

2388

Liao H L et al.

2 Toselli A, Widlund O. Domain Decomposition Methods-Algorithms and Theory. New York: SpringerVerlag, 2005 3 Yuan G W, Shen L J, Zhou Y L. Unconditional stability of alternating difference schemes with intrinsic parallelism for two-dimensional parabolic systems. Numerical Methods for Partial Differential Equations, 15: 625–636 (1999) 4 Zhou Y L. General finite difference schemes with intrinsic parallelism for nonlinear parabolic systems. Science in China Series A: Mathematics, 40: 357–365 (1997) 5 Zhou Y L, Shen L J, Yuan G W. Some practical difference schemes with intrinsic parallelism for nonlinear parabolic systems. Chinese Journal of Numerical Mathematics and Applications, 19(3): 46–57 (1997) 6 Zhou Y L, Yuan G W. Gerenal difference schemes with intrinsic parallelism for semilinear parabolic systems of divergence type. Journal of Computational Mathematics, 17: 337–352 (1999) 7 Lu J F, Zhang B L. Alternating group explicit (AGE) method for solving the two-dimensional convection diffusion equation. Annal Report of LCP, Beijing: IAPCM, 1995, 105–119 8 Zhang B L, Su X. Alternating block explicit-implicit method for two-dimensional diffusion equation. International Journal of Computer Mathematics, 38: 241–255 (1991) 9 Zhang B L. Difference graphs of block ADI method. SIAM Journal on Numerical Analysis, 38: 742–752 (2000) 10 Sheng Z Q, Yuan G W, Hang X D. Unconditional stability of parallel difference schemes with second order accuracy for parabolic equation. Applied Mathematics and Computation, 184: 1015–1031 (2007) 11 Yuan G W, Hang X D, Sheng Z Q. Parallel difference schemes with interface extrapolation terms for quasi-linear parabolic systems. Science in China Series A: Mathematics, 50(2): 253–275 (2007) 12 Kuznetsov Y. New algorithms for approximate realization of implicit difference scheme. Soviet Journal of Numerical Analysis and Mathematical Modelling, 3: 99–114 (1988) 13 Dawson C N, Du Q, Dupont T F. A finite difference domain decomposition algorithm for numerical solution of the heat equation. Mathematics of Computation, 57(195): 63–71 (1991) 14 Dawson C N, Dupont T F. Explicit/implicit, conservative domain decomposition procedures for parabolic problems based on block-centered finite differences. SIAM Journal on Numerical Analysis, 31: 1045–1061 (1994) 15 Du Q, Mu M, Wu Z N. Efficient parallel algorithms for parabolic problems. SIAM Journal on Numerical Analysis, 39(5): 1469–1487 (2001) 16 Lapin A, Piesk¨ a J. On the parallel domain decomposition algorithms for time-dependent problems. Lobachevskii Journal of Mathematics, 10: 27–44 (2002) 17 Yuan G W, Sheng Z Q, Hang X D. The unconditional stability of parallel difference schemes with second order convergence for nonlinear parabolic system. Journal of Partial Differential Equations, 20: 45–64 (2007) 18 Shi H S, Liao H L. Unconditional stability of corrected explicit-implicit domain decomposition algorithms for parallel approximation of heat equations. SIAM Journal on Numerical Analysis, 44(4): 1584–1611 (2006) 19 Zhuang Y, Sun X H. Stabilized explicit-implicit domain decomposition methods for the numerical solution of parabolic equations. SIAM Journal on Scientific Computing, 24(1): 335–358 (2002) 20 Rivera W, Zhu J P, Huddleston D. An efficient parallel algorithm with application to computational fluid dynamics. Computers & Mathematics with Applications, 45: 165–188 (2003) 21 Zhou Y L. Applications of Discrete Functional Analysis to the Finite Difference Method. Beijing: International Academic Publishers, 1990 22 Gilbarg D, Trudinger N S. Elliptic Partial Differential Equations of Second Order. Reprint of the 1998 edition. New York: Springer, 2001

Corrected explicit-implicit domain decomposition ... - Springer Link

Corrected explicit-implicit domain decomposition ... - Springer Link

Suggest Documents

Overlapping domain decomposition methods for elliptic ... - Springer Link

Efficient Domain Decomposition for a Neural Network ... - Springer Link

Domain decomposition for wave propagation problems - Springer Link

LNCS 8932 - Domain Decomposition Methods for Total ... - Springer Link

Generalized Benders decomposition - Springer Link

Parallel Domain Decomposition Preconditioning

Schwarz domain decomposition preconditioners

KINETIC PARAMETERS OF DECOMPOSITION OF ... - Springer Link

Multienvironment genomic variance decomposition ... - Springer Link

Decomposition of Human Remains - Springer Link

Decomposition, nitrogen and phosphorus ... - Springer Link

THERMAL DECOMPOSITION OF TWO SYNTHETIC ... - Springer Link

A decomposition of repeat buying - Springer Link

self-accelerating decomposition temperature (sadt) - Springer Link

THERMAL DECOMPOSITION OF TWO SYNTHETIC ... - Springer Link

Decomposition of Hudson estuary macrophytes ... - Springer Link

Decomposition Combustion Synthesis of Calcium ... - Springer Link

The challenges presented by decomposition - Springer Link

DOMAIN DECOMPOSITION METHODS FOR ...

Algebraic Domain Decomposition Preconditioners - IRIT

Fast domain decomposition algorithm for

An Overlapping Domain Decomposition Method

non-overlapping domain decomposition methods

(PDF). - Domain Decomposition Methods (DDM)