DECAY BOUNDS FOR FUNCTIONS OF BANDED NON-HERMITIAN MATRICES∗
arXiv:1605.01595v1 [math.NA] 5 May 2016
STEFANO POZZA
† AND
VALERIA SIMONCINI‡
Abstract. The derivation of a-priori decay bounds for the entries of functions of banded matrices is of interest in a variety of applications. While decay bounds for functions of Hermitian banded matrices have been known for some time, the non-Hermitian case is an especially challenging setting. By using Faber polynomial series we first explore the bounds obtainable by extending results for Hermitian matrices to banded non-Hermitian (not necessarily diagonalizable) matrices. We then describe the limitations of this approach in capturing the true off-diagonal decay whenever analyticity constraints hold for the given function. Hence, we use rational functions to derive new bounds that more accurately describe the decay pattern of Cauchy-Stieltjes functions of non-Hermitian banded matrices. Numerical experiments illustrate the quality of all new bounds. Key words. Banded matrices. Faber polynomials. Faber-Dzhrbashyan rational functions. Decay bounds. Matrix functions. AMS subject classifications. 15A16, 65F50, 30E10, 65E05.
1. Introduction. Matrix functions have arisen as a reliable and a computationally attractive tool for solving a large variety of application problems; we refer the reader to [19] for a thorough discussion and references. The analysis of their properties and structure have recently attracted the interest of many practitioners. In particular, for a given square banded matrix A, the entries of the matrix function f (A) for a sufficiently regular function f are characterized by a - typically exponential - decay pattern as they move away from the main diagonal. This phenomenon has been known for a long time, and it is at the basis of approximations and estimation strategies in many fields, from signal processing to quantum dynamics and multivariate statistics; see, e.g., [3, 4, 7] and their references. The interest in a-priori estimates that can accurately predict the decay rate of matrix functions has significantly grown in the past decades, and it has mainly focused on Hermitian matrices [13, 16, 25, 5, 32, 7, 12, 9]; the inverse and exponential functions have been given particular attention, due to their relevance in numerical analysis and other fields. Upper bounds usually take the form |(f (A))k,ℓ | ≤ cρ|k−ℓ| ,
(1.1)
where ρ ∈ (0, 1) and c both depend on the spectral properties of A and on the domain of f , while ρ also strongly depends on the bandwidth of A. The analysis of the decay pattern for banded non-Hermitian A is significantly harder, especially for non-normal matrices. In [6] Benzi and Razouk addressed this challenging case for diagonalizable matrices. They developed a bound of the type (1.1), where c also contains the eigenvector matrix condition number. In [22] the authors derive several qualitative bounds, mostly under the assumption that A is diagonally dominant. The exponential function provides a special setting, which has ∗ Version of May 4, 2016. This work was partially supported by the FARB12SIMO grant, Universit` a di Bologna. † Dipartimento di Matematica, Universit` a di Bologna, Piazza di Porta San Donato 5, I-40127 Bologna, Italy (
[email protected]) ‡ Dipartimento di Matematica, Universit` a di Bologna, Piazza di Porta San Donato 5, I-40127 Bologna, Italy (
[email protected]), and IMATI-CNR, Pavia.
1
2
S. Pozza and V. Simoncini
been explored in [20] and very recently by Wang in his PhD thesis [30]. In all these last three articles, and also in our approach, bounds on the decay pattern of banded nonHermitian matrices are derived that avoid the explicit reference to the possibly large condition number of the eigenvector matrix. Specialized off-diagonal decay results have been obtained for certain normal matrices, see, e.g., [18, 12], and for analytic functions of banded matrices over C ∗ -algebras [3]. Starting with the pioneering work [13], most estimates for the decay behavior of the entries have relied on Chebyshev and Faber polynomials as technical tool, mainly for two reasons. Firstly, polynomials of banded matrices are again banded matrices, although the bandwidth increases with the polynomial degree. Secondly, sufficiently regular matrix functions can be written in terms of Chebyshev and Faber series, whose polynomial truncations enjoy nice approximation properties for a large class of matrices, from which an accurate description of the matrix function entries can be deduced. In the first part of this paper we follow a similar reasoning: we use Faber polynomials to obtain new bounds for functions that are analytic on the field of values of A, where A is a general non-Hermitian (not necessarily diagonalizable or diagonally dominant) matrix; see section 2 for the definition of field of values. The new estimates are able to capture the true decay pattern of matrix functions for a large class of functions, such as strictly monotonic functions and Laplace-Stieltjes functions. We then illustrate the limitations of this way of proceeding: the quality of the determined bounds degrades whenever the field of values reaches the analyticity region boundary of the considered function. This problem is associated with the convergence properties of Faber polynomials, and it does not reflect the actual off-diagonal decay pattern of the given matrix functions. In the second part of the paper we thus explore the use of Faber-Dzhrbashyan rational functions that allow faster converging series expansions, and we adapt complex analysis tools for their treatment; while doing so, we derive new characterizations of series expansions by means of Faber-Dzhrbashyan rational functions. The new technique allows us to obtain more favourable bounds than those obtainable with Faber polynomials, for the class of Markov functions. Throughout the paper numerical experiments illustrate the quality of the new bounds. Our decay results reveal a novel application of these powerful rational functions, which have recently been used in the convergence analysis of rational Krylov subspaces for matrix function and matrix equation approximations; see, e.g., [2, 15, 21]. The paper is organized as follows. Section 2 introduces some basic definitions and properties. In section 3 we use Faber polynomials to give a bound that can be adapted to approximate the entries of several matrix functions; as a sample we consider the 1 functions A− 2 and eA . Then we use the result for the exponential function to obtain bounds for strictly monotonic functions of banded matrices (Subsection 3.1) and of Kronecker sums of banded matrices (Subsection 3.2). In section 4 we use the FaberDzhrbashyan rational functions to obtain a better bound for some cases in which the one given by Faber polynomials is too pessimistic. We conclude with some remarks in section 5 and with the proofs of some technical lemmas in Appendix A. All our numerical experiments were performed using Matlab (R2013b) [23]. In all our experiments, the computation of the field of values employed the code in [10]. 2. Preliminaries. We begin by recalling the definition of matrix function and some of its properties. Matrix functions can be defined in several ways (see [19, section 1]). For our presentation it is helpful to introduce the definition that employs the Cauchy integral formula.
Decay bounds for non-Hermitian matrix functions
3
Definition 2.1. Let A ∈ Cn×n and f be an analytic function on some open Ω ⊂ C. Then Z −1 f (A) = f (z) (zI − A) dz, Γ
with Γ ⊂ Ω a system of Jordan curves encircling each eigenvalue of A exactly once, with mathematical positive orientation. When f is analytic Definition 2.1 is equivalent to other common definitions; see [27, section 2.3]. In what follows we will also consider certain matrix functions defined by integral measure transforms, e.g., completely monotonic functions and Markov functions. We will introduce these definitions in later sections. For v ∈ Cn we denote with ||v|| the Euclidean vector norm, and for any matrix A ∈ Cn×n , with ||A|| the induced matrix norm, that is ||A|| = sup||v||=1 ||Av||. C+ denotes the open right-half complex plane. Moreover, we recall that the field of values of A is defined as the set W (A) = {v∗ Av | v ∈ Cn , ||v|| = 1}, where v∗ is the conjugate transpose of v. We remark that the field of values of a matrix is a bounded convex subset of C. In the following sections we need an approximation for the matrix norm of a matrix function. As proved by Crouzeix in [11], if A is a matrix with field of values W (A), then for any function f in the Banach algebra of the functions analytic in the interior of W (A) it holds ||f (A)|| ≤ C
sup w∈W (A)
|f (w)|,
(2.1)
with C = 11.08, conjecturing the stricter value C = 2. Notice that for some functions it does hold that C = 2. In the following we approximate the field of values of a matrix A with a subset E ⊂ C, such that W (A) ⊆ E. Unless explicitly stated, E does not need to be symmetric with respect to the real axis. If E is a continuum (i.e., a non-empty, compact and connected subset of C) with a connected complement, then by Riemann’s mapping theorem there exists a function φ that maps the exterior of E conformally onto the exterior of the unitary disk {|z| ≤ 1}. Subsets of the kind of E, the relative conformal maps φ, and their inverses ψ play a key role in the definition of Faber polynomials and Faber-Dzhrbashyan functions, which are a main tool in our analysis. For this reason from now on the notations E, φ, and ψ will be reserved to the objects defined above. For the sake of simplicity and without loss of generality, our numerical experiments will use Toeplitz matrices, which are constant along their diagonals. These matrices allow us to explore a large variety of spectral scenarios and non-normality properties, while providing a fully replicable experimental framework. The (k, ℓ) element of a matrix A will be denoted by (A)k,ℓ . The set of banded matrices is defined as follows. Definition 2.2. The notation Bn (β, γ) defines the set of banded matrices A ∈ Cn×n with upper bandwidth β ≥ 0 and lower bandwidth γ ≥ 0, i.e., (A)k,ℓ = 0 for ℓ − k > β or k − ℓ > γ. We observe that if A ∈ Bn (β, γ) with β, γ 6= 0, for ⌈(ℓ − k)/β⌉, if k < ℓ ξ := (2.2) ⌈(k − ℓ)/γ⌉, if k ≥ ℓ it holds that (Am )k,ℓ = 0,
for every m < ξ.
(2.3)
4
S. Pozza and V. Simoncini
This characterization of banded matrices is a classical fundamental tool to prove the decay property of matrix functions, as sufficiently regular functions can be expanded in power series. Since we are interested in nontrivial banded matrices, in the following we shall assume that both β and γ are nonzero. Remark 2.3. All our decay bounds describe the influence of the upper and lower bandwidths on the off-diagonal entry decay pattern. Numerical evidence indicates that the decay rate may also be influenced by the magnitude of the matrix nonzero entries, and in particular by the different magnitude of the elements in the upper and lower parts of the matrix. This property may determine a different decay rate for the two sides of the main diagonal, even for equal bandwidths β and γ. This asymmetry in the entry magnitude is partially accounted for by the shape of the field of values, and thus by our bounds. As a result, however, our estimates capture the slowest off-diagonal decay between the two sides. Relation (2.3) can be extended to the case of a sparse, not necessarily banded, matrix A. Indeed, following the approach in [6] we can define the graph G(A) describing the nonzero pattern of A, i.e, G(A) is such that the vertex set of G(A) consists of the indexes of the matrix 1, . . . , n and an edge (k, ℓ) is part of the graph if and only if Ak,ℓ 6= 0. It thus follows that our analysis still holds if ξ is replaced by d(k, ℓ), the geodesic distance, i.e., the length of the shortest path between the nodes ℓ and k. 3. Decay bounds for analytic functions by Faber polynomials expansion. Faber polynomials extend the theory of power series to sets different from the disk, and can be effectively used to bound the entries of matrix functions. Let E be a continuum with connected complement, and let us consider the relative conformal map φ satisfying the following conditions φ(∞) = ∞,
φ(z) = 1. z→∞ z lim
Hence, φ can be expressed by a Laurent expansion φ(z) = z + a0 + Furthermore, for every n > 0 we have (n)
n
(n)
(n)
(φ(z)) = z n + an−1 z n−1 + · · · + a0 +
a1 z
+
a2 z2
+ ....
(n)
a−1 a + .... + −2 z z2
Then, the Faber polynomial for the domain E is defined by (see, e.g., [28]) (n)
(n)
Φn (z) = z n + an−1 z n−1 + · · · + a0 ,
for n ≥ 0.
Assume that Γ, the boundary of E, is a regular analytic curve. If f is analytic on E then it can be expanded in a series of Faber polynomials for E, that is f (z) =
∞ X
fj Φj (z),
j=0
for z ∈ E;
[28, Theorem 2, p. 52]. Moreover, if the spectrum of A is contained in E and f is a function analytic in E, then the matrix function f (A) can be expanded as follows (see, e.g., [28, p. 272]) f (A) =
∞ X j=0
fj Φj (A).
Decay bounds for non-Hermitian matrix functions
5
By properly extending the set E, this expansion allows us to establish a first intermediate result. Theorem 3.1. Let A ∈ Bn (β, γ) with field of values P∞ contained in a continuum E with connected complement. Moreover, let f (z) = j=0 fj Φj (z) be the Faber expansion of f , in which Φj are Faber polynomials for E. Then | (f (A))k,ℓ | ≤ 2
∞ X j=ξ
|fj |,
for k 6= ℓ and ξ defined by (2.2). Proof. Let us consider super-diagonal elements of (f (A))k,ℓ (for the sub-diagonal elements the proof is the same). Then eTk pm−1 (A)eℓ = 0 for every polynomial pm−1 of degree at most m − 1, with m ≥ |ℓ − k|/β. Hence, we get f (A)k,ℓ =
∞ X
fj (Φj (A))k,ℓ =
∞ X
fj (Φj (A))k,ℓ ,
j=m
j=0
P∞ and thus |f (A)k,ℓ | ≤ |eTk f (A)eℓ | ≤ j=m |fj | ||Φj (A)||. We conclude the proof using the inequality ||Φj (A)|| ≤ 2 proved in [1]. The approach used in the proof of Theorem 3.1, based on the expansion of f (A) in a series of polynomials of A, will be useful again in section 4 where we will consider rational functions instead of Faber polynomials. Notice that in [30, Theorem 3.8] a similar bound for the exponential function is derived in a different way. In [22] an analogous result is discussed, although our presentation is more complete and the proof different. By using Theorem 3.1 we can give general decay bounds for a large class of matrix functions. Theorem 3.2. Let A ∈ Bn (β, γ) with field of values contained in a continuum E with connected complement whose boundary is a regular analytic curve Γ. Moreover, let φ be the conformal mapping of E, ψ be its inverse and Gτ = {w : |φ(w)| < τ }. Let us assume that τ > 1, f is analytic in Gτ and f is bounded on Γτ , the boundary of Gτ . Then ξ τ 1 max |f (ψ(z))| , (f (A))k,ℓ ≤ 2 τ − 1 |z|=τ τ with ξ defined by (2.2).
Proof. From Theorem 3.1, |(f (A))k,ℓ | ≤ 2
∞ X j=ξ
|fj |, where the Faber coefficients fj
are given by (see, e.g., [28, chapter III,Theorem 1]) Z 1 f (ψ(z)) dz. fj = 2πi |z|=τ (z)j+1 Hence, |fj | ≤
1 (τ )j
max|z|=τ |f (ψ(z))|. Therefore,
∞ j X 1 (f (A))k,ℓ ≤ 2 max |f (ψ(z))| τ |z|=τ j=ξ ξ ξ X ∞ j 1 τ 1 1 =2 . max |f (ψ(z))| = 2 max |f (ψ(z))| τ τ τ − 1 |z|=τ τ |z|=τ j=0
6
S. Pozza and V. Simoncini
We notice the similarity of this theorem with the one given in [17, Corollary 2.2] on the approximation error of analytic functions in terms of partial Faber series. Remark 3.3. The bound of Theorem 3.2 has similarities with the one obtained by Benzi and Razouk in [6, Theorem 3.5]. Consider a diagonalizable matrix A ∈ Bn (β, γ) whose spectrum is contained in a continuum F with connected complement, and a function f analytic in {ψ(z) : |z| < τ }, with τ > 1 and ψ the inverse of the conformal map of F . Moreover let κ(X) = kXk kX −1k be the spectral condition number of the matrix X of eigenvectors of A. For a sufficiently large ξ and for every ε > 0 we can rewrite the bound in Theorem 3.5 of [6] as 3 1 ((q + ε))ξ , (f (A))k,ℓ . κ(X) max |f (ψ(z))| 2 1 − (q + ε) |z| (q + ε)−1 . In this bound F needs to contain the spectrum of A, so it may be smaller than the set E we considered in Theorem 3.2 (which must contain W (A)). Hence, the value of τ may be allowed to be greater than the one in our bound. On the other hand, the bound (3.1) contains the factor κ(X) which can be enormous. When A is a normal matrix, i.e., AA∗ = A∗ A, the two bounds have a similar rate. Indeed, in this case the convex hull of the spectrum is equal to the field of values and κ(X) = 1. However, in the non-normal case the two bounds can significantly differ. In particular, κ(X) can be huge even when W (A) is not much bigger than the spectrum. This can consistently be appreciated in our numerical experiments, where we report κ(X) for completeness. For this reason, the new bound of Theorem 3.2 turns out to always be more descriptive than (3.1). The choice of τ in Theorem 3.2, and thus the sharpness of the derived estimate, depends on the trade-off between the possible large size of f on the given region, and the exponential decay of (1/τ )ξ , and it is thus problem dependent. As an example, we apply Theorem 3.2 to the approximation of two functions: the exponential and the inverse square root. Corollary 3.4. Let A ∈ Bn (β, γ) with field of values contained in a closed set E whose boundary is a horizontal ellipse with semi-axes a ≥ b > 0 and center c = c1 + ic2 ∈ C, c1 , c2 ∈ R. Then A e k,ℓ ≤ 2ec1
p ξ + ξ 2 + a2 − b 2 p ξ + ξ 2 + a2 − b2 − (a + b)
for ξ > b, with q(ξ) = 1 +
2
ξ 2 +ξ
2
a −b √ 2
ξ +a2 −b2
a+b eq(ξ) p ξ 1 + 1 + (a2 − b2 )/ξ 2
!ξ
,
and ξ as in (2.2).
Before we prove this result, we notice that for ξ large enough, the decay rate is ξ of the form ((a + b)/(2ξ)) , that is, the decay is super-exponential. √ 2 Proof. Let ρ = a − b2 be the distance between the foci and the center, and R = (a + b)/ρ. Then a conformal map for E is p w − c − (w − c)2 − ρ2 φ(w) = , ρR and its inverse is ρ ψ(z) = 2
1 Rz + + c, Rz
(3.2)
7
Decay bounds for non-Hermitian matrix functions
see, e.g., [28, chapter II, Example 3]. Notice that max |eψ(z) | = max eℜ(ψ(z)) = e 2 (Rτ + Rτ )+c1 . ρ
|z|=τ
1
|z|=τ
Hence by Theorem 3.2 we get ξ 1 1 τ A Rτ + Rτ c1 ρ ( ) 2 . e e e k,ℓ ≤ 2 τ −1 τ
The optimal value of τ > 1 that minimizes e 2 (Rτ + Rτ ) ρ
τ=
ξ+
1
p ξ 2 + ρ2 . ρR
1 ξ τ
is
1 = b. Finally, Moreover the condition τ > 1 is satisfied if and only if ξ > ρ2 R − R noticing that ! ! p p ξ + ξ 2 + ρ2 1 ρ2 2 2 p = ξq(ξ), ψ − c1 = ξ+ ξ +ρ + ρR 2 ξ + ξ 2 + ρ2
and collecting ξ the proof is completed. In the Hermitian, case, this bound is similar to bounds available in the literature. Indeed, let A ∈ Bn (β, γ) be Hermitian and such that W (A) ⊂ [−2a, 0], a > 0. The following bound was obtained in [7], ξ e−a/2 ea A , e k,ℓ ≤ 20 a 2ξ
ξ ≥ a.
The bound in Corollary 3.4 has a similar decay rate. Indeed we can let b → 0 in the bound, thus obtaining p ξ + ξ 2 + a2 A p e−a e k,ℓ ≤ 2 ξ + ξ 2 + a2 − a
with q(ξ) = 1 +
2
ξ 2 +ξ
a √
ξ 2 +a2
a eq(ξ) p ξ 1 + 1 + a2 /ξ 2
!ξ
,
for ξ > 0,
. For ξ ≫ a the quantity in parentheses behaves like
(ea)/(2ξ). This is consistent with the fact that both estimates are based on Faber polynomial approximation.
Example 3.5. Figure 3.1 illustrates the quality of the bound in Corollary 3.4 for two different matrices. The top plots refer to A ∈ B200 (1, 1) with Toeplitz structure, A = Toeplitz(−i, i, −2), where the underlined element is on the diagonal, while the previous (resp. subsequent) values denote the lower (resp. upper) diagonal entries. The bottom plots refer to A ∈ B100 (1, 2), A = Toeplitz(i, 3i, −i, −i). The left plots report the field of values of A (colored area), its eigenvalues (“×”), and the ellipse used in the bound (dashed line). The right plots show the elements1 of the t-th column of eA (black solid line), and the corresponding bound from Corollary 3.4 (“×”). In both examples the estimate is able to correctly capture the true (super-exponential) 1 The
computation of eA was performed with the Matlab function expm.
8
S. Pozza and V. Simoncini
4 3 10
10 2 1
0
10 0 −1
−10
10
−2 −20
10
−3 −4 −4
−2
0
2
4
6
80
90
100
110
120
130
140
150
160
10
10
5.5 5
0
10
4.5 4
−10
10 3.5 3
−20
10 2.5 2
−30
10 1.5 1
−40
−3
−2
−1
0
1
2
3
10
0
20
40
60
80
100
Fig. 3.1: Example 3.5. Left: field of values of A (colored region), its eigenvalues (“×”), and ellipse used in the bound (dashed line). Right (log-scale): the t-th column of eA (solid line) and estimate of Corollary 3.4 (“×”). Top: A = Toeplitz(−i, i, −2) ∈ Cn×n , n = 200, and t = 127. Bottom: A = Toeplitz(i, 3i, −i, −i) ∈ Cn×n , n = 100, and t = 67.
decay rate of the elements. For the two matrices, the condition number κ of the eigenvector matrix is given by κ = 4.0e + 29 (top) and κ = 5.5e + 13 (bottom) (see Remark 3.3). Theorem 3.2 can be used to obtain bounds for many other matrix functions. An interesting example is the inverse square root of a matrix, which is uniquely defined if and only if the spectrum of the matrix is in the positive part of the complex plane; see [19, chapter 1]. This property has crucial effects on the quality of the approximation, as the subsequent experiment shows. Corollary 3.6. Let A ∈ Bn (β, γ) with field of values contained in a closed set E ⊂ C+ , whose boundary is a horizontal ellipse with semiaxes a ≥ b > 0 and center c = c1 + ic2 ∈ C, c1 , c2 ∈ R. Then, −1 A 2 ≤ 2 q2 (a, b, c) k,ℓ
a+b 1 p c1 1 + 1 − (a2 − b2 )/c21
!ξ
,
Decay bounds for non-Hermitian matrix functions
9
with ξ defined by (2.2) and q p c1 + c21 − (a2 − b2 ) p q2 (a, b, c) = . c1 + c21 − (a2 − b2 ) − (a + b)
Proof. The inverse of the conformal map ψ is given by (3.2). Since min|z|=τ |ψ(z)| = ρ(τ R − (τ R)−1 )/2 + c1 , from Theorem 3.2 we obtain s ξ −1 τ 1 2 ≤2 A 2 . −1 τ − 1 ρ(τ R − (τ R) ) + 2c τ k,ℓ
We can minimize the value of the bound by increasing the value of τ as much as possible. The condition ℜ(ψ(z)) > 0 for |z| = τ must holds, i.e., 1 ρ τR + < c1 . 2 τR p Thus, τ R.
(3.6)
Example 3.10. Figure 3.3 shows the behavior of the bound (3.6) for the matrix A = Toeplitz(0.8, 3, −1, −3) ∈ B200 (2, 1), with eigenvector condition number κ = 3.3e + 43, and the 127-th column of the matrix function A−1 (I − e−A ), computed in Matlab as F =A\(eye(n)-expm(-A)). The integral was numerically estimated by the Matlab function quadgk. The description of the two plots is as in the previous examples. As for the other experiments associated with the exponential, the bound provides a quite good approximation of the true decay slope. We remark that this bound does not need that the circle (dashed line in the left plot) contains a disk in which the function (1 − e−x )/x is analytic. 3.2. Bound for functions of Kronecker sums of matrices. As done in the recent literature (see, e.g., [7] and references therein), the peculiar oscillating decay of functions of Kronecker sums of banded matrices can be captured by exploiting the properties of the exponential function, when the Kronecker structure occurs. Definition 3.11. Let A1 and A2 be two complex n × n matrices. The matrix 2 2 A ∈ Cn ×n is the Kronecker sum of A1 and A2 if A = A1 ⊕ A2 = A1 ⊗ I + I ⊗ A2 .
12
S. Pozza and V. Simoncini
5
5
10
4 0
10
3 2
−5
10
1 −10
10
0 −1
−15
10
−2 −20
10
−3 −4
−25
10
−5 −2
0
2
4
6
8
10
60
80
100
120
140
160
Fig. 3.3: Example 3.10. Matrix A = Toeplitz(0.8, 3, −1, −3) ∈ B200 (2, 1) and function f (x) = (1 − e−x )/x. Left: W (A) (colored area), eigenvalues (“×”), and circle used in the bound (dashed line). Right (log-scale): entries of the 127-th column of f (A) (solid line) and bound from Corollary 3.6 (“×”).
The definition can be extended to three or more matrices, e.g., A = A1 ⊕ A2 ⊕ A3 = A1 ⊗ I ⊗ I + I ⊗ A2 ⊗ I + I ⊗ I ⊗ A3 . The Kronecker sum of two matrices satisfies (see, e.g., [19, Theorem 10.9]) eA1 ⊕A2 = eA1 ⊗ eA2 .
(3.7)
Functions of Kronecker sums of two banded matrices exhibit the typical decay away from the main diagonal, together with a refined decay associated with the bandwidth of the single matrices A1 , A2 , giving rise to local “oscillations”. This behavior was characterized in [7, 9] for Hermitian positive definite matrices and a large class of functions. Thanks to the bounds in Theorem 3.9, we can generalize these results to non-Hermitian matrices, for completely monotonic functions. It is useful to express the column and row indexes of an n2 × n2 matrix A = A1 ⊕ A2 using the lexicographic ordering. Let k = (k1 , k2 ), ℓ = (ℓ1 , ℓ2 ). Then Ak,ℓ corresponds to the element in the (k2 − 1)n + k1 row and (ℓ2 − 1)n + ℓ1 column, with k1 , k2 , ℓ1 , ℓ2 ∈ {1, . . . , n}. Therefore (see, e.g., [7, proof of Theorem 6.1]) eA1 ⊕A2 k,ℓ = eA1 ⊗ eA2 k,ℓ = eA1 k2 ,ℓ2 eA2 k1 ,ℓ1 . R∞
e−tx dµ(t) be a completely monotonic function. Then Z ∞ −t(A1 ⊕A2 ) e dµ(t) (f (A1 ⊕ A2 ))k,ℓ = k,ℓ Z0 ∞ −tA2 −tA1 e dµ(t) = e k1 ,ℓ1 k2 ,ℓ2
Let f (x) =
0
0
≤
Z
0
∞
12 Z 2 −tA1 e dµ(t) k2 ,ℓ2
0
∞
12 2 −tA2 . e dµ(t) k1 ,ℓ1
Let Aj ∈ Bn (βj , γj ). We define ξj = ⌈(ℓj − kj )/βj ⌉ if kj < ℓj , or ξj = ⌈(kj − ℓj )/γj ⌉ if kj > ℓj , for j = 1, 2. Then, as in the proof of Theorem 3.9, we can bound the two
13
Decay bounds for non-Hermitian matrix functions
100
10-10
1
10-20
0.5
10-30
0 10-40
−0.5 10-50
−1
10-60
2.5
3
3.5
4
4.5
5
10-70 0
5.5
100
200
300
400
500
600
700
800
900
100
200
300
400
500
600
700
800
900
100
3
10-5
2
10-10
1 10-15
0 10-20
−1 10-25
−2
10-30
−3
10-35
0
1
2
3
4
5
6
7
10-40 0
8
Fig. 3.4: Example 3.12. Matrix A = A ⊕ A. Top: A = Toeplitz(−0.1, 4, 0.9i) ∈ B30 (1, 1). Bottom: A = Toeplitz(−1, 4, 1, 0.5) ∈ B30 (2, 1). Left: W (A) (colored ared), eigenvalues (“×”), and ellipse used in the bound (dashed line). Right (log-scale): the 300-th column of (A)−1 (I − e−A ) (solid line) and the values of the equation (3.8) bound (“×”).
last integrals as follows Z
0
∞
2 Z ξ2 −2tc1 2ξ2 2 R e t (eR)ξ2 −tA1 dµ(t) ≤ 2ξ dµ(t) + e 2 k2 ,ℓ2 2 (ξ2 )ξ2 (ξ − Rt) 2 0 Z ∞ 2 e−tA1 k2 ,ℓ2 dµ(t). + ξ2 R
As an example, consider the function f (x) = (1 − e−x )/x. From (3.6) we obtain
with
1 ξ1 ξ2 (eR)ξ1 +ξ2 (I(ξ1 )I(ξ2 )) 2 , (f (A ⊕ A))k,ℓ ≤ 4 (ξ1 )ξ1 (ξ2 )ξ2 I(ξ) =
Z
0
1
for ξ1 , ξ2 > R,
(3.8)
e−2tc1 t2ξ dt. (ξ − Rt)2
Example 3.12. Figure 3.4 illustrates the quality of the bound in (3.8) for f (x) = (1 − e−x )/x and A = A ⊕ A. We consider A = Toeplitz(−0.1, 4, 0.9i) ∈ B30 (1, 1)
14
S. Pozza and V. Simoncini
(κ = 7.8e + 13), (top) and A = Toeplitz(−1, 4, 1, 0.5) ∈ B30 (2, 1) (κ = 79.0), (bottom), so that A has dimension 900. The Matlab function quadgk was used to numerically evaluate the integral (3.8) for the two matrices. The matrix function A−1 (I − e−A ) was computed in Matlab as F = A\ (eye(n)-expm(-A)). The description of the two plots is as in the previous examples. The bound given by inequality (3.8) is able to predict the local and the global decay rate of the matrix function elements. 4. Decay bounds using rational functions expansion. Example 3.7 illustrates that when the field of values of a matrix is close to the boundary of the function domain the bound associated with Theorem 3.2 may not be able to capture the correct slope of the matrix entries decay. We have found that the use of Faber-Dzhrbashyan (FD) rational functions in the derivation of the decay bounds significantly mitigates this problem. To be able to use FD functions in this section we restrict our attention to Cauchy-Stieltjes (or Markov ) functions of matrices, which can be written as Z dµ(x) , w ∈ C \ F, (4.1) f (w) = x −w F
where µ is a (complex) measure supported on a closed set F . In particular, we are interested in the case in which µ is a positive (real) measure with support contained in F = [x1 , x2 ], and −∞ ≤ x1 < x2 < +∞. Many application problems make use of Cauchy-Stieltjes functions, see, e.g., [14, 29], and for this reason they have received a lot of attention in the recent literature. Examples of Cauchy-Stieltjes functions include Z 0 Z −1 1 log(1 + w) 1 1 1 1 √ dx, = dx, w− 2 = w −∞ w − x π −x −∞ w − x (−x) √ w
and
e−t
w
−1
=
Z
0
−∞
√ 1 sin(t −x) dx. w−x −πx
Notice that Cauchy-Stieltjes functions with a positive measure µ and x1 = ∞, x2 = 0 are strictly completely monotonic functions (see, e.g., [24]). The vice-versa is not true, for example e−x is strictly completely monotonic but it is not a Cauchy-Stieltjes function. Let us consider the sequence of Faber-Dzhrbashyan rational functions {Mj (w)}; we refer the reader to [28, chapter XIII, section 3] and the references therein for a more detailed introduction to these functions. Let E be a bounded continuum with a connected complement and φ be the relative conformal map introduced in section 3. Moreover, let θj = φ(ωj ) be a sequence of points outside the unit disk. Let us define the Takenaka-Malmquist system of functions p |θ0 |2 − 1 , ϕ0 (z) = θ0 − z p j−1 |θj |2 − 1 Y 1 − θ k z θ k ϕj (z) = , j = 1, 2, . . . . θj − z θk − z |θk | k=0
The function Mj (w) is defined as the sum of all the principal part and constant terms in the Laurent expansion of the function ϕj (φ(w)), and can be represented in the form Qj (w) , Mj (w) = (w − ω0 ) · · · (w − ωj )
Decay bounds for non-Hermitian matrix functions
15
with Qj (w) a polynomial. Moreover, if Mj is analytic in Gτ = {φ(z) : |z| < τ } then Z 1 ϕj (φ(η)) Mj (w) = dη, w ∈ Gτ , (4.2) 2πi Γτ η − w with ψ the conformal map inverse and Γτ the boundary of Gτ . The function Mj can thus be expanded into a series of Faber polynomials. We prove the following upper bound for the coefficients of this series. To the best of our knowledge this result is new, and it is of interest beyond our technical use. Theorem 4.1. Le Mj be a Faber-Dzhrbashyan rational function for the set E, with one real pole ω of multiplicity j + 1. Moreover, let {Φn } be the sequence of Faber P∞ polynomials for the set E. Then Mj (w) = n=0 fn,j Φn (w) for w ∈ E, where p |fn,j | ≤ 2π θ2 − 1 Cj (1 + θ2 )j |θ|−(n+j+1) , for j ≤ n,
with θ = φ(ω) and
k j √ j + 1+ 24j+1 ! Cj = j √ k 2 . j! 1+ 24j+1 !
(4.3)
Proof. Consider the Faber polynomials Φn and the uniform converging expansion ∞ X Φn (w) ψ ′ (z) , = ψ(z) − w n=0 z n+1
w ∈ E,
|z| > 1;
see [28, page 39, eq. (1)]. Noticing that Mj (w) can be given by equation (4.2), for w ∈ E and 1 < τ < |θ| we get Z Z 1 ϕj (φ(η)) ψ ′ (z) 1 dη = dz ϕj (z) Mj (w) = 2πi Γτ η − w 2πi |z|=τ ψ(z) − w Z ∞ X Φn (w) 1 dz = ϕj (z) 2πi |z|=τ z n+1 n=0 Z ∞ ∞ X X 1 ϕj (z) = Φn (w) dz = fn,j Φn (w), 2πi |z|=τ z n+1 n=0 n=0 with fn,j =
Z
|z|=τ
ϕj (z) dz. z n+1
To conclude we need to derive a bound for |fn,j |, when j ≤ n. Let g be a sufficiently regular function. We consider the Cauchy formula Z g(z) 1 dz, g(a) = 2πi |z|=τ (z − a) where a lies outside of |z| = τ . Differentiating n times we obtain Z g(z) n! (n) dz. g (a) = 2πi |z|=τ (z − a)n+1
16
S. Pozza and V. Simoncini
We can apply this well known formula to our coefficient fn,j for a = 0, so that Z ϕj (z) 2πi (n) fn,j = dz = ϕ (0), n+1 n! j |z|=τ z j √ θ 2 −1 1−θz . We then need to work out a more explicit form where ϕj (z) = (−1)j θ−z θ−z (n)
for ϕj (0). Since |z/θ| < 1 we write
so that
(−1)j ϕj (z) √ = (1 − θz)j (θ − z)−(j+1) θ2 − 1 z −(j+1) =: θ−(j+1) h1 (z)h2 (z), = θ−(j+1) (1 − θz)j 1 − θ (n) n X (−1)j ϕj (0) n (s) (n−s) −(j+1) √ h1 (0)h2 (0) =θ 2 s θ −1 s=0 j X n (s) (n−s) −(j+1) h1 (0)h2 (0), =θ s s=0
where we truncate the sum since h1 is a polynomial of degree j. For the polynomial (s) j! we have h1 (z)|z=0 = (j−s)! (−θ)s , for s ≤ j. For the rational function we have ∞ ∞ ∞ −(j+1) X X X −(j + 1) z m (m) m m = z ≡ h2 (0)z m . − h2 (z) = m θ (−θ) m m=0 m=0 m=0 Collecting all derivatives we obtain (n) j X (−1)j ϕj (0) n (s) (n−s) −(j+1) √ h1 (0)h2 (0) =θ 2 s θ −1 s=0 =θ
−(j+1)
j X s=0
=
j X s=0
n! j! s −(j + 1) (−θ)−(n−s) (−θ) s!(n − s)! (j − s)! n−s
j! (−1)n−s−1 (j + n − s)! −n+2s−j−1 n! (−1)s θ s!(n − s)! (j − s)! (n − s)! j!
= (−1)n−1
j X s=0
n!(j + n − s)! θ−n+2s−j−1 . s!(n − s)!(j − s)!(n − s)!
We can rewrite the coefficients of this last sum as j! (j + n − s)! n!(j + n − s)! = n! 2 2 s!(j − s)!((n − s)!) j!((n − s)!) s!(j − s)! (j + n − s)! j j = n! , ≤ n! Cj 2 s j!((n − s)!) s
where the last inequality is proved in Lemma A.1, Appendix A. Therefore, j p X j 2π (n) |θ|−n+2s−j−1 |fn,j | = |ϕj (0)| ≤ 2π θ2 − 1 Cj s n! s=0 p −(n+j+1) 2 ≤ 2π θ − 1 Cj |θ| (1 + θ2 )j .
17
Decay bounds for non-Hermitian matrix functions
The following lemmas introduce two upper bounds for |(Mj (A))k,ℓ | that will be used in the main theorem of this section. Lemma 4.2. Let A ∈ Bn (β, γ) with field of values contained in a continuum E with connected complement. Let Mj (w) be a Faber-Dzhrbashyan rational function for E with pole ω of multiplicity j + 1. Finally, let k 6= ℓ, and ξ defined by (2.2). Then for 1 < τ < |θ| and j ≤ ξ √ ξ+j+1 |θ| θ2 − 1 1 |(Mj (A))k,ℓ | ≤ 4π , (1 + θ2 )j Cj |θ| − 1 |θ| with θ = φ(ω) and Cj as in (4.3). P∞ Proof. From Theorem 3.1 we know that |(Mj (A))k,ℓ | ≤ 2 n=ξ |fn,j |. Substituting the bound of Theorem 4.1 for j ≤ n gives |(Mj (A))k,ℓ | ≤ 4π
∞ p X θ2 − 1 Cj (1 + θ2 )j |θ|−(n+j+1) n=ξ
∞ p X |θ|−(n+j+1) ≤ 4π θ2 − 1 Cj (1 + θ2 )j
≤ 4π
n=ξ
√
|θ| θ2 − 1 (1 + θ2 )j Cj |θ| − 1
1 |θ|
ξ+j+1
.
Lemma 4.3. Let A ∈ Bn (β, γ) with field of values W (A) and Mj (w) be a FaberDzhrbashyan rational function in a continuum E with connected complement. If W (A) ⊂ E then p |θj |2 − 1 12 ℓ(Γ) ||Mj (A)|| ≤ , 2π d(W (A), Γ) |θj | − 1 with Γ = ∂E, ℓ(Γ) the length of Γ, and d(W (A), Γ) the distance between W (A) and Γ, and θj the last pole in the sequence defining the system of ϕj . Proof. To prove the statement we apply the bound in (2.1) to kMj (A)k, that is ||f (Mj )|| ≤ C supw∈W (A) |f (w)|, with C = 11.08. We are thus left with bounding supw∈W (A) kMj (w)k. For every τ > 1 for which ϕj is analytic in {|z| < τ } equation (4.2) implies Z 1 ℓ(Γτ ) |ϕj (φ(η))| sup |Mj (w)| ≤ sup dη, ≤ max |ϕj (z)|, 2π w∈W (A) Γτ |η − w| 2π d(W (A), ∂E) |z|=τ w∈W (A) with ℓ(Γτ ) the length of Γτ . Finally, p j−1 Y 1 − θ k z θk |θj |2 − 1 max max |ϕj (z)| ≤ θk − z |θk | . |θj | − τ |z|=τ |z|=τ k=0
Notice that
j−1 Y
k=0
1 − θk z θ k θk − z |θk |
is the Blaschke product, hence its absolute value is 1 on the unit circle. Thus letting τ → 1+ we conclude the proof.
18
S. Pozza and V. Simoncini
We are thus ready to state the main result of this section. Theorem 4.4. Let A ∈ Bn (β, γ) with field of values contained in a continuum E symmetric with respect to the real axis and with connected complement. Moreover, let Z x2 dµ(x) f (w) = x −w x1 be a Cauchy-Stieltjes (or Markov) function, with µ a positive measure with support contained in [x1 , x2 ], and −∞ ≤ x1 < x2 < min(ℜ(E)). Then for 0 ≤ m ≤ ξ (ξ defined in (2.2)) !m ξ+1 m+1 2 1 + θopt 1 1 | (f (A))k,ℓ | ≤ K1 . + K2 Cm (m + 1) ρopt ρopt |θopt | |θopt | with K1 , K2 coefficients depending on f, E and θopt , Cm as in Theorem 4.1, and r 1 φ(x2 ) − φ(x1 ) 1 1 − φ(x1 )ρopt , ρopt = + − 1, κ = ∈ (0, 1), θopt = φ(x1 ) − ρopt κ κ2 φ(x2 )φ(x1 ) − 1 where φ is the conformal map for E. Proof. Consider the sequence of Faber-Dzhrbashyan rational functions {Mj (w)} for E with constant real poles ωj = ω ∈ / E for j = 0, 1, . . . . The following relation holds ∞ 1X 1 ψ ′ (z) Mj (w), w ∈ G, |z| > 1, (4.4) = ϕj ψ(z) − w z j=0 z¯ where ψ is the inverse of φ (the conformal mapping of E) and G is the interior of E; see [28, p. 259] and the references therein. Noticing that φ(x1 ) < φ(x2 ) < −1 from relation (4.4) for w ∈ G we derive x2
Z φ(x2 ) dµ(x) ψ ′ (z) = dµ(ψ(z)) x1 x − w φ(x1 ) ψ(z) − w Z φ(x2 ) ∞ ∞ X X 1 1 Mj (w) ϕj aj Mj (w), = dµ(ψ(z)) = z¯ φ(x1 ) z j=0 j=0
f (w) =
Z
where the coefficients of the expansion are given by Z φ(x2 ) 1 1 dµ(ψ(z)). aj = ϕj z¯ φ(x1 ) z Hence,
Moreover,
Z 1 φ(x2 ) 1 |aj | ≤ max d|µ(ψ(z))| ϕj [φ(x1 ),φ(x2 )] z¯ φ(x1 ) z p θ − z j |θ|2 − 1 1 max max ϕj ≤ . [φ(x1 ),φ(x2 )] z¯ |θ − 1/φ(x1 )| [φ(x1 ),φ(x2 )] 1 − θz
Decay bounds for non-Hermitian matrix functions
19
As we have already noticed the argument of the maximum is the absolute value of the Blaschke product. Beckermann and Reichel in [2, Corollary 6.4] show that under our hypotheses the optimal value for θopt that minimizes θ−z max [φ(x1 ),φ(x2 )] 1 − θz is given by
r 1 φ(x2 ) − φ(x1 ) 1 ρopt = + − 1, κ = ∈ (0, 1). θopt κ κ2 φ(x2 )φ(x1 ) − 1 θopt −z j Moreover, max[φ(x1 ),φ(x2 )] 1−θ = (ρopt )−j . Hence we get the upper bound opt z p j |θopt |2 − 1 1 f |aj | ≤ CE , (4.5) θopt − 1/φ(x1 ) ρopt 1 − φ(x1 )ρopt , = φ(x1 ) − ρopt
where
f CE
=
Z
φ(x2 )
φ(x1 )
1 d|µ(ψ(z))|. |z|
For some integer m ≥ ξ we can split the series in two parts, ∞ X |aj | (Mj (A))k,ℓ |f (A)k,ℓ | ≤ j=0
≤
m X j=0
∞ X |aj | (Mj (A))k,ℓ . |aj | (Mj (A))k,ℓ +
(4.6)
j=m+1
Using Lemma 4.3 the second term in (4.6) can be bounded as follows ∞ ∞ X X |aj | kMj (A)k |aj | (Mj (A))k,ℓ ≤ j=m+1
j=m+1
p ∞ |θ|2 − 1 X 12 ℓ(Γ) ≤ |aj |. 2π d(W (A), ∂E) |θ| − 1 j=m+1
Furthermore, fixing θ = θopt from (4.5) we obtain m+1 ∞ X 1 |aj | (Mj (A))k,ℓ ≤ K1 , ρopt j=m+1
where
f K1 = CE
|θopt | + 1 ρopt 12 ℓ(Γ) . 2π d(W (A), ∂E) |θopt − 1/φ(x1 )| ρopt − 1
To bound the first term in (4.6) we use Lemma 4.2 and inequality (4.5): for 1 < τ < |θ| and for m ≤ ξ we obtain √ ξ+j+1 m m X |θ| θ2 − 1 X 1 2 j |aj | (Mj (A))k,ℓ ≤ 4π |aj |(1 + θ ) Cj |θ| − 1 j=0 |θ| j=0 √ ξ+j+1 m 1 |θ| θ2 − 1 X 2 j |aj | (1 + θ ) , ≤ 4π Cm |θ| − 1 j=0 |θ|
20
S. Pozza and V. Simoncini
where the last inequality stands since Cj is an increasing function of j (see Lemma A.2 in the appendix). Fixing θ = θopt and using again (4.5) we obtain m X
2 |aj | (1 + θopt )j
j=0
1 |θopt |
ξ+j+1
f ≤ CE
f ≤ CE
f ≤ CE
ξ+j+1 j m |θopt |2 − 1 X 1 1 2 (1 + θopt )j |θopt − 1/φ(x1 )| j=0 ρopt |θopt | !j p m 2 ξ+1 X 1 + θopt |θopt |2 − 1 1 |θopt − 1/φ(x1 )| |θopt | ρopt |θopt | j=0 !m p 2 ξ+1 1 + θopt |θopt |2 − 1 1 . (m + 1) |θopt − 1/φ(x1 )| |θopt | ρopt |θopt | p
Hence we have m X j=0
|aj | (Mj (A))k,ℓ ≤ K2 Cj (m + 1)
f with K2 = 4π CE
|f (A)k,ℓ | ≤ K1
|θopt |(|θopt |+1) |θopt −1/φ(x1 )| .
1 ρopt
m+1
2 1 + θopt ρopt |θopt |
!m
1 |θopt |
ξ+1
,
Thus, for m ≤ ξ
+ K2 Cm (m + 1)
2 1 + θopt ρopt |θopt |
!m
1 |θopt |
ξ+1
.
Notice that for φ(x1 ) = −∞ we get ρopt = −θopt . To obtain a descriptive bound we have restricted the Faber-Dzhrbashyan rational function to have a single pole of maximum multiplicity. The use of distinct poles may lead to even more accurate decay bounds, but the resulting functions would be considerably harder to treat. The result of Theorem 4.4 is very general, as the integer m is not specified. We next determine m ≤ ξ such that the upper bound of the above theorem is minimized. Let us consider the quantity m+1 m ξ+1 1 1 1 + θ2 L(m) := K1 + K2 Cξ (ξ + 1) , ρ ρ|θ| |θ| with m ∈ R. Differentiating in m we obtain m m ξ+1 log(ρ) 1 1 + θ2 1 1 + θ2 ′ + K2 Cξ (ξ + 1) log , L (m) = −K1 ρ ρ ρ|θ| ρ|θ| |θ| and L′ (m) = 0 is satisfied for 2
log(K1 log(ρ)) − log(K2 Cξ (ξ + 1)ρ log( 1+|θ| log(|θ|) ρ|θ| )) m ˜ = (ξ + 1) + . log(|θ| + |θ|−1 ) log(|θ| + |θ|−1 ) Notice that m ˜ ≤ ξ + 1 for ξ large enough. Then we set m = ⌊min{max{0, m} ˜ , ξ}⌋. We remark that, since we fix the value Cξ (ξ + 1), this is an approximation of the optimal value for m. The left plot of Figure 4.1 numerically shows that Cj in (4.3) can be bounded by the quantity 250(1.12)j . In summary, when φ(x1 ) = −∞ the asymptotical offdiagonal decay can be approximated as m 1.12 , |(f (A))k,ℓ | ≤ O ρopt
21
Decay bounds for non-Hermitian matrix functions 108
1.2 1
106
0.8 104
0.6 250(1.12)j Cj
102
̺F 1.12̺F D
0.4 0.2
100 0
0 20
40
60
80
100
0
2
4
6
8
10
j
Fig. 4.1: Left. Comparison of Cj in (4.3) and 250(1.12)j . Right. Comparison of ̺F and 1.12̺F D for −φ(0) ∈ [1, 10].
with m ≈ ξ and ρopt big enough so that the factor (m + 1) is negligible. We next specialize this bound to the case of the inverse square root defined on an ellipse. This result will allow us to make a comparison, at least in asymptotic terms, with respect to Corollary 3.6, which followed from the use of Faber polynomials. Corollary 4.5. Let A ∈ Bn (β, γ) with field of values contained in a closed set E ⊂ C+ , whose boundary is a horizontal ellipse with semiaxes a ≥ b > 0 and center c ∈ R, sop that c > a. With the notation of Theorem 4.4, it holds that θopt = −ρopt = −φ(0) + (φ(0))2 − 1, and p c + c2 − (a2 − b2 ) . (4.7) φ(0) = − a+b 1
Proof. The result follows from Theorem 4.4 with w− 2 = Let ̺F D :=
1 ρopt
and let ̺F :=
a+b √ 1 c1 1+ 1−(a2 −b2 )/c2 1
R0
1 √1 dx. −∞ w−x π −x
be the asymptotic decay rate
in Corollary 3.6. Comparing ̺F with the expression for φ(0) in (4.7) we see that 1 ̺F = − φ(0) . As a consequence, we obtain ̺F − 1.12̺F D =
1.12 1 p − ≥0 −φ(0) −φ(0) + φ(0)2 − 1
for φ(0) < −1.0073. The behavior of ̺F and 1.12̺F D for −φ(0) ∈ [1, 10] is reported in the right plot of Figure 4.1. Clearly, 1.12̺F D is significantly smaller than ̺F for all considered values of φ(0), also for −φ(0) very close to 1, which occurs for instance when c ≈ a (that is the ellipse almost intersects the imaginary axis). It thus follows that asymptotically, and for m = ξ in Theorem 4.4, the decay rate predicted by the rational function estimate is higher than that predicted by the Faber polynomial bound. The following example experimentally confirms this argument. Example 4.6. In Figure 4.2 we compare the bound in Theorem 4.4 with the bound in Corollary 3.6 for f (x) = x−1/2 , with x ∈ C+ , and for two 200 × 200 matrices A = Toeplitz(i, −0.5i, −i, −i) + αI, with α = 3.5, 7 (eigenvector condition numbers κ = 5.1e + 19 and 1.8e + 19, respectively). The description of the left plots is as in the previous examples. The right plots report on the decay of the 147-th column of A−1/2
22
S. Pozza and V. Simoncini 5
10
2.5 2
0
10
1.5 1
−5
10
0.5 0
−10
−0.5
10
−1 −15
−1.5
10
−2 −2.5
−20
10 1
2
3
4
5
6
0
50
100
150
200
2 0
10
1.5 1
−5
10
0.5 0
−10
10
−0.5 −1
−15
10
−1.5 −2
−20
10 4
5
6
7
8
9
60
80
100
120
140
160
180
Fig. 4.2: Example 4.6. Top: matrix A = Toeplitz(i, 3.5 − 0.5i, −i, −i) ∈ B200 (2, 1). Bottom: matrix A = Toeplitz(i, 7 − 0.5i, −i, −i) ∈ B200 (2, 1). Left: W (A) (colored area), its eigenvalues (“×”), and ellipses used in the bounds (dashed line). Right 1 (log-scale): elements of the 147-th column of A− 2 (solid line), bound from Corollary 3.6 (“×”), and bound from Theorem 4.4 (“◦”).
computed via the Matlab command F=eye(n)/sqrtm(A)(solid line), the corresponding bound from Corollary 3.6 (“×”), and the corresponding bound from Theorem 4.4 (“ ◦”). The two matrices differ by a translation of 3.5I, hence one field of values is closer to the origin (part of the function domain boundary). In both cases the bound in Theorem 4.4 captures the true decay rate of the entries, performing better than the one in Corollary 3.6. The improved accuracy can be particularly appreciated for α = 3.5, as the field of values approaches the origin. For the computation of the bound f = 1, d(W (A), ∂E) = 0.1, ℓ(Γ) = 2πa, with a ≥ b. in Theorem 4.4 we set CE 5. Conclusions. We have demonstrated that for a large class of functions sharp bounds on the off-diagonal decay pattern of functions of non-normal matrices can be obtained. Different proof strategies have been adopted, to comply with the analyticity properties of the considered functions, and the spectral properties of the given matrices. To this end, both polynomials and rational functions have been employed. The former functions have been used to specialize general estimates to strictly monotonic functions. The latter functions have been shown to be suitable for dealing with Markov functions. As expected, our bounds are also influenced by the dependence between the predicted decay rate and the shape and dimension of the set enclosing
Decay bounds for non-Hermitian matrix functions
23
the field of values of A. The closer E is to the field of values, the sharper the bound. As already mentioned, decay estimates can be used in a variety of applications, and in particular in numerical linear algebra. We plan to investigate their use in the convergence analysis of projection-type iterative system solvers [26], where the banded structure arises naturally in certain matrices that are generated during the iteration; see, e.g., [30, 31] for results in this direction. Appendix A. In this Appendix we report the proofs of some technical lemmas. Lemma A.1. Let C(n, j, s) =
(j + n − s)! . j!((n − s)!)2
For s ≤ j ≤ n it holds k j √ j + 1+ 24j+1 ! C(n, j, s) ≤ Cj := j √ k 2 . j! 1+ 24j+1 !
Proof. We look for the values s for which C(n, j, s) ≤ C(n, j, s + 1), i.e., (j + n − s)! (j + n − s − 1)! ≤ j!((n − s)!)2 j!((n − s − 1)!)2
⇔
(j + n − s) ≤ 1. (n − s)2
This is equivalent to s2 − (2n − 1)s + n2 − n − j ≥ 0. This second order equation has roots √ √ 2n − 1 − 4j + 1 2n − 1 + 4j + 1 s1 = , s2 = , 2 2 with s1 ≤ n − 1 < n ≤ s2 . Therefore, C(n, j, s) is an increasing function of s up to s = s1 . Finally we show that Cj (n, j, ⌈s1 ⌉) ≤ Cj (n, j, ⌊s1 ⌋), that is j √ m k l √ j + 1+ 24j+1 ! j + 1+ 24j+1 ! m 2 ≤ j √ k 2 . l √ j! 1+ 24j+1 ! j! 1+ 24j+1 ! Noticing that
l
m √ 1+ 4j+1 2
=
j
k √ 1+ 4j+1 2
+ 1 we obtain the condition
√ √ 2 1 + 4j + 1 1 + 4j + 1 j+ +1 ≤ +1 , 2 2
k2 j √ k j √ 1+ 4j+1 1+ 4j+1 − j ≥ 0. This inequality is true for every + equivalent to 2 2 j ≥ 0 since the left-hand expression of the previous inequality is greater than
1+
√
4j + 1 −1 2
2
+
1+
√ 4j + 1 − 1 − j = 0. 2
Lemma A.2. The parameter Cj defined in Lemma A.1 is a monotonically increasing function for j ≥ 0.
24
S. Pozza and V. Simoncini
Proof. We want to prove that Cj ≤ Cj+1
⇔
Cj+1 ≥ 1. Cj
(A.1)
for j = 0, 1, . . . . We j 2, √ k first notice that (A.1) is trivial for j = 0. Therefore, let j > 0. 1+ 4j+1 Let αj := . Noticing that 2 αj+1 =
$
1+
% p √ √ 4(j + 1) + 1 1 + 2 + 4j + 1 1 + 4j + 1 ≤ =1+ = 1 + αj , 2 2 2
then either αj+1 = αj or αj+1 = αj + 1. In the first case, from (A.1) after some j+1+α simplifications we get the inequality j+1 j ≥ 1, which is true for every j ≥ 0. In the second case, that is for αj+1 = αj + 1, after some simplifications inequality (A.1) becomes (j + 2 + αj ) (j + 1 + αj ) (j + 1) (1 + αj )2
≥1
or, equivalently, j 2 − jα2j + 2j + αj + 1 ≥ 0.
(A.2)
We notice that for j = 1, 2, . . . p √ 1 + 4(j + 1) + 1 1 + 4j + 1 1 − < . 2 2 2 √ √ Indeed, 4j + 5 < 4j + 1+1 if and only if j > 5/16. Therefore, since αj + 1 we get √ 1 1 + 4j + 1 > αj + . 2 2
1+
√
4(j+1)+1 2
≥
Hence, the left-hand side of (A.2) is greater than or equal to 2
j −j
1+
√ 2 4j + 1 1 − + 2j + αj + 1, 2 2
which is nonnegative if and only if 74 j + αj + 1 ≥ 0, which holds for every j > 0. REFERENCES [1] B. Beckermann. Image num´ erique, GMRES et polynˆ omes de Faber. C. R. Acad. Sci. Paris, Ser. I, 340(11):855–860, 2005. [2] B. Beckermann and L. Reichel. Error estimates and evaluation of matrix functions via the Faber transform. SIAM J. Numer. Anal., 47(5):3849–3883, Jan 2009. [3] M. Benzi and P. Boito. Decay properties for functions of matrices over C ∗ -algebras. Linear Algebra and its Applications, 456:174–198, Sep 2014. [4] M. Benzi, P. Boito, and N. Razouk. Decay properties of spectral projectors with applications to electronic structure. SIAM Review, 55(1):3–64, Feb 2013. [5] M. Benzi and G. H. Golub. Bounds for the entries of matrix functions with applications to preconditioning. BIT Numerical Mathematics, 39(3):417–438, 1999.
Decay bounds for non-Hermitian matrix functions
25
[6] M. Benzi and N. Razouk. Decay bounds and O(n) algorithms for approximating functions of sparse matrices. Electronic Transactions on Numerical Analysis, 28:16–39, 2007. [7] M. Benzi and V. Simoncini. Decay bounds for functions of Hermitian matrices with banded or Kronecker structure. SIAM Journal on Matrix Analysis and Applications, 36(3):1263– 1282, Jan 2015. [8] S. Bernstein. Sur les fonctions absolument monotones. Acta Mathematica, 52(1):1–66, 1929. [9] C. Canuto, V. Simoncini, and M. Verani. On the decay of the inverse of matrices that are sum of Kronecker products. Linear Algebra and its Applications, 452:21–39, Jul 2014. [10] C. C. Cowen and E. Harel. An effective algorithm for computing the numerical range. https://www.math.iupui.edu/ ccowen/Downloads/33NumRange.html, 1995. [11] M. Crouzeix. Numerical range and functional calculus in Hilbert space. Journal of Functional Analysis, 244(2):668–690, Mar 2007. [12] N. Del Buono, L. Lopez, and R. Peluso. Computation of the exponential of large sparse skewsymmetric matrices. SIAM Journal on Scientific Computing, 27(1):278–293, Jan 2005. [13] S. G. Demko. Inverses of band matrices and local convergence of spline projections. SIAM Journal on Numerical Analysis, 14(4):616–619, 1977. [14] V. Druskin and L. Knizhnerman. Extended Krylov subspaces: approximation of the matrix square root and related functions. SIAM J. Matrix Anal. Appl., 19(3):755–771 (electronic), 1998. [15] V. Druskin, L. Knizhnerman, and V. Simoncini. Analysis of the rational Krylov subspace and ADI methods for solving the Lyapunov equation. SIAM J. Numer. Anal., 49:1875–1898, 2011. [16] V. Eijkhout and B. Polman. Decay rates of inverses of banded M-matrices that are near to Toeplitz matrices. Linear Algebra and its Applications, 109:247–277, 1988. [17] S. W. Ellacott. Computation of Faber series with application to numerical polynomial approximation in the complex plane. Mathematics of Computation, 40(162):575–587, April 1983. [18] R. Freund. On polynomial approximations to fa (z) = (z − a)−1 with complex a and some applications to certain non-hermitian matrices. Approx. Theory and Appl., 5:15–31, 1989. [19] N. J. Higham. Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008. [20] A. Iserles. How large is the exponential of a banded matrix? New Zealand Journal of Mathematics, 29:177–192, 2000. [21] L. Knizhnerman and V. Simoncini. Convergence analysis of the Extended Krylov Subspace Method for the Lyapunov equation. Numerische Mathematik, 118(3):567–586, 2011. [22] N. Mastronardi, M. K.-P. Ng, and E. E. Tyrtyshnikov. Decay in functions of multiband matrices. SIAM Journal on Matrix Analysis and Applications, 31(5):2721–2737, Jan 2010. [23] The MathWorks, Inc. MATLAB 7, r2013b edition, 2013. [24] M. Merkle. Analytic Number Theory, Approximation Theory, and Special Functions: In Honor of Hari M. Srivastava, chapter Completely Monotone Functions: A Digest, pages 347–364. Springer New York, New York, NY, 2014. [25] G. Meurant. A review on the inverse of symmetric tridiagonal and block tridiagonal matrices. SIAM Journal on Matrix Analysis and Applications, 13(3):707–728, 1992. [26] S. Pozza and V. Simoncini. On the convergence of Krylov subspace solvers and decay pattern of banded matrices, 2016. In preparation. [27] R. F. Rinehart. The equivalence of definitions of a matric function. The American Mathematical Monthly, 62(6):395–414, 1955. [28] P. K. Suetin. Series of Faber polynomials. Gordon and Breach Science Publishers, 1998. Translated from the 1984 Russian original by E. V. Pankratiev [E. V. Pankrat′ ev]. [29] J. van den Eshof, A. Frommer, T. Lippert, K. Schilling, and H. van der Vorst. Numerical methods for the QCD overlap operator. I: Sign-function and error bounds. Comput. Phys. Commun., 146(2):203–224, 2002. [30] H. Wang. The Krylov Subspace Methods for the Computation of Matrix Exponentials. PhD thesis, Department of Mathematics, University of Kentucky, 2015. [31] H. Wang and Q. Ye. Error Bounds for the Krylov Subspace Methods for Computations of Matrix Exponentials. ArXiv e-prints, Mar 2016. arXiv:1603.07358. [32] Q. Ye. Error bounds for the Lanczos methods for approximating matrix exponentials. SIAM J. Numer. Anal., 51(1):68–87, Jan 2013.