Structure-Preserving Doubling Algorithms

0 downloads 0 Views 2MB Size Report
One way to solve the DARE (1.1) is to use the structure-preserving doubling algorithm (SDA): .... For other matrix equations, the continuous-time algebraic ... The palindromic quadratic eigenvalue problem can be generalized to the ..... Figure 3.2 A 2D single cell domain of a LSAW resonator and boundary conditions.
Open Problems and Surveys of Contemporary Mathematics SMM 6, pp. 133–193

c Higher Education Press

and International Press Beijing-Boston

Structure-Preserving Doubling Algorithms — Rediscovery, Redevelopment and Applications Eric King-wah Chu∗, Wen-Wei Lin†

Abstract In this paper, we attempt to tell the interesting story about the recent rediscovery and revival of the once faded and almost forgotten doubling algorithm for the discrete-time algebraic Riccati equation. Armed with some new insight and theoretical redevelopment, the method, now called the structure-preserving doubling algorithm (SDA), is first linked to palindromic eigenvalue problems. Generalizations were then developed for continuous-time, nonsymmetric and PCP algebraic Riccati equations and nonlinear matrix equations, for a vast array of applications in vibration analysis, surface acoustic wave simulation, optimal control, stochastic systems, transport theory, time-delayed systems, nano research, etc. We shall present some key theoretical results associated with selected important applications, and a summary of the recent discovery that the SDA can be adapted to solve large-scale problems of dimension n, with an O(n) computational complexity and memory requirement. 2000 Mathematics Subject Classification: 15A18, 15A22, 15A24, 15A30, 65F15, 65F50 Keywords and Phrases: algebraic Riccati equation, continuous-time system, discrete-time system, doubling algorithm, eigenvalue, invariant subspace, largescale problem, Lyapunov equation, matrix pencil, nano research, Newton’s method, nonlinear matrix equation, palindromic eigenvalue problem, singular system, Stein equation, surface acoustic wave, time-delayed system, vibration

∗ School of Mathematical Sciences, Building 28, Monash University 3800. [email protected] † Department of Applied Mathematics, Chiao Tung University, Hsinchu 300. [email protected]

Email: Email:

134

Eric King-wah Chu, Wen-Wei Lin

Contents 1 Introduction

134

2 What is new? 135 2.1 SDA 1 and SSF 1 for DAREs; Palindromic eigenvalue problems . . 136 2.2 SDA 2 and SSF 2 for NMEs . . . . . . . . . . . . . . . . . . . . . . 137 2.3 Advantages of SDA . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 3 Applications 3.1 Vibration of fast trains . . 3.2 SAW simulation . . . . . 3.3 Nano research . . . . . . . 3.4 Sensitivity of time-delayed 3.5 PDA . . . . . . . . . . . . 3.6 Large-scale problems . . .

. . . . . . . . . . . . . . . systems . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

4 Conclusions

1

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

138 138 146 150 165 171 186 188

Introduction

Our story starts from the discrete-time linear system: xj+1 = Axj + Buj with the state x ∈ Rn , the control u ∈ Rm , and the system and control matrices A ∈ Rn×n and B ∈ Rn×m , respectively. In optimal control with an infinite time horizon, we choose uj such that ∞

 1X ⊤ xj Hxj + u⊤ Ruj . min J = j uj 2 j=1 with both H ∈ Rn×n and R ∈ Rm×m beng symmetric. With the stabilizing solution X from the discrete algebraic Riccati equation (DARE) X = A⊤ X(I + GX)−1 A + H,

G = BR−1 B ⊤ ,

(1.1)

the optimal control has the form uj = −(R + B ⊤ XB)−1 B ⊤ XAxj . One way to solve the DARE (1.1) is to use the structure-preserving doubling algorithm (SDA): Ak+1 = Ak (In + Gk Hk )−1 Ak ,

(1.2) −1

A⊤ k,

Gk+1 = Gk + Ak Gk (In + Hk Gk ) −1 Hk Ak ; Hk+1 = Hk + A⊤ k (In + Hk Gk )

Structure-Preserving Doubling Algorithms

135

with A0 = A, H0 = H and G0 = G. A second type of structure-preserving algorithm (SDA 2) will be introduced later, thus the iteration in (1.2) is referred to as the SDA of type 1, or SDA 1. Under favourable conditions, the SDA 1 converges quadratically, with Ak → 0, Hk → X and Gk → Y (the adjoint solution) as k → ∞. The SDA 1 was well-known in the seventies, but faded until its revival by Chu, Fan, Lin and Wang in 2004 [17] (actually for the more general periodic DARE). The origin of the SDA is, interestingly, unknown. In [2, p. 159], we have the following quotation by Anderson and Moore: Doubling algorithms have been part of the folklore associated with Riccati equations in linear systems problems for some time. We are unable to give any original reference containing material close to that presented here. However, more recent references include [31] Bierman, G.J., Steady-state covariance computation for discrete linear systems, Proc. 1971 JACC, paper No. 8-C3. [32] Friedlander, B., T. Kailath and L. Ljung, Scattering theory and linear least squares estimation – II. Discrete-time problems, J. Franklin Inst., Vol. 301, Nos. 1 and 2, 1976, pp. 71–82. [33] Anderson, B.D.O., Second-order convergent algorithms for the steady-state Riccati equations, Int. J. Control, Vol. 28, No. 2, pp. 295– 306, 1978. Many attributed the SDA to Anderson (and his 1978 paper above), but that is wrong according to his quotation above. In this paper, we shall discuss what is new in the recent rediscovery and redevelopment of the SDA in Section 2. The algorithms and theoretical results for selected important applications, in vibration analysis of fast trains, surface acoustic wave simulation, Green’s function computation in nano research, sensitivity analysis of time-delayed systems, and the linear palindromic eigenvalue problem A∗ x = λAx, with applications such as the optimal control for singular descriptor linear systems, occupy most of Section 3. A brief summary of the recent generalizations of the SDA for large-scale problems is contained in Section 3.6, before the conclusions in Section 4. To limit the length of the paper, we neglect some minor results and all numerical experiments. Readers should consult the quoted references for details if so desired. For an earlier brief survey paper on the SDA, see [19].

2

What is new?

We shall sketch the big picture on how the SDA has been revived and redeveloped, before the details for selected applications in the next section. We shall describe how the SDA 1 (1.2) is generalized for several seemingly very different problems. So how these generalizations of the SDA and the associated array of diverse applications are linked?

136

2.1

Eric King-wah Chu, Wen-Wei Lin

SDA 1 and SSF 1 for DAREs; Palindromic eigenvalue problems

The original SDA borrowed from the physical phenomenon of scattering, also acted as a convergence acceleration technique. Subsequently, only a small set of techniques can be employed in its analysis, limiting its understanding, growth and applicability. The first major modern step is to link the DARE (1.1) to an equivalent eigenvalue problem:       In G A 0 I MZ = LZΛ, M ≡ , L≡ , Z= n . (2.1) −H In X 0 A⊤ The eigenvalue problem (2.1) is described as symplectic because the pencil (M, L) satisfies   0 In ⊤ ⊤ MJM = LJL , J ≡ . −In 0

The canonical form (M, L) in (2.1) is called a standard symplectic form of type 1 (SSF 1). It is easy to show that eigenvalues for (M, L) occur in reciprocal pairs; i.e., with σ(·) denoting the spectrum of matrices, matrix pencils or polynomials, λ ∈ σ(M, L) implies λ ∈ σ(L, M). To obtain the stabilizing solution X for the DARE (1.1), the more general f, L] e such that “doubling” action on the matrix pencil (M, L) is applied — find [M   h i L e L e = 0, (2.2) M, −M e = LM, e which is equivalent to ML and then define     b L b ≡ MM, e e M, . LL   b L b preserves symplecticity. More importantly, The doubling action (M, L) → M,     b L b when M f, L e in (2.2) the stronger SSF 1 form in (M, L) is passed onto M, is selected as in [17], where some generalized elementary row-block operations are applied. The formulae in (1.2) are then retrieved. Assume that Mx = λLx, after the doubling action, we have b = MMx e e e e b Mx = λMLx = λLMx = λ2 LLx = λ2 Lx.

In other words, the power of λ is doubled, thus the term “doubling”. Assuming no unimodular eigenvalues exists, several doubling actions reduce the problem to finding the null space of the converged matrix pencil. In the vibration analysis of fast trains and the simulation of surface acoustic waves (SAW), we are required to solve the palindromic eigenvalue problem  (2.3) λ2 A⊤ A⊤ d0 = Ad0 , x 6= 0. d1 + λAd0 + Ad1 x = 0, After a deflation process, (2.3) can be written in SSF 1 form, and the SDA 1 can then be applied for its solution; see [18] or Section 3.1 for the details.

Structure-Preserving Doubling Algorithms

2.2

137

SDA 2 and SSF 2 for NMEs

Alternatively, and more efficiently, the palindromic eigenvalue problem (2.3) can be solved via the solvent approach. With the factorization −1 λ2 A⊤ (λAd1 − Y )⊤ d1 + λAd0 + Ad1 = (λY − Ad1 )Y

provided that the nonlinear matrix equation (NME) holds: Y + AY −1 A⊤ = Q,

(2.4)

with Q = −Ad0 and A = Ad1 . This can be shown to be equivalent to the eigenvalue problem       0 In A 0 I MZ = LZΛ, M ≡ , L≡ , Z≡ n . (2.5) Q −In Y A⊤ 0 Again, the matrix pencil (M, L) in (2.5) is symplectic and is called a standard symplectic form of type 2 (SSF 2). A specially designed doubling action can then be applied to (M, L), producing the SDA 2 for the solution of the NME (2.4): Ak+1 = Ak (Qk − Pk )−1 Ak , Pk+1 = Pk + Ak (Qk − Pk )−1 A⊤ k,

(2.6)

−1 ⊤ Qk+1 = Qk − A⊤ Ak ; k (Qk − Pk )

with A0 = A, P0 = 0 and Q0 = Q. Under favourable conditions, we have Ak → 0 and Qk → X as k → ∞.

2.3

Advantages of SDA

The first advantage of the SDA lies in its wide applicability. From the brief discussion in Sections 2.1–2.2, we now have most of the ingredients for all the generalizations of the SDA. For other matrix equations, the continuous-time algebraic Riccati equation (CARE) can be treated after the Cayley transform, and Stein and Lyapunov equations are special cases without nonlinear terms. Further generalizations were also found for the nonsymmetric algebraic Riccati equation (NARE). Algebraic Riccati equations arise in many important applications, including the total least squares problems with or without symmetric constraints [26], the spectral factorizations of rational matrix functions [30], the linear and nonlinear optimal controls [8], the contractive rational matrix functions [55], the structured complex stability radius [39], the transport theory [20, 52, 58], the Wiener-Hopf factorization of Markov chains [83], and the optimal solutions of linear differential systems [55]. Symmetric algebraic Riccati equations have been the topic of extensive research, and the theory, applications and numerical solution of these equations are the subject of [13, 15, 17, 19] as well as the monographs [55, 70]. For NAREs, see [9, 20, 32, 36, 58]. Importantly, different applications possess different structures and solution requirement, throwing up new theoretical and computational challenges.

138

Eric King-wah Chu, Wen-Wei Lin

The palindromic quadratic eigenvalue problem can be generalized to the PCP-palindromic eigenvalue problem, for the sensitivity analysis of time-delayed systems. Furthermore, a SDA was discovered for the linear palindromic eigenvalue problem A∗ x = λAx, from a specific doubling action for (A∗ , A). When the problems are large, doubling can also be adapted, under additional sparseness and low-rank assumptions. In addition to the palindromic eigenvalue problems and the NMEs mentioned before, the SDA can now be applied to a wide range of interesting problems from important applications. This is the biggest advantage of the SDA over the other methods. In fact, there are no comparable unifying suite of methods for such as large collection of problems and applications. The next advantage of the SDA lies in its simplicity. With our new understanding of the method related to symplectic pencils, proofs can be constructed using established tools such as the Kronecker canonical form. The convergence of the SDA is now well understood, with fast quadratic convergence when the corresponding pencil (M, L) has no unimodular eigenvalues. When unimodular eigenvalues with even partial multiplicity exist, linear convergence of factor 1/2 can be achieved. Lastly, for large-scale problems, additional new insight of the SDA is obtained. When the size of the problem n approaches infinity, the iterates in the SDA differed from the previous ones by diminishing low-rank updates. With this new insight and under some minor additional assumptions, we can show, for the first time, that the SDA can be implemented with an O(n) computational complexity and memory requirement. Consequently, the SDA will be viable alternative for large-scale problems; for details, see Section 3.6 and [60, 61, 62].

3 3.1

Applications Vibration of fast trains

Quoted from [18], we present here the study of the resonance phenomena of the track under high frequent excitation forces. Research in this area not only contributes to the safety of the operations of high-speed trains but also new designs of train bridges, embedded rail structures (ERS) and train suspension systems. Recently, the dynamic response of the vehicle-rails-bridge interaction system under different train speeds has been studied by Wu, Yang, and Yau [84] and a procedure for designing an optimal ERS is proposed by Markine, de Man, Jovanovic and Esveld [69]. An accurate numerical estimation to the resonance frequencies of the rail plays an important role in both works. However, as mentioned by Ipsen [46], the classic finite element packages fail to deliver correct resonance frequencies for such problems [46]. Here, we would like to compare the method proposed by Mackey, Mackey, Mehl and Mehrmann [67] with the generalized SDA methods proposed by Chu, Fan, Lin and Wang [17] and Lin and Xu [64], in solving the palindromic eigenvalue problems arising from spectral modal analysis of the resonance of the rail under a periodic excitation force. We assume that the rail sections between consecutive sleeper bays are identical, distances between consecutive wheels are the same and the wheel loads are

Structure-Preserving Doubling Algorithms

139

equal. Figure 3.1 shows an example of the rail section we consider here.

Figure 3.1

A 3D rail model.

Base on our assumptions, we model the rail under cargo wheel loads by a section of rail between two sleepers. The external force is assumed to be periodic and the displacements of two boundary cross sections of the modelled rail are assumed to have a ratio κ, which is dependent on the excitation frequency of the external force. In the following, we consider the rail as a three dimensional isotropic elastic solid and a 3D finite element model of the solid with linear isoparametric tetrahedron elements is introduced. From the element virtual work principle, the equilibrium state of the solid element e under external body forces satisfies the following equation: Z Z Z (δǫ⊤ )Cǫ dV + (δq ⊤ )ρ¨ q dV = (δq ⊤ )f dV. (3.1) e

e

e

Here, ρ is the mass density, f is the time-dependent body force, q = [u, v, w] ∂v ∂w ∂u ∂v ∂w ∂u ∂v ∂w ⊤ is the displacement vector, ǫ = [ ∂u ∂x , ∂y , ∂z , ∂y + ∂x , ∂x + ∂z , ∂z + ∂y ], and δq and δǫ⊤ are the virtual displacement and the corresponding virtual strain vectors, and E C= diag (C1 , C2 ) (1 + υ)(1 − 2υ) is the well-known strain-stress relationship, where E is the Young’s modulus, υ is the Poisson ratio and     1−υ υ υ 1 − 2υ   υ 1−υ υ C1 = , C2 = I3 . 2 υ υ 1−υ Let φi and [ui , vi , wi ]⊤ (i = 1, . . . , 4) be the linear nodal basis function and the nodal displacement vector associated with the i-th node of the element e, respectively, and let Xe = [X1⊤ , X2⊤ , X3⊤ , X4⊤ ]⊤ , Be = [B1 , B2 , B3 , B4 ] and Ne =

140

Eric King-wah Chu, Wen-Wei Lin

[N1 , N2 , N3 , N4 ], where  ∂φi

 0  0 ∂φi 0        ∂y ui φi 0 0  0 0 ∂φi    Xi =  vi  , Ni =  0 φi 0  , Bi =  ∂φi ∂φi ∂z  .  ∂y ∂x 0  wi 0 0 φi  ∂φi  i   ∂z 0 ∂φ ∂x i ∂φi 0 ∂φ ∂z ∂y ∂x

0

Equation (3.1) can now be discretized into the following linear equations  Z   X Z X Z ¨e = Be⊤ CBe dV Xe + ρ Ne⊤ Ne dV X Ne⊤ Ne dV Fe , (3.2) e

e

e

e

e

where Fe = [F1⊤ , F2⊤ , F3⊤ , F4⊤ ]⊤ and Fi (i = 1, . . . , 4) is the i-th force vector P R nodal ⊤ acting on element e. In the following, we denote K = B CB e dV , and e e e P R M = ρ e Ne⊤ Ne dV . Equation (3.2) can now be written as e

¨ = ρ−1 M F. KX + M X

When considering the dynamic response of the solid, dissipative forces such as the force due to frictions have to be considered. Their effect is introduced in the form of the so-called viscous damping DX˙ where D is the damping matrix. For the lack of better alternatives, proportional damping proposed by Strutt (Lord Rayleigh) [79] is employed where D is a linear combination of K and M . The equation of motion involving viscous damping can now be written as ¨ = ρ−1 M F. KX + DX˙ + M X Due to the given boundary conditions on a uniform mesh, K, D and M have the following form   1 ⊤ G11 G12 0 ··· 0 κ Gm,m+1   G⊤ G22 G23 0 0 12   .   .. .. .. .. ..   . . . . 0     .. . . . . . .   . . . . 0 0     .. ⊤  . 0 Gm−2,m−1 Gm−1,m−1 Gm−1,m  κGm,m+1 0 · · · 0 G⊤ Gm,m m−1,m

with Gii ∈ Cni ×ni for i = 1, . . . , m. Furthermore, from the spectral modal analysis, ˆ iωt where ω is the frequency of the external excitation force one considers X = Xe ˆ is the corresponding eigenmode. Consequently, we arrive to a palindromic and X eigenvalue problem  ˆ = 0, P(κ) ≡ κ2 A1 + κA0 + A⊤ κ−1 P(κ)X 1 ,

Structure-Preserving Doubling Algorithms

141

where A0 , A1 ∈ Cn×n with n = n1 + · · · + nm and  Km,m+1 + iωDm,m+1 − ω 2 Mm,m+1 (if i = m and j = 1), [A1 ]ij = (3.3) 0 (otherwise),  Ki,j + iωDi,j − ω 2 Mi,j (if i − 1 ≤ j ≤ i + 1), [A0 ]ij = 0 (otherwise). 3.1.1

Deflation

We shall consider the deflation of zero and infinite eigenvalues in this section. For the deflation of λ = ±1, consult [67]. From their definitions in (3.3), A1 , A0 ∈ Cn×n can be partitioned as     000 C11 C12 0 ⊤ C22 C23  A1 =  0 0 0  , A0 =  C12 ⊤ L00 0 C23 C33

⊤ ⊤ ⊤ where L ∈ Cnm ×n1 , C11 = C11 ∈ Cn1 ×n1 , C33 = C33 ∈ Cnm ×nm and C22 = C22 ∈ Cℓ×ℓ with ℓ = n − n1 − nm . Assume that C22 is nonsingular. We have observed that this assumption is generically valid from the numerical examples we have encountered. Otherwise, the preprocessing procedure in [67] should be applied. Let     −1 In1 −C12 C22 0 In1 0 0 Θ= 0 Iℓ 0  , Π =  0 0 Inm  . ⊤ −1 0 Iℓ 0 0 −C23 C22 Inm

Then, using a similarity transformation, P(λ) can be transferred to the following form   −1 ⊤ −1 λ(C11 − C12 C22 C12 ) L⊤ − λC12 C22 C23 0 ⊤ −1 ⊤ ⊤ −1 ΠΘP(λ)Θ⊤ Π⊤ =  λ(λL − C23 C22 C12 ) λ(C33 − C23 C22 C23 ) 0  0 0 λC22   S(λ) 0 = diag (In1 , λInm , Iℓ ) 0 λC22 where



λC˜11 L⊤ − λC˜12 S(λ) = ⊤ C˜22 λL − C˜12



−1 ⊤ ˜ −1 ⊤ −1 with C˜11 ≡ C11 − C12 C22 C12 , C12 ≡ C12 C22 C23 and C˜22 ≡ C33 − C23 C22 C23 .

3.1.2

Structure-Preserving Algorithms

SDA 1 After swapping the row-blocks, the pencil S(λ) is equivalent to     ⊤ ˜ L 0 −C˜12 C22 λ ˜ + , C11 −C˜12 0 L⊤

142

Eric King-wah Chu, Wen-Wei Lin

which is in a generalized standard symplectic form (GSSF) [44]. The structurepreserving doubling algorithm (SDA 1) in [44] can then be applied to solve the corresponding eigenvalue problem. In terms of accuracy and speed of convergence, SDA 1 behaves similarly as SDA 2 below. However, the operation count for SDA 1 doubles that of SDA 2. As a result, we shall not discuss SDA 1 further. However, SDA 1 is a weapon in reserve against difficult palindromic eigenvalue problems, when some assumptions for SDA 2 are not satisfied. ˜ Assume that C˜22 is invertible. Define a new λ-matrix S(λ) as follows:     −1 In 0 I −L⊤ C˜22 ˜ S(λ) ≡ n1 S(λ) ˜ −1 1˜ ⊤ 0 In3 C22 C12 In3   −1 −1 ˜ ⊤ −1 ˜ ⊤ λ(C˜11 − L⊤ C˜22 L − C˜12 C˜22 C12 ) + L⊤ C˜22 C12 −λC˜12 = λL C˜22  ⊤ ⊤ ⊤ ˜ and let x ˜ , y˜ be an eigenvector of S(λ); i.e., h i −1 −1 ˜ ⊤ −1 ˜ ⊤ λ (C˜11 − L⊤ C˜22 L − C˜12 C˜22 C12 )˜ x − C˜12 y˜ + L⊤ C˜22 C12 x ˜ = 0, λL˜ x + C˜22 y˜ = 0.

(3.4a) (3.4b)

Since C˜22 is invertible, from (3.4b) y˜ can be represented as −1 y˜ = −λC˜22 L˜ x.

(3.5)

Substituting (3.5) into (3.4a), we get the following new small size palindromic quadratic eigenvalue problem: x = 0, Pd (λ)˜ x ≡ (λ2 Ad1 + λAd0 + A⊤ d1 )˜ where

−1 Ad1 = C˜12 C˜22 L,

(3.6)

−1 −1 ˜ ⊤ Ad0 = C˜11 − L⊤ C˜22 L − C˜12 C˜22 C12 .

The SDA 1 can then be applied to the palindromic eigenvalue problem (3.6), which is in the same form as (2.3), written in the SSF 1 as in (2.1). SDA 2 Suppose that X is nonsingular. Rewrite Pd (λ) in (3.6) as −1 ⊤ Pd (λ) = (λAd1 − X)X −1 (λX − A⊤ Ad1 + X + Ad0 ). d1 ) + λ(Ad1 X

It follows that Pd (λ) can be factorized (or square-rooted) as Pd (λ) = (λAd1 − X)X −1 (λX − A⊤ d1 ) for some nonsingular X if and only if X is satisfied following nonlinear matrix equation with the plus sign (NME): Ad1 X −1 A⊤ d1 + X + Ad0 = 0.

(3.7)

We can easily prove the following lemma on the existence of the solutions of the NME:

Structure-Preserving Doubling Algorithms

143

Lemma 3.1.1. Let (Λ1 ⊕ Λ2 , [Y1 , Y2 ]) be an eigenpair of Pd (λ) in the sense that Ad1 Yi Λ2i + Ad0 Yi Λi + A⊤ d1 Yi = 0

(i = 1, 2),

where Yi ∈ Cn1 ×n1 for i = 1, 2. Suppose that Ad1 and Yi (i = 1, 2) are invertible. −1 −1 Then the corresponding NME (3.7) has the solutions X = A⊤ (i = 1, 2). d1 Yi Λi Yi Evidently, there are many solutions to the NME, each will facilitate the factorization of Pd (λ) we aim for. Assume that there are no eigenvalues on the unit circle. Consequently, we can partition the spectrum into Λs ⊕ Λ−1 s , with Λs containing the stable eigenvalues (inside the unit circle). The SDA will seek a stable −1 −1 solution Xs ≡ A⊤ d1 Ys Λs Ys , where Ys contains the eigenvectors corresponding to Λs . Note that Xs is unique as it is independent of the order of the eigenvalues in Λs . The structure-preserving doubling algorithm SDA 2 (2.6) in [64] can then be applied to solve the NME, and subsequently the palindromic eigenvalue problem. In the SDA 2 for fast trains, we assume the invertibility of the matrices Qk − Pk . This is the case for large values of k, as indicated by Corollary 3.1.4. A more involved analysis [34] removes the assumption. Finally, equations similar to (3.7), for real matrices or with the transpose replaced by Hermitian, have been studied and existing results do not apply. The lack of positivity of matrices or an associated inner product make our NME difficult to study. 3.1.3

Convergence of Algorithms

The behavior of the structure-preserving doubling algorithms is well-documented in [15, 17, 36, 44, 45, 64]. However, these results are mostly written for real problem with real variables and have to be modified for our situation. Following the development in [64], let M − λL ∈ C2n×2n be a T-symplectic pencil, in the sense that   0I MJM⊤ = LJL⊤ , J = . (3.8) −I 0 Define the nonempty null set N(M, L)   2n×2n ≡ [M∗ , L∗ ] : M∗ , L∗ ∈ C , rank [M∗ , L∗ ] = 2n, [M∗ , L∗ ]

  L =0 . −M

b = M∗ M, L b = L∗ L. For the doubling For any given [M∗ , L∗ ] ∈ N(M, L), define M b b transformation M − λL → M − λL, we have the adaptation of [64, Theorem 2.1]:

b − λL b be a doubling transformation of a T-symplectic Theorem 3.1.2. Let M pencil M − λL. Then we have: b b (a) The pencil   M −λLis still T-symplectic. U U (b) If M =L S, where U, V ∈ Cn×m and S ∈ Cm×m , then V V     b U =L b U S2. M V V

144

Eric King-wah Chu, Wen-Wei Lin

(c) If the pencil M − λL has the Kronecker canonical form     J 0 I 0 W MZ = r , W LZ = r 0 I2n−r 0 N2n−r

(3.9)

where W, Z are nonsingular, Jr is a Jordan matrix corresponding to the finite eigenvalues of M − λL and N2n−r is a nilpotent Jordan matrix corresponding to c such the infinite eigenvalues of M − λL, then there exists a nonsingular matrix W that  2    b = Jr 0 b = Ir 20 c MZ c LZ W , W . (3.10) 0 I2n−r 0 N2n−r Proof. (a) Since [M∗ , L∗ ] ∈ N(M, L) implies that M∗ L = L∗ M, it follows from (3.8) that b M b ⊤ = M∗ MJM⊤ M⊤ = M∗ LJL⊤ M⊤ MJ ∗ ∗

⊤ ⊤ b b⊤ = (L∗ M)J(M⊤ L⊤ ∗ ) = L∗ LJL L∗ = LJ L ,

b − λL b is T-symplectic. implying that M (b) Again using M∗ L = L∗ M, we have     U U M∗ L = L∗ M S, V V and hence

(c) Let

We then have

        U U U U b M = M∗ M = M∗ L S = L∗ M S V V V V     U b U S2. = L∗ L S2 = L V V   Jr 0 e M∗ = W, 0 I2n−r

  Ir 0 e L∗ = W. 0 N2n−r

       Jr 0 Jr 0 Ir 0 Jr 0 e M∗ LZ = W LZ = = , 0 I2n−r 0 I2n−r 0 N2n−r 0 N2n−r        I 0 Jr 0 J 0 e ∗ MZ = Ir 0 L W MZ = r = r . 0 N2n−r 0 N2n−r 0 I2n−r 0 N2n−r i h e ∗, L e ∗ ∈ N(M, L). Notice that (3.9) As Z is nonsingular, this implies that M   h i M e ∗, L e∗ implies that M−λL is regular and rank = 2n. Thus [M∗ , L∗ ] and M −L form two different bases of the hnull set iN(M, L). Consequently, there exists a e ∗, L e∗ = W f such that M f [M∗ , L∗ ]. It can be verified easily nonsingular matrix W f satisfies (3.10). that W

Structure-Preserving Doubling Algorithms

145

It is easy to verify that NME (3.7) has a symmetric nonsingular solution X if and only if X satisfies     I I M =L S X X for some S ∈ Cn×n , where



 A⊤ 0 d1 M≡ , −Ad0 −I



 0 I L≡ . Ad1 0

Note that M − λL is in the second standard symplectic form (SSF 2) [64]. For the convergence of the SDA, we have the following adaptation of [64, Theorem 4.1]: Theorem 3.1.3. Let X be a symmetric invertible solution of (3.7) and let S = X −1 Ad1 . Then the matrix sequences {Rk }, {Qk } and {Pk } generated by the SDA 2 satisfy k (a) Rk = (X − Pk )S 2 ; (b) Qk − Pk = (X − Pk ) + Rk⊤ (X − Pk )−1 Rk ; k k (c) Qk − X = (S ⊤ )2 (X − Pk )S 2 ; provided that all the required inverses of Qk − Pk exist. Proof. We shall apply mathematical induction. With R0 = A⊤ d1 , Q0 = −Ad0 and P0 = 0, denote     Rk 0 −Pk I Mk ≡ , Lk ≡ . Qk −I Rk⊤ 0 For k = 1, the NME in (3.7) implies the invertibility of       X R0 I 0 X 0 I X −1 R0 = . R0⊤ Q0 R0⊤ X −1 I 0 X 0 I Further computation yields       I 0 X R0 I −R0 Q−1 X − R0 Q−1 R0⊤ 0 0 0 = . ⊤ R0⊤ Q0 0 I −Q−1 0 Q0 0 R0 I Consequently,

⊤ X − P1 = X − R0 Q−1 0 R0

(3.11)

is invertible, as required in (b). From (3.7), it is easy to verify that X satisfies     I I M0 = L0 S X X with S = X −1 R0 . Since M1 − λL1 is a doubling transformation of M0 − λL0 . Part (b) in Theorem 3.1.2 implies     I I M1 = L1 S2. X X

146

Eric King-wah Chu, Wen-Wei Lin

The blocks in the above equation yield R1 = (X − P1 )S 2 ,

Q1 − X = R1⊤ S 2 .

Together with (3.11), these imply the invertibility of Q1 − P1 = (X − P1 ) + R1⊤ (X − P1 )−1 R1 ,

Q1 − X = (S ⊤ )2 (X − P1 )S 2 . (3.12)

We have proved the theorem for k = 1. With the Theorem holding for all positive integers up to k, we shall prove the case for k + 1. Since Qk − Pk is assumed to be invertible, it follows that Rk+1 , Pk+1 and Qk+1 are well defined. Similar to the proof of (3.11), the equality in (b) implies the invertibility of X − Pk+1 = (X − Pk ) − Rk (Qk − Pk )−1 Rk⊤ . On the other hand, since Mj+1 − λLj+1 is a doubling transformation of Mj − λLj for j = 0, 1, . . . , k, repeat application of Part (b) in Theorem 3.1.2 implies     k+1 I I Mk+1 = Lk+1 S2 . X X This, following the same argument leading to (3.12), implies the invertibility of ⊤ Qk+1 − Pk+1 = (X − Pk+1 ) + Rk+1 (X − Pk+1 )−1 Rk+1 , k+1

Qk+1 − X = (S ⊤ )2

k+1

(X − Pk+1 )S 2

.

This completes the proof for the k + 1 case, and the induction argument. Convergence to the unique symmetric stable solution Xs , which the SDA seeks, is summarized in the following Corollary. Corollary 3.1.4. When S is stable, Rk → 0 and Qk → X quadratically as k → ∞.

3.2

SAW simulation

Quoted from [43], we consider the generalized eigenvalue problem (GEP) of the form       M1 G ψi 0 F ψi + λ = 0, (3.13) F⊤ 0 ψℓ G⊤ M2 ψℓ where M1⊤ = M1 ∈ Cn×n , M2⊤ = M2 ∈ Cm×m , F and G ∈ Cn×m with m ≪ n. If M1 and M2 are nonsingular, then (3.13) can be reduced as the T-palindromic quadratic eigenvalue problem (TPQEP) of the form P(λ)x ≡ (λ2 A⊤ 1 + λA0 + A1 )x = 0, where x = ψℓ , A1 = F ⊤ M1−1 G,

ψi = −M1−1 (λF + G)ψℓ , A0 = F ⊤ M1−1 F + G⊤ M1−1 G − M2 ;

(3.14)

Structure-Preserving Doubling Algorithms

147

or x = ψi , A1 = GM2−1 F ⊤ ,

ψℓ = −λ−1 M2−1 (F ⊤ + λG⊤ )ψi ,

A0 = F M2−1 F ⊤ + GM2−1 G⊤ − M1 .

(3.15)

By taking the transpose of P(λ) in (3.14) and multiplying it by 1/λ2 it is easily seen that the eigenvalues of P(λ) appear in the reciprocal pairs (λ, 1/λ) (including 0 and ∞). Since the nullity of A1 = GM2−1 F ⊤ in (3.15) is larger or equal to n− m, P(λ) in (3.14) with A0 and A1 defined in (3.15) has n − m trivial zero and infinite eigenvalues which are not interested. We are only interested in finding 2m(≪ 2n) nontrivial eigenpairs of P(λ). The GEP (3.13) can be solved by traditional methods such as QZ and Arnoldi method, but there is no guarantee that the computed eigenvalues split equally inside and outside the unit circle [42]. For solving TPQEP (3.14) with small and dense matrices A0 and A1 , some pioneering works [66, 67] have been done for preserving the reciprocity of the eigenvalues, basing on a good linearization of (3.14) which transforms (3.14) into the form λZ ⊤ + Z. Some structure-preserving methods [76, 77] were proposed for solving (λZ ⊤ +Z)u = 0. A structure-preserving doubling algorithm for solving (3.14) was developed in [18] via the computation of a solvent of a nonlinear matrix equation associated with (3.14). Another structurepreserving algorithm based on (S + S−1 )-transform [63] and Patel’s approach [74] was developed in [41]. For problems with large and sparse matrices A0 and A1 , a structure-preserving algorithm using (S + S−1 )-transform and implicity-restarted shift-and-invert Arnoldi method was also developed for searching eigenvalues in a specified region of interests [41]. An accurate and efficient eigensolver which preserves the reciprocal relationship of the associated eigenpairs is needed. In this section, we would like to compare the accuracy and computational costs of the above mentioned algorithms for computing reciprocal eigenpairs in a SAW device [86]. The SAW filter plays an important role in telecommunication filters [12, 73] and sensor technologies [3] etc. These filters are built on the physical property of piezoelectric materials, that electrical charges induce mechanical deformations and vice versa. The main component (or cell) of a SAW filter composes of a piezoelectric substrate and the input and output interdigital transducers (IDT). An input electrical signal from the input IDT produces a surface acoustic wave, traveling through periodically arranged electrodes and the output IDT picks up the output electrical signal. Depending on the material properties of the piezoelectric substrate (PZT) and the metallic electrode, and the gap length between the electrodes, frequencies in a desired range can be stopped or filtered off. In the filter design, it is important to know the stop band width and the center frequency fc of the filter where fc = λvss here vs and λs are the wave velocity and wave length of the incident wave. The center frequency and stop band width can be determined by experiments or computation. In a computational approach, the dispersion diagram needs to be generated in which a GEP of the form (3.13) associated with each frequency in the search range has to be solved [42]. We next introduce a finite element model for a simple SAW resonator. For more finite element simulations of piezoelectric devices in two dimension (2D) and

148

Eric King-wah Chu, Wen-Wei Lin

three dimension (3D), one can refer to the works done by Allik, Koshiba, Lerch, and Buchner and Mohamed [1, 11, 54, 56]. Then the four structure-preserving algorithms developed in [18, 41] can be applied to solve the TPQEP (3.14) and the GEP (3.13) resulted from our FEM model. 3.2.1

Surface Wave Propagation

To model the wave propagation in a SAW device, we assume that a large number of electrodes are placed equally-spaced along a straight line on the PZT substrate. According to the Floquet-Bloch theory, one can reduce the problem to a single cell domain with one electrode by assuming the wave ψ is quasi-periodic of the form ψ(x1 , x2 ) = ψp (x1 , x2 )e(α+ıβ)x1 ,

ψp (x1 + p, x2 ) = ψp (x1 , x2 ),

where x1 is the wave propagation direction, p is the length of the unit cell (i.e. the periodic interval), α and β are the attenuation and phase shift along the wave propagation direction, respectively.

Figure 3.2

A 2D single cell domain of a LSAW resonator and boundary conditions

Let Ω denote the piezoelectric substrate with a single IDT as shown in Fig. 3.2, and Γℓ and Γr denote the left and right boundary segments of Ω, respectively. For the general anisotropic PZT substrates, under the assumption of linear piezoelectric coupling, the elastic and electric fields interact following the general material constitution law below T = cE S − e⊤ E,

D = eS + εS E,

(3.16)

where vectors T , S, D and E are the mechanical stress, strain, dielectric displacement and the electric field, respectively, and the matrices cE , εS and e are the elasticity constant, dielectric constant and piezoelectric constant matrices measured at constant electric and constant strain fields at constant temperature. By applying the virtual work principle to (3.16), the equilibrium state satisfies the

Structure-Preserving Doubling Algorithms

149

following equation: Z Z Z  E    ⊤ ⊤ ⊤ S (δS) c S + e (∇φ) dV + (∇δφ) eS − ε (∇φ) dV + (δu)⊤ ρ¨ u dV Ω Ω Ω Z   = (δu)⊤ (T · ~n) + (δφ)⊤ (D · ~n) dA, (3.17) Γl ∪Γr

where ρ is the mass density, u = [u1 , u2 , u3 ]⊤ is the displacement vector, φ is ∂u3 ∂u3 1 ∂u2 ∂u3 ∂u2 the electric potential that satisfies ∇φ = E, S = [ ∂u ∂x , ∂y , ∂z , ∂z + ∂y , ∂x + ∂u1 ∂u1 ∂u2 ⊤ ∂z , ∂y + ∂x ] , and δu, δφ, δS are virtual displacement, potential and strain vectors, respectively. Let the notation ψ = [u⊤ , φ]⊤ and the subscript i, ℓ and r refer to nodal point index in the interior, the left boundary and the right boundary of the domain Ω, respectively. Using the periodic boundary conditions, proposed by Buchner [11], Tr · nr = −γTℓ · nℓ , Dr · nr = −γDℓ · nℓ with γ = e−(α+ıβ) , the finite element discretization to (3.17) on the domain Ω [42] can be written in the following matrix form C(ω)ψ ≡ [K − ω 2 M + ıω(κ1 K + κ2 M )]ψ = 0,

(3.18)

where κ1 , κ2 > 0 are the viscous damping and mass damping respectively. By ordering the nodal unknown ψ according the order of subscripts ℓ, i and r, the matrices K and M , and the vector ψ can be partitioned as following:     ⊤ ⊤ Kℓℓ Kiℓ 0 Mℓℓ Miℓ 0 K =  Kiℓ Kii Kir  , M =  Miℓ Mii Mir  , ⊤ ⊤ 0 Kir Krr 0 Mir Mrr

where Kii , Mii ∈ Rn×n , Kℓℓ , Krr , Mℓℓ , Mrr ∈ Rm×m , Kiℓ , Kir , Miℓ , Mir ∈ Rn×m , and ψ = [ψℓ⊤ , ψi⊤ , ψr⊤ ]⊤ with ψi ∈ Cn , ψℓ , ψr ∈ Cm (m ≪ n). Obviously the matrix C(ω) in (3.18) can also be partitioned into   ⊤ Cℓℓ Ciℓ 0 C(ω) ≡ C ≡  Ciℓ Cii Cir  ⊤ 0 Cir Crr By setting ψr = λψℓ , (3.18) leads to the generalized eigenvalue problem       Cii Ciℓ 0 Cir ψi − λ = 0, ⊤ ⊤ Cir 0 Ciℓ Cbb ψℓ where Cbb := Cℓℓ + Crr . Since the viscosity is small for PZT substrates and metals in SAW devices, the attenuation factor α of surface waves is close to zero. As a result, the propagation factor λ are generally near the unit circle thereafter denoted by U. Furthermore, for frequency ω in the stopping band, the frequency shift parameter β shall be close to π when the periodic interval p (i.e. the domain width here) equals to half of the incident wave length λs . Therefore, we are interesting in finding λ close to U, especially for those are near −1 on the complex plane.

150

3.3

Eric King-wah Chu, Wen-Wei Lin

Nano research

Quoted from [33], we shall study the nonlinear matrix equation X + A⊤ X −1 A = Q + iηI, where A ∈ Rn×n , Q = Q⊤ ∈ Rn×n and η ≥ 0. The equation arises from the nonequilibrium Green’s function approach for treating quantum transport in nanodevices, where the system Hamiltonian is a semi-infinite or bi-infinite real symmetric matrix with special structures [4, 25, 51, 53, 81]. A first systematic mathematical study of the equation has already been undertaken in [35]. For the bi-infinite case, the Green’s function corresponding to the scattering region GS ∈ Cns ×ns , in which the nano scientists are interested, satisfies the relation [24, 51] ⊤ ⊤ GS = (E + i0+ )I − HS − CL,S GL,S CL,S − DS,R GS,R DS,R

−1

,

where E is energy, a real number that may be negative, HS ∈ Rns ×ns is the Hamiltonian for the scattering region, CL,S ∈ Rnℓ ×ns and DS,R ∈ Rns ×nr represent the coupling with the scattering region for the left lead and the right lead, respectively, GL,S ∈ Cnℓ ×nℓ and GS,R ∈ Cnr ×nr are special solutions of the matrix equations GL,S = (E + i0+ )I − BL − A⊤ L GL,S AL GS,R

−1

,  −1 = (E + i0+ )I − BR − AR GS,R A⊤ , R

(3.19) (3.20)

⊤ ⊤ ∈ Rnr ×nr . Since (3.19) and with AL , BL = BL ∈ Rnℓ ×nℓ and AR , BR = BR (3.20) are of the same type, we only need to study (3.19), and we simplify the notation nℓ to n. In nano research, one is mainly interested in the values of E for which GL,S in (3.19) has a nonzero imaginary part [53]. For each fixed E, we replace “0+ ” in (3.19) by a sufficiently small positive number η and consider the matrix equation

X = (E + iη)I − BL − A⊤ L XAL

−1

.

(3.21)

It is shown in [35] that the required special solution GL,S of (3.19) is given by GL,S = limη→0+ GL,S (η) with X = GL,S (η) being the unique complex symmetric solution of (3.21) such that ρ (GL,S (η)AL ) < 1, where ρ(·) denotes the spectral radius. Thus GL,S is a special complex symmetric solution of X = −1 EI − BL − A⊤ with ρ (GL,S AL ) ≤ 1. L XAL The question as to when GL,S has a nonzero imaginary part is answered in the following result from [35], where T denotes the unit circle. Theorem 3.3.1. For λ ∈ T, let the eigenvalues of ψL (λ) = BL + λAL + λ−1 A⊤ L be µL,1 (λ) ≤ · · · ≤ µL,n (λ). Let   ∆L,i = min µL,i (λ), max µL,i (λ) , |λ|=1

|λ|=1

Structure-Preserving Doubling Algorithms

151

Sn and ∆L = i=1 ∆L,i . Then GL,S is a real symmetric matrix if E ∈ / ∆L . When E ∈ ∆L , the quadratic pencil λ2 A⊤ L − λ(EI − BL ) + AL has eigenvalues on T. If all these eigenvalues on T are simple and nonreal, then GL,S has a nonzero imaginary part. By replacing X in (3.21) with X −1 , we get the equation X + A⊤ X −1 A = Qη ,

(3.22)

where A = AL and Qη = Q + iηI with Q = EI − BL . So Q is a real symmetric matrix dependent on the parameter E and is usually indefinite. For η > 0, we need the stabilizing solution X of (3.22), which is the solution with ρ(X −1 A) < 1, and then GL,S (η) = X −1 . When η = 0 and E ∈ ∆L , it follows from Theorem 3.3.1 that the required solution X = G−1 L,S of (3.22) is only weakly stabilizing, in the sense that ρ(X −1 A) = 1. One way to approximate GL,S is to take a very small η > 0 and compute GL,S (η). It is proved in [35] that the sequence {Xk } from the basic fixed-point iteration (FPI) Xk+1 = Qη − A⊤ Xk−1 A, with X0 = Qη , converges to GL,S (η)−1 . It follows that the sequence {Yk } from the basic FPI Yk+1 = (Qη − A⊤ Yk A)−1 ,

with Y0 = Q−1 η , converges to GL,S (η). However, the convergence is very slow for E ∈ ∆L since ρ(GL,S (η)A) ≈ 1 for η close to 0. It is also shown in [35] that a doubling algorithm (DA) can be used to compute the desired solution X = GL,S (η)−1 of (3.22) efficiently for each fixed value of E. However, in practice the desired solution needs to be computed for many different E values. Since the DA is not a correction method, it cannot use the solution obtained for one E value as an initial approximation for the exact solution at a nearby E value. To compute the solutions corresponding to many E values, it may be more efficient to use a modified FPI together with the DA. Indeed, it is suggested in [81] that the following modified FPI be used to approximate GL,S (η): Yk+1 =

1 1 Yk + (Qη − A⊤ Yk A)−1 . 2 2

A variant of this FPI is given in [35] to approximate GL,S (η)−1 : Xk+1 =

1 1 Xk + (Qη − A⊤ Xk−1 A), 2 2

which requires less computational work each iteration. However, the convergence analysis of these two modified FPIs has been an open problem even for the special initial matrices Y0 = Q−1 η and X0 = Qη , respectively. Our first contribution in this section is a proof of convergence (to the desired solutions) of these two modified FPIs and their generalizations, for many choices of initial matrices. Consequently, these methods can be used as correction methods.

152

Eric King-wah Chu, Wen-Wei Lin

In this process we will show that the unique stabilizing solution X = GL,S (η)−1 of (3.22) is also the unique solution of (3.22) with a positive definite imaginary part. It follows that the imaginary part XI of the matrix G−1 L,S is positive semi-definite. Our second contribution in this section is a determination of the rank of XI in terms of the number of eigenvalues on T of the quadratic pencil λ2 A⊤ − λQ + A. Our third contribution is a structure-preserving algorithm that is applied directly on (3.22) with η = 0. In doing so, we work with real arithmetic most of the time. 3.3.1

Rank of Im(GL,S )

Equation (3.22) has a unique stabilizing solution Xη = GL,S (η)−1 for any η > 0. So Xη + A⊤ Xη−1 A = Qη (3.23) with ρ(Xη−1 A) < 1. We also know that Xη is complex symmetric. Write Xη = ⊤ ⊤ Xη,R + iXη,I with Xη,R = Xη,R , Xη,I = Xη,I ∈ Rn×n . We know from the previous section that Im(Xη ) = Xη,I > 0. Let ϕη (λ) = λA⊤ + λ−1 A − Qη .

 By (3.23) we have the factorization ϕη (λ) = λ−1 I − Sη⊤ Xη (−λI + Sη ), where Sη = Xη−1 A. Let X = limη→0+ Xη = G−1 L,S . Then X + A⊤ X −1 A = Q

(3.24)

with ρ(X −1 A) ≤ 1 and Im(X) ≥ 0. Note that ϕ0 (λ) = λA⊤ + λ−1 A − Q has the factorization  ϕ0 (λ) = λ−1 I − S ⊤ X (−λI + S) , (3.25)

where S = X −1 A. In particular, ϕ0 (λ) is regular, i.e., its determinant is not identically zero. In this section we will determine the rank of Im(X), which is the same as the rank of Im(GL,S ) since Im(GL,S ) = Im(X −1 ) = −X −1 Im(X)X −∗ . Let     0 I A 0 M= , L= . (3.26) Q −I A⊤ 0 Then the pencil M − λL, also denoted by (M, L), is a linearization of the quadratic matrix polynomial P (λ) = λϕ0 (λ) = λ2 A⊤ − λQ + A. It is easy to check that y and z are the right and left eigenvectors, respectively, corresponding to an eigenvalue λ of P (λ) if and only if 

   z y , Qy − λA⊤ y −λz

are the right and left eigenvectors of (M, L), respectively.

(3.27)

Structure-Preserving Doubling Algorithms

153

Theorem 3.3.2. Suppose that λ0 is a semi-simple eigenvalue of ϕ0 (λ) on the unit circle T with multiplicity m0 and Y ∈ Cn×m0 forms an orthonormal basis of right eigenvectors corresponding to λ0 . Then, for η > 0 sufficiently small, λj,η = λ0 −

λ0 η + O(η 2 ), dj

yj,η = Y ξj + O(η),

(j = 1, . . . , m0 )

(3.28)

are perturbed eigenvalues and the associated eigenvectors of ϕη (λ), where dj and ξj , j = 1, . . . , m0 , are eigenvalues and right eigenvectors of the nonsingular Hermitian matrix iY ∗ (2λ0 A⊤ − Q)Y . Proof. Since P (λ0 )Y = λ0 ϕ0 (λ0 )Y = 0 with Y ∗ Y = Im0 and |λ0 | = 1, we have 0∗ = (P (λ0 )Y )∗ =

1 ∗ 2 ⊤ Y (λ0 A − λ0 Q + A). λ20

It follows that Y forms an orthonormal basis for left eigenvectors of P (λ) corresponding to λ0 . From (3.27), we obtain that the column vectors of YR =



Y QY − λ0 A⊤ Y



and YL =



Y −λ0 Y



form a basis of left and right eigenspaces of M − λL corresponding to λ0 , respectively. Since λ0 is semi-simple, the matrix   Y ∗ ∗ = −Y ∗ (2λ0 A⊤ − Q)Y = −Y ∗ P ′ (λ0 )Y [Y , −λ0 Y ]L QY − λ0 A⊤ Y is nonsingular. Let

Then we have

eR = −YR (Y ∗ P ′ (λ0 )Y )−1 , Y eR = Im , e∗ LY Y L 0

eL = YL . Y

eR = λ0 Im . e ∗ MY Y L 0

(3.29)

For η > 0, sufficiently small, we consider the perturbed equation of P (λ) by

 A 0 . Then Mη − λL is a linearization of λϕη (λ). By (3.29) Q + iηI −I i h bR and Y bL such that Y eR Y bR and and [78, Chapter VI, Theorem 2.12] there are Y h i eL Y bL are nonsingular and Y Let Mη =

"



P (λ) − λiηI = λ2 A⊤ − λ(Q + iηI) + A = λϕη (λ).

#  h i  e∗ Y L M Y eR Y bR = λ0 Im0 0 , b b∗ 0 M Y L

"

#  h i  e∗ Y L L Y eR Y bR = Im0 0 . b b∗ 0 L Y L

154

Eric King-wah Chu, Wen-Wei Lin

eR +O(η) span the Then, by [78, Chapter VI, Theorem 2.15] the column vectors of Y right eigenspace of (Mη , L) corresponding to (λ0 Im0 + E11 + O(η 2 ), Im0 ), where   0 0 e ∗ e E11 = YL Y = λ0 Y ∗ (iηI)Y (Y ∗ P ′ (λ0 )Y )−1 = −λ0 η(iY ∗ (2λ0 A⊤ −Q)Y )−1 . iηI 0 R (3.30) The matrix iY ∗ (2λ0 A⊤ − Q)Y in (3.30) is Hermitian since iY ∗ (2λ0 A⊤ − Q)Y = iY ∗ ϕ0 (λ0 )Y + iλ0 Y ∗ A⊤ Y − iλ0 Y ∗ AY = iλ0 Y ∗ A⊤ Y + (iλ0 Y ∗ A⊤ Y )∗ .

(3.31)

Let dj and ξj , for j = 1, . . . , m0 , be eigenvalues and associated eigenvectors of iY ∗ (2λ0 A⊤ − Q)Y . Then, for each j ∈ {1, 2, . . . , m0 }, the perturbed eigenvalue λj,η and the associated eigenvector ζj,η of Mη − λL with λj,η |η=0 = λ0 can be expressed by λj,η = λ0 −

λ0 η + O(η 2 ), dj

ζj,η = YR ξj + O(η).

(3.32)

The second equation in (3.28) follows from (3.32). Lemma 3.3.3. Suppose that Zη − Rη∗ Zη Rη = ηWη , for η > 0,

(3.33)

where Wη ∈ Cm×m is positive definite, Rη = eiθ Im + ηEη with θ ∈ [0, 2π] fixed, and Eη ∈ Cm×m is uniformly bounded such that ρ(Rη ) < 1. Then Zη is positive definite. Furthermore, if Zη converges to Z0 and Wη converges to a positive definite matrix W0 as η → 0+ , then Z0 is also positive definite. Proof. Since ρ(Rη ) < 1 and ηWη is positive definite, it is well known that Zη is uniquely determined by (3.33) and is positive definite. Since Eη is bounded, we have from (3.33) that   ηWη = Zη − e−iθ Im + ηEη∗ Zη eiθ Im + ηEη = −ηeiθ Eη∗ Zη −ηe−iθ Zη Eη +O(η 2 ). This implies that

Wη = −eiθ Eη∗ Zη − e−iθ Zη Eη + O(η).

(3.34)

If Zη converges to Z0 as η → 0+ , then Z0 is positive semi-definite. To prove that Z0 is positive definite, it suffices to show that Z0 is nonsingular. Suppose that x ∈ Cm such that Z0 x = 0. Then we have Zη x → 0 and x∗ Zη → 0 as η → 0+ . Multiplying (3.34) by x∗ and x from the left and right, respectively, we have x∗ Wη x = −eiθ x∗ Eη∗ Zη x − e−iθ x∗ Zη Eη x + O(η) → 0, as η → 0+ . Thus x = 0 because Wη converges to W0 and W0 , thus Z0 , are positive definite.

Structure-Preserving Doubling Algorithms

155

Theorem 3.3.4. The number of eigenvalues (counting multiplicities) of ϕ0 (λ) on T must be even, say 2m. Let X = limη→0+ Xη be invertible and write X = ⊤ XR + iXI with XR = XR , XI = XI⊤ ∈ Rn×n . Then (a) rank (XI ) ≤ m; (b) rank (XI ) = m, if all eigenvalues of ϕ0 (λ) on T are semi-simple and kSη − Sk2 = O(η) for η > 0 sufficiently small; (c) rank (XI ) = m, if all eigenvalues of ϕ0 (λ) on T are semi-simple and each unimodular eigenvalue of multiplicity mj is preturbed to mj eigenvalues (of ϕη (λ)) inside the unit circle or to mj eigenvalues outside the unit circle. Proof. Consider the real quadratic pencil P (λ) = λϕ0 (λ) = λ2 A⊤ − λQ + A. So P (λ) and ϕ0 (λ) have the same eigenvalues on T. If λ0 6= ±1 is an eigenvalue of P (λ) on T with multiplicity m0 , then so is λ0 . Thus the total number of nonreal eigenvalues of P (λ) on T must be even. Now the quadratic pencil Pη (λ) = λ2 A⊤ − λ(Q + iηI) + A is ⊤-palindromic, and it has no eigenvalues on T for any η 6= 0 [35]. If 1 (or −1) is an eigenvalue of P (λ) with multiplicity r and Q in P (λ) is perturbed to Q + iηI, then half of these r eigenvalues are perturbed to the inside of T and the other half are perturbed to the outside of T. This means that r must be even. Thus the total number of eigenvalues of ϕ0 (λ) on T is also even and is denoted by 2m. (a) By Xη + A⊤ Xη−1 A = Qη we have i(Q∗η − Qη ) = i(Xη∗ − Xη ) − iA⊤ (Xη−1 − Xη−∗ )A

= i(Xη∗ − Xη ) − (Xη−1 A)∗ i(Xη∗ − Xη )(Xη−1 A).

Thus 

Kη − Sη∗ Kη Sη = 2ηI,

(3.35)

where Kη = i Xη∗ − Xη = 2Xη,I . Note that the eigenvalues of Sη = Xη−1 A are the eigenvalues of Pη (λ) inside T. Since X = limη→0+ Xη is invertible, we have S = X −1 A = limη→0+ Sη . Let   R0,1 0 S = V0 V −1 (3.36) 0 R0,2 0 be a spectral resolution of S, where R0,1 ∈ Cm×m and R0,2 ∈ C(n−m)×(n−m) are upper-triangular with σ(R0,1 ) ⊆ T and σ(R0,2 ) ⊆ D ≡ {λ ∈ C| |λ| < 1}, and V0 = [V0,1 , V0,2 ] with V0,1 ∈ Cn×m and V0,2 ∈ Cn×(n−m) having unit column vectors. It follows from [78, Chapter V, Theorem 2.8] that there is a nonsingular matrix Vη = [Vη,1 , Vη,2 ] with Vη,1 ∈ Cn×m and Vη,2 ∈ Cn×(n−m) such that   Rη,1 0 Sη = Vη V −1 , (3.37) 0 Rη,2 η and Rη,1 → R0,1 , Rη,2 → R0,2 , and Vη → V0 , as η → 0+ . From (3.35) and (3.37) we have  ∗    Rη,1 0 Rη,1 0 ∗ Vη∗ Kη Vη − V K V = 2ηVη∗ Vη . η η ∗ η 0 Rη,2 0 Rη,2

(3.38)

156

Eric King-wah Chu, Wen-Wei Lin

Let Hη =

Vη∗ Kη Vη

   Hη,1 Hη,3 Wη,1 Wη,3 ∗ = , Vη Vη = . ∗ ∗ Hη,3 Hη,2 Wη,3 Wη,2 

(3.39)

Then (3.38) becomes ∗ Hη,1 − Rη,1 Hη,1 Rη,1 = 2ηWη,1 , ∗ Hη,2 − Rη,2 Hη,2 Rη,2 = 2ηWη,2 ,

(3.40a) (3.40b)

∗ Hη,3 − Rη,1 Hη,3 Rη,2 = 2ηWη,3 .

(3.40c)

As η → 0+ , Rη,1 → R0,1 with ρ(R0,1 ) = 1, Rη,2 → R0,2 with ρ(R0,2 ) < 1, and Wη,2 and Wη,3 are bounded. So we have Hη,2 → 0 from (3.40b) and Hη,3 → 0 from (3.40c). It follows from (3.39) that Kη = 2Xη,I converges to K0 = 2XI with rank(XI ) ≤ m. (b) Suppose that eigenvalues of ϕ0 (λ) on T are semi-simple and kSη − Sk2 = O(η) for η > 0 sufficiently small. Then we will show that Hη,1 in (3.40a) converges to H0,1 with rank(H0,1 ) = m. Let λ1 , . . . , λr ∈ T be the distinct semi-simple eigenvalues of S with multiplicities m1 , . . . , mr , respectively. Then (3.36) can be written as   D0,1 0 S = V0 V −1 0 R0,2 0 Pr where D0,1 = diag{λ1 Im1 , . . . , λr Imr }, V0 = [V0,λ1 , . . . , V0,λr , V0,2 ] and i=1 mi = m. Now Sη = S + (Sη − S) with kSη − Sk2 = O(η). By repeated application of [78, Chapter V, Theorem 2.8] there is a nonsingular matrix Vη = [Vη,λ1 , . . . , Vη,λr , Vη,2 ] ∈ Cn×n such that   D0,1 + ηEη,1 0 Sη = Vη V −1 0 R0,2 + ηEη,2 η 1 1 1 and Vη → V0 as η → 0+ , where Eη,1 = diag{Em , . . . , Em } with Em ∈ 1 ,η r ,η j ,η (n−m)×(n−m) 1 mj ×mj and Eη,2 ∈ C are such that kEmj ,η k2 = O(1) for j = 1, . . . , r C and kEη,2 k2 = O(1). Equation (3.40a) can then be written as

Hη,1 − (D0,1 + ηEη,1 )∗ Hη,1 (D0,1 + ηEη,1 ) = 2ηWη,1 .

(3.41)

Since D0,1 + ηEη,1 is a block diagonal matrix and all eigenvalues of its jth diagonal block converge to λj , with λj ’s distinct numbers on T, we have 1 1 Hη,1 = diag{Hm , . . . , Hm } + O(η), 1 ,η r ,η 1 1 where diag{Hm , . . . , Hm } is the block diagonal of Hη,1 . Then (3.41) gives 1 ,η r ,η 1 1 1 1 1 Hm − (λj Imj + ηEm )∗ Hm (λj Imj + ηEm ) = 2ηWm j ,η j ,η j ,η j ,η j ,η

(j = 1, . . . , r)

1 where Wm is the jth diagonal block of Wη,1 . Since Wη,1 is positive definite j ,η 1 and converges to a positive definite matrix, Wm , j = 1, . . . , r, are also positive j ,η definite and converge to positive definite matrices. For η > 0, we have ρ(λj Imj +

Structure-Preserving Doubling Algorithms

157

1 ηEm ) < 1 for j = 1, . . . , r since ρ(Sη ) < 1. By the assumption that Xη converges j ,η 1 1 to X, we have that Hm converges to Hm for j = 1, . . . , r. From Lemma 3.3.3, j ,η j ,0 1 we obtain that Hmj ,0 is positive definite for j = 1, . . . , r. Hence, Hη,1 converges to H0,1 with rank(H0,1 ) = m. It follows from (3.39) that Kη = 2Xη,I converges to K0 = 2XI with rank(XI ) = m. (c) It suffices to show that kSη − Sk2 = O(η) for η > 0 sufficiently small. Since X is a solution of X + A⊤ X −1 A = Q, we have



I M X





 I =L S, X

where the pencil (M,  L) is defined in (3.26) . Under the condition in (c), the I column space of is a simple eigenspace of (M, L), in the terminology of [78]. X It follows from [78, Chapter VI, Theorems 2.12 and 2.15] that 

A 0 Q + iηI −I



    I + ηFη,1 0 I I + ηFη,1 = (S + ηEη ), X + ηFη,2 A⊤ 0 X + ηFη,2

where Fη,1 , Fη,2 , Eη ∈ Cn×n with max{kFη,1 k2 , kFη,2 k2 , kEη k2 } ≤ c for η > 0 sufficiently small and c > 0. It is easily seen that Xη = (X +ηFη,2 )(I +ηFη,1 )−1 , Sη = Xη−1 A = (I +ηFη,1 )(S +ηEη )(I +ηFη,1 )−1 . It follows that kSη − Sk2 = O(η) for η > 0 sufficiently small. Remark 3.3.1. Without the additional conditions in Theorem 3.3.4 (b) or (c), rank (XI ) could be much smaller than m. Consider the example with A = In and Q = 2In . Then ϕ0 (λ) has all 2n eigenvalues at 1, with partial multiplicities 2. So m = n, but it is easy to see that rank (XI ) = 0. For this example, we have kSη − Sk2 = O(η 1/2 ) for η > 0 sufficiently small. We also know that the 2n eigenvalues of ϕ0 (λ) at 1 are perturbed to n eigenvalues inside the unit circle and n eigenvalues outside the unit circle. Corollary 3.3.5. If ϕ0 (λ) has no eigenvalues on T, then X is real symmetric. Furthermore, In(X) = In (−ϕ0 (1)). Here In(W ) denotes the inertia of a matrix W. Proof. From Theorem 3.3.4, it is easy to see that X is a real symmetric matrix. Since X is real, S = X −1 A is a real matrix. By setting λ = 1 in (3.25) we get ϕ0 (1) = −(I − S ⊤ )X(I − S). Hence, In(X) = In(−ϕ0 (1)). Corollary 3.3.6. If all eigenvalues of ϕ0 (λ) are on T and simple, then XI is positive definite. Proof. By Theorem 3.3.4 (c) immediately.

158 3.3.2

Eric King-wah Chu, Wen-Wei Lin A structure-preserving algorithm

As explained in [35] and also in this section, the required solution X = G−1 L,S is a particular weakly stabilizing solution of (3.24) and is given by X = limη→0+ Xη , where Xη is the unique stabilizing solution of (3.22). We will call this particular solution the weakly stabilizing solution of (3.24). It can be approximated by Xη for a small η. For a fixed η > 0, Xη can be computed efficiently by the doubling algorithm studied in [35], for all energy values. In this section we will develop a structure-preserving algorithm that can for most cases find the weakly stabilizing solution of (3.24) more efficiently and more accurately than the doubling algorithm, by working on (3.24) directly.   I Consider the pencil (M, L) given by (3.26). The simple relation M = X   I L X −1 A shows that the weakly stabilizing solution of (3.24) is obtained by X     X1 X1 −1 X = X2 X1 , where forms (or more precisely, the columns of form) X2 X2 a basis for the invariant subspace of (M, L) corresponding to its eigenvalues inside T and its eigenvalues on T that would be perturbed to the inside of T when Q is replaced by Qη with η > 0. We now assume that all unimodular eigenvalues λ 6= ±1 of (M, L) are semisimple and the eigenvalues ±1 (if exist) have partial multiplicities 2. This assumption seems to hold generically. Under this assumption, for computing the weakly stabilizing solution we need to include all linearly independent eigenvectors associated with the eigenvalues ±1, and use Theorem 3.3.2 to determine which half of the unimodular eigenvalues λ 6= ±1 should be included to compute the required invariant subspace. We may use the QZ algorithm to determine this invariant subspace, but it would be better to exploit the structure of the pencil (M, L). We will use the same approach as in [41] to develop a structure-preserving algorithm (SA) to find a basis for the desired invariant subspace of (M, L) and then compute the weakly stabilizing solution of (3.24). The algorithm is still based on the (S + S −1 )transform in [63] and Patel’s algorithm in [74], but some new issues need to be addressed here. ⊤ It is well known  that(M, L) is a symplectic pair, i.e., (M, L) satisfies MJM = 0 I LJL⊤ , where J = . Furthermore, the eigenvalues of (M, L) form reciprocal −I 0 pairs (ν, 1/ν), where we allow ν = 0, ∞. We define the (S + S −1 )-transform [63] of (M, L) by   Q A − A⊤ K := MJL⊤ J + LJM⊤ J = , A⊤ − A Q   A 0 . N := LJL⊤ J = 0 A⊤ Then K and N are both skew-Hamiltonian, i.e., KJ = JK⊤ and NJ = JN⊤ . The relationship between eigenvalues of (M, L) and (K, N), and their Kronecker

Structure-Preserving Doubling Algorithms

159

structures have been studied in [63, Theorem 3.2]. We will first extend that result to allow unimodular eigenvalues for (M, L). The following preliminary result is needed. Lemma 3.3.7. Let Nr (λ) := λIr + Nr , where Nr is the nilpotent matrix with eq. Nr (i, i+1) = 1, i = 1, . . . , r−1, and zeros elsewhere. Let ∼ denote the equivalence between two matrix pairs. Then eq.

(a) For λ 6= 0, ±1, (Nr (λ) + Nr (λ)−1 , Ir ) ∼ (Nr (λ + 1/λ), Ir ). eq. (b) (Nr2 + Ir , Nr ) ∼ (I, Nr ). Proof. (a) Since λ 6= 0, one can show that Nr (λ)−1 ≡ [tj−i ] and Nr (λ)+Nr (λ)−1 ≡ [sj−i ] are Toeplitz upper triangular with tk = (−1)k λ−(k+1) , for k = 0, 1, . . . , r − 1, as well as s0 = λ+1/λ, s1 = 1−λ−2 and sk = tk , for k = 2, . . . , r−1. Since λ 6= ±1, eq. s1 = 1 − λ−2 is nonzero. It follows that (Nr (λ) + Nr (λ)−1 , Ir ) ∼ (Nr (λ + 1/λ), Ir ). eq. eq. eq. (b) (I + Nr2 , Nr ) ∼ (Ir , Nr (I + Nr2 )−1 ) ∼ (Ir , Nr − Nr3 + Nr5 − · · · ) ∼ (Ir , Nr ). Theorem 3.3.8. Suppose that (M, L) has eigenvalues {±1} with partial multiplicities 2. Let γ = λ + 1/λ (λ = 0, ∞ permitted). Then λ and 1/λ are eigenvalues of (M, L) if and only if γ is a double eigenvalue of (K, N). Furthermore, for λ 6= ±1 (i.e., γ 6= ±2) γ, γ, λ and 1/λ have the same sizes of Jordan blocks, i.e., they have the same partial multiplicities; for λ = ±1, γ = ±2 are semi-simple eigenvalues of (K, N). Proof. By the results on Kronecker canonical form for a symplectic pencil (see [65]) and by our assumption, there are nonsingular matrices X and Y such that     J D I 0 YMX = , YLX = n , (3.43) 0 In 0 J where J = J1 ⊕ Js ⊕ J0 , J1 = Ip ⊕ (−Iq ), Js is the direct sum of Jordan blocks corresponding to nonzero eigenvalues λj of (M, L), where |λj | < 1 or λj = eiθj with Im(λj ) > 0, and J0 is the direct sum of nilpotent blocks corresponding to zero eigenvalues, and D  = Ip ⊕ Iq⊕ 0n−r with r = p + q. X1 X2 Let X−1 JX−⊤ = , where Xi ∈ Cn×n , i = 1, 2, 3. So X1⊤ = −X1 −X2⊤ X3 and X3⊤ = −X3 . Using (3.43) in MJM⊤ = LJL⊤ we get JX1 J ⊤ − DX2⊤ J ⊤ + JX2 D + DX3 D = X1 , JX2 + DX3 = X2 J ⊤ ,

JX3 J ⊤ = X3 . 

(3.44a) (3.44b) (3.44c)  X3,2 and X3,3

X3,1 Let Js,0 = Js ⊕ J0 . We partition X3 and X2 by X3 = ⊤ −X3,2   X2,1 X2,2 X2 = , respectively, where X2,1 , X3,1 ∈ Cr×r . Comparing the diX2,3 X2,4 agonal blocks in (3.44b) we get J1 X2,1 + X3,1 = X2,1 J1⊤ ,

⊤ Js,0 X2,4 = X2,4 Js,0 .

(3.45)

160

Eric King-wah Chu, Wen-Wei Lin 

 0p ω From (3.45) we see that X3,1 has the form X3,1 = . From (3.44c) we −ω ⊤ 0q have [Ip ⊕ (−Iq )]X3,1 [Ip ⊕ (−Iq )] = X3,1 . It follows that ω = 0 and thus X3,1 = 0. ⊤ ⊤ From (3.44c) we also have J1 X3,2 Js,0 = X3,2 and Js,0 X3,3 Js,0 = X3,3 , from which we get X3,2 = 0 and X3,3 = 0. So we have X3 = 0. Then (3.44b) becomes JX2 = X2 J ⊤ , from which we get ⊤ X2,1 = ηp ⊕ ηq , X2,2 = X2,3 = 0,

(3.46)

where ηp ∈ Cp×p and ηq ∈ Cq×q . Moreover, X2,1 and X2,4 are nonsingular by the nonsingularity of X−1 JX−⊤ . Substituting (3.44b) into (3.44a) we get X1 = JX1 J ⊤ − DJX2⊤ + X2 J ⊤ D ≡ JX1 J ⊤ + V, (3.47)   V1 0 where V = X2 J ⊤ D − DJX2⊤ = with V1 = (ηp − ηp⊤ ) ⊕ (ηq⊤ − ηq ) by 0 0   X1,1 X1,2 (3.46). Partition X1 = with X1,1 ∈ Cr×r . From the equations for ⊤ −X1,2 X1,3 the (1, 2) and (2, 2) blocks of (3.47) we get X1,2 = 0 and X1,3 = 0, respectively. Furthermore, from the (1, 1) block in (3.47) we get X1,1 = ξp ⊕ ξq with ξp ∈ Cp×p ⊤ and ξq ∈ Cq×q , and we also get ηp = ηp⊤ and ηq = ηq⊤ , i.e., X2,1 = X2,1 . From (3.43), (3.45) and Lemma 3.3.7 (b) we now have eq.

eq.

(K, N) ∼ (MJL⊤ + LJM⊤ , LJL⊤ ) ∼    2 (J1 X1,1 +X1,1 J1 ) ⊕ 0n−r (J12 +Ir )X2,1 ⊕ (Js,0 +In−r )X2,4 , ⊤ 2 ⊤ X2,1 (J12 +Ir ) ⊕ (X2,4 ((Js,0 ) +In−r )) 0n   X1,1 ⊕ 0n−r J1 X2,1 ⊕ Js,0 X2,4 ⊤ ⊤ X2,1 J1 ⊕ X2,4 Js,0 0n   eq. 2ξp ⊕ (−2ξq ) 2Ir ∼ ⊕ (Js + Js−1 ) ⊕ I ⊕ (Js + Js−1 ) ⊕ I, 2Ir 0r    ξp ⊕ ξq Ip ⊕ (−Iq ) ⊕ I ⊕ J0 ⊕ I ⊕ J0 Ip ⊕ (−Iq ) 0r  eq. ∼ 2I2p ⊕ (−2I2q ) ⊕ (Js +Js−1 ) ⊕ I ⊕ (Js +Js−1 ) ⊕ I, I2p ⊕ I2q ⊕ I ⊕ J0 ⊕ I ⊕ J0 .

The proof is completed by using Lemma 3.3.7 (a).

It is helpful to keep in mind that the transform λ → γ achieves the following: {0, ∞} → ∞, T → [−2, 2], R \ {±1, 0} → R \ [−2, 2], C \ (R ∪ T) → C \ R. By Theorem 3.3.8 and our assumption on the unimodular eigenvalues of (M, L), all eigenvalues of (K, N) in [−2, 2] are semi-simple. Based on Patel approach [74] we first reduce (K, N) to a block triangular matrix pair     K11 K12 N11 N12 ⊤ U⊤ KZ = , U NZ = (3.48) ⊤ ⊤ , 0 K11 0 N11

Structure-Preserving Doubling Algorithms

161

where K11 and N11 ∈ Rn×n are quasi-upper and upper triangular, respectively, K12 and N12 are skew symmetric, U and Z ∈ R2n×2n are orthogonal satisfying U⊤ JZ = J. From (3.48) we see that the pair (K11 , N11 ) contains half of eigenvalues of (K, N). We then reduce (K11 , N11 ) to the quasi-upper and upper block diagonal matrix pair b ⊤ K11 Z b = diag{Im0 , Γ1 , Γ2 , . . . , Γr }, U

b ⊤ N11 Z b = diag{Γ0 , Im1 , Im2 , . . . , Imr } U (3.49) by solving some suitable Sylvester equations, where m0 + m1 + · · · + mr = n, Γ0 is nilpotent, Γ1 = diag{g1 , . . . , gm1 } with gi ∈ [−2, 2], and σ(Γj ) = {γj } ⊆ R \ [−2, 2] or σ(Γj ) = {γj , γ j } ⊆ C \ R with σ(Γj ) ∩ σ(Γi ) = ∅, i 6= j, i, j = 2, . . . , r. Partition b = [Z b0 , Z b1 , . . . , Z br ] with Z bi ∈ Rn×mi according to the block sizes of (3.49). It Z holds that

b0 Γ0 = N11 Z b0 , K11 Zbj = N11 Z bj Γj , j = 1, . . . , r. K11 Z (3.50)   bj Z It follows that Z forms a basis for an invariant subspace of (K, N) for 0   b1 Z j = 0, 1, . . . , r. In particular, the columns of Z are real eigenvectors of 0 (K, N) corresponding to real eigenvalues in [−2, 2]. We then need to get a suitable invariant subspace of (M, L) from each of these invariant subspaces for (K, N). We start with two lemmas about solving the quadratic equation γ = λ + 1/λ in the matrix form. Lemma 3.3.9. Given a real quasi-upper triangular matrix   γ11 · · · γ1m  ..  , Γs =  ... . 

(3.51)

0 · · · γmm

where γii is 1 × 1 or 2 × 2 block with σ(γii ) ⊆ C \ [−2, 2], i = 1, . . . , m. Then the quadratic matrix equation Λ2s − Γs Λs + I = 0 (3.52)

of Λs is uniquely solvable with Λs being real quasi-upper triangular with the same block form as Γs in (3.51) and σ(Λs ) ⊆ D ≡ {λ ∈ C| |λ| < 1}. Proof. Let

 λ11 · · · λ1m  ..  Λs =  ... .  0 · · · λmm 

have the same block form as Γs . We first solve the diagonal blocks {λii }m i=1 of Λs from the quadratic equation λ2 − γii λ + I[i] = 0, where [i] denotes the size of γii . Note that the scalar equation λ2 − γλ + 1 = 0

(3.53)

162

Eric King-wah Chu, Wen-Wei Lin

has no solutions on T for γ ∈ C \ [−2, 2]. So it always has one solution inside T and the other outside T. For i = 1, . . . , m, if γii ∈ R \ [−2, 2], then λii ∈ (−1, 1) is uniquely solved from (3.53) with γ = γii . If γii ∈ R2×2 with γii z = γz for z 6= 0 and γ ∈ C \ R, then z] diag {γ, γ} [z, z]−1 and the required solution is λii =  γ ii = [z, −1 [z, z] diag λ, λ [z, z] ∈ R2×2 , where λ ∈ D is uniquely solved from (3.53). For j > i, comparing the (i, j) block on both sides of (3.52) and using λii − γii = −λ−1 ii , we get λij λjj −

λ−1 ii λij

= γij λjj +

j−1 X

ℓ=i+1

(γiℓ − λiℓ )λℓj .

Since σ(λ−1 ii ) ∩ σ(λjj ) = ∅, i, j = 1, . . . , m, the strictly upper triangular part of Λs can be determined by the following recursive formula. For d = 1, . . . , m − 1, For i = 1, . . . , m − d, j = i + d, −1 A := λ⊤ jj ⊗ I[i] − I[j] ⊗ λii , Pj−1 b := γij λjj + ℓ=i+1 (γiℓ − λiℓ )λℓj , λij = vec−1 (A−1 vec(b)), end i, end d. Here ⊗ denotes the Kronecker product, vec is the operation of stacking the columns of a matrix into a vector, and vec−1 is its inverse operation. Lemma 3.3.10. Given a nilpotent matrix Γ0 = [γij ] ∈ Re×e . The quadratic matrix equation Γ0 Λ20 − Λ0 + Γ0 = 0 in Λ0 = [λij ] ∈ Re×e

(3.54)

with Λ0 being nilpotent is uniquely solvable. Proof. From (3.54) the matrix Λ0 is uniquely determined by λi,i+j = γi,i+j , i = 1, . . . , e − 2, j = 1, 2, λe−1,e = γe−1,e and For j = 3, . . . , e, For i = 1, . . . , eP − j + 1, bi,i+j−1 = i+j−2 λi,ℓ λℓ,i+j−1 , λ ℓ=i+1 end i, For i = 1, . . . , e − j, P b λi,i+j = γi,i+j + i+j−2 ℓ=i+1 γi,ℓ λℓ,i+j , end i, end j.

Theorem 3.3.11. Let Zs form a basis for an invariant subspace of (K, N) corresponding to Γs with σ(Γs ) ⊆ C \ [−2, 2], i.e., KZs = NZs Γs . Suppose that Λs

Structure-Preserving Doubling Algorithms

163

solves Γs = Λs + Λ−1 as in Lemma 3.3.9 with σ(Λs ) ⊆ D \ {0}. If the columns s of J(L⊤ JZs Λs − M⊤ JZs ) are linearly independent, then they form a basis for a stable invariant subspace of (M, L) corresponding to Λs . Proof. Since KZs − NZs Γs = (MJL⊤ + LJM⊤ )JZs − LJL⊤ JZs (Λs + Λ−1 s ) = 0,   we have MJ L⊤ JZs Λs − M⊤ JZs = LJ L⊤ JZs Λs − M⊤ JZs Λs . Remark 3.3.2. Let



 Z1 Zs = , Z2



 X1 = J(L⊤ JZs Λs − M⊤ JZs ), X2

where each of X1 , X2 , Z1 , and Z2 has n rows. Then direct computation gives X1= Z2 Λs −Z1 . But it is more convenient to get X2 = QX1 − A⊤ X1 Λs from X1 X1 M =L Λs . X2 X2 We now explain how we can get eigenvectors of (M, L) from eigenvectors of (K, N) corresponding to eigenvalues in [−2, 2].   v1 be a real eigenvector of (K, N) corresponding Theorem 3.3.12. Let v = v2 to eigenvalue γ ∈ [−2, 2]. Let λ be a solution of (3.53) and let u1 = λv2 − v1 , u1 u2 = Qv1 − λA⊤ v1 . Then u = is an eigenvector of (M, L) corresponding to u2 eigenvalue λ if u 6= 0. Moreover, we indeed have u 6= 0 for each γ ∈ (−2, 2). Proof. The first part is proved by direct computation as in the proof of Theorem 3.3.11. For the second part, we simply note that λ is not real when γ ∈ (−2, 2), and then u1 6= 0 since v is a nonzero real vector. Remark 3.3.3. When γ ∈ (−2, 2) is an eigenvalue of (K, N) with multiplicity 2k (or an eigenvalue of (K11 , N11 ) with multiplicity k), we can use Theorem 3.3.12 to get k eigenvectors of (M, L) corresponding to eigenvalue λ, from the k linearly independent eigenvectors of (K, N) we have already obtained. However, there is no guarantee that the k eigenvectors of (K, N) so obtained are also linearly independent. When A is singular, (M, L) has eigenvalues at 0 and ∞. The following result will then be needed. Theorem 3.3.13. Let Z∞ ∈ R2n×m span an infinite invariant subspace of (K, N) corresponding to (I, Γ0 ), where Γ0 is nilpotent, i.e., NZ∞ = KZ∞ Γ0 . Suppose that Λ0 solves Γ0 Λ20 − Λ0 + Γ0 = 0 as in Lemma 3.3.10 with Λ0 being nilpotent. If the columns of J(L⊤ JZ∞ Λ0 − M⊤ JZ∞ ) are linearly independent, then they form a basis for a zero invariant subspace of (M, L) corresponding to (Λ0 , I).

164

Eric King-wah Chu, Wen-Wei Lin

Proof. Since Γ0 = Λ0 (I + Λ20 )−1 , we have NZ∞ (I + Λ20 ) = MJM⊤ JZ∞ + LJL⊤ JZ∞ Λ20 = KZ∞ Λ0 = LJM⊤ JZ∞ Λ0 + MJL⊤ JZ∞ Λ0 , and then MJ(L⊤ JZ∞ Λ0 − M⊤ JZ∞ ) = LJ(L⊤ JZ∞ Λ0 − M⊤ JZ∞ )Λ0 .

We can now present a structure-preserving algorithm (SA) for the computation of the weakly stabilizing solution of (3.24). Here ek denotes the kth column of I. Algorithm 3.3.1 (SA). Input: A ∈ Rn×n , Q = Q⊤ ∈ Rn×n . Output: The weakly stabilizing solution X of X + A⊤ X −1 A = Q. Step 1: Form the matrix pair (K, N) as in (3.42);   K11 K12 ⊤ Step 2: Reduce (K, N) as in (3.48): K ← U⊤ KZ = ⊤ , N ← U NZ = 0 K11   N11 N12 n×n are quasi-upper and upper triangu⊤ , where K11 and N11 ∈ R 0 N11 lar, respectively, U and Z are orthogonal satisfying U⊤ JZ = J (see a pseudo code in [41, Appendix]); b0 Γ0 = N11 Z b0 , K11 Zbj = N11 Z bj Γj of Step 3: Compute eigenmatrix pairs K11 Z (K11 , N11 ), j = 1, . . . , r, as in (3.50) by solving suitable Sylvester equations to get (3.49);         bj b0 Z∞,1 Zj,1 Z Z Step 4: Z∞ = Z ≡ , Zj = Z ≡ , j = 1, . . . , r; Z∞,2 Zj,2 0 0 Step 5-1: Use Lemma 3.3.10 to solve the nilpotent Λ0 for Γ0 Λ20 − Λ0 + Γ0 = 0; Compute X∞,1 = Z∞,2 Λ0 − Z∞,1 , X∞,2 = QX∞,1 − A⊤ X∞,1 Λ0 ; Set X1 ← X∞,1 , X2 ← X∞,2 (by Theorem 3.3.13); Step 5-2: For k = 1, . . . , m1 , Solve λ = eiθ with Im(λ) ≥ 0 from (3.53) with γ = gk ; Compute x1 = z2 eiθ − z1 , x2 = Qx1 − eiθ A⊤ x1 , where z1 = Z1,1 ek , z2 = Z1,2 ek ; If Im(eiθ x∗1 A⊤ x1 ) > 0, then x1 ← x1 , x2 ← x2 ; Set X1 ← [X1 |x1 ], X2 ← [X2 |x2 ] (by Theorems 3.3.12 and 3.3.2); end k; Step 5-3: For j = 2, . . . , r, Use Lemma 3.3.9 to solve Λj = Λs for (3.52) with Γs = Γj ; Compute Xj,1 = Zj,2 Λj − Zj,1 , Xj,2 = QXj,1 − A⊤ Xj,1 Λj ; Set X1 ← [X1 |Xj,1 ], X2 ← [X2 |Xj,2 ] (by Theorem 3.3.11); end j; Step 6: Compute X = X2 X1−1 .

Structure-Preserving Doubling Algorithms

165

To find the weakly stabilizing solution of (3.24), the total number of flops is roughly 120n3 for SA, and 440n3 for QZ. Note that a flop denotes a multiplication or an addition in real arithmetic. In Step 5-2 of SA we have assumed that the nonsingular Hermitian matrix iY ∗ (2λ0 A⊤ − Q)Y in Theorem 3.3.2 (which is the matrix in (??)) is definite. This assumption is the same as the assumption in Theorem 3.3.4 (iii). We could have used Theorem 3.3.2 to choose the right eigenvectors when the Hermitian matrix is not definite, but this will increase computational work. We have thus chosen to use the simpler Step 5-2. We can always check in the end whether the imaginary part of the computed X is (numerically) positive semidefinite. If not, we can use a more complicated Step 5-2 according to Theorem 3.3.2 to re-compute X.

3.4

Sensitivity of time-delayed systems

Quoted from [59], we consider the numerical solution of the quadratic eigenvalue problem (QEP) Q(λ)x ≡ (λ2 B + λC + A)x = 0, (3.55) where A, B, C ∈ Cn×n satisfy P BP = εA, P AP = εB, P CP = εC with ε = ±1, P ∈ Cn×n being an idempotent matrix (i.e., P 2 = In ), and B denoting the complex conjugate of B. In our application for the time-delay system (3.57), the matrix P is a real involuntary matrix (more details later). The scalar λ ∈ C and the vector x ∈ Cn \ {0} are the eigenvalue and the associated eigenvector of the quadratic pencil Q(λ), and the pair (λ, x) is called an eigenpair of Q(λ). We shall propose a structure-preserving doubling algorithm (SDA; [16, 17, 18, 19, 64]) for the solution of (3.55). We define the conjugate and the reverse of a quadratic pencil, respectively, by Q(λ) ≡ λ2 B + λC + A,

rev(Q(λ)) ≡ λ2 A + λC + B.

It is easy to see that QEP in (3.55) satisfies Q(λ) = ε P rev(Q(λ)) P.

(3.56)

From (3.56), we see that (λ, x) is an eigenpair of Q(λ) if and only if (1/λ, P x) is also an eigenpair of Q(λ) [28]. As in [67], a quadratic pencil Q(λ) in (3.55) is said to be palindromic if Q(λ) = rev(Q(λ)). Similar to the terminology of (⋆, ε)-palindromic QEPs (⋆= H or ⊤) in [67], the QEP (3.55) is referred to as (ε = +1) (ε = −1)

P-Conjugate-P Palindromic QEP ( PCP PQEP), or anti-P-Conjugate-P Palindromic QEP (−PCP PQEP).

PCP PQEPs as in (3.55) were first proposed in [47] from the stability analysis of retarded time-delayed systems (TDS), and generalized in [28] the more general neutral TDS ( see [5, 10, 27, 37, 38, 71, 72, 87] and the references therein). Consider a neutral linear time-delayed system with m constant delays h0 = 0 < h1 < · · ·
0);

x(t) = ϕ(t) (t ∈ [−hm , 0]) (3.57)

with x : [−hm , ∞) → RN , ϕ ∈ C1 ([−hm , 0]), and Ak , Dk ∈ RN ×N (k = 1, . . . , m). When D0 = IN and Dk = 0 (k > 0), we have the retarded time-delayed systems. The stability of the TDS (3.57) can be determined by its characteristic equation ! m m X X s Dk e−hk s − Ak e−hk s v = 0 (v 6= 0) (3.58) k=0

k=0

with eigenpairs (s, v) from the nonlinear eigenvalue problem. A TDS is said to be critical if and only if some eigenvalues s is purely imaginary. The set of all points (h1 , . . . , hm ) in the delay-parameter space for which the TDS (3.57) is critical are called critical curves (m = 2) or surfaces (m > 2). Under certain continuity assumptions, the boundary of the stability domain of a TDS is a subset of the critical curves/surfaces. Consequently, purely imaginary eigenvalues of (3.58) are of great interest. See [47] and the references therein for approaches to compute critical surfaces. In [28], the following parameterization of critical surfaces gives rise to an associated PCP PQEP. Detailed discussion on palindromic linearizations, a Schur-like canonical form and other useful results can also be found in [28]. For a given eigenpair (s, v) of (3.58) with kvk2 = 1, a point (h1 , . . . , hm ) in the delay-parameter space is critical if and only if there exist ϕk ∈ [−π, π] (k = 1, . . . , m − 1) and ω ∈ R such that ϕk + 2pk π −Arg z + 2pm π , pk ∈ Z (k = 1, . . . , m − 1); hm = , pm ∈ Z ω ω giving rise to the QEP (z 2 E + zF + G)u = 0, (3.59) hk =

where the unimodular eigenvalue z = e−iωhm , and the corresponding eigenvector u = vec vv ∗ = v ⊗ v¯, ! !   m−1 m−1 X X −i ∗ −iϕk −iϕk Ak e v, w = Dm z + Dk e v, ω= w Am z + w∗ w k=0

and

E=

m−1 X

Dk e

iϕk

k=0

F =

m−1 X

Dk eiϕk

k=0

! !

⊗ Am + ⊗

m−1 X

k=0

m−1 X

Ak e

k=0

Ak e−iϕk

k=0

2

!

iϕk

!

+

⊗ D m ∈ CN

m−1 X k=0

2

Ak eiϕk

2

!

×N 2



, m−1 X k=0

+Dm ⊗ Am + Am ⊗ Dm ∈ CN ×N , ! ! m−1 m−1 X X 2 2 −iϕk −iϕk G = Dm ⊗ Ak e + Am ⊗ Dk e ∈ CN ×N . k=0

k=0

Dk e−iϕk

!

Structure-Preserving Doubling Algorithms

167

Here ⊗ denotes the usual Kronecker product. As M1 ⊗ M2 and M2 ⊗ M1 both contain products of elements of M1 and M2 at different positions, we can find 2 2 an involuntary matrix P ∈ RN ×N (P −1 = P ⊤ = P ) such that M1 ⊗ M2 = PN ⊤ ⊤ N P (M2 ⊗ M1 )P [40, Corollary 4.3.10]. Here, P = i,j=1 Eij ⊗ Eij = [Eij ]i,j=1 , N ×N Eij = ei ⊗ e⊤ ∈ R and e is the jth column of I . Consequently from the j N j structures in E, F and G, we can easily show that (3.59) is a PCP PQEP because E = P GP and F = P F P . Remark 3.4.1. In [14], it has been proved that unimodular eigenvalues occur quite often for PCP PQEPs (and other palindromic eigenvalue problems with eigenvalue pairs (λ, ε/λ)). These eigenvalues can stay unimodular under perturbation, thus are numerically stable to compute. Furthermore, the probability of having too many of them is low and multiple unimodular eigenvalues are rare. This makes our problem of computing the unimodular eigenvalues z of (3.59) well-posed, unlike for complex-T palindromic eigenvalue problems. Before proceeding further, we would like to point out some recent developments in the numerical solution of palindromic eigenvalue problems. The QEP (3.55) is said to be ⋆-palindromic if B ⋆ = A and C ⋆ = C, with ⋆ = ⊤ or H. The train vibration problem and the associated ⊤-palindromic QEP were discussed in [46, 66] and structure-preserving palindromic linearizations for (3.55) were suggested in [67]. An SDA algorithm [19] and a backward-stable generalized (Arnoldi-)Patel algorithm [41] were proposed for the ⊤-palindromic QEP, and can be extended to H-palindromic QEPs. For other approaches for ⋆-palindromic QEPs, see [68, 76, 77]. In [16], the SDA algorithm was generalized for the gpalindromic QEPs, which do not include the PCP PQEP. For further information on the numerical solution of palindromic EVPs, see [18]. For general survey of matrix polynomials, the associated eigenvalue problems and their applications, see [31, 80]. 3.4.1

SDA Algorithm for εPCP PQEP

For a given PCP PQEP (3.55), we define M=



   A 0 DI , L= . −C −I B0

(3.60)

With D = 0 in (3.60), we have 

A 0 −C −I



    x1 0 I x1 =λ x2 B 0 x2

leading to Ax1 = λx2 ,

−Cx1 − x2 = λBx1 .

(3.61)

Multiplying the second equation in (3.61) by λ and substituting the first, we obtain (λ2 B + λC + A)x1 = 0.

168

Eric King-wah Chu, Wen-Wei Lin

We have shown that the pencil M − λL is a linearization of PCP PQEP (3.55) with D = 0. Based on the SDA algorithm proposed in [64], we develop a new SDA for solving the PCP PQEP. For the pencil M − λL defined in (3.60), we compute     −AK −1 0 I AK −1 M∗ = , L∗ = , BK −1 I 0 −BK −1

where K ≡ C −D is assumed to be invertible. It is easy to check that M∗ L = L∗ M. Direct calculation gives rise to " # " # b 0 bI A D b b M = M∗ M = (3.62) b I , L = L∗ L = B b0 , −C

where

b ≡ −AK −1 A, B b ≡ −BK −1 B, A b ≡ C − BK −1 A, K b ≡ K − (BK −1 A + AK −1 B), C b =C b − K. b D

(3.63a) (3.63b)

b − λL b has the doubling property, i.e., if Theorem 3.4.1. The pencil M     X1 X1 =L S, M X2 X2 where X1 , X2 ∈ Cn×m and S ∈ Cm×m , then     X1 X1 2 b b M =L S . X2 X2

Proof. From (3.62) and the relation M∗ L = L∗ M, we have         X1 X1 X1 X1 b M = M∗ M = M∗ L S = L∗ M S X2 X2 X2 X2     X1 2 b X1 2 = L∗ L S =L S . X2 X2 The iteration in (3.63) is structure-preserving for εPCP PQEP, as shown in the following theorem: Theorem 3.4.2. For a pencil M − λL given in (3.60), suppose that P AP = εB, P BP = εA, P KP = εK, where K = C − D. Then it holds that b = εB, b P BP b = εA, b P KP b = εK, b P AP

b B, b K b are defined in (3.63). where A,

(3.64)

Structure-Preserving Doubling Algorithms

169

Proof. From (3.63b) and (3.64), we have   b = P KP − P BP (P KP )−1 P AP + P AP (P KP )−1 P BP P KP b = εK − ε3 (AK −1 B + BK −1 A) = εK.

Similarly, from (3.63a) and (3.64), we have

b = −P AP (P KP )−1 P AP = −ε3 BK −1 B = εB b P AP

b = εA. b and then P BP

The SDA for a basis for the stable invariant subspace of (M, L) can then be constructed, as summarized in Theorems 3.4.1 and 3.4.2. 3.4.2

Convergence of SDA

Consider the matrix pair (M, L) as in (3.60). In order to ensure that the SDA converges to a basis of the stable invariant subspace of (M, L), we suppose that all eigenvalues of (M, L) on the unit circle are semisimple (generically, multiple unimodular eigenvalues are rare; see [14]). From the Kronecker Canonical form [28, 29], there exist nonsingular matrices Q and Z such that     J1 0 I 0 QMZ = ≡ JM , QLZ = ≡ JL , (3.65) 0 In 0 J2 where J1 = Ω1 ⊕ Js ,

Ω1 = diag{e

iω1

J2 = Ω2 ⊕ εJ s ,

, . . . , eiωℓ },

Ω2 = diag{eiωℓ+1 , . . . , eiω2ℓ },

and Js is the stable Jordan block of size m (i.e., ρ(Js ) < 1) with m = n − ℓ. Here ρ denotes the spectral radius and ⊕ the direct sum of matrices. Since JM and JL commute with each other, it follows from (3.65) that MZJL = Q−1 JL JM = LZJM . Let {(Mk , Lk )}∞ k=0 be the sequence with     Ak 0 Dk In Mk = , Lk = , −Ck −In Bk 0

(3.66)

(3.67)

where Ak , Bk , Ck and Dk are generated by the SDA. With M0 = M and L0 = L, it follows from (3.66) and Theorem 3.4.1 that k

k

2 2 Mk ZJL = Lk ZJM .

(3.68)

170

Eric King-wah Chu, Wen-Wei Lin

Theorem 3.4.3. Let (M, L) be given in (3.60) with all its unimodular eigenvalues being semisimple. Write Z in (3.65) in the form   Z1 Z3 Z= , Zi ∈ Cn×n , i = 1, . . . , 4. (3.69) Z2 Z4 Suppose that the sequence {Ak , Bk , Ck , Dk }∞ k=1 generated by the SDA is well defined and {Ak }∞ k=1 is uniformed bounded on k. If Z1 and Z3 are invertible, we have k (a) Ak Z12 = O(ρ(Js2 )) → 0 as k → ∞; k (b) Ck Z12 = −Z22 + O(ρ(Js2 )) → −Z22 as k → ∞; 2k

(c) Dk Z12 = −Z42 + O(ρ(J s )) → −Z42 as k → ∞, where Z12 = Z1 (:, ℓ + 1, . . . , n), Z22 = Z2 (:, ℓ + 1, . . . , n), Z32 = Z3 (:, ℓ + 1, . . . , n), Z42 = Z4 (:, ℓ + 1, . . . , n) and ρ(·) is the spectrum radius. Proof. Substituting (Mk , Lk ) from (3.67), JM and JL in (3.65) as well as Z in (3.69) into (3.68), we have k

k

Ak Z1 = (Dk Z1 + Z2 )(Ω21 ⊕ Js2 ), k

(3.70a)

k

2

Ak Z3 (Ω22 ⊕ J s ) = Dk Z3 + Z4 , k

(3.70b) k

− (Ck Z1 + Z2 ) = Bk Z1 (Ω21 ⊕ Js2 ), k

− (Ck Z3 + Z4 )(Ω22 ⊕

2k Js )

(3.70c)

= Bk Z3 .

From (3.70b), it follows that k

2k

Dk = −Z4 Z3−1 + Ak Z3 (Ω22 ⊕ J s )Z3−1 . Substituting (3.71) into (3.70a), we get      k  2k −1 −1 2k 2 2k Ak I − Z3 Ω2 ⊕ J s Z3 Z1 Ω1 ⊕ Js Z1   k k = −Z4 Z3−1 Z1 + Z2 Ω21 ⊕ Js2 Z1−1 . k

k

(3.71)

Since Ω21 and Ω22 are uniformly bounded (independent of k), and ρ(Js ) < 1, we then have h     i k k k Ak I − z3 Ω22 ω3⊤ z1 Ω21 ω1⊤ + O ρ(Js2 )    k k = −Z4 Z3−1 z1 + z2 Ω21 ω1⊤ + O ρ(Js2 ) , where

z1 = Z1 (:, 1 : ℓ), z2 = Z2 (:, 1 : ℓ), z3 = Z3 (:, 1 : ℓ), ω1⊤ = Z1−1 (1 : ℓ, :), ω3⊤ = Z3−1 (1 : ℓ, :).

Structure-Preserving Doubling Algorithms

171

By the assumption that Ak is uniformly bounded on k, we have k

Ak = ak ω1⊤ + O(ρ(Js2 )) for some suitable ak ∈ Cn×ℓ with kak k being uniformly bounded on k. Thus we have k Ak Z12 = O(ρ(Js2 )) → 0, as k → ∞, where Z12 = Z1 (:, ℓ + 1, . . . , n) being orthogonal to ω1 . This proves (i). Since Bk = P Ak P by Theorem 3.4.2, from (3.70c) we obtain k

Ck = −Z2 Z1−1 + ck ω1⊤ + O(ρ(Js2 )) for some suitable ck ∈ Cn×ℓ with kck k being uniformly bounded on k. Thus, we have   k 0 −1 ⊤ 2k Ck Z12 = −Z2 Z1 Z12 + ck ω1 Z12 + O(ρ(Js )) = −Z2 + O(ρ(Js2 )) → −Z22 In−ℓ as k → ∞, where Z22 = Z2 (:, ℓ + 1, . . . , n). This proves (ii). Similarly from (3.71), we see, as k → ∞, that 2k

Dk Z32 = −Z4 Z3−1 Z32 + dk ω3⊤ Z32 + O(ρ(J s )) → −Z42 → 0, where Z32 ≡ Z3 (:, ℓ + 1, . . . , n), Z42 ≡ Z4 (:, ℓ + 1, . . . , n) and dk ∈ Cn×ℓ is some suitable matrix which is uniformly bounded on k.

3.5

PDA

Quoted from [57], we develop the palindromic doubling algorithm (PDA) for the numerical solution of the palindromic generalized eigenvalue problem (PGEP) A∗ x = λAx,

(3.72)

where A is a real or complex N × N matrix, λ ∈ C and x ∈ CN \{0} are eigenvalue and the corresponding eigenvector of (3.72), respectively. Here, the symbol “∗” = ⊤ (Transpose) or H (Hermitian). The pencil A∗ − λA and the pair (A∗ , A) are usually called a palindromic linear pencil and a palindromic matrix pair, respectively. It is easily seen that the eigenvalues of (3.72) satisfy the reciprocal property, i.e., they appear in pairs as in {λ, 1/λ∗ }. The PGEPs with complex coefficient matrices were firstly suggested as “good” linearizations [66, 67] of palindromic polynomial/quadratic matrix pencils, arising from the study of vibration analysis [46, 86]. A PGEP with real coefficient matrices can also be shown to be equivalent to the generalized continuous/discrete-time algebraic Riccati equations, associated with the continuous/discrete-time, linearquadratic optimal control problems (see [85] for details). The standard approach for solving the PGEP is to compute its generalized Schur form (e.g., by qz in MATLAB), ignoring its symmetric or palindromic structure in (A∗ , A). However, the reciprocal property of eigenvalues of (3.72) is not preserved by computation generally, producing large numerical errors [68]. Recently,

172

Eric King-wah Chu, Wen-Wei Lin

a QR-like algorithm [76] and a hybrid method [68] (which combines Jacobi-type method with the Laub’s trick) were proposed for the PGEP. The QR-like algorithm generally requires O(N 4 ) flops and the hybrid method requires O(N 3 logN ) flops. Alternatively, for methods of cubic complexity, a URV-decomposition based structured method [77] and a structure-preserving algorithm [41] for PGEPs were proposed, producing eigenvalues which are paired to working precision. Unfortunately for PGEPs, these more efficient (and equivalent) methods require the transformation of the PGEP to the quadratic form (µ2 A∗ + µ · 0 + A)x = 0, leading to operations in larger 2N × 2N matrices. The PDA is a unique and more direct, thus more efficient, algorithm for the PGEP. The purpose of this section is to develop the PDA for solving the PGEP structurally. We establish quadratic convergence and linear convergence with rate 1/2 of the PDA, respectively, when (A∗ , A) has no unimodular eigenvalues and has unimodular eigenvalues with partial multiplicities two. In application to discretetime optimal control problems, we especially develop a new algorithm combined with the PDA (as in Algorithm 3.3.1) for solving the optimal control of singular descriptor linear systems. To our knowledge, the associated generalized discretetime algebraic Riccati equation (GDARE) has not been solved successfully in a structure-preserving manner. We shall denote the open left-half plane and the open unit disk by C− and D1 , respectively. 3.5.1

Palindromic doubling algorithm

For a given palindromic matrix pair (A∗ , A), we shall develop a doubling algorithm for solving the associated PGEP which preserves the palindromic structure at each iterative step. Suppose −1 6∈ σ(A∗ , A) (the assumption can be removed later in Remark 3.5.1). We then have A∗ (A∗ + A)−1 A = ((A∗ + A) − A)(A∗ + A)−1 (A + A∗ − A∗ ) = A(A∗ + A)−1 A∗ .

= (I − A(A∗ + A)−1 )((A∗ + A) − A∗ )

(3.73)

From (3.73), it is easily seen that     A∗ ∗ −1 ∗ ∗ −1 A(A + A) , A (A + A) = 0. −A

We now define the doubling transform A → Aˆ by

Aˆ = A(A∗ + A)−1 A.

(3.74)

ˆ∗

ˆ has the doubling property; i.e., if Theorem 3.5.1. The matrix pair (A , A) A∗ U = AU S, where U ∈ C

N ×ℓ

and S ∈ C

ℓ×ℓ

(3.75)

, then ˆ S2. Aˆ∗ U = AU

(3.76)

Structure-Preserving Doubling Algorithms

173

Proof. Multiplying the both sides of (3.75) by A∗ (A∗ +A)−1 , and (3.73) and (3.75) imply (3.76). From Theorem 3.5.1, we see that the doubling transform (3.74) preserves the palindromic structure. So, for a palindromic matrix pair (A∗0 , A0 ) with A0 ∈ CN ×N , we can develop the PDA to generate the sequence {(A∗k , Ak )} if no breakdown occurs in the iterative process. Algorithm 3.5.1 (PDA). Given A0 ∈ CN ×N , τ (a small tolerance), for k = 0, 1, 2, . . . , compute Ak+1 = Ak (A∗k + Ak )−1 Ak ,

(3.77)

if dist(Null(Ak+1 ), Null(Ak )) < τ , then stop, end for Here, “Null(·)” denotes the null space of the given matrix and “dist(·, ·)”denotes the distance between two subspaces. To develop the PDA further, denote Ak = Hk + Kk ,

(3.78)

where

1 ∗ 1 (A + Ak ) = Hk∗ , Kk = (Ak − A∗k ) = −Kk∗ 2 k 2 are the ∗-symmetric and ∗-anti-symmetric parts of Ak , respectively. Then the iteration (3.77) can be rewritten as Hk =

Ak+1 = Hk+1 + Kk+1 =

1 1 (Hk + Kk )Hk−1 (Hk + Kk ) = (Hk + Kk Hk−1 Kk ) + Kk . 2 2

The iteration (3.77) in the PDA can be simplified to Hk+1 = 3.5.2

1 (Hk + K0 Hk−1 K0 ), 2

Kk+1 = Kk = · · · = K0 .

Convergence of PDA

Let A0 ∈ CN ×N . Suppose the eigenvalue “1” of (A∗0 , A0 ) (if exists) has partial multiplicity one or two, and the other unimodular eigenvalues of (A∗0 , A0 ) (if exist) have exactly partial multiplicities two. By the theorem of Kronecker canonical form there are nonsingular matrices Q and Z such that     J0 ⊕ Ω0 0ℓ,ℓ˜ ⊕ Ir In 0n,˜n QA∗0 Z = ≡ C0 , QA0 Z = ≡ D0 , (3.79) 0n˜ ,n Iℓ˜ ⊕ Ω0 0n˜ ,n J˜0∗ ⊕ Ir where Ω0 =diag(eiω1 , . . . , eiωr ), J0 ∈ Cℓ×ℓ consists of stable Jordan blocks (i.e., ρ(J0 ) < 1, where ρ(·) is the radius of the spectrum) and J˜0∗ = J0 ⊕ Im with

174

Eric King-wah Chu, Wen-Wei Lin

n = ℓ + r, ℓ˜ = ℓ + m, n ˜ = n + m = ℓ + r + m and N = 2n + m. Here “⊕” denotes the direct sum of matrices. Since C0 D0 = D0 C0 , from (3.79) we have that A∗0 ZD0 = A0 ZC0 . From Theorem 3.5.1 and steps in the PDA, it follows that k

k

A∗k ZD02 = Ak ZC02 .

(3.80)

Substituting (3.79) into (3.80), we get A∗k Z



" k #  k In 0n,˜n J02 ⊕ Ω20 0ℓ,ℓ˜ ⊕ Γk = Ak Z , k k 0n˜ ,n (J˜0∗ )2 ⊕ Ir 0n˜ ,n Iℓ ⊕ Ω20

(3.81)

k

where Γk = 2k Ω20 −1 . On the other hand, we can interchange the role of (A∗0 , A0 ) by considering the pair (A0 , A∗0 ) which has the same Kronecker structure as (A∗0 , A0 ). Therefore, there are nonsingular P and Y such that  ∗    In 0n,˜n J0 ⊕ Ω∗0 0ℓ,ℓ˜ ⊕ Ir ∗ P A0 Y = ≡ E0 , P A0 Y = ≡ F0 , (3.82) 0n˜ ,n J˜0 ⊕ Ir 0n˜ ,n Iℓ˜ ⊕ Ω∗0 Since E0 F0 = F0 E0 , we deduce that A0 Y F0 = A∗0 Y E0 . Using the similar arguments as in (3.80)–(3.81), we obtain # "   ∗ 2k ∗ 2k ∗ In 0n (J ) ⊕ (Ω ) 0 ⊕ Γ ˜ 0 0 k ℓ,ℓ Ak Y = A∗k Y . (3.83) k k 0n J˜02 ⊕ Ir 0n˜ ,n Iℓ˜ ⊕ (Ω∗0 )2 We partition Ak , Hk and K0 in (3.78) into four sub-blocks as in       ∗ ∗ Ak1 Ak3 Hk1 Hk2 K01 −K02 Ak = , Hk = , K0 = , Ak2 Ak4 Hk2 Hk4 K02 K04

(3.84)

∗ ∗ where Ak1 , Hk1 , K01 ∈ Cn×n , A∗k2 , Ak3 , Hk2 , K02 ∈ Cnטn and Ak4 , Hk4 , K04 ∈ n ˜ ט n C . From (3.78) and (3.84), we also have

Ak1 = Hk1 + K01 , ∗ ∗ Ak3 = Hk2 − K02 ,

Ak2 = Hk2 + K02 , Ak4 = Hk4 + K04 .

Furthermore, we partition Z in (3.81) and Y in (3.83) as in     Z1 Z3 Y1 Y3 Z= , Y = , Z2 Z4 Y2 Y4

(3.85)

where Z1 , Y1 ∈ Cn×n ; Z2∗ , Z3 , Y2∗ , Y3 ∈ Cnטn and Z4 , Y4 ∈ Cn˜ טn . For convenience, we denote Zi,a ≡ Zi (:, 1 : ℓ),

Zi,b ≡ Zi (:, ℓ + 1 : n),

Yi,a ≡ Yi (:, 1 : ℓ),

i = 3, 4;

Yi,b ≡ Yi (:, ℓ + 1 : n),

i = 1, 2.

Structure-Preserving Doubling Algorithms

175

Theorem 3.5.2. Let A0 ∈ CN ×N . Suppose that the eigenvalue “1” of (A∗0 , A0 ) (if exists) has partial multiplicity one or two, the other unimodular eigenvalues of (A∗0 , A0 ) (if exist) have exactly partial multiplicities two, and (3.79) and (3.82) hold with n ˜ ≤ 2ℓ (i.e., r + m ≤ ℓ). Suppose that Z1 and Y1 in (3.85) are invertible, and W ≡ [ΦZ3,a −Z4,a |ΨY3,a −Y4,a ] ∈ Cn˜ ×2ℓ is of full row rank, where Φ ≡ Z2 Z1−1 , Sr k Ψ ≡ Y2 Y1−1 . If −1 6∈ j=1 {e2 iωj , k ≥ 0}, then the sequence {(A∗k , Ak )} generated by the PDA is well defined and satisfies     Z Y A∗k 1 , Ak 1 → 0, Z2 Y2     Z1 Y3 linearly as k → ∞ with convergence rate 1/2, where span and span Z2 Y4 form the weakly stable and the unstable invariant subspaces of (A∗0 , A0 ) corresponding to (J0 ⊕ Ω0 , In ) and (In , J0∗ ⊕ Ω∗0 ), respectively. Sr k Proof. Since −1 6∈ j=1 {e2 iωj , k ≥ 0}, from (3.77) we see that −1 6∈ σ(A∗k , Ak ), thus, A∗k + Ak is invertible for all k. From (3.84), (3.81) and (3.85), we have k

k

k

k

k

k

k

k

A∗k1 Z1 + A∗k2 Z2 =Ak1 Z1 (J02 ⊕ Ω20 ) + Ak3 Z2 (J02 ⊕ Ω20 ),

(3.86)

A∗k3 Z1 + A∗k4 Z2 =Ak2 Z1 (J02 ⊕ Ω20 ) + Ak4 Z2 (J02 ⊕ Ω20 ), k A∗k1 Z3 ((J˜0∗ )2

⊕ Ir ) +

k A∗k2 Z4 ((J˜0∗ )2 k

=Ak1 [Z1 (0ℓ,ℓ˜ ⊕ Γk ) + Z3 (Iℓ˜ ⊕ Ω20 )]

(3.87)

⊕ Ir )

k

+ Ak3 [Z2 (0ℓ,ℓ˜ ⊕ Γk ) + Z4 (Iℓ˜ ⊕ Ω20 )],

k A∗k3 Z3 ((J˜0∗ )2

⊕ Ir ) +

k A∗k4 Z4 ((J˜0∗ )2 k

=Ak2 [Z1 (0ℓ,ℓ˜ ⊕ Γk ) + Z3 (Iℓ˜ ⊕ Ω20 )]

(3.88)

⊕ Ir )

k

+ Ak4 [Z2 (0ℓ,ℓ˜ ⊕ Γk ) + Z4 (Iℓ˜ ⊕ Ω20 )].

(3.89)

k

−1 2 Post-multiplying (3.88) by 0ℓ,ℓ ˜ ⊕ Γk Ω0 , we get k

k

−1 2 −1 2 ∗ A∗k1 Z3 (0ℓ,ℓ ˜ ⊕ Γk Ω0 ) + Ak2 Z4 (0ℓ,ℓ ˜ ⊕ Γ k Ω0 ) k

= (Ak1 Z1 + Ak3 Z2 )(0ℓ ⊕ Ω20 )

k

k

−1 2 2 + (Ak1 Z3 + Ak3 Z4 )(0ℓ,ℓ ˜ ⊕ Ω0 Γk Ω0 ). k

k

(3.90) k

2 −k 2 Substituting (3.90) into (3.86) and using Ω20 Γ−1 Ω0 k Ω0 = 2

A∗k1 Z1

+

+1

, we have

A∗k2 Z2 k

k

=(Ak1 Z1 + Ak3 Z2 )(J02 ⊕ 0r ) + (Ak1 Z1 + Ak3 Z2 )(0ℓ ⊕ Ω20 ) k

k

−1 2 =(Ak1 Z1 + Ak3 Z2 )(J02 ⊕ 0r ) + (A∗k1 Z3 + A∗k2 Z4 )(0ℓ,ℓ ˜ ⊕ Γ k Ω0 ) k

−k 2 − (Ak1 Z3 + Ak3 Z4 )(0ℓ,ℓ Ω0 ˜ ⊕2

+1

).

(3.91)

176

Eric King-wah Chu, Wen-Wei Lin

Using (3.84) and re-arranging (3.91), we get n o k k −k Hk1 Z1 [In − (J02 ⊕ 0r )] − Z3 [0ℓ,ℓ Ω0 (Ir − Ω20 )] ˜ ⊕2 n o k k −k ∗ + Hk2 Z2 [In − (J02 ⊕ 0r )] − Z4 [0ℓ,ℓ Ω0 (Ir − Ω20 )] ˜ ⊕2 n o k k −k Ω0 (Ir + Ω20 )] =K01 Z1 [In + (J02 ⊕ 0r )] − Z3 [0ℓ,ℓ ˜ ⊕2 n o k k ∗ −k − K02 Z2 [In + (J02 ⊕ 0r )] − Z4 [0ℓ,ℓ Ω0 (Ir + Ω20 )] . ˜ ⊕2 Denote

k

k

ǫk ≡ max{ρ(J0 )2 , 2−k } → 0,

as k → ∞.

(3.92)

(3.93)

Since kΩ20 k is bounded and Z1 is invertible, by letting Φ ≡ Z2 Z1−1 , (3.92) can be simplified to ∗ ∗ (Φ + O(ǫk )) + K01 − K02 Φ + O(ǫk ). Hk1 = −Hk2

(3.94)

Post-multiplying (3.88) by Iℓ˜ ⊕ Γ−1 k , we have k ∗ ˜∗ 2k ⊕ Γ−1 ) A∗k1 Z3 ((J˜0∗ )2 ⊕ Γ−1 k k ) + Ak2 Z4 ((J0 ) k

= Ak1 [Z1 (0ℓ,ℓ˜ ⊕ Ir ) + Z3 (Iℓ˜ ⊕ Ω20 Γ−1 k )] k

+ Ak3 [Z2 (0ℓ,ℓ˜ ⊕ Ir ) + Z4 (Iℓ˜ ⊕ Ω20 Γ−1 k )].

(3.95)

From (3.84) and (3.93), (3.95) becomes Hk1 [Z3 (Iℓ ⊕ 0m+r ) + Z1 (0ℓ,ℓ˜ ⊕ Ir ) + O(ǫk )]

∗ + Hk2 [Z4 (Iℓ ⊕ 0m+r ) + Z2 (0ℓ,ℓ˜ ⊕ Ir ) + O(ǫk )]

= − K01 [Z3 (Iℓ ⊕ 2Im ⊕ 0r ) + Z1 (0ℓ,ℓ˜ ⊕ Ir ) + O(ǫk )]

∗ + K02 [Z4 (Iℓ ⊕ 2Im ⊕ 0r ) + Z2 (0ℓ,ℓ˜ ⊕ Ir ) + O(ǫk )].

(3.96)

Substituting (3.94) into (3.96) we get n ∗ Hk2 (Φ + O(ǫk ))[Z3 (Iℓ ⊕ 0m+r ) + Z1 (0ℓ,ℓ˜ ⊕ Ir ) + O(ǫk )] o −Z4 (Iℓ ⊕ 0m+r ) − Z2 (0ℓ,ℓ˜ ⊕ Ir ) + O(ǫk ) = O(1). Since ΦZ1,b = Z2,b , it holds that

∗ Hk2 ([ΦZ3,a − Z4,a ] + O(ǫk )) = O(1) ∈ Cn˜ ×ℓ .

(3.97)

On the other hand, from (3.84), (3.83) and (3.85), we have k

k

k

k

k

k

k

k

Ak1 Y1 + Ak3 Y2 =A∗k1 Y1 ((J0∗ )2 ⊕ (Ω∗0 )2 ) + A∗k2 Y2 ((J0∗ )2 ⊕ (Ω∗0 )2 ),

Ak2 Y1 + Ak4 Y2 =A∗k3 Y1 ((J0∗ )2 ⊕ (Ω∗0 )2 ) + A∗k4 Y2 ((J0∗ )2 ⊕ (Ω∗0 )2 ),

(3.98) (3.99)

Structure-Preserving Doubling Algorithms

177

k k k Ak1 Y3 (J˜02 ⊕ Ir ) + Ak3 Y4 (J˜02 ⊕ Ir ) =(A∗k1 Y3 + A∗k2 Y4 )(Iℓ˜ ⊕ (Ω∗0 )2 )

k Ak2 Y3 (J˜02

⊕ Ir ) +

k Ak4 Y4 (J˜02

+ (A∗k1 Y1 + A∗k2 Y2 )(0ℓ,ℓ˜ ⊕ Γ∗k ),

⊕ Ir )

k =(A∗k3 Y3 + A∗k4 Y4 )(Iℓ˜ ⊕ (Ω∗0 )2 ) + (A∗k3 Y1 + A∗k4 Y2 )(0ℓ,ℓ˜ ⊕ Γ∗k ).

(3.100) (3.101)

k

∗ −1 As in (3.90)–(3.92), post-multiplying (3.100) by 0ℓ,ℓ (Ω∗0 )2 and substi˜ ⊕ (Γk ) tuting it into (3.98), we have k

−k ∗ Ω0 ) Ak1 Y1 + Ak3 Y2 =(A∗k1 Y1 + A∗k2 Y2 )((J0∗ )2 ⊕ 0r ) + (A∗k1 Y3 + A∗k3 Y4 )(0ℓ,ℓ ˜ ⊕2 k

−k − (A∗k1 Y3 + A∗k2 Y4 )(0ℓ,ℓ (Ω∗0 )2 ˜ ⊕2

+1

).

(3.102)

From (3.84) and (3.93), (3.102) becomes o n k k −k ∗ Hk1 Y1 [In − ((J0∗ )2 ⊕ 0r )] − Y3 [0ℓ,ℓ Ω0 (Ir − (Ω∗0 )2 )] ˜ ⊕2 n o k k ∗ −k ∗ + Hk2 Y2 [In − ((J0∗ )2 ⊕ 0r )] − Y4 [0ℓ,ℓ Ω0 (Ir − (Ω∗0 )2 )] ˜ ⊕2 o n k k −k ∗ Ω0 (Ir + (Ω∗0 )2 )] = − K01 Y1 [In + ((J0∗ )2 ⊕ 0r )] − Y3 [0ℓ,ℓ ˜ ⊕2 n o k k ∗ −k ∗ + K02 Y2 [In + ((J0∗ )2 ⊕ 0r )] − Y4 [0ℓ,ℓ Ω0 (Ir + (Ω∗0 )2 )] , ˜ ⊕2

(3.103)

and then

∗ ∗ Hk1 = −Hk2 (Ψ + O(ǫk )) − K01 + K02 Ψ + O(ǫk ),

have

Ψ = Y2 Y1−1 .

Post-multiplying (3.100) by Iℓ˜ ⊕ (Γ∗k )−1 and substituting (3.103) into it, we

n ∗ Hk2 Y4 (Iℓ ⊕ 0m+r ) + Y2 (0ℓ,ℓ˜ ⊕ Ir ) + O(ǫk ) − (Ψ + O(ǫk ))[Y3 (Iℓ ⊕ 0m+r ) o +Y1 (0ℓ,ℓ˜ ⊕ Ir ) + O(ǫk )] = O(1).

Since ΨY1,b = Y2,b , it holds that

∗ Hk2 ([ΨY3,a − Y4,a ] + O(ǫk )) = O(1) ∈ Cn˜ ×ℓ .

(3.104)

Combining (3.97) and (3.104) we get ∗ Hk2 ([ΦZ3,a − Z4,a |ΨY3,a − Y4,a ] + O(ǫk )) = O(1) ∈ Cn˜ ×2ℓ .

(3.105)

By the assumption that W ≡ [ΦZ3,a −Z4,a |ΨY3,a −Y4,a ] ∈ Cn˜ ×2ℓ is of full row rank, ∗ it follows that kHk2 k is uniformly bounded on k. Consequently, (3.94) implies that kHk1 k, and in turn kAk1 k and kA∗k2 k, are uniformly bounded on k. From (3.91), it follows that A∗k1 Z1 + A∗k2 Z2 = O(ǫk ) → 0,

as k → ∞.

(3.106)

178

Eric King-wah Chu, Wen-Wei Lin

Applying the similar argument as in (3.90)–(3.92) to (3.99) and (3.101), we deduce that Hk4 = −Hk2 (Y2 Y1−1 + O(ǫk )) + K04 − K02 Y2 Y1−1 + O(ǫk ). Thus, (3.105) implies that kHk4 k, and in turn kAk4 k, are uniformly bounded on k. −1 2k To show A∗k3 Z1 + A∗k4 Z2 = O(ǫk ), we post-multiply (3.89) by 0ℓ,ℓ ˜ ⊕ Γ k Ω0 and obtain k

k

−1 2 −1 2 ∗ A∗k3 Z3 (0ℓ,ℓ ˜ ⊕ Γk Ω0 ) + Ak4 Z4 (0ℓ,ℓ ˜ ⊕ Γ k Ω0 )

k

k

−k 2 Ω0 = (Ak2 Z1 + Ak4 Z2 )(0ℓ ⊕ Ω20 ) + (Ak2 Z3 + Ak4 Z4 )(0ℓ,ℓ ˜ ⊕2

+1

) (3.107)

Substituting (3.107) into (3.87), as in (3.91) we have k

−k Ω0 ) A∗k3 Z1 + A∗k4 Z2 =(Ak2 Z1 + Ak4 Z2 )(J02 ⊕ 0r ) + (A∗k3 Z3 + A∗k4 Z4 )(0ℓ,ℓ ˜ ⊕2 k

−k 2 − (Ak2 Z3 + Ak4 Z4 )(0ℓ,ℓ Ω0 ˜ ⊕2

+1

) = O(ǫk ) → 0,

as → ∞. (3.108)

Combining (3.106) and (3.108), we have shown that  ∗   Ak1 A∗k2 Z1 = O(ǫk ) → 0, as k → ∞. A∗k3 A∗k4 Z2 Similarly, as in (3.90)-(3.91), from (3.99) and (3.101) we have k

−k ∗ Ak2 Y1 + Ak4 Y2 =(A∗k3 Y1 + A∗k4 Y2 )((J0∗ )2 ⊕ 0r ) + (Ak2 Y3 + Ak4 Y4 )(0ℓ,ℓ Ω0 ) ˜ ⊕2 k

−k − (A∗k3 Y3 + A∗k4 Y4 )(0ℓ,ℓ (Ω∗0 )2 ˜ ⊕2

+1

) = O(ǫk ).

(3.109)

Using the boundedness of kAki k, i = 1, . . . , 4, and combining (3.102) and (3.109), we have shown that    Ak1 Ak3 Y1 = O(ǫk ) → 0, as k → ∞, Ak2 Ak4 Y2 1 2k

k

dominates ρ(J0 )2 in (3.93) for sufficiently large values of k. Sr k Remark 3.5.1. Consider the assumption −1 6∈ U ≡ j=1 {e2 iωj , k ≥ 0} in Theorem 3.5.2. Since U is a countable set (possibly dense on the unit circle only when r → ∞), there exist an −eiθ0 6∈ U . With A∗new ≡ e−iθ0 /2 A∗0 , we have A∗new + Anew = eiθ0 /2 A0 + e−iθ0 /2 A0 = e−iθ0 /2 (A∗0 + eiθ0 A0 ) being invertible. It is unclear how the “optimal” θ0 can be found.

because

Theorem 3.5.3. Suppose (A∗0 , A0 ) has no unimodular eigenvalues. The sequence {(A∗k , Ak )} generated by the PDA satisfies     Z Y A∗k 1 , Ak 1 → 0, Z2 Y2 quadratically, as k → ∞, with convergence rate ρ(J0 ).

Structure-Preserving Doubling Algorithms

179

Proof. Since (A∗0 , A0 ) has no unimodular eigenvalues, Theorem 3.5.2 implies (A∗k , Ak ) has no unimodular eigenvalues and (A∗k + Ak ) is invertible. So, the PDA is welldefined. From (3.84), (3.81) and (3.85), we have k

k

A∗k1 Z1 + A∗k2 Z2 = Ak1 Z1 J02 + Ak3 Z2 J02 , A∗k3 Z1

+

A∗k4 Z2

=

k Ak2 Z1 J02

+

(3.110)

k Ak4 Z2 J02 .

(3.111)

From (3.84), it holds that k

k

∗ ∗ Hk1 Z1 + Hk2 Z2 = (K01 Z1 − K01 Z2 )(I + J02 )(I − J02 )−1 . ∗ Therefore, kHk1 Z1 + Hk2 Z2 k ≤ O(1), and this implies ∗ ∗ )Z2 k = kAk1 Z1 + Ak3 Z2 k ≤ O(1). k(Hk1 + K01 )Z1 + (Hk2 − K02

(3.112)

From (3.110) and (3.112), we have k

A∗k1 Z1 + A∗k2 Z2 = O(ρ(J0 )2 ) → 0,

as k → ∞.

Similarly, from (3.111), we obtain k

k

∗ ∗ Hk2 Z1 + Hk4 Z2 = (K02 Z1 + K04 Z2 )(I + J02 )(I − J02 )−1 ,

which is uniformly bounded on k. This implies k

A∗k3 Z1 + A∗k4 Z2 = O(ρ(J0 )2 ) → 0,

 Z1 This shows that → 0, quadratically, with convergence rate ρ(J0 ). SimiZ2   Y larly, from (3.84), (3.83) and (3.85), we can also show that Ak 1 → 0 quadratY2 ically, with rate ρ(J0 ). A∗k

3.5.3



as k → ∞.

Applications of PDA

In this section, we want to apply the PDA to find all the eigenpairs of a general PGEP, and solve the c-/d-stabilizing solutions of generalized continuous/discretetime algebraic Riccati equations (GCARE/GDARE). We especially apply the Algorithm 3.5.2 for the computation of the d-semi-stabilizing solution of GDAREs arising in the optimal control of singular descriptor linear systems. To our knowledge, Algorithm 3.5.2 is the first structure-preserving algorithm for solving GDAREs associated with singular descriptor systems. For operation counts or complexity, it depends on the details in the individual applications and whether efficiency can be squeezed from these fine structures. From the PDA, it suffices to say that the algorithm is of O(N 3 ) complexity per iteration. In addition, for problems without unimodular eigenvalues, the convergence

180

Eric King-wah Chu, Wen-Wei Lin

is quadratic and typically less than ten iterations are required for convergence to machine accuracy. PGEP We can apply the PDA to solve the PGEP A∗0 x = λA0 x, where A0 ∈ C2n×2n . First, we apply the PDA to A0 until convergence to Ak . Then we compute the bases Zs ,Ys ∈ C2n×n for the right and left null spaces of A∗k , respectively, satisfying A∗k Zs = 0,

Ys∗ A∗k = 0.

This implies that there are S and T ∈ Cn×n with ρ(S) ≤ 1 and ρ(T ) ≤ 1 such that A∗0 Zs = A0 Zs S, A0 Ys = A∗0 Ys T. (3.113) From (3.113), S and T can be computed by S = (Ys∗ A0 Zs )−1 (Ys∗ A∗0 Zs ) ≡ S1−1 S2 ,

T = (Zs∗ A∗0 Ys )−1 (Zs∗ A0 Ys ) ≡ S1−∗ S2∗ .

Rewrite the second equation of (3.113) as A0 (Ys S1−∗ ) = A∗0 (Ys S1−∗ )S2∗ S1−∗ = A∗0 (Ys S1−∗ )S ∗ . Compute Sgj = λj gj and S ∗ hj = λ∗j hj , as well as zj = Zs gj and yj = (Ys S1−∗ )hj , for j = 1, . . . , n. It holds that A∗0 zj = λj A0 zj ,

λ∗j A∗0 yj = A0 yj ,

j = 1, . . . , n.

GCARE Here we are interested in finding the c-stabilizing solution of the generalized continuous-time algebraic Riccati equation (GCARE) ⊤ ⊤ −1 ⊤ ⊤ A⊤ c Xc Ec + Ec Xc Ac − (Nc + Ec Xc Bc )Rc (Nc + Ec Xc Bc ) + Mc = 0, (3.114)

which solves the continuous-time linear-quadratic control problem min u

1 2

Z

0



 ⊤    M c Nc x x dt u u Nc⊤ Rc

subject to the descriptor linear system Ec x˙ = Ac x + Bc u,

x(0) = x0 ,

where Ec , Ac , Mc = Mc⊤ , Xc = Xc⊤ ∈ Rn×n , Bc , Nc ∈ Rn×m and Rc = Rc⊤ ∈ Rm×m with Ec and Rc being nonsingular. Furthermore, the c-stabilizing closedloop matrix pencil of (3.115) is given by Ac + Bc Kc − λEc with the σ(Ac + Bc Kc , Ec ) ⊆ C− , where Kc ≡ −Rc−1 (Bc⊤ Xc Ec + Nc⊤ ).

Structure-Preserving Doubling Algorithms Let



   0 Ac Bc 0 Ec 0  − λ  −Ec⊤ 0 0  . Mc − λLc ≡  A⊤ c M c Nc Bc⊤ Nc⊤ Rc 0 0 0

181

(3.116)

One common approach to solve (3.114) is to compute the n-dimensional, c-stable invariant subspace Uc of the symmetric/skew-symmetric pencil Mc − λLc corresponding to the eigenvalue matrix pair (Sc , Ec ) with σ(Sc , Ec ) ⊆ C− , where Uc is the column space of Uc ∈ R(2n+m)×n which satisfies Mc Uc Ec = Lc Uc Sc . We assume that the matrix pencil Mc − λLc has no eigenvalues on the imaginary axis. The generalized eigenvalues of (Mc , Lc ) can be arranged by ¯1, . . . , λ ¯ 2n ; ∞, . . . , ∞, λ1 , . . . , λ2n ; λ | {z } m

where λi ∈ C− , for 1 ≤ i ≤ 2n. The m trivial infinity eigenvalues are from the nonsingularity of Rc . With   Xc Ec } n Uc =  In  } n , Kc }m

Xc is the c-stabilizing solution of GCARE (3.114) and Kc is the optimal controller for (3.115)[85].  ⊤ In order to utilize the PDA to compute an orthogonal basis V = V1⊤ , V2⊤ , V3⊤ for Uc with V1 , V2 ∈ Rn×n , we consider the Cayley transformation A⊤ 0 − λA0 = (Mc + Lc ) − λ(Mc − Lc ), where



 0 Ac − Ec Bc ⊤ M c Nc  . A0 = Mc − Lc =  A⊤ c + Ec ⊤ Bc Nc⊤ Rc

Then the c-stabilizing solution Xc for GCARE (3.114) can be obtained by Xc = V1 V2−1 Ec−1 . GDARE Here we are interested in finding the d-semi-stabilizing solution of the generalized discrete-time algebraic Riccati equation (GDARE) ⊤ ⊤ ⊤ −1 ⊤ A⊤ (Nd +A⊤ d Xd Ad −Ed Xd Ed −(Nd +Ad Xd Bd )(Rd +Bd Xd Bd ) d Xd Bd ) +Md = 0 (3.117) which solves the discrete-time linear-quadratic control problem

min uk

⊤    ∞  1 X xk M d Nd xk uk Nd⊤ Rd uk 2 k=0

182

Eric King-wah Chu, Wen-Wei Lin

subject to the singular descriptor linear system Ed xk+1 = Ad xk + Bd uk ,

x0 = x0 ,

where Ed , Ad , Md = Md⊤ , Xd = Xd⊤ ∈ Rn×n , Bd , Nd ∈ Rn×m and Rd = Rd⊤ ∈ Rm×m with Ed being singular. Furthermore, the d-semi-stabilizing closed-loop matrix pencil of (3.118) is given by Ad +Bd Kd −λEd with the σ(Ad +Bd Kd , Ed ) ⊆ D1 ∪ {∞}, where Kd ≡ −(Rd + Bd⊤ Xd Bd )−1 (Bd⊤ Xd Ad + Nd⊤ ). Let



   0 Ad Bd 0 Ed 0  Md − λLd ≡ Ed⊤ Md Nd  − λ  A⊤ d 0 0 . 0 Nd⊤ Rd Bd⊤ 0 0

One common approach to solve (3.117) is to compute the n-dimensional, d-semistable invariant subspace Ud of the matrix pencil Md − λLd corresponding to the eigenvalue matrix pair (Sd , Ed ) with σ(Sd , Ed ) ⊆ D1 ∪{∞}, where Ud is the column space of Ud ∈ R(2n+m)×n which satisfies Md Ud Ed = Ld Ud Sd . With   Xd Ed } n Ud =  In  } n , Kd }m Xd is the d-semi-stabilizing solution of GDARE (3.117) and Kd is the optimal controller for (3.118)[85]. Assume that the matrix pencil Md − λLd has no eigenvalues on the unit circle, rd =nullity(Ed ) and ind∞ (Ad , Ed ) ≤ 1. From [85] we see that σ(Md , Ld ) = σ(Ad + Bd Kd , Ed ) ∪ σ(Ed⊤ , (Ad + Bd Kd )⊤ ) ∪ σ(0m , Im ).

(3.119)

So, the generalized eigenvalues of (Md , Ld ) corresponding to (3.119) can be arranged by , . . . , λ−1 {λ1 , . . . , λn−rd , ∞, . . . , ∞} ∪ {0, . . . , 0, λ−1 n−rd } ∪ {∞, . . . , ∞}, (3.120) | {z } 1 | {z } | {z } rd

rd

m

where λi ∈ D1 (can possibly be zero) , i = 1, . . . , n − rd . For convenience, we apply the convention that 0 and ∞ are mutually reciprocal. The rd infinity and rd zero eigenvalues in (3.120) are from the assumption rd = nullity(Ed ). The last trivial m infinity eigenvalues are from the last m columns of Ld . In fact, (Ad + Bd Kd , Ed ) is an eigenvalue matrix pair associated with the d-semi-stable invariant subspace Ud . We now introduce an elegant transformation between the coefficient matrices of the GDARE (3.117) and GCARE (3.114) proposed by [85]. We define fW : (Ed , Ad , Bd , Md , Nd , Rd ) → (Ec , Ac , Bc , Mc , Nc , Rc ),

Structure-Preserving Doubling Algorithms

183

where the matrices Ec , Ac , Bc , Mc , Nc , Rc satisfy       1 Ad + Ed Bd Ec 0 E 0 =χ d Wd⊤ = √ Wd⊤ , Ac Bc Ad Bd 2 Ad − Ed Bd     M c Nc M d Nd = W Wd⊤ , d Nc⊤ Rc Nd⊤ Rd       In In 1 √ in which χ = 2 , and Ad + Ed Bd = H 0 Wd is the QR-factorization −In In with Wd being orthogonal and H being lower triangular. By the important property of fW in [85], it is assumed that   rank Ad + Ed Bd = n and (Md , Ld ) has no eigenvalue “−1”. Thus, the coefficient matrix tuple (Ec , Ac , Bc , Mc , Nc , Rc ) corresponds to a GCARE (3.114) with Ec and Rc being nonsingular. Furthermore, the GDARE (3.117) and the GCARE (3.114) share the same stabilizing solutions, i.e., Xd = Xc . We construct (Mc , Lc ) by (Ec , Ac , Bc , Mc , Nc , Rc ) as in (3.116) which satisfies Mc + Lc = W −1 Md W,

(3.122)

√ where W ≡ diag( 2In , Wd⊤ ). Let (A⊤ 0 , A0 ) = (Mc + Lc , Mc − Lc )

(3.123)

be the Cayley transformation of (Mc , Lc ). From (3.121) and (3.122), we see that the eigenvalues λd ∈ σ(Md , Ld ), µ ∈ σ(Mc , Lc ) and λ ∈ σ(A⊤ 0 , A0 ) satisfy the relationship in Table 1, in which µ = (λ − 1)(λ + 1)−1 . From Table 1, we see that the key property of the transformation λd → λ is to transform m trivial infinity eigenvalues to m trivial −1 while preserving other eigenvalues (including nontrivial ∞) unchanged.

λd µ λ

Table 1. Correspondence among λd , 0 < |λd | < 1 |λd | > 1 0 ∞ Re(µ) < 0 Re(µ) > 0 −1 1 λ = λd λ = λd 0 ∞

µ and λ m trivial ∞ m trivial ∞ m trivial −1

In the following, we use the PDA and the special structure of (Md , Ld ) for the computation of the d-semi-stabilizing solution Xd of GDARE (3.117). Firstly, we apply the PDA to the matrix A0 until convergence to Ak . Then we compute the orthogonal bases Nr and Nℓ ∈ R(2n+m)×n for the right and left null spaces of A⊤ k ; i.e., A⊤ (3.124) k Nr = 0, Ak Nℓ = 0, which form orthogonal bases for the d-stable invariant subspaces of (A⊤ 0 , A0 ) and (A0 , A⊤ ), respectively. 0 We then compute the QR-factorization A0 Nr = Q1 R1 , where Q1 is orthogonal and R1 is upper triangular. Next compute ⊤ S = Q⊤ s A0 Nr ,

T = Q⊤ s A0 Nr ,

184

Eric King-wah Chu, Wen-Wei Lin

where Qs = Q1 (:, 1 : n). We see that (S, T ) forms the d-stable eigenvalue matrix pair of (A⊤ 0 , A0 ) associated with Nr , and T is clearly nonsingular. We would like to separate the invariant subspaces of (A⊤ 0 , A0 ) corresponding to the zero and nonzero d-stable eigenvalues. Let G = T −1 S. By Van Dooren’s algorithm [82], there is an orthogonal matrix Φ ∈ Rn×n such that   G11 G12 ⊤ ΦGΦ = , 0 G22 where G11 ∈ Rs×s with σ(G11 ) = {λ ∈ σ(A⊤ A0 )| 0 < |λ| < 1} and G22 ∈ 0 ,T (n−s)×(n−s) R with σ(G ) = {0}. Since σ(G ) σ(G22 ) = φ, there is a Ψ = 22 11   Is Ψ12 such that 0 In−s   G11 0 −1 ⊤ Ψ Φ GΦΨ = , 0 G22 where Ψ12 solves the Sylvester equation G11 Ψ12 − Ψ12 G22 = G12 uniquely. Then V0 = Nr ΦΨ(:, s + 1 : n),

Vˆs = Nr ΦΨ(:, 1 : s)

span the invariant subspaces of (A⊤ 0 , A0 ) corresponding d-stable eigenvalues, respectively. Let ζˆℓ spans the left null space of Ed . Then ζℓ =

(3.125)

to the zero and nonzero

i⊤ h ζˆℓ⊤ , 0 ∈ R(2n+m)×rd contains the rd eigenvectors of (Md , Ld ) corresponding to the trivial zeros. From the transformation (3.122), we see that W −1 ζℓ contains the rd eigenvectors of −1 (A⊤ ζℓ from 0 , A0 ) corresponding to trivial zeros. Now, we want to extract W span{V0 }.   Compute the QR-factorization W −1 ζℓ V0 = Q0 R0 , where Q0 is orthogonal and R0 is upper triangular. Let Vˆ0 = Q0 (:, rd + 1 : n − s)

(3.126)

which forms the eigenvectors of (A⊤ 0 , A0 ) corresponding to zero eigenvalues of (Sd , Ed ). We will next find the invariant space U∞ of (A⊤ 0 , A0 ) corresponding to the infinity eigenvalues. Compute the QR-factorization A0 Nℓ = Q∞ R∞ , where Q∞ is orthogonal and R∞ is upper triangular. Let     N∞,1 } n Vs1 } n   N∞ = Nℓ Q∞ (:, s + 1 : n) ≡ N∞,2  } n , Vs = Vˆs Vˆ0 ≡ Vs2  } n . N∞,3 } m Vs3 } m

From the Cayley transform, there is a full rank matrix Z ∈ R(n−s)×rd so that     Vs1 N∞,1 Z V1 V = Vs2 N∞,2 Z  ≡ V2  (3.127) Vs3 N∞,3 Z V3

Structure-Preserving Doubling Algorithms

185

is a  basis of an invariant subspace of (A⊤ 0 , A0 ), satisfying Span{V }   Xc Ec  Span  In  .   Kc To determine Z, (3.127) and the fact Xc = Xc⊤ imply     ⊤ ⊤     Vs2 Vs1 ⊤ Ec Vs1 N∞,1 Z = Ec Vs2 N∞,2 Z . ⊤ ⊤ Z ⊤ N∞,2 Z ⊤ N∞,1

=

That is,

⊤ ⊤ ⊤ Ec Vs2 , Vs2 Ec Vs1 = Vs1

(3.128a)

⊤ ⊤ Ec Vs2 , Z ⊤ N∞,2 Ec⊤ Vs1 = Z ⊤ N∞,1

(3.128b)

=

(3.128c)

⊤ ⊤ Vs2 Ec N∞,1 Z ⊤ ⊤ Z N∞,2 Ec⊤ N∞,1 Z

=

⊤ Vs1 Ec N∞,2 Z, ⊤ ⊤ Z N∞,1 Ec N∞,2 Z.





(3.128d) 



Vs1 N∞1 and , (3.128a) Vs2 N∞2 and (3.128d) hold automatically. Since (3.128b) is the transpose of (3.128c), the ⊤ ⊤ ⊤ matrix Z is solved by finding the basis of Null(Vs2 Ec N∞,1 − Vs1 Ec N∞,2 ). Finally, we have the d-semi-stabilizing solution Xd for GDARE (3.117) can be obtained by Xd = Xc = V1 V2−1 Ec−1 . (3.129) Since Ec is nonsingular, from the isotropic property of

We summarize the computational steps (3.123)–(3.129) for Xd in Algorithm 3.5.2. Algorithm 3.5.2. (for GDARE (3.117)) Input: Output: 1. 2. 3. 4. 5. 6. 7. 3.5.4

Ed , Ad , Bd , Md , Nd , Rd ; τ (a small tolerance); The d-semi-stabilizing solution Xd of (3.117). Construct A0 via (3.123). Apply PDA to (A⊤ 0 , A0 ) until dist(Null(Ak ), Null(Ak−1 )) < τ . Compute bases Nr , Nℓ for the right and left null spaces of A⊤ k as in (3.124). Compute bases V0 , Vˆs for d-stable invariant subspaces of (A⊤ 0 , A0 ) as in (3.125). Compute eigenvectors Vˆ0 of (A⊤ 0 , A0 ) corresponding to zeros as in (3.126) . Determine Z by (3.128c). Compute Xd = V1 V2−1 Ec−1 as in (3.129).

Additional remarks

We have developed the palindromic doubling algorithm (PDA) for solving the palindromic generalized eigenvalue problem (PGEP) A∗ x = λAx structurally. We prove quadratic convergence and linear convergence with rate 1/2 of the PDA, when (A∗ , A) has no unimodular eigenvalues and has unimodular eigenvalues

186

Eric King-wah Chu, Wen-Wei Lin

with partial multiplicities two (one or two for eigenvalue 1), respectively. Algorithm 3.5.2 is specially developed for the computation of the d-semi-stabilizing solution of the generalized discrete-time algebraic Riccati equation (GDARE) for the singular descriptor linear system. It is the first structure-preserving algorithm for singular descriptor systems. Our numerical experience indicates that the PDA is not necessarily better than other specialist algorithms (if exist) for solving the original problem, without linearizing the associated palindromic matrix polynomials. Such specialist algorithms may be able to better utilize the finer structures of the original problems. Our numerical examples showed selected applications for which the PDA was better or when no specialist structure-preserving algorithms exist. For future work, research will be conducted on how the finer structures can be fully utilized for individual applications. For a general PGEP without finer structures, the PDA is the only structure-preserving algorithm which performs reasonably efficiently. Consequently, the “good” vibrations from “good” linearizations [66, 67] can always be computed using the PDA, in the absence of better methods. Of course, numerical solutions from the PDA or other methods may be refined using the finer structures in the original problems, if feasible.

3.6

Large-scale problems

As an example, we shall only sketch the result from [60] on large-scale DAREs. For large-scale CAREs, NAREs, NMEs and Stein/Lyapunov equations, please consult [21, 22, 61, 62]. Let the system matrix A be large and sparse, possibly with band structures. The discrete-time algebraic Riccati equation (DARE) in (1.1): D(X) ≡ −X + A⊤ X(I + GX)−1 A + H = 0, with the low-ranked G = BR−1 B ⊤ , H = CT −1 C ⊤ ,

(3.130)

where B ∈ Rn×m , C ∈ Rn×l and m, l ≪ n, arises often in linear-quadratic optimal control problems. The solution of DAREs and CAREs has been an extremely active area of research; see, e.g., [23, 55, 70]. The usual solution methods such as the Schur vector method, symplectic SR methods, the matrix sign function, the matrix disk function or the doubling method have not made (full) use of the sparsity and structure in A, G and H. Requiring in general O(n3 ) flops and workspace of size O(n2 ), these methods are obviously inappropriate for the large-scale problems we are interested in here. For control problems for parabolic PDEs and the balancing based model order reduction of large linear systems, large-scale CAREs, DAREs, Lyapunov and Stein equations have to be solved [6, 7], [48]–[50]. As stated in [7], “the basic observation on which all methods for solving such kinds of matrix equations are based, is that often the (numerical) rank of the solution is very small compared to its actual dimension and therefore it allows for a good approximation via low

Structure-Preserving Doubling Algorithms

187

rank solution factors”. Importantly, without solving the corresponding algebraic Riccati equations, alternative solutions to the optimal control problem require the deflating subspace of the corresponding Hamiltonian matrices or (generalized) symplectic pencils which are prohibitively expensive to compute. Benner has done much on large-scale algebraic Riccati equations; see [6, 75] and the references therein. They built their methods on (inexact) Newton’s methods with inner iterations for the associated Lyapunov and Stein equations. We shall adapt the structure-preserving doubling algorithm (SDA) [15, 17, 64], making use of the sparsity in A and the (numerically) low-ranked structures in G, H and X. For other applications of the SDA, see [19]. We shall concentrate on the numerical solution of large-scale DAREs in this section. The results for large-scale CAREs and Stein/Lyapunov equations will be presented elsewhere [61, 62]. 3.6.1

Structure-preserving doubling algorithm for DAREs

The structure-preserving doubling algorithm (SDA) [17], assuming (I + GH)−1 exists, has the form in (1.2). We can apply the Sherman-Morrison-Woodbury formula (SMWF) to (I + GH)−1 and make use of the low-ranked forms of G and H in (3.130). It is then obvious that (I + GH)−1 is a low-rank update of the b H b and A b are the same of the corresponding identity matrix, and from (1.2) that G, previous iterates. So we can organize the SDA in the low-rank update form h i⊤ (1) (2) Ak = A2k−1 − Dk Sk−1 Dk , Gk = Bk Rk−1 Bk⊤ , Hk = Ck Tk−1 Ck⊤ .

The previous attempts in generalizing the SDA for large-scale problems were abandoned mainly because of the difficulties presented by the computation of Ak , when it fills up as the iteration proceeds. There are different ways of handing this problem, dependent on the structure of A, H and G. Under favourable assumptions on the band-width of A, we may just compute Ak explicitly, with the growth in its exponentially growing band-widths out run by the fast convergence of the SDA. Alternatively, we may assume that the multiplication of A and A⊤ to arbitrary vectors to be of O(n) computational complexity. For given vectors u and v, we can then compute Ak u and A⊤ k v recursively, in terms of Ak−1 and eventually A0 = A. Assuming that n is large and the convergence of the SDA is fast, we obtain the important result that the SDA is of O(n) computational complexity and memory requirement. Note that the rank of X has been observed to be numerically low-ranked. Under suitable assumptions, the convergence of the SDA [17] implies the converk gence of Ak = O(|λ|2 ) → 0, for some |λ| < 1. We then see that Bk+1 and Ck+1 equal, respectively, the sums of Bk and Ck and the diminishing components Ak Bk and A⊤ k Ck . Thus the observation about the numerical ranks of X is shown to be true. Obviously, as the SDA converges, increasingly smaller but fatter low-ranked components are added to Bk and Ck . Apparently, the growth in the sizes and ranks of these iterates is potentially exponential. To reduce the dimensions of Bk ,

188

Eric King-wah Chu, Wen-Wei Lin (1)

(2)

Ck , Dk and Dk , we may compress their columns by the QR decomposition with column pivoting.

4

Conclusions

We have told the story about the (structure-preserving) doubling algorithms, how it was “folk-lore”, presumably once popular until the seventies of the last century, then forgotten, and rediscovered and redeveloped in the past decade. Since the paper [17] by Chu, Fan, Lin and Wang in 2004, there have been approximately fifty published papers and reports by our extended group of authors based in Australia, Monash, Beijing, and Hsinchw. Other authors are starting to realize the importance, power and usefulness of the doubling algorithms. Our papers are widely cited. In addition, others are starting their research in the area and there have been a handful of papers from other authors since 2010, on the method’s theoretical aspects as well as applications. It has been an interesting and rewarding journey, and it is continuing.

Acknowledgements This research work is partially supported by the National Science Council and the National Centre for Theoretical Sciences. Part of the work was completed when the first author visited the National Centre for Theoretical Sciences at the Chiao Tung University, Hsinchu. This manuscript was written when the first authors visited Southeast University, Nanjing, China. We would like to acknowledge the support from these institutions.

References [1] [2] [3]

[4]

[5] [6]

h. allik and t. hughes. Finite element method for piezoelectric vibration, Int. J. Numer. Methods Eng., 2 (1970) 151–157. b.d.o. anderson and j.b. moore. Optimal Filtering, Prentice-Hall, Englewood Cliffs, NJ, 1979. m.b. angel, m.i. rocha-gaso, m.i. carmen, and a.v. antonio. Surface generated acoustic wave biosensors for detection of pathogens: a review, Sensors, 9 (2009) 5740–5769. i. appelbaum, t. wang, j.d. joannopoulos, and v. narayanamurti. Ballistic hot-electron transport in nanoscale semiconductor heterostructures: Exact self-energy of a three-dimensional periodic tight-binding Hamiltonian, Physical Review B, 69 (2004), Article Number 165301, 6 pp. r. bellman and k.l. cooke. Differential Difference Equations. Academic Press, New York, 1963. p. benner. Editorial of special issue on “Large-Scale Matrix Equations of Special Type”, Numer. Lin. Alg. Appl., 15 (2008) 747–754.

Structure-Preserving Doubling Algorithms [7]

[8]

[9]

[10] [11]

[12] [13]

[14]

[15]

[16]

[17]

[18]

[19] [20]

[21]

[22]

189

p. benner and j. saak. A Galerkin-Newton-ADI method for solving largescale algebraic Riccati equations, DFG Priority Programme 1253 “Optimization with Partial Differential Equations”, Preprint SPP1253-090, January 2010. d.s. bernstein and w.m. haddad. LQG control with an H∞ performance bound: a Riccati equation approach, IEEE Trans. Automat. Control, 34 (1989) 293–305. d.a. bini, b. iammazzo, and f. poloni. A fast Newton’s method for a nonsymmetric algebraic Riccati equation, SIAM J. Matrix Anal. Appl., 30 (2008) 276–290. e.k. boukas and z.k. liu. Deterministic and Stochastic Time Delay Systems, Birkh¨auser, Boston, 2002. m. buchner, w. ruile, a. dietz, and r. dill. FEM analysis of the reflection coefficient of SAWS in an infinite periodic array, In Proc. IEEE Ultrason. Symp., 371–375, 1991. c.k. campbell. Surface Acoustic Wave Devices for Mobile and Wireless Communications, Academic Press, New York, 1998. c.-y. chiang, e.k.-w. chu, c.-h. guo, t.-m. huang, w.-w. lin, and s.f. xu. Convergence analysis of the doubling algorithm for several nonlinear matrix equations in the critical case, SIAM J. Matrix Anal. Applic., 31 (2009) 227–247. c.y. chiang, e.k.-w. chu, and t.-m. huang. Unimodular eigenvalues for palindromic eigenvalue problems. Technical report, NCTS Preprints in Mathematics, Tsing Hua University, Hsinchu, 2008. e.k.-w. chu, h.-y. fan, and w.-w. lin. A structure-preserving doubling algorithm for continuous-time algebraic Riccati equations, Lin. Alg. Applic., 396 (2005) 55–80. e.k.-w. chu, t.-m. huang, and w.-w. lin. Structured doubling algorithms for solving g-palindromic quadratic eigenvalue problems, Technical report, NCTS Preprints in Mathematics, Tsing Hua University, Hsinchu, 2008. e.k.-w. chu, h.-y. fan, w.-w. lin, and c.-s. wang. Structure-preserving algorithms for periodic discrete-time algebraic Riccati equations, Int. J. Control, 77 (2004) 767–788. e.k.-w.chu, t.-m. hwang, w.-w. lin, and c.-t. wu. Vibration of fast trains, palindromic eigenvalue problems and structure-preserving doubling algorithms, J. Comput. Appl. Math., 219 (2008) 237–252. e.k.-w. chu, t.m. huang, w.-w. lin, and c.-t. wu. Palindromic eigenvalue problems: a brief survey, Taiwanese J. Math., 14 (2010) 743–779. e.k.-w. chu, t. li, j. juang, and w.-w. lin. Solution of a nonsymmetric algebraic Riccati equation from a one-dimensional multi-state transport model, IMA J. Numer. Anal., 31 (2011) 1453–1467. e.k.-w. chu, c.-y. weng, y.-c. kuo, and w.-w. lin. Solving large-scale nonlinear matrix equations by doubling, Technical Report, NCTS, Chiao Tung University, Taiwan, 2011. e.k.-w. chu, c.-y. weng, t. li, and w.-w. lin. Solving large-scale nonsymmetric algebraic Riccati equations by doubling, Technical Report, NCTS,

190

Eric King-wah Chu, Wen-Wei Lin

Chiao Tung University, Hsinchu, 2011. [23] b. datta. Numerical Methods for Linear Control Systems, Elsevier Academic Press, Boston, 2004. [24] s. datta, Electronic Transport in Mesoscopic Systems, Cambridge University Press, Camb ridge, 1995. [25] s. datta, Nanoscale device modeling: the Green’s function method, Superlattices and Microstructures, 28 (2000) 253–278. [26] b. de moor and j. david. Total linear least squares and the algebraic Riccati equations, Systems Control Lett., 18 (1992) 329–337. [27] o. diekmann, s.a. van gils, s.m. verduyn lunel, and h. o. walther. Delay Equations: Functional-, Complex-, and Nonlinear Analysis, Applied Mathematical Sciences 110, Springer-Verlag, New York, 1995. ¨ der. A struc[28] h. fassbender, n. mackey, d.s. mackey, and c. schro tured polynomial eigenproblem arising in the analysis of time delay systems and related polynomial eigenproblems, Technical report, TU Braunschweig, 2007. [29] f.r. gantmacher. The Theory of Matrices, Vol. II. Chelsea Publishing Company, New York, 1977. [30] i. gohberg and m.a. kaashoek. An inverse spectral problem for rational matrix functions and minimal divisibility, Integral Equations and Operator Theory, 10 (1987) 437–465. [31] i. gohberg, p. lancaster, and l. rodman. Matrix Polynomials. Academic Press, New York, 1982. [32] c.-h. guo. A new class of nonsymmetric algebraic Riccati equations, Lin. Alg. Applic., 426 (2007) 636–649. [33] c.-h. guo, y.-c. kuo, and w.-w. lin. On a nonlinear matrix equation arising in nano research, SIAM J. Matrix Anal. Applic., 33 (2012) 235–262. [34] c.-h. guo and w.-w. lin. Solving a structured quadratic eigenvalue problem by a structure-preserving doubling algorithm, SIAM J. Matrix Anal. Applic., 31 (2010) 2784–2801. [35] c.-h. guo and w.-w. lin. The matrix equation X + A⊤ X −1 A = Q and its application in nano research, SIAM J. Sci. Comput., 32 (2010) 3020–3038. [36] x.-x. guo, w.-w. lin, and s.-f. xu. A structure-preserving doubling algorithm for nonsymmetric algebraic Riccati equations, Numer. Math., 103 (2006) 393–412. [37] k. gu, v. kharitonov, and j. chen. Stability of Time-Delay Systems. Control Engineering, Birkh¨auser, Boston, 2003. [38] j. hale and s.m. verduyn lunel. Introduction to Functional Differential Equations. Springer-Verlag, New York, 1993. [39] d. hinrichsen, b. kelb, and a. linnemann. An algorithm for the computation of the structured complex stability radius, Automatica, 25 (1989) 771–775. [40] r.a. horn and c.r. joihnson. Topics in Matrix Analysis, Cambridge University Press, Cambridge, 1995. [41] t.-m. huang, w.-w. lin, and j. qian. Structure-preserving algorithms for palindromic quadratic eigenvalue problems arising from vibration on fast

Structure-Preserving Doubling Algorithms

191

trains, SIAM J. Matrix Anal. Applic., 30 (2009) 1566–1592. [42] t.-m. huang, w.-w. lin, and c.-t. wu. Structure-preserving Arnoldi-type algorithms for solving palindromic quadratic eigenvalue problems in leaky surface wave propagation. NCTS Preprints in Mathematics, No. 2011-2001, Tsing Hua University, 2011. [43] t.-m. huang, c.-t. wu, and t. li. Numerical study of structure-preserving algorithms for surface acoustic wave simulation, Technical Report, NCTS, Chiao Tung University, Hsinchu, 2011. [44] t.-m. hwang, e.k.-w. chu, and w.-w. lin. A generalized structurepreserving doubling algorithm for generalized discrete-time algebraic Riccati equations, Int. J. Control, 78 (2005) 1063–1075. [45] t.-m. hwang and w.-w. lin. Structured doubling algorithms for weak Hermitian solutions of algebraic Riccati equations, Technical report, NCTS Preprints in Mathematics, Tsing Hua University, Hsinchu, 2006-7-009, 2006. [46] c.f. ipsen. Accurate eigenvalues for fast trains, SIAM News, 37, 2004. [47] e. jarlebring. The Spectrum of Delay-Differential Equations: Computational Analysis, PhD thesis, Institute Computational Mathematics, TU Braunschweig, Germany, 2008. [48] k. jbilou. Block Krylov subspace methods for large continuous-time algebraic Riccati equations, Numer. Algorithms, 34 (2003) 339–353. [49] k. jbilou. An Arnoldi based algorithm for large algebraic Riccati equations, Appl. Math. Lett., 19 (2006) 437–444. [50] k. jbilou and a. riquet. Projection methods for large Lyapunov matrix equations, Lin. Alg. Applic., 415 (2006) 344–358. [51] d.l. john and d.l. pulfrey, Green’s function calculations for semi-infinite carbon nanotubes, Physica Stutus Solidi B – Basic Solid State Physics, 243 (2006) 442–448. [52] j. juang. Existence of algebraic matrix Riccati equations arising in transport theory, Lin. Alg. Applic., 230 (1995) 89–100. [53] a. kletsov, y. dahnovsky, and j.v. ortiz. Surface Green’s function calculations: A nonrecursive scheme with an infinite number of principal layers, J. Chemical Physics, 126 (2007), Article Number 134105. [54] m. koshiba, s. mitobe, and m. suzuki. Finite-element solution of periodic waveguides for acoustic waves, IEEE Trans. Ultrason. Ferroelectr. Freq. Control, 34 (1987) 472–477. [55] p. lancaster and l. rodman, Algebraic Riccati Equations, Clarendon Press, Oxford, 1995. [56] r. lerch. Simulation of piezoelectric devices by two- and three-dimensional finite elements, IEEE Trans. Ultrason. Ferroelectr. Freq. Control, 37 (1990) 233–247. [57] t. li, c.-y. chiang, e.k.-w. chu, and w.-w. lin. The palindromic generalized eigenvalue problem A∗ x = λAx: numerical solution and applications, Lin. Alg. Applic., 434 (2011) 2269–2284. [58] t. li, e.k.-w. chu, j. juang, and w.-w. lin. Solution of a nonsymmetric algebraic Riccati equation from a two-dimensional transport model, Lin.

192

Eric King-wah Chu, Wen-Wei Lin

Alg. Applic., 434 (2011) 201–214. [59] t. li, e.k.-w. chu, and w.-w. lin. A structure-preserving doubling algorithm for quadratic eigenvalue problems arising from time-delay systems, J. Comp. Appl. Math., 223 (2010) 1799–1745. [60] t. li, e.k.-w. chu, and w.-w. lin. Solving large-scale discrete-time algebraic Riccati equations by doubling, Technical Report, NCTS, Chiao Tung University, Hsinchu, 2011. [61] t. li , e.k.-w. chu, w.-w. lin, and c.-y. weng. Solving large-scale continuous-time algebraic Riccati equations by doubling, Technical Report, NCTS, Chiao Tung University, Hsinchu, 2011. [62] t. li , c.-y. weng, e.k.-w. chu, and w.-w. lin. Solving large-scale Stein and Lyapunov equations by doubling, Technical Report, NCTS, Chiao Tung University, Hsinchu, 2011. [63] w.-w. lin. A new method for computing the closed-loop eigenvalues of a discrete-time algebraic Riccati equation, Lin. Alg. Applic., 96 (1987)157– 180. [64] w.-w. lin and s.-f. xu, Convergence analysis of structure-preserving doubling algorithms for Riccati-type matrix equations, SIAM J. Matrix Anal. Applic., 28 (2006) 26–39. [65] w.-w. lin, v. mehrmann, and h. xu. Canonical forms for Hamiltonian and symplectic matrices and pencils, Lin. Alg. Appl., 302/303 (1999) 469–533. [66] d. s. mackey, n. mackey, c. mehl, and v. mehrmann. Vector spaces of linearizations for matrix polynomials. SIAM J. Matrix Anal. Applic., 28 (2006) 971–1004. [67] d. s. mackey, n. mackey, c. mehl, and v. mehrmann. Structured polynomial eigenvalue problems: Good vibrations from good linearizations, SIAM J. Matrix Anal. Applic., 28 (2006) 1029–1051. [68] d. s. mackey, n. mackey, c. mehl, and v. mehrmann. Numerical methods for palindromic eigenvalue problems, Technical Report, TU Berlin, MATHEON, Germany, 2007. [69] v.l. markine, a.p. de man, s jovanovic, and c. esveld. Optimal design of embedded rail structure for high-speed railway lines, In Railway Engineering 2000, 3rd Int. Conf., London, 2000. [70] v.l. mehrmann, The Autonomous Linear Quadratic Control Problem, Lecture Notes in Control and Information Sciences, Vol. 163, Springer Verlag, Berlin, 1991. [71] w. michiels and s.-i. miculescu. Stability and Stabilization of TimeDelay Systems: An Eigenvalue-Based Approach, Advances in Design and Control 12, SIAM Publications, Philadelphia, 2007. [72] w. michiels and t. vyhl´ıdal. An eigenvalue based approach for the stabilization of linear time-delay systems of neutral type, Automatica, 41 (2005) 991–998. [73] m. mohamed, m.m. el gowini, and w.a. moussa. A finite element model of a MEMS-based surface acoustic wave hydrogen sensor, Sensors, 10 (2010) 1232–1250. [74] r.v. patel. On computing the eigenvalues of a symplectic pencil, Lin. Alg.

Structure-Preserving Doubling Algorithms

193

Applic., 188 (1993) 591–611. [75] j. saak, h. mena, and p. benner. Matrix Equation Sparse Solvers (MESS): a Matlab Toolbox for the Solution of Sparse Large-Scale Matrix Equations, Chemnitz University of Technology, Germany, 2010. ¨ der. A QR-like algorithm for the palindromic eigenvalue problem, [76] c. schro Technical Report, Preprint 388, TU Berlin, Matheon, Germany, 2007. ¨ der. URV decomposition based structured methods for palin[77] c. schro dromic and even eigenvalue problems, Technical Report, Preprint 375, TU Berlin, MATHEON, Germany, New York, 2007. [78] g.w. stewart and j.g. sun. Matrix Perturbation Theory, Academic Press, 1990. [79] j.w. strutt (Lord Rayleigh). Theory of Sound, Dover, New York, 1945. [80] f. tisseur and k. meerbergen. A survey of the quadratic eigenvalue problem, SIAM Rev., 43 (2001) 234–286. [81] j. tomfohr and q.f. sankey. Theoretical analysis of electron transport through organic molecules, J. Chemical Physics, 120 (2004) 1542–1554. [82] p. van dooren. The computation of Kronecker’s canonical form of a singular pencil, Lin. Alg. Applic., 27 (1979) 103–141. [83] d. williams. A potential-theoretical note on the quadratic Wiener-Hopf equation for Q-matrices, in Seminar on Probability XVI, Lecture Notes in Mathematics 920, pp. 91–94, Springer-Verlag, Berlin, 1982. [84] y.s. wu, y.b. yang, and e.d. yau. Three-dimensional analysis of trainrail-bridge interaction problems, Vehicle Syst. Dyn., 36 (2001) 1–35. [85] h. xu. Transformations between discrete-time and continuous-time algebraic Riccati equations, Lin. Alg. Applic., 425 (2007) 77–101. [86] s. zaglmayr. Eigenvalue Problems in SAW-Filter Simulations, Diplomarbeit, Institute of Computational Mathematics, Johannes Kepler University Linz, Linz, Austria, 2002. [87] q.c. zhong. Robust Control of Time-Delay Systems, Springer Verlag, New York, 2006.