Computing the Nearest Doubly Stochastic Matrix

Computing the Nearest Doubly Stochastic Matrix with A Prescribed Entry ∗ Zheng-Jian Bai

†

Delin Chu

‡

Roger C. E. Tan

§

June 2006

Abstract In this paper a nearest doubly stochastic matrix problem is studied. This problem is to find the closest doubly stochastic matrix with the prescribed (1, 1) entry to a given matrix. According to the well-established dual theory in optimization, the dual of the underlying problem is an unconstrained differentiable but not twice differentiable convex optimization problem. A Newton-type method is used for solving the associated dual problem and then the desired nearest doubly stochastic matrix is obtained. Under some mild assumptions, the quadratic convergence of the proposed Newton’s method is proved. The numerical performance of the method is also demonstrated by numerical examples.

Keywords. Doubly stochastic matrix, generalized Jacobian, Newton’s method, quadratic convergence.

1

Introduction

A matrix A ∈ Rn×n is called doubly stochastic if it is non-negative and all its row and column sums equal to one. Doubly stochastic matrices have found many important applications in probability and statistics, quantum mechanics, the study of hypergroups, economics and operation research, physical chemistry, communication theory and graph theory, etc., see [3, 5, 14, 15, 22] and the references therein. In this paper, we are interested in the best approximation problem related to doubly stochastic matrices: Given a matrix T ∈ Rn×n , find its nearest doubly stochastic matrix with the same (1, 1) entry as the given matrix T . This problem can be mathematically stated as follows:  min 21 kM − T k2F       s.t. M e = e, eT M = eT ,      

eT1 M e1 = eT1 T e1 ,

(1)

M ≥ 0,

where e = (1, . . . , 1)T ∈ Rn ,

e1 = (1, 0, . . . , 0)T ∈ Rn ,

∗

This Work was partially supported by Research Grant R-146-000-047-112 of National University of Singapore and Research Grant 0000-X07152 of Xiamen University † Department of Information and Computational Mathematics, Xiamen University, Xiamen 361005, P.R. China. Email: [email protected] ‡ Department of Mathematics, National University of Singapore, 2 Science Drive 2, Singapore 117543. Email: [email protected] § Department of Mathematics, National University of Singapore, 2 Science Drive 2, ingapore 117543. Email: [email protected]

1

and M ≥ 0 means that M is non-negative. Problem (1) was originally suggested by Professor Zhaojun Bai (Department of Computer Science, UC Davis). It arose from numerical simulation of large (semi-conductor, electronic) circuit networks. Padé approximation technique using the Lanczos process is very powerful for computing a lower order approximation to the linear system matrix describing the large linear network [1, 2]. The matrix T produced by the Lanczos process is in general not a doubly stochastic matrix. Suppose the original system matrix is doubly stochastic, then we need to find the nearest doubly stochastic matrix M to T and at the same time match the moments. Problem (1) has been studied in [8] based on alterating projection method [4]. In [8, 11], Problem (1) is simplified by removing the requirements on the (1, 1) entry and the non-negativity of the matrix M . In this case, the solution can be obtained explicitly. We will revisit Problem (1). Based on the dual approach in optimization [13], we will first reformulate (1) as an unconstrained differentiable but not twice differentiable convex optimization problem, next apply Newton’s method to solve this convex problem, and then obtain the desired nearest doubly stochastic matrix. Under some mild assumptions, we will show that the proposed Newton’s method is quadratically convergent. We will also demonstrate the numerical performance of the method by numerical examples. Throughout this paper, the following notation will be used: •



t1,1 · · ·  .. T = . ··· tn,1 · · ·

 t1,n ..  . .  tn,n

• A ≥ 0 (A > 0) means that A is non-negative (positive). • K = {A : A ∈ Rn×n , A ≥ 0},

(z)+ = max{0, z}.

• ΠK (X) denotes the metric projection of X onto K, i.e.,    (x1,1 )+ · · · (x1,n )+ x1,1 · · ·    . . . .. .. ΠK (X) =   , ∀ X =  .. ··· ··· (xn,1 )+ · · · (xn,n )+ xn,1 · · ·

2

 x1,n ..  ∈ Rn×n . .  xn,n

Newton’s Method

In this section we consider a Newton-type method for computing the solution of Problem (1). Let     Me ¤ e £ £ ¤ 1 f (M ) := kM − T k2F , A(M ) :=  In−1 0 M T e  , b :=  In−1 0 e  , 2 eT1 M e1 eT1 T e1 then Problem (1) is equivalent to

The dual problem [13] of (2) is

 min f (M )    s.t. A(M ) = b    M ∈K (

sup −θ(x) s.t.

x ∈ R2n , 2

(2)

(3)

where

1 1 θ(x) = kΠK (T + A? (x))k2F − xT b − kT k2F , 2 2

and A? is the adjoint of A and is defined by 

 0 0 n×(n−1) £ ¤ £ ¤ In 0 xeT + exT  In−1 A? (x) = 0  + e1 0 1 xeT1 0 0   x1 + xn+1 + x2n x1 + xn+2 · · · x1 + x2n−1 x1  x2 + xn+1 x2 + xn+2 · · · x2 + x2n−1 x2    =  .. .. .. ..  ,  . . ··· . .  xn + xn+1

xn + xn+2 · · ·

xn + x2n−1 xn



 x1   ∀x =  ...  ∈ R2n . x2n

The relation between the values of (2) at its minimum and of the dual (3) at its maximum is stated in the following theorem. Theorem 1 There exists a matrix M ∈ Rn×n in the topological interior of K such that A(M ) = b, if and only if 0 < eT1 T e1 < 1. (4) Under the condition (4), (i) Problem (2) has a unique solution, denoted by M ? ; (ii) The supremum of dual problem (3) is actually a maximum. Let this maximum be achieved at x? . Then M ? = ΠK (T + A? (x? ))

(5)

Proof. If M is in the topological interior of K and A(M ) = b, then (4) follows directly from the properties that eT1 T e1 = eT1 M e1 > 0, eT1 M e = 1, and all entires of eT1 M are positive. Conversely, if (4) holds, then it is clear that the matrix M defined by    1   M :=  n−1  

r0 r .. . r r

r

··· .. . r0 .. .. . . .. . r ···

r

r



    ..  with r0 = (n − 1) eT1 T e1 > 0, r = 1 − eT1 T e1 > 0 .   .. . r  r r0 r .. .

(6)

satisfies that M is in the topological interior of K and A(M ) = b. Hence Theorem 1 follows. Under the condition (4), Parts (i) and (ii) are now well-known, see [10, 13]. Remark 1 In [10], the condition ensuring that there exists a matrix M ∈ Rn×n in the topological interior of K such that A(M ) = b is called the the Slater condition for 2. Hence, we can regard (4) as the Slater condition for (2). 3

According to Theorem 1, once we can compute an optimal solution x? of the dual problem (3), then we can obtain the optimal solution M ? of Problem (2) by using (5). Define F (x) : = A(ΠK (T + A? (x))) − b  P (t1,1 + x1 + xn+1 + x2n )+ + n−1 i=2 (t1,i + x1 + xn+i )+ + (t1,n + x1 )+ P n−1  (t + x + xn+i )+ + (t2,n + x2 )+ 2  i=1 2,i  ..  .  Pn−1  )+ + (tn,n + xn )+  i=1 (tn,i + xn + xn+iP  (t1,1 + x1 + xn+1 + x2n )+ + nj=2 (tj,1 + xj + xn+1 )+ =  Pn   j=1 (tj,2 + xj + xn+2 )+  ..   . Pn   (t + xj + x2n−1 )+ j,n−1 j=1 (t1,1 + x1 + xn+1 + x2n )+





                −             

1 1 .. . 1 1 1 .. . 1 t11

               

(7)



 x1   for any x =  ...  ∈ R2n . It is easy to know that the function θ(x) is continuously differentiable and its x2n gradient ∇θ(x) = F (x) is globally Lipschitz continuous. So, both gradient-type methods and quasi-Newton methods can be directly employed to solve (3). However, since, θ(x) is not twice continuously differentiable, the convergence rates of these methods are at most linear. Since θ(x) is convex and differentiable, so, at solution x? of (3), ∇θ(x? ) = 0,

i.e., F (x? ) = 0.

This indicates that we can obtain a solution of (3) by solving the equation F (x) = 0. F (x) is globally Lipschitz continuous. According to Rademacher’s theorem [20, Chapter 9.J], F (x) is Fréchet differentiable almost everywhere. Let ΩF be the set of points at which F is Fréchet differentiable. Denote the Jacobian of F (x) at x ∈ ΩF by F 0 (x). The generalized Jacobian ∂F (x) of F at x ∈ R2n in the sense of Clarke is defined by ∂F (x) := conv{∂B F (x)}, where “conv” denotes the convex hull and n o ∂B F (x) := V ∈ R2n×2n : V is an accumulation point of F 0 (x(k) ), x(k) → x, x(k) ∈ ΩF . The nonsmooth Newton’s method for solving equation F (x) = 0

(8)

is given by x(k+1) = x(k) − Vk−1 F (x(k) ),

Vk ∈ ∂F (x(k) ).

(9)

The following result has been established in [17]. Theorem 2 [17] Let x? be a solution of the equation F (x) = 0. If all V ∈ ∂F (x? ) are nonsingular and F is semismooth at x? , i.e., F is directionally differentiable at x? and for any V ∈ ∂F (x? + δx) and δx → 0, F (x? + δx) − F (x? ) − V (δx) = o(kδxkF ), 4

then every sequence generalized by (9) is superlinearly convergent to x? provided that the starting point x(0) is sufficiently close to x? . Moreover, if F is strongly semismooth at x? , i.e., F is semismooth at x? and F (x? + δx) − F (x? ) − V (δx) = o(kδxk2F ),

∀V ∈ ∂F (x? + δx), δx → 0,

then the convergence rate is quadratic.

Motivated by Theorem 2, in the following we discuss the strong semismoothness of F and the nonsingularity of all V ∈ ∂F (x? ) at a solution x? of F (x) = 0.   x1   Since x =  ...  ∈ ΩF , i.e., F 0 (x) exists, if and only if x2n  t11 + x1 + xn+1 + x2n 6= 0,    t1,j + x1 + xn+j 6= 0, j = 2, · · · , n − 1, t + xi + xn+j 6= 0, i = 2, · · · , n, j = 1, · · · , n − 1,    i,j ti,n + xi 6= 0, i = 1, · · · , n, in the case that the inequalities above hold, a1,1 : = = a1,j : = ai,j : = ai,n : =

∂(t11 + x1 + xn+1 + x2n )+ ∂(t11 + x1 + xn+1 + x2n )+ = ∂x1 ∂xn+1 ½ ∂(t11 + x1 + xn+1 + x2n )+ 1 if t11 + x1 + xn+1 + x2n > 0 = 0 if t11 + x1 + xn+1 + x2n < 0 ∂x2n ½ ∂(t1,j + x1 + xn+j )+ ∂(t1,j + x1 + xn+j )+ 1 if t1,j + x1 + xn+j > 0 = = 0 if t1,j + x1 + xn+j < 0 ∂x1 ∂xn+j ½ ∂(ti,j + xi + xn+j )+ ∂(ti,j + xi + xn+j )+ 1 if ti,j + xi + xn+j > 0 = = 0 if ti,j + xi + xn+j < 0 ∂xi ∂xn+j ½ ∂(ti,n + xi )+ 1 if ti,n + xi > 0 i = 1, · · · , n, = 0 if ti,n + xi < 0 xi

j = 2, · · · , n − 1, i = 2, · · · , n j = 1, · · · , n − 1

and  Pn        0 F (x) =        

i=1 a1,i

a1,1 a2,1 .. .

Pn

i=1 a2,i

a1,1 a1,2 .. .

a2,1 a2,2 .. .

a1,n−1 a1,1

a2,n−1 0

..

.

··· ··· ··· ··· ···

Pn

i=1 an,i

a1,2 a2,2 .. .

a an,2 Pn n,1 a i,1 i=1 Pn i=1 ai,2

an,1 an,2 .. .

an,n−1 0

a1,1 5

0

··· ··· ··· ···

..

.

···

a1,n−1 a2,n−1 .. . an,n−1

Pn

i=1 ai,n−1

0

a1,1 0 .. .



     0   a1,1  , 0   ..  .   0  a1,1



 x1   thus, for any x =  ...  ∈ R2n x2n V ∈ ∂B F (x)  Pn        ⇔ V =       

i=1 b1,i

b1,1 b2,1 .. .

Pn

i=1 b2,i

b1,1 b1,2 .. .

b2,1 b2,2 .. .

b1,n−1 b1,1

b2,n−1 0

..

.

Pn

bn,1 bn,2 .. .

··· ··· ···

··· ···

bn,n−1 0

b1,1

b1,n−1 b2,n−1 .. .

··· ···

b bn,2 Pn n,1 b i=1 i,1 P n i=1 bi,2

i=1 bn,i

··· ···

b1,2 b2,2 .. .

..

0

b1,1 0 .. .

bn,n−1

.

Pn

i=1 bi,n−1

···

0



     0   b1,1   , (10) 0   ..  .   0  b1,1

where   b1,1 = 1 b1,1 ∈ {0, 1}  b1,1 = 0   b1,j = 1 b1,j ∈ {0, 1}  b1,j = 0   bi,j = 1 bi,j ∈ {0, 1}  bi,j = 0   bi,n = 1 bi,n ∈ {0, 1}  bi,n = 0

if t11 + x1 + xn+1 + x2n > 0 if t11 + x1 + xn+1 + x2n = 0, if t11 + x1 + xn+1 + x2n < 0 if t1,j + x1 + xn+j > 0 if t1,j + x1 + xn+j = 0 if t1,j + x1 + xn+j < 0

j = 2, · · · , n − 1,

if ti,j + xi + xn+j > 0 if ti,j + xi + xn+j = 0 if ti,j + xi + xn+j < 0

i = 2, · · · , n j = 1, · · · , n − 1

if ti,n + xi > 0 if ti,n + xi = 0 if ti,n + xi < 0

(11)

i = 1, · · · , n,

As a result, we obtain

Theorem 3 V ∈ ∂F (x) if and only if V is of the form  Pn        V =       

i=1 v1,i

Pn

i=1 v2,i

v1,1 v1,2 .. .

v2,1 v2,2 .. .

v1,n−1 v1,1

v2,n−1 0

..

.

··· ··· ··· ··· ···

Pn

i=1 vn,i

vn,1 vn,2 .. .

vn,n−1 0

v1,1 v2,1 .. .

v1,2 v2,2 .. .

v vn,2 Pn n,1 v i,1 i=1 Pn i=1 vi,2

v1,1 6

0

··· ··· ··· ···

..

.

···

v1,n−1 v2,n−1 .. . vn,n−1

Pn

i=1 vi,n−1

0

v1,1 0 .. .



     0   v1,1  , 0   ..  .   0  v1,1

(12)

where   v1,1 = 1 v1,1 ∈ [0,  v1,1 = 0   v1,j = 1 v1,j ∈ [0,  v1,j = 0   vi,j = 1 vi,j ∈ [0,  vi,j = 0   vi,n = 1 vi,n ∈ [0,  vi,n = 0

if t11 + x1 + xn+1 + x2n > 0 1] if t11 + x1 + xn+1 + x2n = 0 if t11 + x1 + xn+1 + x2n < 0 if t1,j + x1 + xn+j > 0 1] if t1,j + x1 + xn+j = 0 if t1,j + x1 + xn+j < 0 if ti,j + xi + xn+j > 0 1] if ti,j + xi + xn+j = 0 if ti,j + xi + xn+j < 0 if ti,n + xi > 0 1] if ti,n + xi = 0 if ti,n + xi < 0

j = 2, · · · , n − 1, i = 2, · · · , n j = 1, · · · , n − 1

(13)

i = 1, · · · , n.

We are now ready to present our results on the strong semismoothness of F and the nonsingularity of all V ∈ ∂F (x). Theorem 4 At any point x ∈ R2n , F (x) is directionally differentiable and F (x + δx) − F (x) − V δx = 0,

∀ V ∈ ∂F (x + δx), δx → 0.

(14)

Hence, F is strongly semismooth at any x ∈ R2n . Proof. A simple calculation yields that lim

t→0+

F (x + th) − F (x) t

exists for any x, h ∈ R2n , and so F (x) is directionally differentiable at any point x ∈ R2n . In addition, it can be verified using (10) and (11) that F (x + δx) − F (x) − V δx = 0,

∀ V ∈ ∂B F (x + δx), δx → 0.

Since any V ∈ ∂F (x + δx) is just a convex combination of elements in ∂B F (x + δx), so, (14) holds. 

 x1   Theorem 5 For any x =  ...  ∈ R2n , let x2n 

 m1,1 · · · m1,n  ..  = Π (T + A? (x)) M : =  ... K ··· .  mn,1 · · · mn,n  t1,1 + x1 + xn+1 + x2n t1,2 + x1 + xn+2  t2,1 + x2 + xn+1 t2,2 + x2 + xn+2  = ΠK   .. ..  . . tn,1 + xn + xn+1

··· ···

··· tn,2 + xn + xn+2 · · · 7

t1,n−1 + x1 + x2n−1 t2,n−1 + x2 + x2n−1 .. .

t1,n + x1 t2,n + x2 .. .

tn,n−1 + xn + x2n−1 tn,n + xn

    

and  Pn

NM

i=1 m1,i

       =       

m1,1 m2,1 .. .

Pn

i=1 m2,i

m1,1 m1,2 .. .

m2,1 m2,2 .. .

m1,n−1 m1,1

m2,n−1 0

..

.

m1,2 m2,2 .. .

··· ···

··· m m m · ·· n,i n,1 n,2 i=1 Pn mn,1 m i=1 i,1 P n mn,2 i=1 mi,2 .. .. . .

Pn

··· ··· ··· ··· ···

mn,n−1 0

m1,1

0

m1,n−1 m2,n−1 .. .

m1,1 0 .. .

mn,n−1

0 m1,1 0 .. .

Pn

i=1 mi,n−1

···

0

0 m1,1

        ,       

Then (i) NM is symmetric and positive semi-definite. (ii) All V ∈ ∂F (x) are nonsingular if and only if NM is positive definite. Proof. (i) Since mi,j ≥ 0, i, j = 1, · · · , n, 

 h1   for any h =  ...  ∈ R2n h2n

T

2

h NM h = m1,1 (h1 + hn+1 + h2n ) +

n X

2

mi,1 (hi + hn+1 ) +

i=2

n−1 n XX

2

mi,j (hi + hn+j ) +

j=2 i=1

n X

mi,n h2i ≥ 0

i=1

and NM is symmetric, so NM is symmetric and positive semi-definite. (ii) Among all V ∈ ∂F (x), we consider the following one: Vmin =  Pn                  

(min)

i=1 v1,i

(min)

Pn

(min) i=1 v2,i

..

.

(min) v2,1 (min) v2,2

···

(min) v1,n−1 (min) v1,1

(min) v2,n−1

··· ···

0

···

.. .

v1,2

(min)

···

v1,1

v2,1 .. .

(min)

v1,n−1

v2,2 .. .

···

v2,n−1 .. .

0 .. .

(min)

(min) v1,1 (min) v1,2

.. .

v1,1

···

Pn

(min) i=1 vn,i (min) vn,1 (min) vn,2

(min) vn,1 Pn (min) i=1 vi,1

.. .

(min) vn,n−1

0

(min) v1,1

8

(min) vn,2

··· ···

(min) (min)

(min)

vn,n−1

Pn

(min) i=1 vi,2

..

.

···

0 (min) v1,1

0 .. .

Pn

(min) i=1 vi,n−1

0

(min)

0

0 (min) v1,1

          ,         (15)

where ( ( ( (

(min)

= 1 if m1,1 = (t11 + x1 + xn+1 + x2n )+ > 0

(min) v1,1 (min) v1,j (min) v1,j (min) vi,j (min) vi,j (min) vi,n (min) vi,n

= 0 if m1,1 = (t11 + x1 + xn+1 + x2n )+ = 0

v1,1

= 1 if m1,j = (t1,j + x1 + xn+j )+ > 0

j = 2, · · · , n − 1,

= 0 if m1,j = (t1,j + x1 + xn+j )+ = 0 = 1 if mi,j = (ti,j + xi + xn+j )+ > 0

i = 2, · · · , n j = 1, · · · , n − 1

= 0 if mi,j = (ti,j + xi + xn+j )+ = 0 = 1 if mi,n = (ti,n + xi )+ > 0

(16)

i = 1, · · · , n,

= 0 if mi,n = (ti,n + xi )+ = 0

Vmin and V − Vmin are symmetric and positive semi-definite since all V ∈ ∂F (x) are given by (12) and (13), (min)

vi,j

(min)

≥ 0, vi,j − vi,j

≥ 0,



 h1   and for any h =  ...  ∈ R2n h2n T

h Vmin h =

(min) v1,1 (h1 + hn+1 + h2n )2 +

n X

(min) vi,1 (hi + hn+1 )2 +

i=2

n−1 n XX

(min) vi,j (hi + hn+j )2 +

j=2 i=1

(min)

hT (V − Vmin )h = (v1,1 − v1,1

)(h1 + hn+1 + h2n )2 +

n X

(min) 2 hi

vi,n

≥ 0,

i=1

n X

(min)

(vi,1 − vi,1

)(hi + hn+1 )2

i=2

+

n−1 n XX

(vi,j −

(min) vi,j )(hi

2

+ hn+j ) +

j=2 i=1

n X

(min)

(vi,n − vi,n

i=1

Thus, all V ∈ ∂F (x) are nonsingular if and only if Vmin is positive definite. (min)

Recall that vi,j

(min)

hT Vmin h = v1,1

> 0 if and only if mi,j

(h1 + hn+1 + h2n )2 +

)h2i ≥ 0.

n X

 h1   > 0, i, j = 1, · · · , n and for any h =  ...  ∈ R2n h2n (min)

vi,1

(hi + hn+1 )2 +

i=2

hT NM h = m1,1 (h1 + hn+1 + h2n )2 +

n X

n−1 n XX

(min)

vi,j



(hi + hn+j )2 +

j=2 i=1

mi,1 (hi + hn+1 )2 +

i=2

n−1 n XX j=2 i=1

mi,j (hi + hn+j )2 +

n X

(min) 2 hi

vi,n

i=1 n X

mi,n h2i ,

i=1

so, we get that hT Vmin h > 0 if and only if hT NM h > 0. This implies that Vmin is positive definite if and only if NM is positive definite. Hence, Part (ii) is proved. 9

Remark 2 The positive semi-definiteness of NM can be proved1 alternatively as follows: Let  Pn 0 m1,2 ··· m1,n−1 i=2 m1,i P n  m2,1 m2,2 ··· m2,n−1 i=1 m2,i  .. .. ..  ..  . . . ··· .  Pn  m m m · · · m n,i n,1 n,2 n,n−1 i=1  Pn 0 m2,1 ··· mn,1 N1 =  i=2 mi,1 P  n  m1,2 m2,2 ··· mn,2 i=1 mi,2   . . . .. .. .. ..  . ···  Pn  m1,n−1 m2,n−1 ··· mn,n−1 i=1 mi,n−1 0 0 ··· 0 0 0 ··· 0 and



m1,1

       N2 =   m1,1  0   ..  .   0 m1,1

m1,1 0 · · · 0 0 ··· .. .. . . ··· 0 0 ··· 0 0 m1,1 0 0 .. .. . . 0 0 m1,1 0 · · ·

0 ..

.

0 ··· 0 ··· .. . ··· 0 ··· 0 ···

0 m1,1 0 0 .. .. . . 0 0 m1,1 0 .. . 0 0 0 m1,1

0 0 .. . 0 0 0 .. . 0 0

        ,       

        .       

Then NM = N1 + N2 . Obviously, N2 is positive semi-definite. Furthermore, N1 is non-negative, symmetric and weakly diagonally dominant, so the well-known Gershgorin’s theorem [24] gives that all eigenvalues of N1 are non-negative. Thus, N1 is positive semi-definite. Hence, NM = N1 + N2 is also positive semi-definite. If x = x? with F (x? ) = 0, then Theorem 5 (ii) can be simplified significantly, as shown in the next result. Theorem 6 Let M ? be the (unique) solution of Problem  t1,1 m?1,2 · · · ?  m?  2,1 m2,2 · · · ? M =:  . ..  .. . ··· ? ? mn,1 mn,2 · · ·  " L? :=

√

#

1 1−t1,1

I

   

0 m?2,1 .. .

m?1,2 m?2,2 .. .

m?n,1 m?n,2

(2) and x? ∈ R2n satisfy F (x? ) = 0. Denote  m?1,n−1 m?1,n m?2,n−1 m?2,n   (17) .. ..  , . .  m?n,n−1 m?n,n

··· ··· ··· ···

m?1,n−1 m?2,n−1 .. .

    

"

√

#

1 1−t1,1

I

.

(18)

m?n,n−1

Then (i) It is true that kL? k2 ≤ 1. 1

This alternative proof is given by an anonmous referee.

10

(19)

(ii) All V ∈ ∂F (x? ) are nonsingular if and only if kL? k2 < 1.

(20)

Proof. We have from Theorem 5 (i) that NM ? is symmetric and positive semi-definite. Now, 0 < t1,1 < 1, and M ? satisfies that Pn Pn  ? ? ?  0P< m1,1 = t1,1 < 1, i=2 m1,i = 1 − t1,1 , i=2 mi,1 = 1 − t1,1 , n m?j,i = 1, j = 2, · · · , n, (21)  Pi=1 n ? = 1, j = 2, · · · , n − 1, m i=1 i,j we obtain by using the positive semi-definiteness of NM ? that the matrix  1 − t1,1 0 m?1,2  1 m?2,1 m?2,2  . ..  .. ..  . .  ?  mn,1 m?n,2 1 NM ? :=  ? ?  0 m2,1 · · · mn,1 1 − t1,1   m? ? m2,2 · · · m?n,2 1  1,2  . . . .. .. ..  ··· ? ? ? m1,n−1 m2,n−1 · · · mn,n−1

··· ··· ··· ···

..

m?1,n−1 m?2,n−1 .. . m?n,n−1

.

             

(22)

1

is positive semi-definite. Equivalently, (19) holds. (ii) By Theorem 5 (ii) we know that all V ∈ ∂F (x? ) are nonsingular if and only if NM ? is positive definite, which is equivalent to that the matrix NM ? defined by (22) is positive definite. Therefore, Part (ii) follows directly from the property that NM ? is positive definite if and only if (20) holds. Theorem 6 is very pleasant because it indicates that for almost all T ∈ Rn×n , all V ∈ ∂F (x? ) are nonsingular for the solution x? of the equation F (x) = 0. The following corollary contains two important sufficient conditions ensuring that all V ∈ ∂F (x? ) are nonsingular for the solution x? of the equation F (x) = 0. Corollary 7 With the notation in Theorem 6, if M ? ei > 0 for some 1 ≤ i ≤ n, or eTj M ? > 0 for some 1 ≤ j ≤ n, then all V ∈ ∂F (x? ) are nonsingular. Here ei and ej are the ith and jth columns of In , respectively. Proof. By Theorem 6 and its proof we only need to show that NM ? defined by (22) is positive definite provided M ? ei > 0 for some 1 ≤ i ≤ n (or eTj M ? > 0 for some 1 ≤ j ≤ n). In the following we only assume that M ? ei > 0 for some 1 ≤ i ≤ n because the case that eTj M ? > 0 for some 1 ≤ j ≤ n) can be discussed similarly. First, we have 0 < t1,1 < 1, 0 ≤ m?1,j < 1, 0 ≤ m?j,1 < 1, j = 2, · · · , n,   h1   .. and for any h =   ∈ R2n−1 , . h2n−1 hT N M ? h =

n X i=2

m?i,1 (hi + hn+1 )2 +

n−1 n XX j=2 i=1

m?i,j (hi + hn+j )2 +

n X

m?i,n h2i .

i=1

Next, we show by considering three different cases that hT NM ? h = 0 only if h = 0, as follows. 11

(23)

Case 1: m?2,1 6= 0, · · · , m?n,1 6= 0. In this case hT N M ? h = 0  n = −hn+1 ,  h2 = · · · = hP P P T h NM ? h = n−1 hn+j )2 ni=2 m?i,j + h2n+1 ni=2 m?i,n =⇒ j=2 (hn+1 − P  2 ? 2 ? + n−1 j=2 m1,j (h1 + hn+j ) + m1,n h1 = 0 ½ h 2 = · · · = hn = P−hn+1 = · · · = −h2n−1 = 0, =⇒ hT NM ? h = h21 nj=2 m?1,j = 0 (since 0

Computing the Nearest Doubly Stochastic Matrix

Computing the Nearest Doubly Stochastic Matrix

Suggest Documents

Computing nearest stable matrix pairs

Low-Rank Doubly Stochastic Matrix Decomposition for Cluster Analysis

Low-Rank Doubly Stochastic Matrix Decomposition for Cluster Analysis

Nearest Neighbour Distance Matrix Classification

On approximating the nearest\Omega-stable matrix

Hybrid Method for Computing the Nearest Singular

On Computing the Nearest Neighbor Interchange Distance

Doubly Stochastic Neighbor Embedding on Spheres

Generalized backward doubly stochastic differential equations

On A Conjecture Concerning Doubly Stochastic

On Backward Doubly Stochastic Differential Evolutionary System

Comparison Theorems of Backward Doubly Stochastic Differential ...

Forward-Backward Doubly Stochastic Differential Equations with

Majorization and Doubly Stochastic Operators - PMF-a

stochastic computing systems - CiteSeerX

Nearest derogatory matrix when varying into a

Doubly stochastic large scale kernel learning with the empirical kernel ...

Doubly stochastic large scale kernel learning with the empirical kernel ...

Manifold Optimization Over the Set of Doubly Stochastic Matrices - arXiv

Doubly Stochastic Poisson Process and the ... - Semantic Scholar

On the Inverse Eigenvalue Problem for Irreducible Doubly Stochastic

THE LOCAL FORM OF DOUBLY STOCHASTIC MAPS AND JOINT ...

Computing the stochastic $ H^\infty $-norm

The nearest correlation matrix problem: Solution by ... - CiteSeerX