Computing the Nearest Doubly Stochastic Matrix with A Prescribed Entry ∗ Zheng-Jian Bai
†
Delin Chu
‡
Roger C. E. Tan
§
June 2006
Abstract In this paper a nearest doubly stochastic matrix problem is studied. This problem is to find the closest doubly stochastic matrix with the prescribed (1, 1) entry to a given matrix. According to the well-established dual theory in optimization, the dual of the underlying problem is an unconstrained differentiable but not twice differentiable convex optimization problem. A Newton-type method is used for solving the associated dual problem and then the desired nearest doubly stochastic matrix is obtained. Under some mild assumptions, the quadratic convergence of the proposed Newton’s method is proved. The numerical performance of the method is also demonstrated by numerical examples.
Keywords. Doubly stochastic matrix, generalized Jacobian, Newton’s method, quadratic convergence.
1
Introduction
A matrix A ∈ Rn×n is called doubly stochastic if it is non-negative and all its row and column sums equal to one. Doubly stochastic matrices have found many important applications in probability and statistics, quantum mechanics, the study of hypergroups, economics and operation research, physical chemistry, communication theory and graph theory, etc., see [3, 5, 14, 15, 22] and the references therein. In this paper, we are interested in the best approximation problem related to doubly stochastic matrices: Given a matrix T ∈ Rn×n , find its nearest doubly stochastic matrix with the same (1, 1) entry as the given matrix T . This problem can be mathematically stated as follows: min 21 kM − T k2F s.t. M e = e, eT M = eT ,
eT1 M e1 = eT1 T e1 ,
(1)
M ≥ 0,
where e = (1, . . . , 1)T ∈ Rn ,
e1 = (1, 0, . . . , 0)T ∈ Rn ,
∗
This Work was partially supported by Research Grant R-146-000-047-112 of National University of Singapore and Research Grant 0000-X07152 of Xiamen University † Department of Information and Computational Mathematics, Xiamen University, Xiamen 361005, P.R. China. Email:
[email protected] ‡ Department of Mathematics, National University of Singapore, 2 Science Drive 2, Singapore 117543. Email:
[email protected] § Department of Mathematics, National University of Singapore, 2 Science Drive 2, ingapore 117543. Email:
[email protected]
1
and M ≥ 0 means that M is non-negative. Problem (1) was originally suggested by Professor Zhaojun Bai (Department of Computer Science, UC Davis). It arose from numerical simulation of large (semi-conductor, electronic) circuit networks. Pad´e approximation technique using the Lanczos process is very powerful for computing a lower order approximation to the linear system matrix describing the large linear network [1, 2]. The matrix T produced by the Lanczos process is in general not a doubly stochastic matrix. Suppose the original system matrix is doubly stochastic, then we need to find the nearest doubly stochastic matrix M to T and at the same time match the moments. Problem (1) has been studied in [8] based on alterating projection method [4]. In [8, 11], Problem (1) is simplified by removing the requirements on the (1, 1) entry and the non-negativity of the matrix M . In this case, the solution can be obtained explicitly. We will revisit Problem (1). Based on the dual approach in optimization [13], we will first reformulate (1) as an unconstrained differentiable but not twice differentiable convex optimization problem, next apply Newton’s method to solve this convex problem, and then obtain the desired nearest doubly stochastic matrix. Under some mild assumptions, we will show that the proposed Newton’s method is quadratically convergent. We will also demonstrate the numerical performance of the method by numerical examples. Throughout this paper, the following notation will be used: •
t1,1 · · · .. T = . ··· tn,1 · · ·
t1,n .. . . tn,n
• A ≥ 0 (A > 0) means that A is non-negative (positive). • K = {A : A ∈ Rn×n , A ≥ 0},
(z)+ = max{0, z}.
• ΠK (X) denotes the metric projection of X onto K, i.e., (x1,1 )+ · · · (x1,n )+ x1,1 · · · . . . .. .. ΠK (X) = , ∀ X = .. ··· ··· (xn,1 )+ · · · (xn,n )+ xn,1 · · ·
2
x1,n .. ∈ Rn×n . . xn,n
Newton’s Method
In this section we consider a Newton-type method for computing the solution of Problem (1). Let Me ¤ e £ £ ¤ 1 f (M ) := kM − T k2F , A(M ) := In−1 0 M T e , b := In−1 0 e , 2 eT1 M e1 eT1 T e1 then Problem (1) is equivalent to
The dual problem [13] of (2) is
min f (M ) s.t. A(M ) = b M ∈K (
sup −θ(x) s.t.
x ∈ R2n , 2
(2)
(3)
where
1 1 θ(x) = kΠK (T + A? (x))k2F − xT b − kT k2F , 2 2
and A? is the adjoint of A and is defined by
0 0 n×(n−1) £ ¤ £ ¤ In 0 xeT + exT In−1 A? (x) = 0 + e1 0 1 xeT1 0 0 x1 + xn+1 + x2n x1 + xn+2 · · · x1 + x2n−1 x1 x2 + xn+1 x2 + xn+2 · · · x2 + x2n−1 x2 = .. .. .. .. , . . ··· . . xn + xn+1
xn + xn+2 · · ·
xn + x2n−1 xn
x1 ∀x = ... ∈ R2n . x2n
The relation between the values of (2) at its minimum and of the dual (3) at its maximum is stated in the following theorem. Theorem 1 There exists a matrix M ∈ Rn×n in the topological interior of K such that A(M ) = b, if and only if 0 < eT1 T e1 < 1. (4) Under the condition (4), (i) Problem (2) has a unique solution, denoted by M ? ; (ii) The supremum of dual problem (3) is actually a maximum. Let this maximum be achieved at x? . Then M ? = ΠK (T + A? (x? ))
(5)
Proof. If M is in the topological interior of K and A(M ) = b, then (4) follows directly from the properties that eT1 T e1 = eT1 M e1 > 0, eT1 M e = 1, and all entires of eT1 M are positive. Conversely, if (4) holds, then it is clear that the matrix M defined by 1 M := n−1
r0 r .. . r r
r
··· .. . r0 .. .. . . .. . r ···
r
r
.. with r0 = (n − 1) eT1 T e1 > 0, r = 1 − eT1 T e1 > 0 . .. . r r r0 r .. .
(6)
satisfies that M is in the topological interior of K and A(M ) = b. Hence Theorem 1 follows. Under the condition (4), Parts (i) and (ii) are now well-known, see [10, 13]. Remark 1 In [10], the condition ensuring that there exists a matrix M ∈ Rn×n in the topological interior of K such that A(M ) = b is called the the Slater condition for 2. Hence, we can regard (4) as the Slater condition for (2). 3
According to Theorem 1, once we can compute an optimal solution x? of the dual problem (3), then we can obtain the optimal solution M ? of Problem (2) by using (5). Define F (x) : = A(ΠK (T + A? (x))) − b P (t1,1 + x1 + xn+1 + x2n )+ + n−1 i=2 (t1,i + x1 + xn+i )+ + (t1,n + x1 )+ P n−1 (t + x + xn+i )+ + (t2,n + x2 )+ 2 i=1 2,i .. . Pn−1 )+ + (tn,n + xn )+ i=1 (tn,i + xn + xn+iP (t1,1 + x1 + xn+1 + x2n )+ + nj=2 (tj,1 + xj + xn+1 )+ = Pn j=1 (tj,2 + xj + xn+2 )+ .. . Pn (t + xj + x2n−1 )+ j,n−1 j=1 (t1,1 + x1 + xn+1 + x2n )+
−
1 1 .. . 1 1 1 .. . 1 t11
(7)
x1 for any x = ... ∈ R2n . It is easy to know that the function θ(x) is continuously differentiable and its x2n gradient ∇θ(x) = F (x) is globally Lipschitz continuous. So, both gradient-type methods and quasi-Newton methods can be directly employed to solve (3). However, since, θ(x) is not twice continuously differentiable, the convergence rates of these methods are at most linear. Since θ(x) is convex and differentiable, so, at solution x? of (3), ∇θ(x? ) = 0,
i.e., F (x? ) = 0.
This indicates that we can obtain a solution of (3) by solving the equation F (x) = 0. F (x) is globally Lipschitz continuous. According to Rademacher’s theorem [20, Chapter 9.J], F (x) is Fr´echet differentiable almost everywhere. Let ΩF be the set of points at which F is Fr´echet differentiable. Denote the Jacobian of F (x) at x ∈ ΩF by F 0 (x). The generalized Jacobian ∂F (x) of F at x ∈ R2n in the sense of Clarke is defined by ∂F (x) := conv{∂B F (x)}, where “conv” denotes the convex hull and n o ∂B F (x) := V ∈ R2n×2n : V is an accumulation point of F 0 (x(k) ), x(k) → x, x(k) ∈ ΩF . The nonsmooth Newton’s method for solving equation F (x) = 0
(8)
is given by x(k+1) = x(k) − Vk−1 F (x(k) ),
Vk ∈ ∂F (x(k) ).
(9)
The following result has been established in [17]. Theorem 2 [17] Let x? be a solution of the equation F (x) = 0. If all V ∈ ∂F (x? ) are nonsingular and F is semismooth at x? , i.e., F is directionally differentiable at x? and for any V ∈ ∂F (x? + δx) and δx → 0, F (x? + δx) − F (x? ) − V (δx) = o(kδxkF ), 4
then every sequence generalized by (9) is superlinearly convergent to x? provided that the starting point x(0) is sufficiently close to x? . Moreover, if F is strongly semismooth at x? , i.e., F is semismooth at x? and F (x? + δx) − F (x? ) − V (δx) = o(kδxk2F ),
∀V ∈ ∂F (x? + δx), δx → 0,
then the convergence rate is quadratic.
Motivated by Theorem 2, in the following we discuss the strong semismoothness of F and the nonsingularity of all V ∈ ∂F (x? ) at a solution x? of F (x) = 0. x1 Since x = ... ∈ ΩF , i.e., F 0 (x) exists, if and only if x2n t11 + x1 + xn+1 + x2n 6= 0, t1,j + x1 + xn+j 6= 0, j = 2, · · · , n − 1, t + xi + xn+j 6= 0, i = 2, · · · , n, j = 1, · · · , n − 1, i,j ti,n + xi 6= 0, i = 1, · · · , n, in the case that the inequalities above hold, a1,1 : = = a1,j : = ai,j : = ai,n : =
∂(t11 + x1 + xn+1 + x2n )+ ∂(t11 + x1 + xn+1 + x2n )+ = ∂x1 ∂xn+1 ½ ∂(t11 + x1 + xn+1 + x2n )+ 1 if t11 + x1 + xn+1 + x2n > 0 = 0 if t11 + x1 + xn+1 + x2n < 0 ∂x2n ½ ∂(t1,j + x1 + xn+j )+ ∂(t1,j + x1 + xn+j )+ 1 if t1,j + x1 + xn+j > 0 = = 0 if t1,j + x1 + xn+j < 0 ∂x1 ∂xn+j ½ ∂(ti,j + xi + xn+j )+ ∂(ti,j + xi + xn+j )+ 1 if ti,j + xi + xn+j > 0 = = 0 if ti,j + xi + xn+j < 0 ∂xi ∂xn+j ½ ∂(ti,n + xi )+ 1 if ti,n + xi > 0 i = 1, · · · , n, = 0 if ti,n + xi < 0 xi
j = 2, · · · , n − 1, i = 2, · · · , n j = 1, · · · , n − 1
and Pn 0 F (x) =
i=1 a1,i
a1,1 a2,1 .. .
Pn
i=1 a2,i
a1,1 a1,2 .. .
a2,1 a2,2 .. .
a1,n−1 a1,1
a2,n−1 0
..
.
··· ··· ··· ··· ···
Pn
i=1 an,i
a1,2 a2,2 .. .
a an,2 Pn n,1 a i,1 i=1 Pn i=1 ai,2
an,1 an,2 .. .
an,n−1 0
a1,1 5
0
··· ··· ··· ···
..
.
···
a1,n−1 a2,n−1 .. . an,n−1
Pn
i=1 ai,n−1
0
a1,1 0 .. .
0 a1,1 , 0 .. . 0 a1,1
x1 thus, for any x = ... ∈ R2n x2n V ∈ ∂B F (x) Pn ⇔ V =
i=1 b1,i
b1,1 b2,1 .. .
Pn
i=1 b2,i
b1,1 b1,2 .. .
b2,1 b2,2 .. .
b1,n−1 b1,1
b2,n−1 0
..
.
Pn
bn,1 bn,2 .. .
··· ··· ···
··· ···
bn,n−1 0
b1,1
b1,n−1 b2,n−1 .. .
··· ···
b bn,2 Pn n,1 b i=1 i,1 P n i=1 bi,2
i=1 bn,i
··· ···
b1,2 b2,2 .. .
..
0
b1,1 0 .. .
bn,n−1
.
Pn
i=1 bi,n−1
···
0
0 b1,1 , (10) 0 .. . 0 b1,1
where b1,1 = 1 b1,1 ∈ {0, 1} b1,1 = 0 b1,j = 1 b1,j ∈ {0, 1} b1,j = 0 bi,j = 1 bi,j ∈ {0, 1} bi,j = 0 bi,n = 1 bi,n ∈ {0, 1} bi,n = 0
if t11 + x1 + xn+1 + x2n > 0 if t11 + x1 + xn+1 + x2n = 0, if t11 + x1 + xn+1 + x2n < 0 if t1,j + x1 + xn+j > 0 if t1,j + x1 + xn+j = 0 if t1,j + x1 + xn+j < 0
j = 2, · · · , n − 1,
if ti,j + xi + xn+j > 0 if ti,j + xi + xn+j = 0 if ti,j + xi + xn+j < 0
i = 2, · · · , n j = 1, · · · , n − 1
if ti,n + xi > 0 if ti,n + xi = 0 if ti,n + xi < 0
(11)
i = 1, · · · , n,
As a result, we obtain
Theorem 3 V ∈ ∂F (x) if and only if V is of the form Pn V =
i=1 v1,i
Pn
i=1 v2,i
v1,1 v1,2 .. .
v2,1 v2,2 .. .
v1,n−1 v1,1
v2,n−1 0
..
.
··· ··· ··· ··· ···
Pn
i=1 vn,i
vn,1 vn,2 .. .
vn,n−1 0
v1,1 v2,1 .. .
v1,2 v2,2 .. .
v vn,2 Pn n,1 v i,1 i=1 Pn i=1 vi,2
v1,1 6
0
··· ··· ··· ···
..
.
···
v1,n−1 v2,n−1 .. . vn,n−1
Pn
i=1 vi,n−1
0
v1,1 0 .. .
0 v1,1 , 0 .. . 0 v1,1
(12)
where v1,1 = 1 v1,1 ∈ [0, v1,1 = 0 v1,j = 1 v1,j ∈ [0, v1,j = 0 vi,j = 1 vi,j ∈ [0, vi,j = 0 vi,n = 1 vi,n ∈ [0, vi,n = 0
if t11 + x1 + xn+1 + x2n > 0 1] if t11 + x1 + xn+1 + x2n = 0 if t11 + x1 + xn+1 + x2n < 0 if t1,j + x1 + xn+j > 0 1] if t1,j + x1 + xn+j = 0 if t1,j + x1 + xn+j < 0 if ti,j + xi + xn+j > 0 1] if ti,j + xi + xn+j = 0 if ti,j + xi + xn+j < 0 if ti,n + xi > 0 1] if ti,n + xi = 0 if ti,n + xi < 0
j = 2, · · · , n − 1, i = 2, · · · , n j = 1, · · · , n − 1
(13)
i = 1, · · · , n.
We are now ready to present our results on the strong semismoothness of F and the nonsingularity of all V ∈ ∂F (x). Theorem 4 At any point x ∈ R2n , F (x) is directionally differentiable and F (x + δx) − F (x) − V δx = 0,
∀ V ∈ ∂F (x + δx), δx → 0.
(14)
Hence, F is strongly semismooth at any x ∈ R2n . Proof. A simple calculation yields that lim
t→0+
F (x + th) − F (x) t
exists for any x, h ∈ R2n , and so F (x) is directionally differentiable at any point x ∈ R2n . In addition, it can be verified using (10) and (11) that F (x + δx) − F (x) − V δx = 0,
∀ V ∈ ∂B F (x + δx), δx → 0.
Since any V ∈ ∂F (x + δx) is just a convex combination of elements in ∂B F (x + δx), so, (14) holds.
x1 Theorem 5 For any x = ... ∈ R2n , let x2n
m1,1 · · · m1,n .. = Π (T + A? (x)) M : = ... K ··· . mn,1 · · · mn,n t1,1 + x1 + xn+1 + x2n t1,2 + x1 + xn+2 t2,1 + x2 + xn+1 t2,2 + x2 + xn+2 = ΠK .. .. . . tn,1 + xn + xn+1
··· ···
··· tn,2 + xn + xn+2 · · · 7
t1,n−1 + x1 + x2n−1 t2,n−1 + x2 + x2n−1 .. .
t1,n + x1 t2,n + x2 .. .
tn,n−1 + xn + x2n−1 tn,n + xn
and Pn
NM
i=1 m1,i
=
m1,1 m2,1 .. .
Pn
i=1 m2,i
m1,1 m1,2 .. .
m2,1 m2,2 .. .
m1,n−1 m1,1
m2,n−1 0
..
.
m1,2 m2,2 .. .
··· ···
··· m m m · ·· n,i n,1 n,2 i=1 Pn mn,1 m i=1 i,1 P n mn,2 i=1 mi,2 .. .. . .
Pn
··· ··· ··· ··· ···
mn,n−1 0
m1,1
0
m1,n−1 m2,n−1 .. .
m1,1 0 .. .
mn,n−1
0 m1,1 0 .. .
Pn
i=1 mi,n−1
···
0
0 m1,1
,
Then (i) NM is symmetric and positive semi-definite. (ii) All V ∈ ∂F (x) are nonsingular if and only if NM is positive definite. Proof. (i) Since mi,j ≥ 0, i, j = 1, · · · , n,
h1 for any h = ... ∈ R2n h2n
T
2
h NM h = m1,1 (h1 + hn+1 + h2n ) +
n X
2
mi,1 (hi + hn+1 ) +
i=2
n−1 n XX
2
mi,j (hi + hn+j ) +
j=2 i=1
n X
mi,n h2i ≥ 0
i=1
and NM is symmetric, so NM is symmetric and positive semi-definite. (ii) Among all V ∈ ∂F (x), we consider the following one: Vmin = Pn
(min)
i=1 v1,i
(min)
Pn
(min) i=1 v2,i
..
.
(min) v2,1 (min) v2,2
···
(min) v1,n−1 (min) v1,1
(min) v2,n−1
··· ···
0
···
.. .
v1,2
(min)
···
v1,1
v2,1 .. .
(min)
v1,n−1
v2,2 .. .
···
v2,n−1 .. .
0 .. .
(min)
(min) v1,1 (min) v1,2
.. .
v1,1
···
Pn
(min) i=1 vn,i (min) vn,1 (min) vn,2
(min) vn,1 Pn (min) i=1 vi,1
.. .
(min) vn,n−1
0
(min) v1,1
8
(min) vn,2
··· ···
(min) (min)
(min)
vn,n−1
Pn
(min) i=1 vi,2
..
.
···
0 (min) v1,1
0 .. .
Pn
(min) i=1 vi,n−1
0
(min)
0
0 (min) v1,1
, (15)
where ( ( ( (
(min)
= 1 if m1,1 = (t11 + x1 + xn+1 + x2n )+ > 0
(min) v1,1 (min) v1,j (min) v1,j (min) vi,j (min) vi,j (min) vi,n (min) vi,n
= 0 if m1,1 = (t11 + x1 + xn+1 + x2n )+ = 0
v1,1
= 1 if m1,j = (t1,j + x1 + xn+j )+ > 0
j = 2, · · · , n − 1,
= 0 if m1,j = (t1,j + x1 + xn+j )+ = 0 = 1 if mi,j = (ti,j + xi + xn+j )+ > 0
i = 2, · · · , n j = 1, · · · , n − 1
= 0 if mi,j = (ti,j + xi + xn+j )+ = 0 = 1 if mi,n = (ti,n + xi )+ > 0
(16)
i = 1, · · · , n,
= 0 if mi,n = (ti,n + xi )+ = 0
Vmin and V − Vmin are symmetric and positive semi-definite since all V ∈ ∂F (x) are given by (12) and (13), (min)
vi,j
(min)
≥ 0, vi,j − vi,j
≥ 0,
h1 and for any h = ... ∈ R2n h2n T
h Vmin h =
(min) v1,1 (h1 + hn+1 + h2n )2 +
n X
(min) vi,1 (hi + hn+1 )2 +
i=2
n−1 n XX
(min) vi,j (hi + hn+j )2 +
j=2 i=1
(min)
hT (V − Vmin )h = (v1,1 − v1,1
)(h1 + hn+1 + h2n )2 +
n X
(min) 2 hi
vi,n
≥ 0,
i=1
n X
(min)
(vi,1 − vi,1
)(hi + hn+1 )2
i=2
+
n−1 n XX
(vi,j −
(min) vi,j )(hi
2
+ hn+j ) +
j=2 i=1
n X
(min)
(vi,n − vi,n
i=1
Thus, all V ∈ ∂F (x) are nonsingular if and only if Vmin is positive definite. (min)
Recall that vi,j
(min)
hT Vmin h = v1,1
> 0 if and only if mi,j
(h1 + hn+1 + h2n )2 +
)h2i ≥ 0.
n X
h1 > 0, i, j = 1, · · · , n and for any h = ... ∈ R2n h2n (min)
vi,1
(hi + hn+1 )2 +
i=2
hT NM h = m1,1 (h1 + hn+1 + h2n )2 +
n X
n−1 n XX
(min)
vi,j
(hi + hn+j )2 +
j=2 i=1
mi,1 (hi + hn+1 )2 +
i=2
n−1 n XX j=2 i=1
mi,j (hi + hn+j )2 +
n X
(min) 2 hi
vi,n
i=1 n X
mi,n h2i ,
i=1
so, we get that hT Vmin h > 0 if and only if hT NM h > 0. This implies that Vmin is positive definite if and only if NM is positive definite. Hence, Part (ii) is proved. 9
Remark 2 The positive semi-definiteness of NM can be proved1 alternatively as follows: Let Pn 0 m1,2 ··· m1,n−1 i=2 m1,i P n m2,1 m2,2 ··· m2,n−1 i=1 m2,i .. .. .. .. . . . ··· . Pn m m m · · · m n,i n,1 n,2 n,n−1 i=1 Pn 0 m2,1 ··· mn,1 N1 = i=2 mi,1 P n m1,2 m2,2 ··· mn,2 i=1 mi,2 . . . .. .. .. .. . ··· Pn m1,n−1 m2,n−1 ··· mn,n−1 i=1 mi,n−1 0 0 ··· 0 0 0 ··· 0 and
m1,1
N2 = m1,1 0 .. . 0 m1,1
m1,1 0 · · · 0 0 ··· .. .. . . ··· 0 0 ··· 0 0 m1,1 0 0 .. .. . . 0 0 m1,1 0 · · ·
0 ..
.
0 ··· 0 ··· .. . ··· 0 ··· 0 ···
0 m1,1 0 0 .. .. . . 0 0 m1,1 0 .. . 0 0 0 m1,1
0 0 .. . 0 0 0 .. . 0 0
,
.
Then NM = N1 + N2 . Obviously, N2 is positive semi-definite. Furthermore, N1 is non-negative, symmetric and weakly diagonally dominant, so the well-known Gershgorin’s theorem [24] gives that all eigenvalues of N1 are non-negative. Thus, N1 is positive semi-definite. Hence, NM = N1 + N2 is also positive semi-definite. If x = x? with F (x? ) = 0, then Theorem 5 (ii) can be simplified significantly, as shown in the next result. Theorem 6 Let M ? be the (unique) solution of Problem t1,1 m?1,2 · · · ? m? 2,1 m2,2 · · · ? M =: . .. .. . ··· ? ? mn,1 mn,2 · · · " L? :=
√
#
1 1−t1,1
I
0 m?2,1 .. .
m?1,2 m?2,2 .. .
m?n,1 m?n,2
(2) and x? ∈ R2n satisfy F (x? ) = 0. Denote m?1,n−1 m?1,n m?2,n−1 m?2,n (17) .. .. , . . m?n,n−1 m?n,n
··· ··· ··· ···
m?1,n−1 m?2,n−1 .. .
"
√
#
1 1−t1,1
I
.
(18)
m?n,n−1
Then (i) It is true that kL? k2 ≤ 1. 1
This alternative proof is given by an anonmous referee.
10
(19)
(ii) All V ∈ ∂F (x? ) are nonsingular if and only if kL? k2 < 1.
(20)
Proof. We have from Theorem 5 (i) that NM ? is symmetric and positive semi-definite. Now, 0 < t1,1 < 1, and M ? satisfies that Pn Pn ? ? ? 0P< m1,1 = t1,1 < 1, i=2 m1,i = 1 − t1,1 , i=2 mi,1 = 1 − t1,1 , n m?j,i = 1, j = 2, · · · , n, (21) Pi=1 n ? = 1, j = 2, · · · , n − 1, m i=1 i,j we obtain by using the positive semi-definiteness of NM ? that the matrix 1 − t1,1 0 m?1,2 1 m?2,1 m?2,2 . .. .. .. . . ? mn,1 m?n,2 1 NM ? := ? ? 0 m2,1 · · · mn,1 1 − t1,1 m? ? m2,2 · · · m?n,2 1 1,2 . . . .. .. .. ··· ? ? ? m1,n−1 m2,n−1 · · · mn,n−1
··· ··· ··· ···
..
m?1,n−1 m?2,n−1 .. . m?n,n−1
.
(22)
1
is positive semi-definite. Equivalently, (19) holds. (ii) By Theorem 5 (ii) we know that all V ∈ ∂F (x? ) are nonsingular if and only if NM ? is positive definite, which is equivalent to that the matrix NM ? defined by (22) is positive definite. Therefore, Part (ii) follows directly from the property that NM ? is positive definite if and only if (20) holds. Theorem 6 is very pleasant because it indicates that for almost all T ∈ Rn×n , all V ∈ ∂F (x? ) are nonsingular for the solution x? of the equation F (x) = 0. The following corollary contains two important sufficient conditions ensuring that all V ∈ ∂F (x? ) are nonsingular for the solution x? of the equation F (x) = 0. Corollary 7 With the notation in Theorem 6, if M ? ei > 0 for some 1 ≤ i ≤ n, or eTj M ? > 0 for some 1 ≤ j ≤ n, then all V ∈ ∂F (x? ) are nonsingular. Here ei and ej are the ith and jth columns of In , respectively. Proof. By Theorem 6 and its proof we only need to show that NM ? defined by (22) is positive definite provided M ? ei > 0 for some 1 ≤ i ≤ n (or eTj M ? > 0 for some 1 ≤ j ≤ n). In the following we only assume that M ? ei > 0 for some 1 ≤ i ≤ n because the case that eTj M ? > 0 for some 1 ≤ j ≤ n) can be discussed similarly. First, we have 0 < t1,1 < 1, 0 ≤ m?1,j < 1, 0 ≤ m?j,1 < 1, j = 2, · · · , n, h1 .. and for any h = ∈ R2n−1 , . h2n−1 hT N M ? h =
n X i=2
m?i,1 (hi + hn+1 )2 +
n−1 n XX j=2 i=1
m?i,j (hi + hn+j )2 +
n X
m?i,n h2i .
i=1
Next, we show by considering three different cases that hT NM ? h = 0 only if h = 0, as follows. 11
(23)
Case 1: m?2,1 6= 0, · · · , m?n,1 6= 0. In this case hT N M ? h = 0 n = −hn+1 , h2 = · · · = hP P P T h NM ? h = n−1 hn+j )2 ni=2 m?i,j + h2n+1 ni=2 m?i,n =⇒ j=2 (hn+1 − P 2 ? 2 ? + n−1 j=2 m1,j (h1 + hn+j ) + m1,n h1 = 0 ½ h 2 = · · · = hn = P−hn+1 = · · · = −h2n−1 = 0, =⇒ hT NM ? h = h21 nj=2 m?1,j = 0 (since 0