results in view of the conditions on degeneracy, however, their step-size bound for the global convergence depends on the problem and usually it cannot but be.
Mathematical Programming 52 (1991) 377-404 North-Holland
377
Global convergence of the affine scaling methods for degenerate linear programming problems Takashi Tsuchiya The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan Received 19 January 1990 Revised manuscript received 1 February 1991 In this paper we show the global convergence of the anne scaling methods without assuming any condition on degeneracy. The behavior of the method near degenerate faces is analyzed in detail on the basis of the equivalence between the affine scaling methods for homogeneous LP problems and Karmarkar's method. It is shown that the step-size ~-, where the displacement vector is normalized with respect to the distance in the scaled space, is sufficient to guarantee the global convergence of the a n n e scaling methods.
Key words: Linear programming, interior point methods, affine scaling methods, global analysis, degenerate problems.
Introduction
Since Karmarker [9] proposed the projective scaling method for linear programming in 1984, a number of interior point methods have been proposed and implemented. The anne scaling method, originated by Dikin [6] and rediscovered by several authors including Barnes [4], Vanderbei et al. [21], and Adler et al. [1], is one of the most popular interior point methods obtained by substituting the affine scaling transformation in place of the projective transformation in Karmarkar's method. Though there is no proof of the polynomiality of the method and many of the researchers have rather pessimistic conjectures on this problem (see [11] on this point), this simple method works well in practice, and now several promising experimental results [1, 2, 10, 12] are reported. One of the most interesting questions regarding to this method has been to prove its global convergence without requiring nondegeneracy assumptions. Since the papers by Barnes [4] and Vanderbei et al. [21] appeared with (independent) proofs of global convergence assuming both primal and dual nondegeneracy assumptions, several researchers have dealt with this problem. The best global convergence results obtained so far are (i) the proof by Dikin [7] (assuming primal nondegeneracy and /~ = 1 (/x is the step-size where the displacement vector is normalized with respect This paper was presented at the International Symposium "Interior Point Methods for Linear Programming: Theory and Practice," held on January 18-19, 1990, at the Europa Hotel, Scheveningen, the Netherlands.
378
T. Tsuchiya / Global convergence of the affine scaling methods
to the Euclidean distance in the scaled space), see [22] also); (ii) the proof by Tsuchiya [18] (assuming dual nondegeneracy and p~ =½); and (iii) the proof by Tseng and Luo [15] (assuming no condition on degeneracy and /x = 0(2 L) (typically), where L is the bit-size of the problem). For the continuous version of the method, the global convergence was shown by Adler and Monterio [3] without assuming nondegeneracy conditions. Tseng and Luo's result surpasses the other results in view of the conditions on degeneracy, however, their step-size bound for the global convergence depends on the problem and usually it cannot but be estimated by a quantity of O(2-L). In this paper we complete our approach taken in [18] by removing the dual nondegeneracy assumption to give a new proof of global convergence of the affine scaling method without requiring nondegeneracy conditions. In the previous paper, we introduced the local Karmarkar potential function as a key ingredient to analyze the behavior of the affine scaling method near degenerate vertices, on the basis of the equivalence between Karmarkar's method and the affine scaling method applied to homogeneous problems [5, 8, 16, 17, 23]. Applying the techniques developed for the local analysis of several interior point methods under the existence of degeneracy [19, 20] and the global analysis of the dynamical systems associated with the interior point methods [14], we proved the global convergence of the affine scaling method under the assumption of dual nondegeneracy. In this paper we study extensively the behavior of the method in the vicinity of dual degenerate faces to remove this assumption. Amalgamating the new results with the key idea of our previous work, we show that the step-size ~, where the displacement vector is normalized with respect to the Euclidean distance in the scaled slack variables, is sufficient to guarantee the convergence of the sequence to an interior point of the optimal solution set. 1. Problem
We deal with the dual standard form linear programming problem (D): minimize
ctx
subject to
x e ~, ~={x~R,,IAtx_b~O}, A=(al,...,ar,,)~lR
nxm,
(1.1) c ~ N ~', b ~ R m.
We study the global convergence of the affine scaling method for the dual standard form linear programming problems [1] under the following assumptions. Assumption 1. The feasible region ~ has an interior point and Rank(A) = n. Assumption 2. c ~ 0.
We do not require any condition on degeneracy. Note also that the boundedness of the optimal solution set is not assumed.
T. Tsuchiya / Global convergence of the affine scaling methods
379
We introduce basic notations. For a vector v, we denote by [v] the diagonal matrix whose diagonal entries are elements of v. We denote the slack variables A t x - b by ~:(x), and define the "metric" matrix G ( x ) for the affine scaling method as follows: G ( x ) = A[~Z(x)] 2At.
(1.2)
1 and I denote the vector of all ones and the identity matrix of proper dimension, respectively. We use N"n (without subscript) for the 2 norm. For the sequence {x (~} (v = 1 , . . . ; x ( ~ ~"), we abbreviate {f(x(V~)}, {g(x(~)} etc. as {f(~)}, {g(~} etc. We denote by x + the new point obtained by performing one iterative step at the point x c R", and u s e f +, g+, etc. to denote f(x+), g(x+), etc. We do not indicate arguments of functions when they are obvious from the context.
2. The dual attine scaling method and the main result
Let x (") be an interior point of the polyhedron ~. The iteration of the affine scaling method for the dual standard from problem (D) is defined as follows: x (~+1~= x ~ - tx (~
G(x~)-Ic {ctG(x(~'))-lc}l/2"
(2.1)
We refer to this method as the dual affine scaling method. We determine the step-size /x (~) by appropriate step-size selection procedures such that x ~+1) is also an interior point of ~, which guarantees that the iterations can be continued recursively. Concerning the choice of step-sizes for the iteration (2.1), we have the following well-known lemma. Lemma 2.1 [18, Lemma 2.2]. I f x (~) is an interior point of ~ and 0 < / z ( ~ ) < 1 in the iteration (2.1), x ('+1) is also an interior point of ~.
Proof. See [18]. The counterpart of the lemma for the affine scaling method for the standard form problems is given in [4]. [] Since G ( x ) is a positive definite matrix, we easily see that the method is a descendent method for ctx. Now, the main result of this paper is written as follows: Theorem 2.2. Let (D) be a linear programming problem satisfying Assumptions 1 and 2, and let {x (~)} be the sequence generated by the dual affine scaling method (2.1) applied to (D) with step-size i.t (~) =½. I f (D) has an optimal solution, the sequence converges to an interior point x* of the whole set 5¢ of the optimal solutions with IIx ~>-x*ll -- O(ctx - etx*), where the reduction rate of (ctx (v)- ctx *) is bounded from above by ( 1 - 1 / ( 8 k l / 2 ) ) asymptotically. Here k denotes the number of alwaysactive constraints on 5F and, i f 5 p is a vertex, we regard the interior point of the vertex as the vertex itself. On the other hand, if (D) does not have an optimal solution, the sequence is unbounded with ctx(V) ~ - 0 o as u~0o.
T. Tsuchiya / Globalconvergenceof the affine scaling methods
380
The p r o o f of this theorem will be given in Section 6. We note that our result can be directly applied to show the global convergence of the affine scaling method for the standard form problems, though it is for the dual standard form problems. A brief discussion of this point is given in the Appendix of [18].
3. Preliminaries
In this section we introduce further notations regarding to polyhedra, together with preliminary lemmas and concepts which will be used in the remainder of this paper. See [13] for the basic theory of polyhedra. (1) We use the letters ~¢, N , . . . , Y to denote the faces of ~. We denote by 5e the set of the optimal solution(s) if it exists. We do not treat the empty set as a face. For a face ~ of ~, we denote by E ( ~ ) the set of indices of the constraints which are always satisfied with equality on the face. We sometimes abbreviate E ( ~ ) as E when the face ~ which associates with the notation E is obvious from the context. (2) Given a set X___ { 1 , . . . , m} of indices, we denote by Ax,t bx the matrix and the vector composed of the corresponding coefficient vectors and constants. We use ~x(X) for A t x x - b x . Analogously, for a vector v, we denote by Vx the vector which are composed of the part of v associated with X. We introduce the following set which is naturally determined from X: ~ ( X ) = { x ~ ~ l A ~ x x - b x =0}.
(3.1)
Note that q t ( ¢ ) is ~ itself, and that ~ ( X ) may be empty i f X is chosen arbitrarily. If qr(X) is nonempty, it determines a face of ~. Conversely, given a face 95, there always exists an index set X ' with which ~ is written as gz(X'). X ' may not be determined uniquely. (3) A point x on a face ~f of ~ is referred to as an "interior point of ~ " if ~ ( ~ ) ( x ) = 0 and ~ i ( x ) > 0 (i~ E ( ~ ) ) . The interior point of a vertex is the vertex itself. The face ~ is characterized as the smallest face (as a set) among the faces which contain the point x as their element. (4) For an index set X, we use IX[ to denote its cardinality. I f X is a (proper) subset of another index set Y, we denote X c_ ( ~ ) y. Then we denote by X - Y the set consisting of the indices which belong to X but not to Y. The complement of X, which is defined as { 1 , . . . , m } - X , is written by X c. Let Z be an index set of constraints. We can choose the index set B c Z such that the columns of AB is a basis for the range space of Az. Since Rank(A) = n, due to the elementary theory of linear algebra, we can choose the index set/~ from the complement of Z such that A ~ n is a nonsingular matrix. Then s%~ ~ is regarded as another coordinate than x, where the coordinate transformation is given by
~B,~a(x)=A~,~sx--bB~B
and
x((B~)=(at~)
'(~:B~a+bB~).
(3.2)
We refer to the pair (B,/~) as a "pair of basis index sets associated with the index set Z " . In this paper we use the letters B and /3 as the notation for such pairs of
381
7". Tsuchiya / Global convergence of the affine scaling methods
basis index sets. When we want to make clear that the pair is associated with the index set Z, we write them as B(Z) and /~(Z). We refer to (~:mz), ~:a(z)) as the "slack coordinate associated with the index set Z " . We denote by R(Z, B) the index set Z - B . Due to the definitions, there exists a constant matrix TBR such that (3.3)
A R = ABTBR .
Thus the index set Z and its associated pair of basis index sets (B,/~) determined, we define the matrices 7,B(Z) and J-a(z) as
(~mz)) =_(Amz ) A~(z)) l=(A~(z)~(z))-l. Ao(z)/ Then, we have
A~(z)AB(z)
(3.4)
O)
(3.5)
Ag(z)AB(z)/
A B ( z ) A B ( z ) -I- A ~ ( z ) m ~ ( z ) = L
Note that AB(z)AB(z) and AB(z)A~(z) are projection matrices. The objective function ctx can be written as ctx = ctAtB(z)(~B(Z) ~- bB(z) ) ~- ctAt~(z)( ~ ( z ) - r b~(z) )
(3.6)
in terms of the coordinate (~B(z), ~(z))With these notations, all the constraints can be categorized into four groups:
~(x) = A t x - b =
Atmz)x-bmz) 1 = At~(z)X- bB(z) I ~AtN(z,O)x - bN(z,~)]
~B(x)
(3.7)
Ca(x) ' ~N(X)
where N(Z,/~)={1,..., m } - Z - / ~ = { 1 , . . . , m } - R - B w / ~ . W e u s e R and N a l s o as global notations in this paper. We omit the arguments (Z, B) of R and (Z,/~) of N if they are obvious from the context. A face o~ of ~ is referred to as a "dual degenerate face" if the objective function ctx is constant on the face. We include vertices also as dual degenerate faces. Dual degenerate faces are characterized as follows. Proposition 3.1 [18, Proposition 3.2]. A face 0% of ~ is a dual degenerate face if and only if c c Im(AE(~)). Proof. It is an easy exercise, hence we omit the proof.
[]
By definition, the set of the optimal solution(s) is a dual degenerate face. We note that a dual degenerate face does not necessarily contain an optimal solution. Any face ~ of ~ that is contained in a hyperplane {x I ctx = Co} with appropriate Co is a dual degenerate face. For example, every vertex is a dual degenerate face. For reference, we describe how the degeneracy conditions, which are assumed in Dikin [7] and Tsuchiya [18], are written in terms of the notations introduced above. We do not require these assumptions here.
382
T. Tsuchiya / Global convergenceof the affine scaling methods
(i) Assumption of primal nondegeneraey (assumed in Dikin [7]): For each face o~ of ~, there exists no redundant constraint which is always active on the face. In other words, E ( ~ ) is the unique index set such that q~(E(~)) = ~. (ii) Assumption of dual nondegeneracy (assumed in Tsuchiya [18]): ~ has no other dual degenerate face except for its vertices, i.e., c ~ Im(Ae(~)) for any face of ~ except for the case where ~- is a vertex. In the global convergence proofs by Barnes [4] and Vanderbei et al. [2l], they require both primal and dual nondegeneracy assumptions. (Strictly speaking, the dual nondegeneracy assumption (ii) is slightly weaker than the one assumed in their proofs. See the Appendix of [18] on this point.) Since the main subject of this paper is to study the asymptotic behavior of the sequence {x ~'~} generated by (2.1) when z , ~ , it is necessary to consider the situations where the sequence (or its subsequence) approaches a face or it diverges to infinity. In such situations, it turns out to be useful in the analysis to divide the constraints into two groups; the one consisting of the constraints whose values converge to 0 or are relatively small; and the other consisting of the constraints whose values are large or diverge to infinity. We introduce the following characteristic determined from the index set F: cbv(x) -- mini~z (i(x) -- mini~F c ~:~(x) "
(3.8)
Conventionally, we define @ F ( x ) = 0 in the special cases where F = ~ h or F = { 1 , . . . , m} = (the index set of all constraints of (D)). If the value of @F(x) is small, see(x) consists of small components of ~:(x), while ~:ro(x) is composed of large components. This quantity @F(x) is exactly the same as the one introduced in (4.20) of [18], and plays an important role. The following lemma relates the existence of a sequence of interior points to the existence of a face with the quantity (3.8). Lemma 3.2. Let F be a nonempty index set of constraints, and let {x (~} be a sequence of interior points of ~. I f (i)
~(F>~ O,
(ii)
q~v(x ~ )
and ]1~11~:~) converges to zero, min~F
then there exists a face ~ such that E ( ~ ) = F. Proof. Due to (i), there exists a point ~ such that s c F ( ~ ) = A ~ 2 - b F = 0 . If F = { 1 , . . . , m}, the lemma is obvious. In the following we assume F c { l , . . . , m}, and show the existence of a point x ~ ~ such that atrx-bF=O,
at~x-b,>O
(i~:F).
(3.9)
If this can be shown, the lemma immediately follows by putting ~-= ~ ( F ) . To this end, consider the following system of linear equations: A ~ A x (~= ~(fi).
(3.10)
T. Tsuchiya / Global convergence of the affine scaling methods
383
This equation has a solution, say, Ax (~) = x (~) - ~. Let us denote by aY (~) the minimum norm solution of (3.10). As is well-known, the norm of A# (~) is bounded from above by [IA£(~)ll~< hT/ll~:%~)ll, where ~ / i s a positive constant. We put (3.11)
y o , ) = x o,) _ A £ o , ) .
Then we have Atr# (v)= bF
(3.12)
for all v, and, due to (ii), ati:~(~)- bi = a~gx ( v ) - b i - atiA)~(v)~> ~I ~)- ]V/[Iail[ IIW)II > 0
(3.13)
for each i ~ F when v is large enough. Therefore, if we take x to be £( ~ where t, is sufficiently large, x satisfies (3.9). This completes the proof. []
4. Asymptotic formula of the projection matrix associated with the dual atiine scaling method Let y be a vector in ~ " which satisfies the equality A y = c. The iteration (2.1) of the dual affine scaling method is written as follows in the space of slack variables se(x): ~ ( x +) = A t x + - b = Atx - b - I~A ~
O(x)-~c {ctG(x) 'c} '/2
= Atx - b - txA t
G(x) 'iy {ytAtG(x) lay}'~2
= ~(x) - ~[~(x)]
P(x)a(x) { OZ( x ) t p ( x )
o~ (x)} 1/2'
(4.1)
where P(x) = [~(x)]-lAtG(x)-lA[~(x)]-I
(4.2)
and a ( x ) = [~C(x)]y. Note that P ( x ) is a projection matrix. Multiplying both sides of (4.1) by [ ( ] - l , we have Poe [~:]-'~:+ = 1 - / x ( a t p ~ ) , / 2 .
(4.3)
This means that the value of each slack variable is multiplied at most by the factor of 1 +/x at each step of the iteration (2.1). Since c ~ 0 due to Assumption 2 of Section 1, we see that the sequence generated by (2.1) has accumulation points on the boundary of ~, or has unbounded subsequences. In our previous paper [18], we studied the asymptotic formula of P ( x ) in detail in the cases where some of the constraints tend to zero or diverge to infinity (or both), and obtained the following lemmas.
384
Z Tsuchiya / Global convergence of the affine scaling methods
Lemma 4.1 [18, Lemma 4.1]. Let F be an index set, and choose a pair of basis index sets ( B ( F ) , B ( F ) ) associated with F to take the slack coordinate ((B(e), ~&~)). Let x be an interior point of ~ and let the slack variables ~(x) be put in order as ~(x) = ((~(x), (~(e)(x), seu(~)(x)) = ((R(~)(X), (~(~)(X), ~:O(e)(X), (N(~)(X)). Then the matrix P ( x ) is written as follows ( a matrix with a pair of index sets as the lower indices (say, Cz, z;) represents a matrix whose rows and columns are associated with the first index set (Z~) and the second one (Z2), respectively; we use this convention throughout the paper):
o
F F P(x)
#(F) ~ N(F) QVF
)
~(F)~N(F)
+ziP,
(4.4)
where F
B(F)
F ( o zIP = ~(F)
o
0^
N(F) ISt~NPF~ F
- ~(~) [ N(F)
~S~N )
0
0
0
S~NPv~S~N F
~(F)
0
Q~
Q~
o
/
~t
N(F)
t t t FNQFF+SBNOF~ S ~ N Q ~
N(F)
+ Q S.N
Q vo SFN ,,, S~N QFFSpN + StUNQ ~ SrN]
+ S~NQF~S~N PF~ =
\ /
(I + SseStBR)-I(SBR I),
" =(~)('+sB~s~)-I('s~), S N
PF~F°
Qvp = PFvS~N(I + St~NsgN + SVN t PrFS~N) * -1 S~N t Pr~, ^
QFB = PFFSFN( I -~ StBNS~N -}- SrNPe~S~N) t ~ -1 S~N, t AO~B = S~N( I + StBNSBN)-IStFN •
+
(Z + S3NS N)--lSgN F )--I
•&N(I+S~N&N)-'S3N, S~,~ = [~B]X~AR[&] 1, 0 SFN = ( [ ~ B ] ~ B A N [ ~ N ]
1),
SaN = [¢O]YI~AN[~N] -~, fi~. c ~Wlxrvl, /3Foro~ W~'~ NI×I~NI ' QFF~NleI×Iel, QFB~I FI×I~I, AQNNclRIOI×I01'
(4.5)
T. Tsuchiya / Global convergence of the aJfine scaling methods
385
SFN E ~ IF[x[NI, SON ~ R 10I×[NI, SBR C ~ IBIx[RI. tiFF(X) and fiF~F~(X) are projection matrices.
[]
Lemma 4.2 [18, Lemma 4.2]. Let F be an index set, and choose a pair of basis index sets ( B ( F ) , / ~ ( F ) ) associated with F. We take the slack coordinate (~mF), ~O(F))" Then the norms of the matrices SFN, QFF, QFO, QFoSoN and (I SoN)tAQBo(I SON) appearing in Lemma 4.1 are bounded as follows:
II&N II--
( [~B]fi~ANo [~:N]-1)
~< II*BAN IIOF(x),
(4.6)
IIQFz II = II~F~SFN(I + S~SoN + S~?~FSFN )-1 StFNbFFI1 ~< IIsF,~ll 2 -- o({oF(x)}~), IIQFoII : II/;FFS~N(I
(4.7)
+ S 'ONSON + S ~FN PFFSFN )-' StBN II *
~i O. If F is chosen to be E(o~), the index set consisting of the indices corresponding to the "always-active constraints" of the face ~, the system A ~ x - b F ~ 0 becomes homogeneous. Then we have the following lemma, which shows that the associated projection matrix PFF is essentially the same as the projection matrix appearing in the conical formulations [2, 8, 17, 23] of the Karmarkar method. Lemma 4.4 [18, Lemma 4.3]. Let ~ be a face of ~, and apply Lemma 4.1 with F : = E(~-). Then the projection matrix PFF(X) appearing in (4.4) has the following
property: PFF(X)IF : 1F (or, equivalently, PE(,~)E(.~)IE(,~)= 1E(o~)).
[~
(4.12)
In this case, the two (asymptotically) dominant components tiFF(X) and fiVCFO(X) of P(x) (as O F ( x ) 4 0 ) may be regarded as "the Karmarkar (projective scaling) part" and "the a n n e scaling part", respectively.
5. Convergence of the sequence
In this section we show that the sequence {x (~)} generated by the dual affine scaling method converges to an interior point of a dual degenerate face if the sequence {ctx (~)} of the values of the objective function is bounded below by a constant. In Section 3, we refer to a face g~ of ~, including a vertex, as a "dual degenerate face" if the objective function is constant on the face. In terms of the active set E ( ~ ) on the face, the dual degenerate face ~ is characterized as a face such that c c Im(AE(~))
T. Tsuchiya / Global convergence of the affine scaling methods
387
(Proposition 3.1). In the following we introduce the concept of "maximal dual degenerate faces" and related notations. We call a dual degenerate face which is not a face of any other dual degenerate face as a "maximal dual degenerate face". Given an interior point x of the feasible region, we consider a measure for the distance between the point x and the nearest maximal dual degenerate face, m i n { l l ~ ( x ) [ [ ]~f is a maximal dual degenerate face of ~}.
(5.1)
We choose ~fmi, that gives the minimum in (5.1) (if there are several faces that attains this minimum, we select one in an appropriate manner), and define ~ ( x ) to be 9fm~n, which is a function of x. Then, we have
II~(~(x~(x)II = II~E(c~min)(x) ]l ---min{ll&~(x)lll~ is a maximal dual degenerate face of ~}.
(5.2)
Our purpose in this section is to show the following lemma. Lemma 5.1. Let {x ~')} be the sequence generated by the dual affine scaling method (2.1) with a step-size selection procedure such that the step-size i~ ~ is guaranteed to be bounded from below by a positive constant/~inr. (As was mentioned in Section 2, we assume that i~ ) at each step is chosen so that the next iterate remains to be an interior point.) I f the sequence {ctx ~'~)} of the values of the objective function has the limit value c a (> -co), then: (1) The sequence {x ~'~} converges to an interior point ~ of a dual degenerate face with c t x (p) __ C ~
[l~z(~(x(~)) [I > 7/> 0
(5.3)
= O(c c~), where rl is a constant. (2) The value ctx ~ o f the objective function converges linearly to c ~ asymptotically, where the reduction rate is less than ( 1 - ~ J I E ( ~ ) I ' / ~ ) . and IIx
Concerning the step-size selection procedure, in the lemma, we required the interior feasibility of the next iterate and the existence of a positive lower bound for the step-size /x ~). Note that, due to Lemma 2.1, the choice of the constant step-size/x ~) =/z (0 0 while (5.16), i.e., the leftmost hand side of (5.17), converges to 0 as r tends to infinity, we have y~ +-'{BANyN = 0. This implies
A~ya + ABAoANyN = 0.
(5.18)
On the other hand, due to the definition of y, we have
Ay = A x y x + A~yo + ANYN = C.
(5.19)
Subtracting (5.18) from (5.19), and using the second relation of (3.5), we have
AxYx + ( I -
A~A~)ANy
N =
AxYx
+ ABARANYN
=
C.
(5.20)
Since B ( X ) c_ X, we see c c I m ( A x ) . I f X = 0, we have c = 0, which contradicts with A s s u m p t i o n 2. Thus, X # 0. After all, we obtained the index set X # 0 such that c c I m ( A x ) and (]]s~)[[ = ) [I{:x (x ( ~ ) I[/[I ~:M(~(~)(x%~) I[ ~ 0 as z ~ o0. But, this is a contradiction to the definition of M ( x ) , which comes from assuming that 6 > 0 in (5.6) does not exist. Hence, there must be a positive constant g for which (5.6) is satisfied for all u. Proof of Step 2. We show ][SCM(~%(x(~))[I-~ 0. Since
ctx (~+~ = ctx 0")- tz °'){ ctG(x("))-~ c} ~/2
(5.21)
and {c~x ~ } is a m o n o t o n e decreasing sequence b o u n d e d from below, ctG(x('))-~c converges to 0 as p ~ ~ if the step-size/z ( ~ is b o u n d e d below by the positive n u m b e r /~i,f. Since c
G(x%
(5.22)
>1
we see (M(~%(X ~ ) converges to 0 as v--> oo. This completes the proof. Proof of Step 3. We prove
> a'>o
(5.23)
for all v, where 6' is a constant. We show this by using the result of Step 2. By contradiction, we assume that such 6' does not exist. Then we can choose the subsequence {x% )} of {x (~)} such that "> O.
l[ ~M(x(Vp))( x ( u P ) ) 11
II (
)( x
II
(5.24)
T. Tsuchiya / Global convergence of the affine scaling methods
391
Let W be one of the index sets that appears infinitely many times in the sequence {M(x%))}, and consider the subsequence {x (~0} of {x% )} such that M ( x (~0) = W for each ~" and each component of the sequence
w~ =
~(x ~')) II~ ].L(u) (~I/2[I~E(A/~(x(~)>)(X(V))I I
(5.38)
holds for all v, where 8 is the positive constant appearing in the statement of Lemma 5.2. On the other hand, by using Lemma 5.4, we have ctx (~+1)- cOO_ 1 . (~). {dG(x(~))-lc}l/2/zi~f> 0, this shows the linear convergence of the sequence {ctx (~) - cOO}.Moreover, (5.39) tells us that the step-size {tz (~)} can be bounded from above by, say, 2,/~i.lm, for sufficiently large u. (If not, we can find x (~) in the sequence such that (5.39) holds and ~/2.1m~! ~7 if qSE(z)(x(~)) 1rl holds regardless to x (~) in the special case where ~f happens to be "the origin" of a homogeneous system with E ( ~ ) = { 1 , . . . , m}, because we have ~E(a,)(x (~) = 0 by definition (recall the remark following Lemma 4.2). To prove this lemma, we provide the following two auxiliary lemmas. Lemma 5.6. Under the assumptions of Lemma 5.1, there exists a dual degenerate face such that 0 is an accumulation point of {~e(~)(x(~))}. Proof. It is enough to find a face Y for which we have a subsequence {x (~)} of {x (~)} such that q~e(~)(x (~))-~ 0 as z ~ 00. For this purpose, let us choose a subsequence {x (~} such that each component ~:I~T~ of ~:~-) converges or diverges to
T. Tsuchiya / Global convergence of the affine scaling methods
394
infinity and ~t(x %~) is a specific face, say, ~ for all r. Let Z be the set consisting of the indices such that (I~-)-~ 0 as r ~ ~ . Since Ilsc~(~)l[-~ 0 as r tends to infinity due to L e m m a 5.2, we see that the set Z is nonempty, Cbz(X ~')) ~ 0 and Z _~ E(Cg). Then, due to Lemma 3.2, there exists a face Y such that E ( ~ ) = Z and this implies @e(~)(x("0)~0 as r ~ . (This holds also in the case E ( Y ) = { 1 , . . . , m}.) Since E ( Y ) = Z___ E(°9), Y is a dual degenerate face. This completes the proof. [~ Lemma 5.7. Let ~ be a dual degenerate face of ~, and let {x (~)} be the sequence satisfying the assumptions of Lemma 5.1. I f (0, O) is an accumulation point of the sequence {(q~E(~)(x(~)), A~(x(~)))}, there exists a dual degenerate ~V such that ~V D Y{ ( i.e., E (~V) c E ( Y ) ) and the sequence { @E~~/)(x (~))} has zero as one of its accumulation points. Proof. Due to the assumption, we have a subsequence {x (~)} such that @E(~)(x (~)) ~ 0
and
Az~(x (~)) ~ 0
(5.42)
as r tends to 0o. By using (5.38) and the second condition of (5.42), we have
0.
(5.43)
Choose a maximal dual degenerate face ~J which appears infinitely m a n y times in the sequence {~t(x(~-))}, and consider the sequence S (v') _
~(v)
"
(5.44)
We can choose a subsequence {x %)} of {x (~,)} such that ~t(x (~)) = ~ for all ~r and each component of s (~) converges or diverges to infinity as o- tends to infinity. Note (~,) that II~ce~'+)J]-->0 as o--->oo. Denote by U the set consisting of the indices such that sl ~) converges to a finite value as o- -->oo. Then we have E ( ~ ) ~_ U and q~u (x(~')) -->0. Since IlscE(~)]]-> 0, we see [1~)1[ -->0, then, due to L e m m a 3.2, there exists the face 0// such that E ( ~ ) = U and E(~u)~ 0
(5.45)
as o - ~ . Since E ( ~ ) ~ U = E ( ~ ) , ~ is a dual degenerate face. In the following we show 0 / / = y . For the purpose, it is enough to see E ( a g ) c E ( ~ ) . Since I1~:~)11/11~(''~) I= se (%) II/11~ (%) II.]]S(~)[] we have, due to (5.43) and the definition E(~) E(~)II H E(~). , of u,
II u = I[ II
U )ll-*0
(5.46)
as r--> co. Now, by contradiction we assume E ( ° t / ) ¢ E ( ~ ) . Since (5.46) holds, we have E ( ~ ) = U # E ( Y ) , which implies E ( ~ ) # { 1 , . . . , m}. Then the assumption E(~)¢E(Y) implies the existence of an index i such that i~ U and i ~ E ( Y ) . Since ~e(~.) ~, we have ]]~)][ ~>~(~-)>~ ~(~¢)~(~)I for sufficiently large o-, which, however, is a contradiction to (5.46). Thus, we have og ~ N, which, together with (5.45), shows that ~ is the face that we wanted. []
T. Tsuchiya / Global convergence of the affine scaling methods
395
On the basis of these lemmas, we give the proof for Lemma 5.5. Proof of Lemma 5.5. We want to show the existence of the dual degenerate face that satisfies conditions (i) and (ii) of Lemma 5.5. By contradiction we assume that no dual degenerate face satisfies (i) and (ii) simultaneously. Let us denote by O the set of the faces that satisfy (i). Due to Lemma 5.6, 12 is not empty. Since no face satisfies conditions (i) and (ii) simultaneously, for any face q / c g2, the point (0, 0) is an accumulation point of the sequence {(q~cou)(x(~)), A~(x(V)))}. Let 7~ be a maximal element of 12 in the meaning that it is not a face of another face that belongs to g2. The point (0, 0) is an accumulation point of the sequence {(q0z(w-)(x(~)), A~p(x(~)))}. Applying Lemma 5.7 to the face 7~, we see that there exists a dual degenerate face ~ such that 7~ c eg and {@E(~)(x(~))} has an accumulation point at 0. Hence ~ is a face that belongs to g2 and contains 7g" as its proper subset, which contradicts with the maximality of o0/.in O. This contradiction comes from the initial assumption that no face satisfies (i) and (ii) of Lemma 5.5 simultaneously. Thus, there exists a dual degenerate face that satisfies (i) and (ii). This completes the proof. []
Now, we are ready to see Lemma 5.1, the main result of this section. Proof of Lemma 5.1. Due to Lemma 5.5, there exists a dual degenerate face ~ such that (i) {q~E(~)(x(~))} has 0 as its accumulation point; (ii) There exist the positive constants ~7 and e such that (5.47) Proof of the property (1). We show that the sequence {x (~)} converges to an interior point of ~f. Let us take a pair of basis index sets (B(E(g~)),/~(E(Cf))) to determine the slack coordinate (SCB(E(~e)),(B(~(~))) associated with E (~f), and choose y such that (5.48) We put a = [~]y. The iteration of the dual affine scaling method is written as in (4.1) in the space of slack variables. More precisely, by applying Lemma 4.1 with F : = E ( ~ ) and taking account of ~ N = 0, we see that the iteration of the dual affine scaling method in the space of slack variables with the step-size /x (v) at the vth iteration is written as follows:
(5.49)
396
T. Tsuchiya / Global convergence of the affine scaling methods
where =/ t Aft)/
[f(y)]Q(~)t
|
B
/
EB
P~a
|
XI" ~(v)-I { ~(~')tD(v)t ~-~(v)t/) (v)t ~-~(v)t ~'~(v)t' ] i i p(v)o~ (v) iI \ L S N Jk--~JEN-C EE - - O E N ~ E E --O~NkCrE~ J /
"
(5,50)
First we observe that property (1) holds in the special case where E ( ~ ) = { 1 , . . . , m}. In this case, from (i), (ii) and the linear convergence of { ¢ t x - c°°}, we see that the sequence converges to the unique vertex ~ which is the solution of ~:(~) = s¢~(a,)(~) = 0, satisfying condition (5.3). (N.B., As was mentioned in the remark following Lemma 5.5, the inequality A ~ ( x 0")) >i rI > 0 always holds.) Since ]Ix(v)~11 -- o(ll (~)ll), the order of the convergence IIx ( ~)- ~11 = O(e 1~r'',
(5.56)
tlS(~+l) then ~)"+~) and -~(~) is bounded from above, by using (5.39), (5.54), (5.55) and the remark following (4.3), as follows:
~'z(x('+l))~
,
(6.5)
which suggests that f~ (x) can be a significant indicator to observe the value of (6.2). In fact, roughly speaking, we can show that "f~(x) can be reduced by a constant in the vicinity of ~ per iteration by the iteration (2.1) with an appropriate choice of step-sizes if f is not the whole set of the optimal solutions of (D)". This fact, which was proved in [18] as L e m m a 5.1 and will be stated formally as L e m m a 6.1 later, is a core in our approach. Since it has already been proved in [18], we do not cite the p r o o f of the lemma here. Instead, we outline the underlying idea of the proof. Let ~ be a point on the face ~. Putting z = x - ~ , we consider the following homogeneous linear programming problem: minimize
ctz
subject to
At~(~)z>~O.
(6.6)
T. Tsuchiya I Global convergence of the affine scaling methods
401
This linear programming problem can be regarded as the problem obtained by removing the constraints ~:i(x)~>0 (i¢~E(Y)) from (D), where its associated Karmarkar potential function is given by fee(x). Several researchers (see, e.g. [5, 8, 17] etc.) pointed out the equivalence between Karmarkar's method and the affine scaling method applied to homogeneous linear programming problems. This equivalence implies that the local Karmarkar potential function fee(x) can be reduced by a constant by simply applying the affine scaling method for (6.6) with an appropriate choice of step-sizes, provided that ~f does not contain the whole set of the optimal solutions of (D). Unfortunately, since we want to show the reduction o f f , ( x ) by the iteration (2.1) of "the anne scaling method for (D)" while this property holds for "the affine scaling method for the simplified problem (6.6)", the observation above cannot be directly used to analyze the behavior of the affine scaling method for (D). However, intuitively, if the current iterate x is a point sufficiently close to the face Y in the sense that "~(x) >>ll~E~ee)(x)l I holds for all i~ E (~f)", we may neglect the influence of the constraints ~:~such that i~ E(Y), which are located far away compared with the constraints CE(~). Then it is plausible to consider that applying the anne scaling method for (D) at the point x is almost equivalent to applying the affine scaling method for the simplified problem (6.6), expecting that the one iteration of the dual affine scaling method (2.1) for (D) at x also reduces the local Karmarkar potential function f ~ ( x ) as well by taking an appropriate step-size. This is the outline of our idea to relate the iteration of the dual anne scaling method to the reduction of the local Karmarkar potential function. Lemma 6.1 [18, Lemma 5.1]. Let (D) be the linear programming problem satisfying Assumptions 1 and 2, and let ~ be a dual degenerate face which does not contain the whole set of optimal solution(s) of (D), where the value of the objective function is Co. (In the case where the optimal solution does not exist, we regard that all the dual degenerate faces satisfy this condition.) Let x be an interior point such that ctx - Co> 0, and denote by x + the new iterate obtained by performing one iteration of the dual affine scaling method (2.1) with the step-size tx. Then, if (i) ~e~)(x) is sufficiently small, (ii) 0