Free-Steering Relaxation Methods for Problems with Strictly Convex

Working Paper Free-Steering Relaxation Methods for Problems with Strictly Convex Costs and Linear Constraints Krzysztof C. Kiwiel

WP-94-089 September 1994

IIASA

International Institute for Applied Systems Analysis Telephone: 43 2236 807

Fax: 43 2236 71313

A-2361 Laxenburg

Austria

E-Mail: [email protected]

Free-Steering Relaxation Methods for Problems with Strictly Convex Costs and Linear Constraints Krzysztof C. Kiwiel

WP-94-089 September 1994

Working Papers are interim reports on work of the International Institute for Applied Systems Analysis and have received only limited review. Views or opinions expressed herein do not necessarily represent those of the Institute, its National Member Organizations, or other organizations supporting the work.

IIASA

International Institute for Applied Systems Analysis Telephone: 43 2236 807

Fax: 43 2236 71313

A-2361 Laxenburg

Austria

E-Mail: [email protected]

Free-Steering Relaxation Methods for Problems with Strictly Convex Costs and Linear Constraints Krzysztof C. Kiwiel

Abstract

We consider dual coordinate ascent methods for minimizing a strictly convex (possibly nondierentiable) function subject to linear constraints. Such methods are useful in large-scale applications (e.g., entropy maximization, quadratic programming, network ows), because they are simple, can exploit sparsity and in certain cases are highly parallelizable. We establish their global convergence under weak conditions and a free-steering order of relaxation. Previous comparable results were restricted to special problems with separable costs and equality constraints. Our convergence framework uni es to a certain extent the approaches of Bregman, Censor and Lent, De Pierro and Iusem, and Luo and Tseng, and complements that of Bertsekas and Tseng.

Key words. Convex programming, entropy maximization, nondierentiable optimization, relaxation methods, dual coordinate ascent, B -functions.

1 Introduction We study algorithms for the following convex programming problem minimize f (x); subject to Ax b;

(1.1a) (1.1b)

where f : IRn ! (?1; 1] is a (possibly nondierentiable) strictly convex function that has some properties of dierentiable Bregman functions [Bre67, CeL81] (cf. x2), A is a given m n matrix and b is a given m-vector. (Equality constraints are discussed later.) This problem arises in many applications, e.g., linear programming [Bre67, Erl81, Man84], quadratic programming [Hil57, LeC80, LiP87], image reconstruction [Cen81, Cen88, CeH87, Elf89, HeL78, ZeC91b], matrix balancing [CeZ91, Elf80, Kru37, LaS81], \x log x" entropy optimization [DaR72, CDPE+ 90, Elf80], \log x" entropy optimization [CDPI91, CeL87], and network ow programming [BHT87, BeT89, NiZ92, NiZ93a, Roc84,

Research supported by the State Committee for Scienti c Research under Grant 8S50502206. Systems Research Institute, Newelska 6, 01{447 Warsaw, Poland ([email protected])

1

ZeC91a, Zen91]. Further references can be found, e.g., in [LuT92b, LuT92c, Tse90, Tse91, TsB91]. The usual dual problem of (1.1) consists in maximizing a concave dierentiable dual functional subject to nonnegativity constraints. This motivates coordinate ascent methods for solving the dual problem which, at each iteration, increase the dual functional by adjusting one coordinate of the dual vector. Such methods are simple, use little storage and can exploit problem sparsity. They are among the most popular (and sometimes the only practical) methods for large-scale optimization. Also such methods may be used as subroutines in the proximal minimization algorithms with D-functions [CeZ92, Eck93, Teb92], giving rise to massively parallel methods for problems with huge numbers of variables and constraints [CeZ91, NiZ92, NiZ93a, NiZ93b, Zen91, ZeC91a, ZeC91b]. Other examples include methods for speci c problems quoted above, and methods for more general problems [Bre67, CeL81, DPI86, LuT92b, LuT92c, Tse90, Tse91, TsB87, TsB91]. At least three general approaches to convergence analysis of such methods can be distinguished. Because dierent assumptions on the problem are employed, each approach covers many applications, but not all. First, the approach based on Bregman functions [Bre67, CeL81, DPI86] imposes some smoothness assumptions on f and so-called zone consistency conditions that may be dicult to ensure. Second, the approach of [LuT92b, Tse91] assumes that f is essentially smooth. (Our terminology follows [Roc70]; see below for a review.) Third, the approach of [Tse90, TsB87, TsB91] requires that f be co nite. Usually it is assumed that the relaxed coordinates are chosen in an almost (essentially) cyclic order [CeL81, DPI86, LuT92b, Tse90, Tse91, TsB87, TsB91] (i.e., each coordinate is chosen at least once every icyc iterations, for some xed icyc m), by a Gauss-Southwell max-type rule [Bre67, LuT92b, Tse90, Tse91, TsB91], or|for strongly convex costs only| in a quasi-cyclic order [Tse90, TsB87, TsB91] (in which the lengths of the cycles, i.e., icyc, are allowed to grow, but not too fast). Convergence under the weakest assumption of freesteering relaxation (in which each coordinate is merely chosen in nitely many times) has so far been established only for network ow problems with separable costs and equality constraints [BHT87], [BeT89, x5.5] and for special cases of iterative scaling [BCP93]. In this paper we establish global convergence of a general dual ascent method under free-steering relaxation (for both equality and inequality constraints), weak assumptions on (1.1) and inexact line searches. Our assumptions on problem (1.1) (cf. x2) are weaker than those of [Bre67, CeL81, DPI86] and [LuT92b, Tse91]; thus we generalize those approaches. We show that inexact line searches are implementable because the dual functional, being essentially smooth, may act like a barrier to keep iterates within the region where it is dierentiable. In particular, our results imply global convergence under free-steering relaxation of Hildreth's method [Hil57] for quadratic programming. We note that for the related problem of nding a point in the intersection of a nite family of closed convex sets, convergence of \inexact" free-steering versions of the successive projection method [GPR67] has been established quite recently [ABC83, Kiw94, FlZ90]; see [Ott88, Tse92] for results under \exact" projections. Attempting to capture objective features essential to optimization, we introduce the class of B -functions (cf. De nition 2.1) which generalizes that of Bregman functions 2

[CeL81] and covers more applications. The usefulness of our B -functions is not limited to linearly constrained minimization; this will be shown elsewhere. We concentrate on global convergence under general conditions, whereas the recent results on linear rate of convergence of relaxation methods [Ius91, LuT91, LuT92a, LuT92b, LuT92c, LuT93] require additional regularity assumptions. The paper is organized as follows. In x2 we introduce the class of B -functions, highlight some of its properties and present our method. Its global convergence under free-steering relaxation control is established in x3. In x4 we relate Bregman projections [CeL81] to exact linesearches and give conditions for overrelaxation that supplement those in [DPI86, Tse90]. Convergence under conditions similar to those in [LuT92b, Tse91] and under another regularity condition is established in xx5 and 6 respectively. Some additional remarks are given in x7. In x8 we discuss block coordinate relaxation. The Appendix contains proofs of certain technical results. Our notation and terminology mostly follow [Roc70]. h; i and j j are the Euclidean inner product and norm respectively, ai is column i of the transpose AT of A, bi is component i of b, IRm+ and IRm> are the nonnegative and positive orthants of IRm respectively, []+ denotes the orthogonal projection onto IRm+ , i.e., ([p]+)i = maxfpi; 0g for p 2 IRm and i = 1: m, where 1: m denotes 1; 2; ; m, and ei is the ith coordinate vector in IRm. For any set C in IRn, cl C , C, ri C and bd C denote the closure, interior, relative interior and boundary of C respectively. C () = supx2C h; xi is the support function of C . For any closed proper convex function f on IRn and x in its eective domain Cf = fx : f (x) < 1g, @f (x) denotes the subdierential of f at x and f 0(x; d) = limt#0[f (x + td) ? f (x)]=t denotes the derivative of f in any direction d 2 IRn . By [Roc70, Thms 23.1{23.2], f 0 (x; d) ?f 0(x; ?d) and f 0(x; d) @f (x)(d) = supfhg; di : g 2 @f (x)g: (1:2) The domain and range of @f are denoted by C@f and im @f respectively. By [Roc70, Thm 23.4], ri Cf C@f Cf . f is dierentiable at x i @f (x) = frf (x)g, where rf is the gradient of f [Roc70, Thm 25.1]. f is called essentially strictly convex if f is strictly convex on every convex subset of C@f . f is called co nite when its conjugate f () = supx h; xi ? f (x) is real-valued. A proper convex function f is called essentially smooth if Cf 6= ;, f is dierentiable on Cf , and f 0 (x + t(y ? x); y ? x) # ?1 as t # 0 for any y 2 Cf and x 2 bd Cf (equivalently jrf (xk )j ! 1 if fxk g Cf , xk ! x 2 bd Cf [Roc70, Lem. 26.2]); then f 0(y + t(x ? y); x ? y) " 1 as t " 1 (cf. f 0(; d) ?f 0(; ?d) 8d).

2 B -functions and the algorithm We rst de ne our B -functions and Bregman functions [CeL81]. For any convex function f on IRn, we de ne its dierence functions Df[ (x; y) = f (x) ? f (y) ? @f (y)(x ? y) 8x; y 2 Cf ; Df] (x; y) = f (x) ? f (y) + @f (y)(y ? x) 8x; y 2 Cf : By convexity (cf. (1.2)), f (x) f (y) + @f (y)(x ? y) and 0 Df[ (x; y) f (x) ? f (y) ? hg; x ? yi Df] (x; y) 8x; y 2 Cf ; g 2 @f (y): 3

(2.1a) (2.1b) (2:2)

Df[ and Df] generalize the usual D-function of f [Bre67, CeL81], de ned by since

Df (x; y) = f (x) ? f (y) ? hrf (y); x ? yi 8x 2 Cf ; y 2 Crf ;

(2:3)

(2:4) Df (x; y) = Df[ (x; y) = Df] (x; y) 8x 2 Cf ; y 2 Crf : De nition 2.1. A closed proper (possibly nondierentiable) convex function f is called a B -function (generalized Bregman function) if (a) f is strictly convex on Cf . (b) f is continuous on Cf . (c) For every 2 IR and x 2 Cf , the set L1f (x; ) = fy 2 C@f : Df[ (x; y) g is bounded. (d) For every 2 IR and x 2 Cf , if fyk g L1f (x; ) is a convergent sequence with limit y 2 Cf n fxg, then Df] (y; yk ) ! 0. De nition 2.2. Let S be a nonempty open convex set in IRn . Then h : S ! IR, where S = cl S , is called a Bregman function with zone S , denoted by h 2 B(S ), if (i) h is continuously dierentiable on S . (ii) h is strictly convex on S. (iii) h is continuous on S. (iv) For every 2 IR, y~ 2 S and x~ 2 S, the sets L2h (~y; ) = fx 2 S : Dh(x; y~) g and L3h (~x; ) = fy 2 S : Dh (~x; y) g are bounded. (v) If fykg S is a convergent sequence with limit y, then Dh (y; yk) ! 0. (vi) If fyk g S converges to y, fxk g S is bounded and Dh (xk ; yk ) ! 0 then xk ! y. (Note that the extension f of h to IRn, de ned by f (x) = h(x) if x 2 S, f (x) = 1 otherwise, is a B -function with Cf = S, ri Cf = S and Df[ (; y) = Df] (; y) = Df (; y) 8y 2 S .)

Df[ and Df] are used like distances, because for x; y 2 Cf , 0 Df[ (x; y) Df] (x; y), and Df[ (x; y) = 0 () Df] (x; y) = 0 () x = y by strict convexity. De nition 2.2 (due to [CeL81]), which requires that h be nite-valued on S, does not cover Burg's entropy [CDPI91]. Our De nition 2.1 captures features of f essential for algorithmic purposes. We show in x7 that condition (b) implies (c) if f is co nite. Sometimes one may verify the following stronger version of condition (d) C@f fykg ! y 2 Cf ) Df] (y; yk ) ! 0

(2:5)

by using the following lemmas. Their proofs are given in the Appendix. Lemma 2.3. (a) Let f be a closed proper convex function on IRn , and let S 6= ; be a compact subset of ri Cf . Then there exists 2 IR such that j@f (y)(x ? z )j jx ? z j, jf (x) ? f (y)j jx ? yj and jDf] (x; y)j 2jx ? yj for all x; y; z 2 S . (b) Let h = S , where S is the indicator function of a convex polyhedral set S 6= ; in IRn , i.e., S (x) = 0 if x 2 S , S (x) = 1 if x 2= S . Then h satis es condition (2.5). (c) Let h be a proper polyhedral convex function on IRn. Then h satis es condition (2.5). 4

(d) Let f be a closed proper convex function on IR. Then f is continuous on Cf , and Df] (y; yk ) ! 0 if yk ! y 2 Cf , fyk g Cf . Lemma 2.4. (a) Let f = Pki=1 fi, where f1; : : :; fk arej closed proper convex functions such that fj+1 ; : : : ; fk (j 0) are polyhedral and \i=1 ri(Cfi ) \ki=j+1 Cfi 6= ;. If f1 satis es condition (c) of De nition 2.1, then so does f . If f1; : : : ; fj satisfy condition (d) of De nition 2.1 or (2.5), then so does f . If f1 is a B -function, f2; : : :; fj are continuous on Cf = \ki=1 Cfi and satisfy condition (d) of De nition 2.1, then f is a B -function. In particular, f is a B -function if so are f1; : : : ; fj . (b) Let f1; : : : ; fj be B -functions such that \ji=1 ri Cfi 6= ;. Then f = maxi=1:j fi is a B -function. (c) Let f1 be a B -function and let f2 be a closed proper convex function such that Cf1 ri Cf2 . Then f = f1 + f2 is a B -function. (d) Let f1; : : : ; fn be closed proper strictly convex functions on IR such that L1fi (t; ) is P bounded for any t; 2 IR, i = 1: n. Then f (x) = ni=1 fi(xi ) is a B -function. Lemma 2.5. Let h be a proper convex function on IR. Then L1h (x; ) is bounded for each x 2 Ch and 2 IR i Ch = Ch . Examples 2.6. Let : IR ! (?1; 1] and f (x) = Pni=1 (xi). In each of the examples, it can be veri ed that f is an essentially smooth B -function. 1 [Eck93]. (t) = jtj= for t 2 IR and > 1, i.e., f (x) = kxk=. Then f () = kk = with + = [Roc70, p. 106]. For = 1=2, f (x) = jxj2=2 and Df (x; y) = jx ? yj2=2. 2. (t) = ?t= if t 0 and 2 (0; 1), (t) = 1 if t < 0, i.e., f (x) = ?kxk= if x 0. Then f (y) = ?kyk = if y < 0 and + = , f (y) = 1 if y 6< 0 [Roc70, p. 106]. 3 (`x log x'-entropy) [Bre67]. (t) = t ln t if t 0 (0 ln 0P= 0), (t) = 1 if t < 0. Then P n f (y) = i=1 exp(yi ? 1) [Roc70, p. 105] and Df (x; y) = ni=1 xi ln(xi=yi) + yi ? xi (the Kullback-Liebler entropy). 4 [Teb92]. (t) = t ln t ? t if t 0, (t) = 1 if t < 0. Then f (y) = Pni=1 exp(yi) [Roc70, p. 105] and Df is the Kullback-Liebler entropy. (t) = ?(1 ? t2)1=2 if t 2 [?1; 1], (t) = 1 otherwise. Then f (y) = Pn 5 (1[Teb92]. P 1?xi yi n 2 1=2 2 1=2 n i=1 + yi ) [Roc70, p. 106] and Df (x; y ) = i=1 (1?yi2 )1=2 ? (1 ? xi ) on [?1; 1] (?1; 1)n . 6 (Burg's entropy) [CDPI91]. (t) = ? ln t if t > 0, (t) = 1 if t 0. Then f (y) = ?n ? Pni=1 ln(?yi) if y < 0, f (y) = 1 if y 6< 0, and Df (x; y) = ? Pni=1fln(xi=yi) ? xi=yig ? n. 7 [Teb92]. P (t) = (t ? t)=(1 ? ) if t 0 and 2 (0; 1), (t) = 1 if t < 0. Then f (y) = ni=1(1 ? yi= )? for y 2 Cf = (?1; )n, where ? = . For = 21 , Df (x; y) = Pni=1(x1i =2 ? yi1=2)2=yi1=2. Examples 2.7.PnIn both examples, it canPbe veri ed that f is a co nite B -function. 1. f (x) = i=1 xi ln xi if x 0 P and ni=1 xi = 1, f (x) = 1 otherwise (cf. Lemmas 2.3(b) and 2.4(a)). Then f (y) = ln( ni=1 exp(yi)) [Roc70, pp. 148{149]. 2. f (x) = ?(2 ? jxj2)1=2 if jxj , 0, f (x) = 1 if jxj > . (Here (2.5) fails if n > 1 and > 0.) Then f (y) = (1 + jyj2)1=2 [Roc70, p. 106]. 5

We make the following standing assumptions about problem (1.1). Assumption 2.8. (i) f is a (possibly nonsmooth) B -function. (ii) The feasible set X = fx 2 Cf : Ax bg of (1.1) is nonempty. (iii) (?AT P ) \ im @f 6= ;, where P = IRm+ . This assumption is required in [Bre67, CeL81, DPI86], where the condition (?AT P ) \ im @f 6= ; is only used to start algorithms. We now exhibit important implications of this condition for the usual dual problem of (1.1) (missing in [Bre67, CeL81, DPI86]). The dual problem, obtained by assigning a multiplier vector p to the constraints Ax b, is maximize q(p); (2.6a) subject to p 0; (2.6b) where q : IRm ! [?1; 1) is the concave dual functional given by T q(p) = inf (2:7) x ff (x) + hp; Ax ? big = ?f (?A p) ? hp; bi : The dual problem (2.6) is a concave program with simple bounds. Weak duality means supp2P q(p) inf x2X f (x). The following lemma is proven in the Appendix. Lemma 2.9. Let f be a closed proper essentially strictly convex function, (?AT )?1 im @f 6= ; and Cq = fp : q(p) > ?1g. Then q is closed proper concave and continuously dierentiable on Cq = fp : ?AT p 2 im @f g = fp : Argxmin[f (x) + hp; Axi] 6= ;g; im @f = Cf and rq(p) = Ax(p) ? b for any p 2 Cq , where

x(p) = rf (?AT p) = argxminff (x) + hp; Axig = (@f )?1(?AT p)

(2:8)

is continuous on Cq . Further, ?q is essentially smooth, so that q 0(p + t(p ? p); p? p) # ?1 as t " 1 for any p 2 Cq and p 2 bd Cq . The rst assertion of Lemma 2.9 is well known (cf. [Fal67]). The nal assertion will be used to keep our algorithm within Cq , where q is smooth. For each p 2 Cq \ P , we let r(p) = rq(p) = Ax(p) ? b: (2:9) Note that x(p) and p solve (1.1) and (2.6) if p = [p + r(p)]+ : (2:10) Indeed, then r(p) 0, p 0, hp; r(p)i = 0 and ?AT p 2 @f (x(p)) by (2.8). In the kth iteration of our method for solving (2.6), given pk 2 Cq \ P , a coordinate ik such that rik (pk ) > 0 (or < 0) is chosen and pik is increased (or decreased, respectively) to increase the value of q, using the fact that rik (p) is continuous around pk and nonincreasing in pik , since q is concave. We let (cf. (2.8) and (2.1){(2.2)) xk = x(pk ), gk := ?AT pk 2 @f (xk ); (2:11) D E Dfk (x; xk ) = f (x) ? f (xk ) ? gk ; x ? xk 8x: (2:12)

6

Algorithm 2.10. Step 0 (Initiation ). Select an initial p1 2 P \ Cq , relaxation bounds !min 2 (0; 1) and !max 2 [1; 2) and a relaxation tolerance D 2 (0; 1]. Set x1 = x(p1) by (2.8). Set k = 1. Step 1 (Coordinate selection ). Choose ik 2 f1: mg. Step 2 (Direction nding ). Find the derivative qk0 (0) of the reduced objective qk (t) = q(pk (t)) with pk (t) = pk + teik 8t 2 IR: (2:13) Step 3 (Trivial step ). If qk0 (0) = 0, or qk0 (0) < 0 and pkik = 0, set tk = 0 and go to Step 5. Step 4 (Linesearch ). Find tk ?pkik such that pk (tk ) 2 Cq and (i) if qk0 (0) > 0 then !k 2 [!min; !max]; (ii) if qk0 (0) < 0 then either !k 2 [!min; !max], or !k 2 [0; !min) and tk = ?pkik ; and q(pk (tk )) ? q(pk ) D Dfk (x(pk (tk )); xk ) if !k > 1; (2:14) where

!k = [qk0 (0) ? qk0 (tk )]=qk0 (0): (2:15) Step 5 (Dual step ). Set pk+1 = pk (tk ), xk+1 = x(pk+1), increase k by 1 and go to Step 1. A few remarks on the algorithm are in order. Step 0 is well de ned, since P \ Cq 6= ; by Assumption 2.8 and Lemma 2.9. Suppose k p 2 P \ Cq at Step 1. By (2.13) and Lemma 2.9, qk0 () = rik (pk ()) is continuous and nonincreasing on the nonempty open interval T 0 = ft : pk (t) 2 Cq g. Step 4 chooses pk (tk ) 0, since ?pkik = inf pk (t)0 t. To see that Step 4 is well de ned, suppose qk0 (0) > 0 (the case of qk0 (0) < 0 is similar). Let t0k = supt2T 0 t and T 0 = ft 2 T 0 : 0 qk0 (t) (1 ? !min)qk0 (0)g. It suces to show that T 0 is a nontrivial interval. If t0k is nite then qk0 (t) # ?1 as t " t0k by Lemma 2.9, whereas if t0k = 1 then, ifRwe had qk0 (t) > for some xed 2 (0; (1 ? !min)qk0 (0)] and all t 0, q(pk (t)) = q(pk ) + 0t qk0 ()d ! 1 as t ! 1 would contradict the weak duality relation supP q inf X f . Hence, using the continuity and monotonicity of qk0 on T 0, the required tk can be found, e.g., via bisection. Note that tk qk0 (tk ) 0 i !k 1 (qk is monotone), where !k = 1 if tk = 0. To sum up, by induction we have for all k pk 2 P \ Cq ; (2:16) tk qk0 (tk) 0 if !k 1: (2:17) We make the following standing assumption on the order of relaxation. Assumption 2.11. Every element of f1: mg appears in fikg in nitely many times.

3 Convergence

We shall show that fxk g converges to the solution of (1.1). Because the proof of convergence is quite complex, it is broken into a series of lemmas. We shall need the following two results proven in [TsB91]. 7

Lemma 3.1 ([TsB91, Lemma 1]). Let h : IRn ! (?1; 1] be a closed proper convex function continuous on Ch . Then: (a) For any y 2 Ch , there exists > 0 such that fx 2 Ch : jx ? yj g is closed. (b) For any y 2 Ch and z such that y + z 2 Ch , and any sequences yk ! y and zk ! z such that y k 2 Ch and y k + z k 2 Ch for all k, we have lim supk!1 h0(y k ; z k ) h0(y; z).

Lemma 3.2. Let h : IRn ! (?1; 1] be a closed proper convex function continuous on Ch. If fyk g Ch is a bounded sequence such that, for some y 2 Ch, fh(yk)+ h0(yk; y ? yk)g is bounded from below, then fh(y k )g is bounded and any limit point of fy k g is in Ch . Proof. Use the nal paragraph of the proof of [TsB91, Lemma 2].

Lemmas 3.1{3.2 could be expressed in terms of the following analogue of (2.1) Df0 (x; y) = f (x) ? f (y) ? f 0(y; x ? y) 8x; y 2 Cf : (3:1) Lemma 3.3. Let h : IRn ! (?1; 1] be a closed proper strictly convex function continuous on Ch . If y 2 Ch and fy k g is a bounded sequence in Ch such that Dh0 (y ; y k ) ! 0 then y k ! y . Proof. Let y1 be the limit of a subsequence fyk gk2K . Since h(yk ) + h0(yk ; y ? yk ) = K h(y1) by continuity of h h(y)?Dh0 (y; yk ) ! h(y), y1 2 Ch by Lemma 3.2 and h(yk ) ?! 0 k on Ch. Then by Lemma 3.1(b), 0 = lim infk2K Dh (y ; y ) h(y ) ? h(y1) ? h0(y1; y ? y1) yields y1 = y by strict convexity of h. Hence yk ! y. Using (2.9) and (2.16), we let rk = rq(pk ) = Axk ? b: (3:2) By (2.7), (2.8), (2.13), (1.2), (2.11), (2.2), (2.12) and (3.1), for all k D E q(pk ) = f (xk ) + pk ; Axk ? b ; (3:3) xk 2 C@f Cf ; (3:4) E D (3:5) qk0 (0) = rikk = aik ; xk ? bik ; E D (3:6) qk0 (tk ) = rikk+1 = aik ; xk+1 ? bik ; (3:7) 0 Df0 (x; xk ) Df[ (x; xk ) Dfk (x; xk ) Df] (x; xk) 8x: Lemma 3.4. Let kq = q(pk+1) ? q(pk ) for all k. Then: kq = Dfk (xk+1; xk ) + tk qk0 (tk ) D Dfk (xk+1; xk ) 0 8k; (3:8) 1 X

k=1 1 X

Dfk (xk+1; xk )

1 X

k=1

kq =D < 1;

jtk qk0 (tk)j (1 + 1=D )

1 X

kq < 1;

k=1 k=1 k k k [ k q(p ) f (x) ? Df (x; x ) f (x) ? Df (x; x ) f (x)

8

(3:9)

8x 2 X; 8k:

(3:10) (3:11)

Proof. Using (3.3), (3.2), (2.11), (2.12), (3.6) and pk+1 = pk + tk eik , we have D E D E kq = f (xk+1) ? f (xk ) + pk+1 ; Axk+1 ? b ? pk ; Axk ? b D E D E = f (xk+1) ? f (xk ) + pk ; Axk+1 ? Axk + pk+1 ? pk ; Axk+1 ? b E E D D = f (xk+1) ? f (xk ) + AT pk ; xk+1 ? xk + pk+1 ? pk ; rk+1 E D E D = f (xk+1) ? [f (xk ) + gk ; xk+1 ? xk ] + tk eik ; rk+1 = Dfk (xk+1; xk ) + tk qk0 (tk );

so (3.8) follows from (2.14) and (2.17), since D 2 (0; 1]Pand Dfk (xk+1; xk ) 0 8k (cf. (3.7)). Then by summing (3.8), we get (3.9){(3.10) from 1kD=1 kq = limEk!1D q(pk ) ? q(p1)E infDX f ?q(p1), using supP q inf X f . In fact for x 2 X , pk ; Axk ? b pk ; Axk ? Ax = E ? AT pk ; x ? xk , since Ax b and pk 0 (cf. (2.16)), so by (3.3), (2.11), (2.12) and (3.7),

E D q(pk ) f (xk ) + ?AT pk ; x ? xk = f (x) ? Dfk (x; xk) f (x):

Lemma 3.5. fxk g is bounded and fxk g L1f (x; f (x) ? q(p1)) 8x 2 X . Proof. Let x 2 X . Since fq(pk )g is nondecreasing, (3.11) yields Df[ (x; xk ) f (x) ? q(p1) for all k, so by (3.4), xk 2 L1f (x; f (x) ? q(p1)), a bounded set by De nition 2.1(c). Lemma 3.6. ff (xk )g is bounded and every limit point of fxk g is in Cf . Proof. Let x 2 X . By (3.1), (3.7) and (3.11), f (xk )+ f 0(xk ; x ? xk) f (x) ? Dfk (x; xk) q(pk ) q(p1) for all k, so the desired conclusion follows from the continuity of f on Cf (cf. De nition 2.1(b)), fxk g Cf (cf. (3.4)) and Lemmas 3.5 and 3.2. Lemma 3.7. xk+1 ? xk ! 0. Proof. If the assertion does not hold, then (since fxk g is bounded; cf. Lemma 3.5) there exists a subsequence K such that fxk gk2K and fxk+1gk2K converge to some x1 and x1 + z respectively with z = 6 0. By Lemma 3.6, x1 2 Cf and x1 + z 2 Cf . By (3.1), (3.7) and (3.8), kq D [f (xk+1) ? f (xk ) ? f 0(xk ; xk+1 ? xk )], so from (3.9), the continuity of f on Cf (cf. De nition 2.1(b)) and Lemma 3.1(b), we get 0 = lim infk2K kq D [f (x1 + z) ? f (x1) ? f 0(x1; x1 + z)], contradicting the strict convexity of f . Lemma 3.8. rk+1 ? rk ! 0 and qk0 (tk) ? qk0 (0) ! 0. Proof. We have rk+1 ? rk = A(xk+1 ? xk ) by (3.2), and qk0 (tk ) ? qk0 (0) = rikk+1 ? rikk by (3.5){(3.6), so the desired conclusion follows from xk+1 ? xk ! 0 (cf. Lemma 3.7). Lemma 3.9. [pkik + rikk ]+ ? pkik ! 0. 9

Proof. If the claim does not hold, there exist > 0 and an in nite K f1; 2; : : :g such that j[pkik + rikk ]+ ? pkik j 8k 2 K . Thus for each k 2 K , either (a) rikk or (b) rikk ? and pkik , where rikk = qk0 (0) by (3.5). Using (3.9) and Lemma 3.8, pick k^ such kq < (1 ? !min)2 and jqk0 (0) ? qk0 (tk )j < !min 8k k^. Let k 2 K , k k^. Since jqk0 (0)j , !k < !min (cf. (2.15)). Hence case (a) cannot occur, and for case (b) Step 4(ii) sets tk = ?pkik . Thus tk ? and qRk0 (tk ) < (1 ? !min)qk0 (0) ?(1 ? !min), so, since qk0 () is nonincreasing, q(pk+1) ? q(pk ) = 0tk qk0 ( )d (1 ? !min)2, a contradiction. Lemma 3.10. fxk g converges to some x1 2 Cf . Proof. We rst show that for all k, E D (3:12) Dfk+1 (x; xk+1) + Dfk (xk+1; xk ) ? Dfk (x; xk ) = tk aik ; x ? xk+1 8x 2 Cf : E

D

By (2.12), the left side equals gk ? gk+1; x ? xk+1 , where gk ? gk+1 = AT (pk+1 ? pk ) = tk aik (cf. (2.11)), since pk+1 = pk + tk eik . Since fxk g is bounded (Lemma 3.5), a subsequence fxkj g converges to some x1 2 Cf (cf. Lemma 3.6). Let I< = fi : hai; x1i < big, I= = fi : hai; x1i = big and I> = fi : hai; x1i > big. Pick > 0 for B (x1; ) = fx : jx ? x1 j g such that

Di E a ; x ? bi < ? 8i 2 I 8i 2 I>; x 2 B (x1; ):

(3.13) (3.14)

Suppose fxk g does not converge. Then there exists 1 > 0 such that, for each j , xk 2= B (x1; 1) for some k > kj . Replacing by minf; 1g, for each j such that xkj 2 B (x1; ) let kj0 = minfk kj : xk+1 2= B (x1; )g, so that xk 2 B (x1; ) for k 2 K j = [kj ; kj0 ]. Summing (3.12) for x = x1 and using Dfk (xk+1; xk ) 0 (cf. (3.7)) gives kX j0 ?1 D E kj0 1 kj0 kj 1 kj Df (x ; x ) Df (x ; x ) + tk aik ; x1 ? xk+1 k=kj

8j:

(3:15)

We need to show that the sum above vanishes. Let Kg. Since pkik 0 8k, Lemma 3.9 yields lim supk!1 rikk 0, where rikk = aik ; xk ? bik (cf. (3.5)), so there exists j> such that, for all j j> and k kj , rikk < and K>j = ; (otherwise ik 2 I> and xk 2 B (x1; ) would give rikk > by (3.14), a contradiction). Since qk0 (tk ) ? qk0 (0) ! 0 (Lemma 3.8), there exists j< j> such that qk0 (tk ) 0 (< 0) and qk0 (t) < 0 (> 0) for some t then (4.3) is well de ned for some ~tk > 0 (< 0 respectively). (d) For any 2 IR, the level set fx : Dfk (x; xk) g is bounded. (e) If H k \ Cf 6= ; then x~k+1 is well de ned by (4.2). (f) If H k \ ri Cf 6= ; then x~k+1 is well de ned by (4.2) and (4.3) holds for some t~k . (g) If H k \ ri Cf 6= ; and C@f = ri Cf (e.g., f is essentially smooth) then x~k+1 2 ri Cf . Proof. If (4.3) holds then ?AT pk (t~k ) 2 @f (~xk+1) [Roc70, Thm 23.5], so pk (t~k ) 2 Cq and x~k+1 = x(pk (t~k )) by Lemma 2.9 and qk0 (t~k ) = 0 by (4.1). Similarly, (b) and (c) follow from Lemma 2.9, (4.1), the monotonicity and continuity of qk0 and the strict convexity of f . As for (d), by (2.11), (2.12) and the strict convexity of f , fx : Dfk (x; xk ) 0g = fxk g, so Dfk (; xk ) has bounded level sets by [Roc70, Cor. 8.7.1]. Then (e) follows from the lower semicontinuity of f and Dfk (; xk ), (f) from [Roc70, Thm 28.2] and (g) from [Roc70, Thm 28.3]. Remark 4.2. The proof of (d) above shows that the requirement on L2h (~y; ) in condition (iv) of De nition 2.2 is a consequence of conditions (i){(iii) (since L2h(~y; 0) = fy~g). This fact, implicit in Bregman's original work [Bre67], has been ignored in its follow-up [CeL81]. We now turn to overrelaxation. It follows from (3.8) that kq D Dfk (xk+1; xk ) () tk qk0 (tk ) (D ? 1)Dfk (xk+1; xk ) () (D ? 1)kq D tk qk0 (tk ): (4.4) Depending on which quantities are computed, any of these conditions can be used at Step 4 (cf. (2.14)). The third condition occured in the quite abstract framework of [Tse90], where the case of a quadratic f required considerable additional eort. Generalizing an idea from [DPI86], we now give another useful condition based on the following Lemma 4.3. For all k, !k kq = (1 ? !k )Dfk+1 (xk ; xk+1) + Dfk (xk+1; xk ). Proof. By (2.15), !k qk0 (tk ) = (1 ? !k )[qk0 (0) ? qk0 (tk )], and by (3.5), (3.6), (2.11) and (2.12), E E D D tk [qk0 (0) ? qk0 (tk )] = tk aik ; xk ? xk+1 = AT pk+1 ? AT pk ; xk ? xk+1 = Dfk+1 (xk ; xk+1) + Dfk (xk+1; xk ); so !k kq = !k tk qk0 (tk ) + !k Dfk (xk+1; xk ) = (1 ? !k )tk [qk0 (0) ? qk0 (tk )] + !k Dfk (xk+1; xk ) = (1 ? !k )[Dfk+1(xk ; xk+1) + Dfk (xk+1; xk )] + !k Dfk (xk+1; xk ); where the rst equality follows from (3.8). 13

Let !max 2 (1; 2), D 2 [0; 2!?max!max ?1 ) and d = 1 + (1 + D )(1 ? !max), so that d > 0. Condition (2.14) may be replaced by Dfk+1 (xk ; xk+1) (1 + D )Dfk (xk+1; xk ) if !k > 1; (4:5) since 2kq !k kq [(1 ? !k )(1 + D ) + 1]Dfk (xk+1; xk ) dDfk (xk+1; xk ) if !k > 1 (4:6) by Lemma 4.3 and the choice of d, so (2.14) holds with D = d=2, as required for convergence. If f is a strictly convex quadratic function then Df (x; y) = Df (y; x) (cf. (2.3)) for all x and y [DPI86], whereas by (2.11) and (2.12), Dfk (; xk ) = Df (; xk). Thus in the quadratic case conditions (4.5) and (4.4) hold automatically (for some d; D > 0).

5 Convergence for essentially smooth objectives Generalizing the analysis in [Tse91, LuT92b], let us now replace Assumption 2.8 by Assumption 5.1. (i) f is closed, proper and essentially strictly convex. (ii) C@f = Cf . (iii) Cf fyk g ! y 2 bd Cf ) f (yk ) ! 1. (iv) The feasible set X = fx 2 Cf : Ax bg of (1.1) is nonempty. (v) p1 2 P is such that ?AT p1 2 im @f and the set AT Lq is bounded, where Lq = fp 2 P \ Cq : q(p) q(p1)g. If f is essentially smooth then (i) yields (ii) (cf. [Roc70, Thm 26.1]). In general, (i) implies that f is essentially smooth (cf. [Roc70, Thm 26.3]), so C@f = Cf (cf. [Roc70, Thm 26.1]) and rf is continuous on Cf (cf. [Roc70, Thm 25.5]). Since f is lower semicontinuous, (iii) holds if Cf = Cf , and conversly (iii) yields Cf = Cf (otherwise let y 2 Cf n Cf and y 2 Cf to get (cf. [Roc70, Thm 6.1 and Cor. 7.5.1]) limt"1 f ((1 ? t)y + ty) = f (y) < 1, contradicting (iii)). Thus under Assumption 5.1, Lemma 2.9 holds with Cq = Cq and conditions (ii){(iii) of Assumption 2.8 hold, but f need not satisfy conditions (a){(d) of De nition 2.1. Yet the proofs of Lemmas 3.5{3.7 and 3.10 and Theorem 3.12 can be modi ed by using the following results. Lemma 5.2 ([Tse91, Lemma 8.1]). Let h be a proper convex function onmIRm, E be an n m matrix, c be an m-vector, and P be a convex polyhedral set in IR . Let q~(p) = h(Ep) + hc; pi 8p 2 IRm. Suppose inf P q~ > ?1 and the set fEp : p 2 P ; q~(p) g is bounded for each 2 IR. Then for any 2 IR such that the set fp 2 P : q~(p) g is nonempty, the functions p ! h(Ep) and p ! hc; pi are bounded on this set.

Proof. This follows from the proof of [Tse91, Lem. 8.1]. Lemma 5.3. If Assumption 5.1 holds, then: (a) pk 2 Lq , ?AT pk 2 (?AT Lq ) D Cf Eand xk = rf (?AT pk ) 2 Cf 8k. (b) fAT pk g, ff (?AT pk )g and f b; pk g are bounded. 14

(c) fxk g is bounded. (d) ff (xk )g is bounded and every limit point of fxk g is in Cf . Moreover, if a subsequence L L fxk gk2L converges to some x1 then x1 2 Cf , f (xk ) ?! f (x1) and Df] (x1; xk ) ?! 0. (e) xk+1 ? xk ! 0. (f) The set P = Arg maxP q is nonempty and P Cq . Proof. (a) Since fq(pk )g is nondecreasing, this follows from pk 2 P \ Cq (cf. (2.16)), Lemma 2.9 with ?AT pk 2 @f (xk ) for all k and Assumption 5.1(ii). (b) Apply Lemma 5.2 to ?q, using part (a) and supP q inf X f < 1. L (c) Let yk = ?AT pk , so that xk = rf (yk ) 8k. Suppose jxk j ?! 1 for a subsequence L of f1; 2; : : :g. Since fyk g is bounded (cf. part (b)), there exist y1 and a subsequence K 1 K of L such that yk ?! y . Since fykg Cf (cf. part (a)) and ff (yk )g is bounded (cf. part (b)), y1 2 Cf (cf. Assumption 5.1(iii)). But rf is continuous on Cf , so K 1 K xk ?! x = rf (y1), contradicting jxk j ?! 1. Hence fxk g is bounded. L (d) Let xk ?! x1. As in part (c), we deduce that x1 = rf (y1) with y1 2 Cf . Thus y1 2 @f (x1) (cf. [Roc70, Thm 23.5]), so x1 2 Cf (cf. Assumption 5.1(ii)). Invoking L L Lemma 2.3(a), we get f (xk ) ?! f (x1 ) and Df] (x1; xk ) ?! 0. Hence, since fxk g is bounded, so is ff (xk )g (otherwise we could get a contradiction as in part (c)). (e) If the assertion does not hold, then (cf. parts (c){(d)) there exists a subsequence K such that fxk gk2K and fxk+1gk2K converge to some x1 2 Cf and x1 + z 2 Cf respectively K K K with z 6= 0, f (xk ) ?! f (x1) and f (xk+1) ?! f (x1 + z). Then xk+1 ? xk ?! z yields 0 k k +1 k 0 1 lim supk2K f (x ; x ? x ) f (x ; z) (cf. [Roc70, Thm 24.5]). But (cf. (3.1), (3.7) and (3.8)) kq D [f (xk+1) ? f (xk ) ? f 0(xk ; xk+1 ? xk )], so (cf. (3.9)) 0 = lim infk2K kq D [f (x1 + z) ? f (x1) ? f 0(x1; x1 + z)] contradicts strict convexity of f on Cf C@f (cf. Assumption 5.1(a)). (f) This follows from [Tse91, p. 429]. Parts (c)-(e) of Lemma 5.3 subsume Lemmas 3.5{3.7. Proof of Lemma 3.10 under Assumption 5.1. ]We only modify the argument following (3.16). By Lemma 5.3(d), xkj ! x1 2 Cf yields Df (x1; xkj ) ! 0. Then Df0 (x1; xkj0 ) ! 0 as before. Suppose a subsequence fxkj0 gj2J converges to some x0 6= x1. By Lemma J 5.3(d), f (xkj0 ) ?! f (x0) and x0 2 Cf . Then lim supj2J f 0(xkj0 ; x1 ? xkj0 ) f 0(x0; x1 ? x0) (cf. [Roc70, Thm 24.5]), so (cf. (3.1)) 0 = lim infj2J Df0 (x1; xkj0 ) f (x1 ) ? f (x0) ? f 0(x0; x1 ? x0) with x1 2 Cf and x0 20 Cf contradict strict convexity of f on Cf C@f (cf. Assumption 5.1(a)). Therefore, xkj ! x1. The rest of the proof goes as before.

Theorem 5.4. If Assumption 5.1 holds, then:

(a) Problem (1.1) has a unique solution, say x, in Cf and xk ! x. (b) q(pk ) " maxP q = minX f . (c) Every limit point of fpk g (if any) solves the dual problem (2.6). In particular, if Slater's condition holds then fpk g is bounded and q(pk ) " maxP q = minX f . 15

Proof. (a) Apply Lemma 5.3(d) to (3.18) in the proof of Theorem 3.12(a). (b) Proceed as for part (a) and invoke Lemma 5.3(f). (c) Use the proof of Theorem 3.12(c).

Remark 5.5. Assumption 5.1 holds if f is closed proper essentially strictly convex and 6 ;, where X^ = fx : Ax bg, and the set essentially smooth, Cf = Cf , Cf \ X^ = fx 2 X^ : f (x) g is bounded for all 2 IR. This can be shown as in [Tse91,

p. 440]. Also Assumption 5.1 holds if Cf = Cf , f is strictly convex dierentiable on Cf , Arg minX f 6= ; and X^ \ ri Cf 6= ;. This follows from the analysis in [LuT92b] (use Lemma 5.2 instead of [LuT92b, Lem. 3.3] and observe that C@f = Cf ).

6 Convergence under a regularity condition Let us now replace Assumption 2.8 by the following Assumption 6.1. (i) f satis es conditions (a){(c) of De nition 2.1. (ii) X^ \ ri Cf 6= ;, where X^ = fx : Ax bg. (iii) (?AT P ) \ im @f 6= ;, where P = IRm+ . Condition (ii) is stronger than Assumption 2.8(ii), but f need not satisfy condition (d) of De nition 2.1. To modify the proofs of Lemma 3.10 and Theorem 3.12, we shall need Lemma 6.2. Let h : IRn ! (?1; 1] be a closed proper convex function continuous on Ch . If fy k g is a sequence in Ch convergent to some y 2 CE h such that for some D x 2 ri Ch , 2 DIR and g~k E2 @h(yk), h(x) ? h(yk ) ? g~k ; x ? yk for all k, then h(y) ? h(yk ) ? g~k ; y ? yk ! 0.

D

E

Proof. For all k, let Dhk (y; yk) = h(y) ? h(yk ) ? gk ; y ? yk , where gk is the E D k orthogonal k k k = projection of g ~ onto the linear span L of C ? x , so that y ? y 2 L and g ~ ; y ? y h E Dk g ; y ? yk for all y 2 Ch . Since Dhk (; yk ) 0 by convexity of h, we need to show that

lim supk!1 Dhk (y; yk ) = 0. If this does not hold, there exist > 0 and a subsequence K K of f1; 2; : : :g such that Dhk (y; yk ) 8k 2 K . We have jgk j ?! 1 (otherwise, for some 2 IR and a subsequence K 0 of K , jgk j for all k 2 K 0 and h(yk ) ! h(y) (cf. continuity of h on Ch) would yield limk2K0 Dhk (y; yk ) = 0, a contradiction). Hence we K may assume that gk = gk=jgk j converges to some g1 as k ?! 1. Clearly, jgD1 j = 1 andE g1 2 L (L is closed). For any y 2 Ch, taking the limit of h(y) Dh(yk ) + gEk ; y ? yk divided by jgk j yields hg1; y ? yi 0. Similarly, h(x) ? h(yk ) ? gk ; x ? yk for all k yields hg1 ; x ? yi 0. Then x 2 Ch and hg1 ; y ? yi 0 for all y 2 Ch imply hg1 ; x ? yi = 0 and (since jg1j = 1 and g1 2 L) x 2= ri Ch , a contradiction. In the proof of Lemma 3.10 (after (3.16)), letting x 2 X^ \ ri Cf (cf. Assumption 6.1(ii)), use xkj ! x1 2 Cf , continuity of f on Cf (cf. De nition 2.1(b)), the fact Dfk (x; xk ) f (x) ? q(p1) 8k (cf. (3.11)) and Lemma 6.2 (cf. (2.12)) to get Dfk (x1; xkj ) ! 0. 16

Theorem 6.3. If Assumption 6.1 holds, then: (a) Problem (1.1) has a unique solution, say x, in C@f and xk ! x. (b) q(pk ) " maxP q = minX f . (c) Every limit point of fpk g (if any) solves the dual problem (2.6). In particular, if Slater's condition holds then fpk g is bounded and q(pk ) " maxP q = minX f . Proof. Using the proof of Theorem 3.12(a) and the argument preceding our theorem, we have xk ! x1 and Dfk (x1; xk ) ! 0, so (cf. (3.18)) q(pk ) " f (x1) yields x1 = x as before. Since X^ \ ri Cf = 6 ; (cf. Assumption 6.1(ii)), x 2 C@f and maxP q = f (x) (cf. [Roc70, Cor. 28.3.1 and Cor. 28.4.1]). This yields parts (a) and (b). For part (c), use the proof of Theorem 3.12(c).

7 Additional remarks Equality constraints may be handled directly (instead of converting equalities into pairs of inequalities). Consider problem (1.1) with equality constraints Ax = b. Then X = fx 2 Cf : Ax = bg and P = IRm in Assumption 2.8, (2.6b) is deleted, and (2.10) becomes r(p) = 0. Thus p is no longer constrained at Step 4. DIt is easy toE verifyD the precedingE convergence results. In the proof of (3.11), Ax = b yields pk ; Axk ? b = ? AT pk ; x ? xk as before. In Lemma 3.9, rikk ! 0 can be shown as before by using !k 2 [!min; !max] for all k. Then s1 = 0 in Lemma 3.11. Extension to the case of mixed equality and inequality constraints is straightforward. Following [TsB91, Tse90], let us now assume that f is closed, proper, strictly convex, continuous on Cf and co nite. Then Assumption 2.8(ii,iii) and Lemma 2.9 hold with im @f = Cf = Cf = IRn (cf. [Roc70, Thm 25.5]) and Cq = Cq = IRm. Moreover, f satis es conditions (a){(b) of De nition 2.1; we now show that condition (c) holds automatically by using the following result of [TsB91]. Lemma 7.1 ([TsB91, Lemma 2]). Let h : IRn ! (?1; 1] be a closed proper convex function continuous on Ch and co nite. If fy k g is a sequence in Ch such that, for some y 2 Ch , fh(yk )+ h0(yk ; y ? yk )g is bounded from below, then fykg and fh(yk )g are bounded and any limit point of fy k g is in Ch .

Lemma 7.2. If f is closed proper convex, continuous on Cf and co nite, x 2 Cf and 2 IR, then the sets L4f (x; ) = fy 2 Cf : Df0 (x; y) g and L1f (x; ) = fy 2 C@f : Df[ (x; y) g are bounded. Proof. Note that L1f (x; ) L4f (x; ) (cf. (1.2), (2.1a) and (3.1)) and invoke Lemma 7.1. As a corollary we list a simple result which improves on [DPI86, Thm 5.1]. Lemma 7.3. If f : IRn ! IR is strictly convex and co nite then f is a B -function satisfying condition (2.5). In particular, f is co nite if limjxj!1 f (x)=jxj = 1. 17

Proof. Condition (a) of De nition 2.1 holds by assumption. Invoke [Roc70, Thm 10.1] for (b), Lemma 7.2 for (c), and Lemma 2.3(a) for (2.5). If limjxj!1 f (x)=jxj = 1 then f (x) ? hx; yi ff (x)=jxj ? jyjgjxj ! 1 as jxj ! 1 and hence f (y) < 1 for all y. Lemma 7.3 con rms that if f is a strictly convex quadratic function then Theorem 3.12 holds (also under overrelaxation; cf. x4) with q(pk ) " maxP q (cf. Remark 5.5). Remark 7.4. Suppose f (x) = Pni=1 fi(xi), where each fi is closed proper strictly convex on IR with L1fi (t; ) bounded 8t; 2 IR. Then f is a B -function (cf. Lemma 2.4(d)) satisfying condition (2.5) (cf. Lemma 2.3(d), Cf = Qni=1 Cfi and Df] (x; y) = Pni=1 Df]i (xi; yi)). In particular, if each fP i is also co nite then L1fi (t; ) is bounded 8t; 2 IR (cf. Lemma 7.2), f is co nite (f (y) = ni=1 fi(yi) 8y) and Assumption 2.8 merely requires that X 6= ;, so Theorem 3.12 holds with q(pk ) " supP q = minX f if X 6= ;.

8 Block coordinate relaxation

We now consider the block coordinate relaxation (BCR) algorithm of [Tse90, x5.2]. Given the current pk 2 P \ Cq and xk = x(pk ), choose a nonempty set I k f1: mg. Let I?k = fi 2 I k : rik < 0g and I+k = fi 2 I k : rik > 0g, where rk = Axk ? b. If I?k = I+k = ;, set pk+1 = pk and xk+1 = xk . Otherwise, let xk+1 be the solution and ik , i 2 I?k [ I+k , be the associated Lagrange multipliers of the following problem with a parameter 2 (0; 1=2] minimize f (x) + subject to

X kD i E pi a ; x ;

i=2I?k

Di E a ; x bi 8i 2 I?k ; Di E a ; x bi + rik 8i 2 I+k :

(8.1a) (8.1b) (8.1c)

Let

8 k >< i if i 2 I?k ; pki +1 = > pki + ik if i 2 I+k ; : pki otherwise. Lemma 8.1. pk+1 2 P \ Cq and xk+1 = x(pk+1 ) are well de ned, pik+1 = [pki +1 + rik+1]+ 8i 2 I?k ; j[pki +1 + rik+1 ]+ ? pki +1 j jrik+1 ? rik j 8i 2 I k ; E D D k+1 k+1 k E X k r ; p ? p pi (bi ? ai; xk+1 ) 0; i2I?k E D kq = Dfk (xk+1; xk ) + rk+1 ; pk+1 ? pk Dfk (xk+1; xk ) 0:

18

(8:2)

(8:3) (8:4) (8:5) (8:6)

Proof. Let h and X k denote the objective and the feasible set of (8.1) respectively. Then Ch = Cf and Ch \ Xk 6= ;, since for any x 2 X , 2 (0; ) and y = (1 ? )x + xk , y 2 Cf k k k (x; xk 2 Cf ; cf. (2.8)), hai; yi ? bi rik < 0 8i 2 I?k , hai; yi ?D bi r i < ri 8i 2 I+ . By E (2.8) and [Roc70, Thm 28.3], xk = arg minfh(x) : hai; xi ai; xk 8i 2 I?k g, where the feasible set and h have no direction of recession in common (otherwise the minimum would be nonunique). Hence h and X k have no common direction of recession, so Arg minX k h 6= ; (cf. [Roc70, Thm 27.3]). Since (8.1) is solvable and strictly consistent (Ch \ Xk 6= ;), it has a solution xk+1 and Lagrange multipliers ik (cf. [Roc70, Cor. 29.1.5]) satisfying the KuhnTucker conditions (cf. [Roc70, Thm 28.3]) Ik?k = [Ik?k + rIk?k+1 ]+, Ik+k = [Ik+k + rIk+k+1 ? rIk+k ]+, 0 2 @f (xk+1)+ Pi=2I?k pki ai + Pi2I?k [I+k ikai, with rk+1 = Axk+1 ? b. Then by (8.2), pk+1 2 P and ?AT pk+1 2 @f (xk+1), so pk+1 2 P \ Cf and xk+1 = x(pk+1 ) by Lemma 2.9. By the Kuhn-Tucker conditions and (8.2),

D

E X k k X k k D i k+1E i ri 0; (i ? pi )( a ; x ? bi) + pk+1 ? pk ; rk+1 = i2I+k ;ik >0

i2I?k ;ik =0

which yields (8.5). The equality in (8.6) follows from the proof of Lemma 3.4. Finally, to prove (8.4), note that Ik?k = [Ik?k + rIk?k+1 ]+ and (8.2) yield (8.3). Since pk+1 0, we also have j[pki +1 + rik+1 ]+ ? pik+1 j jrik+1 j 8i 2 I k . But rik = 0 and jrik+1 j = jrik+1 ? rik j 8i 2 I k n (I?k [ I+k ), whereas for each i 2 I+k , since rik+1 rik and 2 (0; 1=2], we have jrik+1 j jrik+1 ? rik j. Combining the preceding relations yields (8.4). Let us now establish convergence of the BCR method under Assumption 2.8 and Assumption 8.2. Every element of f1: mg appears in fI kg in nitely many times. In view of (8.6), Lemmas 3.4{3.8 hold with (3.10) replaced by 1 D X

j

rk+1 ; pk+1 ? pk

E

j2

1 X

kq < 1:

(8:7)

rk 0: rk+1 = lim sup max lim sup max i2I k i i2I k i

(8:8)

k=1

k=1

Lemmas 3.9{3.10 are replaced by the following results. Lemma 8.3. j[pkI k+1 + rIkk+1 ]+ ? pkI k+1 j ! 0 and k!1

k!1

Proof. This follows from (8.4) and rk+1 ? rk ! 0 (cf. Lemma 3.8). Lemma 8.4. fxk g converges to some x1 2 Cf . Proof. In the proof of Lemma 3.10, (3.12) and (3.15) are replaced by D E Dfk+1 (x; xk+1) + Dfk (xk+1; xk ) ? Dfk (x; xk) = pk+1 ? pk ; A(x ? xk+1) 8x 2 Cf ; 19

kX j0 ?1 D E kj0 1 kj0 kj 1 kj pk+1 ? pk ; A(x ? xk+1) Df (x ; x ) Df (x ; x ) + k=kj

8j;

(8:9)

and we need only show that the sum above vanishes. Since lim supk!1 maxEi2I k rik 0 by D Lemma 8.3, there exists j> such that, for all j j> and k 2 [kj ; kj0 ), ai; xk ? bi = rik < for all i 2 I k and I k \ I> = ; (otherwise i 2 I k \ I> and xk 2 B (x1; ) would give rik > by (3.14)). Now, I k \ I> = ;, hai; x1i = bi 8i 2 I= and Ax1 ? Axk+1 = Ax1 ? b ? rk+1 yield

D

E X k+1 k D i 1E E D (pi ? pi )( a ; x ? bi): (8:10) pk+1 ? pk ; A(x ? xk+1) = ? pk+1 ? pk ; rk+1 + i2I k \I
, so pki +1 = 0 by (8.3) and E P k pk rk+1 ; pk+1 ? pk = by (8.5). Hence (8.10) and hai; x1i < b 8i 2 I imply i < i2I \I< i kX j0 ?1 D k=kj

E

pk+1 ? pk ; A(x ? xk+1)

[b ? (1 + max i2I< i

D

ai ; x 1

E

]=)

1 D X k=kj

E

j rk+1 ; pk+1 ? pk j ! 0

as j ! 1, using (8.7). This replaces (3.16) in the rest of the proof.

Proof of Lemma 3.11 for the BCR method. In the original proof, note that, fork+1any i 2 f1: mg, K = fk : i 2 I k g is in nite by Assumption 8.2, and ri1 = limk2K ri K lim supk!1 rik+1 0 (cf. Lemma 8.3). If ri1 < 0 then [pki +1 + rik+1 ]+ ? pki +1 ?! 0 (Lemma K k +1 k +1 k k k 8.3) and p 0 8k yield pi ?! 0, so in fact pi ! 0 because pi = pi if k 2= K , and hence [pki + rik ]+ ? pki ! 0. The rest of the proof goes as before. We conclude that for the BCR method under Assumption 8.2, Theorem 3.12 holds under Assumption 2.8, and Theorems 5.4 and 6.3 remain true; this follows from their proofs. Following [Tse90, x5.1], we note that these results extend to exact coordinate maximization in which pk+1 2 Arg maxfq(p) : p 0; pi = pki 8i 2= I k g. Speci cally, if one can nd the solution xk+1 and Lagrange multipliers ik, i 2 I k, of the following problem

X kD i E pi a ; x ;

(8.11a)

ai; x bi 8i 2 I k;

(8.11b)

minimize f (x) + subject to

D

E i=2I k

one may set pki +1 = ik , i 2 I k , pik+1 = pki , i 2= I k .

A Appendix

Proof of Lemma 2.3. (a) This follows from [Roc70, Thm 24.7] if ri Cf = Cf , and from [GoT89, Thm 1.2.7] if Cf = ; (brie y, letting L be the linear span of Cf ? x~ for any 20

x~ 2 Cf , one de nes fL(y) = f (~x + y) with @fL(y) = @f (~x + y) \ L 8y 2 L, and applies [Roc70, Thm 24.7] to fL with CfL 6= ;). (b) Let y 2 S . There exist > 0 and zi 2 IRn , i = 1: j , such that S \ B (y; ) = fy : hzi; yi hzi; yig, where B (y; ) = fy : jy ? yj g. For any y 2 S \ B (y; ), ? i i i @h(y) = f0g [ fIR+z : hz ; yi = hz ; y ig, so @h(y)(y ? y ) = 0 and Dh (y; y) = 0. (c) We have h = h1 + h2, where h1 is polyhedral with Ch1 = IRn and h2 = Ch . Since @h = @h1 + @h2 (cf. [Roc70, Thm 23.8]), Dh? = Dh?1 + Dh?2 . Apply (a) to h1 and (b) to h2. (d) By [Roc70, Cor. 7.5.1 and Thm 10.1], f is continuous on Cf . Let l = inf y2Cf y and c = supy2Cf y. If y 2 (l; c) then Df] (y; yk ) ! 0 by part (a). Suppose yk # y = l. By [Roc70, pp. 227{230], @f (yk)(1) = f 0(yk ; 1) # f 0(y; 1) 2 [?1; 1). If f 0(y; 1) > ?1 then 0 Df] (y; yk ) = f (y) ? f (yk ) + (yk ? y)f 0(yk; 1) ! 0. If f 0(y; 1) = ?1 then 0 Df] (y; yk ) f (y) ? f (yk ) ! 0 when f 0(yk ; 1) 0. The case yk " y = c is similar.

Proof of Lemma 2.4. For each part, checking conditions (a){(b) of De nition 2.1 is standard, so we only provide relationsP for verifying conditions (c){(d). (a)PWe have Cf = \ki=1Cfi , @f = ki=1 @fi [Roc70, Thm 23.8] and C@f = \ki=1C@fi . Use [ Df = ki=1 Df+i and (by nonnegativity of Df+i ) L1f (x; ) \ki=1 L1fi (x; ) 8x; for condition P

(c), and Df] = ki=1 Df?i for condition (d). (b) We have Cf = \ji=1Cfi and @f (x) = cof@fi(x) : i 2 I (x)g for x 2 Cf and I (x) = fi : fi (x) = f (x)g [GoT89, Thm 1.5.5], so C@f [ji=1C@fi , @f () = maxi2I () @fi(), Df[ (x; ) maxi2I () Df+i (x; ) and L1f (x; ) [ji=1L1fi (x; ) 8. If fyk g C@f converges to y 2 Cf then, for all large k, I (yk ) I (y) (by continuity of fi on Cf ) and Df] (y; yk ) = Df?i (y; yk ) for some i 2 I (yk), so Df] (y; yk) ! 0. (c) As in (a), Cf = Cf1 , @f = @f1 + @f2, C@f = C@f1 (C@f1 Cf1 ri Cf2 C@f2 ), [ Df = Df+1 + Df+2 Df+1 and L1f (x; ) L1f1 (x; ). If fykg ri Cf2 converges to y 2 ri Cf2 then Df?2 (y; yk ) ! 0 by Lemma 2.3(a).Q (d) Using Lemma 2.3(d) and @f = ni=1 @fi, argue as in part (a). Proof of Lemma 2.5. Suppose Ch = Ch . If L1h (x; =2) is unbounded forD some x 2ECh and > 0 then (cf. (2.1a)) there exist zk and k 2 @h(zk ) such that h(zk )+ k ; x ? zk h(x) ? , jzk ? xj 1 for all k and jzk j " 1. LetDk = x + (Ezk ? x)=jzk ? xj for all k. By of h, k 2 Ch and h(k ) h(zk ) + k ; k ? zk , so h(k ) h(x) ? + D k convexity E

; k ? x for all k. Suppose K f1; 2; : : :g is such thatDfzk gk2K "E1. Then f k gk2K is nondecreasing (cf. [Roc70, pp. 227{230]), k = x + 1 and k ; k ? x h(Dk ) ? h(x) + E 1 k 1 k k k for all k 2 K , so there exists 2 IRD suchEthat f gk2K " . But h(z )+ ; x ? z h(x) ? means h( k ) ? h(x)+ k ; x , so in the limit h( 1 ) < 1 from closedness of E D h( )+ ; zk h. Hence 1 2D Ch = ChEyields theD existenceEin Ch of > 1. DThen ? E h(zk ) h(x)+ k ; zk ? x implies ? k ; zk h(x) ? h( ) ? k ; x for all k. But the right side tends to 1 as fkgk2K ! 1 (since k 1 < ), whereas (since f k gk2K " 1) the left side is bounded, a contradiction. A similar contradiction is derived if fzk gk2K # 1, using f k gk2K # 1 > 2 Ch . Hence L1h(x; =2) must be bounded for all > 0. 21

Next, suppose = max 2Ch 2 Ch . Using (z) = h(z) ? h ; zi + h( ) 0 and Ch = im @h (cf. [Roc70, Thm 23.5]), we have (cf. [Roc70, Thm 23.8]) @(z) = @h(z) ? ?IR+ for all z, so is nonincreasing (cf. [Roc70, pp. 227{230]). DHence thereE exist zk " 1,

D k 2 @h(zEk ), D1 E k , and x > 0 such that h(zk ) + k ; x ? zk ?h( ) +

? k ; zk + k ; x h(x) ? for all k with = h( ) + h(x) ? h 1; xi < 1. Thus fzk g L1h (x; ), so L1h (x; ) is unbounded. The case where = min 2Ch 2 Ch is similar. We need the following result for proving Lemma 2.9. Lemma A.1. nLet f~() = f (A~), where f is an closedmproper essentially strictly convex function on IR and A~ is a linear map from IR to IR . If A~?1 im @f 6= ; then f and f~ are essentially smooth, Cf = im @f , Cf~ = A~?1Cf , rf A~ and rf~ = A~rf A~ are ~ ) for any p 2 Cf~, continuous on Crf~ = C@f~ = Cf~ = A~?1Cf , i.e., rf~(p) = A~rf (DAp E (Ap ~ ) () Ap ~ 2 @f (x) () x 2 Arg min f () ? Ap; ~ () f~(p) = with x = r f D~ E Ap; x ? f (x). If f is co nite then A~?1 im @f 6= ; and f~ is continuously dierentiable on IRm . Proof. By the assumptions on f and [Roc70, Thm 26.3], f is closed proper essentially smooth, so @f (y) = frf (y)g 8y 2 Cf = C@f by [Roc70, Thm 26.1] and rf A~ is continuous on A~?1Cf by [Roc70, Thm 25.5]. By [Roc70, Thm 23.5], @f = (@f )?1, i.e., x 2 @f (y) i y 2 @f (x), so im @f = C@f . If A~p~ 2 im @f = Cf for some p~ then Cf~ = A~?1Cf by [Roc70, Thm 6.7], since Cf~ = A~?1Cf , and @ f~ = A~@f A~ by [Roc70, Thm 23.9]. Thus @ f~ is single-valued and f~ is closed and proper (so is f ), so f~ is essentially smooth and @ f~ reduces to rf~ on Cf~ = C@f~ by [Roc70, Thm 26.1], where rf~ is continuous by [Roc70, Thm 25.5]. The equivalences follow from [Roc70, Thm 23.5]. If f is co nite then im @f = Cf = IRn. ~ = ?AT p, f~ = Proof of Lemma 2.9. By Assumption 2.8 and Lemma A.1 with Ap f (?AT ) is essentially smooth with Cf~ = fp : ?AT p 2 im @f g, and so is ?q() = f~() + hb; i.

References [ABC83] [BCP93] [BeT89] [BHT87] [Bre67]

R. Aharoni, A. Berman and Y. Censor, An interior points algorithm for the convex feasibility problem, Adv. in Appl. Math. 4 (1983) 479{489. J. B. Brown, P. J. Chase and A. O. Pittenger, Order independence and factor convergence in iterative scaling, Linear Algebra Appl. 190 (1993) 1{38. D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, Prentice-Hall, Englewood Clis, NJ, 1989. D. P. Bertsekas, P. A. Hosein and P. Tseng, Relaxation methods for network ow problems with convex arc costs, SIAM J. Control Optim. 25 (1987) 1219{1243. L. M. Bregman, The relaxation method of nding the common point of convex sets and its application to the solution of problems in convex programming, Zh. Vychisl. Mat. i Mat. Fiz. 7 (1967) 620{631 (Russian). English transl. in U.S.S.R. Comput. Math. and Math. Phys. 7 (1967) 200{217.

22

[CDPE+ 90] Y. Censor, A. R. De Pierro, T. Elfving, G. T. Herman and A. N. Iusem, On iterative methods for linearly constrained entropy maximization, in Numerical Analysis and Mathematical Modelling, A. Wakulicz, ed., Banach Center Publications, vol. 24, PWN|Polish Scienti c Publishers, Warsaw, 1990, pp. 145{163. [CDPI91] Y. Censor, A. R. De Pierro and A. N. Iusem, Optimization of Burg's entropy over linear constraints, Appl. Numer. Math. 7 (1991) 151{165. [CeH87] Y. Censor and G. T. Herman, On some optimization techniques in image reconstruction from projections, Appl. Numer. Math. 3 (1987) 365{391. [CeL81] Y. Censor and A. Lent, An iterative row action method for interval convex programming, J. Optim. Theory Appl. 34 (1981) 321{353. , Optimization of \log x" entropy over linear equality constraints, SIAM J. Control [CeL87] Optim. 25 (1987) 921{933. [Cen81] Y. Censor, Row-action methods for huge and sparse systems and their applications, SIAM Rev. 23 (1981) 444{466. [Cen88] , Parallel application of block-iterative methods in medical imaging and radiation therapy, Math. Programming 42 (1988) 307{325. [CeZ91] Y. Censor and S. A. Zenios, Interval-constrained matrix balancing, Linear Algebra Appl. 150 (1991) 393{421. , Proximal minimization algorithm with D-functions, J. Optim. Theory Appl. 73 [CeZ92] (1992) 451{464. [DaR72] J. N. Darroch and D. Ratclif, Generalized iterative scaling for log-linear models, Ann. Math. Statist. 43 (1972) 1470{1480. [DPI86] A. R. De Pierro and A. N. Iusem, A relaxed version of Bregman's method for convex programming, J. Optim. Theory Appl. 51 (1986) 421{440. [Eck93] J. Eckstein, Nonlinear proximal point algorithms using Bregman functions, with applications to convex programming, Math. Oper. Res. 18 (1993) 202{226. [Elf80] T. Elfving, On some methods for entropy maximization and matrix scaling, Linear Algebra Appl. 34 (1980) 321{339. [Elf89] , An algorithm for maximum entropy image reconstruction from noisy data, Math. Comput. Modelling 12 (1989) 729{745. [Erl81] S. Erlander, Entropy in linear programs, Math. Programming 21 (1981) 137{151. [Fal67] J. E. Falk, Lagrange multipliers and nonlinear programming, J. Math. Anal. Appl. 19 (1967) 141{159. [FlZ90] S. D. Flam and J. Zowe, Relaxed outer projections, weighted averages and convex feasibility, BIT 30 (1990) 289{300. [GoT89] E. G. Golshtein and N. V. Tretyakov, Modi ed Lagrange Functions; Theory and Optimization Methods, Nauka, Moscow, 1989 (Russian). [GPR67] L. G. Gurin, B. T. Polyak and E. V. Raik, The method of projections for nding a common point of convex sets, Zh. Vychisl. Mat. i Mat. Fiz. 7 (1967) 1211{1228 (Russian). English transl. in U.S.S.R. Comput. Math. and Math. Phys. 7 (1967) 1{24. [HeL78] G. T. Herman and A. Lent, A family of iterative quadratic optimization algorithms for pairs of inequalities with application in diagnostic radiology, Math. Programming Stud. 9 (1978) 15{29. [Hil57] C. Hildreth, A quadratic programming procedure, Naval Res. Logist. Quart. 4 (1957) 79{85. Erratum, ibid., p. 361. [Ius91] A. N. Iusem, On dual convergence and the rate of primal convergence of Bregman's convex programming method, SIAM J. Optim. 1 (1991) 401{423. [Kiw94] K. C. Kiwiel, Block-iterative surrogate projection methods for convex feasibility problems, Linear Algebra Appl. ? (1994). To appear.

23

[Kru37] [LaS81] [LeC80] [LiP87] [LuT91] [LuT92a] [LuT92b] [LuT92c] [LuT93] [Man84] [NiZ92] [NiZ93a] [NiZ93b] [Ott88] [Roc70] [Roc84] [Teb92] [TsB87] [TsB91] [Tse90] [Tse91] [Tse92] [ZeC91a] [ZeC91b] [Zen91]

J. Kruithof, Calculation of telephone trac, De Ingenieur (E. Elektrotechnik 3) 52 (1937) E15{E25. B. Lamond and F. Stewart, Bregman's balancing method, Transportation Res. B 15 (1981) 239{249. A. Lent and Y. Censor, Extensions of Hildreth's row-action method for quadratic programming, SIAM J. Control Optim. 18 (1980) 444{454. Y. Y. Lin and J.-S. Pang, Iterative methods for large convex quadratic programs: A survey, SIAM J. Control Optim. 25 (1987) 383{411. Z.-Q. Luo and P. Tseng, On the convergence of a matrix splitting algorithm for the symmetric monotone linear complementarity problem, SIAM J. Control Optim. 29 (1991) 1037{1060. , Error bound and convergence analysis of matrix splitting algorithms for the ane variational inequality problem, SIAM J. Optim. 2 (1992) 43{54. , On the convergence of the coordinate descent method for for convex dierentiable minimization, J. Optim. Theory Appl. 72 (1992) 7{35. , On the linear convergence of descent methods for convex essentially smooth minimization, SIAM J. Control Optim. 30 (1992) 408{425. , Error bound and reduced-gradient projection algorithms for convex minimization over a polyhedral set, SIAM J. Optim. 3 (1993) 43{59. O. L. Mangasarian, Sparsity-preserving SOR algorithms for separable quadratic and linear programming problems, Comput. Oper. Res. 11 (1984) 105{112. S. S. Nielsen and S. A. Zenios, Massively parallel algorithms for singly constrained convex programs, ORSA J. Comput. 4 (1992) 166{181. , A massively parallel algorithm for nonlinear stochastic network problems, Oper. Res. 41 (1993) 319{337. , Proximal minimizations with D-functions and the massively parallel solution of linear network programs, Comput. Optim. Appl. 1 (1993) 375{398. N. Ottavy, Strong convergence of projection-like methods in Hilbert spaces, J. Optim. Theory Appl. 56 (1988) 433{461. R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1970. , Network Flows and Monotropic Optimization, Wiley, New York, 1984. M. Teboulle, Entropic proximal mappings with applications to nonlinear programming, Math. Oper. Res. 17 (1992) 670{690. P. Tseng and D. P. Bertsekas, Relaxation methods for problems with strictly convex separable costs and linear constraints, Math. Programming 38 (1987) 303{321. , Relaxation methods for problems with strictly convex costs and linear constraints, Math. Oper. Res. 16 (1991) 462{481. P. Tseng, Dual ascent methods for problems with strictly convex costs and linear constraints: A uni ed approach, SIAM J. Control Optim. 28 (1990) 214{242. , Descent methods for convex essentially smooth minimization, J. Optim. Theory Appl. 71 (1991) 425{463. , On the convergence of the products of rmly nonexpansive maps, SIAM J. Optim. 2 (1992) 425{434. S. A. Zenios and Y. Censor, Massively parallel row-action algorithms for some nonlinear transportation problems, SIAM J. Optim. 1 (1991) 373{400. , Parallel computing with block-iterative image reconstruction algorithms, Appl. Numer. Math. 7 (1991) 399{415. S. A. Zenios, On the ne-grain decomposition of multicommodity transportation problems, SIAM J. Optim. 1 (1991) 643{669.

24

Free-Steering Relaxation Methods for Problems with Strictly Convex

Free-Steering Relaxation Methods for Problems with Strictly Convex

Suggest Documents

Convex Relaxation for Multilabel Problems with Product Label Spaces

Successive Convex Relaxation Methods for Nonconvex Quadratic

strictly convex renormings - CiteSeerX

STRICTLY CONVEX NORMS AND TOPOLOGY

Convex Relaxation for Combinatorial Penalties

A feasible active set method for strictly convex quadratic problems with

Randomized Methods for Solving Convex Problems - MIT Sloan

Sampling methods for multistage robust convex optimization problems

Stochastic quasi-Newton methods for non-strongly convex problems ...

convex hull relaxation (chr) for convex and ... - Optimization Online

Regularized Logistic Regression is Strictly Convex

several strictly convex obstacles - American Mathematical Society

A GENERAL FRAMEWORK FOR CONVEX RELAXATION OF ...

Reformulation and Convex Relaxation Techniques for

A Strictly Convex Hull for Computing Proximity Distances ... - CiteSeerX

4. Convex optimization problems

Finite Element Methods for Flow Problems with

Homogenization Methods for Problems with Multiphysics, Temporal

Homogenization Methods for Problems with Multiphysics, Temporal

OPTIMIZED SCHWARZ METHODS FOR MODEL PROBLEMS WITH

Multiscale methods for problems with complex geometry

Reflection-Projection Method for Convex Feasibility Problems with an

Accuracy Certificates for Computational Problems with Convex Structure

A Relaxation-Guided Approach for Vehicle Routing Problems with ...