Path-Following Methods for Linear Programming ... - Semantic Scholar

8 downloads 3035 Views 4MB Size Report
methods, methods of centers, potential reduction methods, and primal-dual methods) is described in .... Call an internal minimization algorithm to find xk+l .... Here the objective is the duality gap, and the optimal value is z = cTI - v. Similarly,.
Path-Following Methods for Linear Programming Author(s): Clovis C. Gonzaga Source: SIAM Review, Vol. 34, No. 2 (Jun., 1992), pp. 167-224 Published by: Society for Industrial and Applied Mathematics Stable URL: http://www.jstor.org/stable/2132853 Accessed: 21-09-2015 16:22 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/ info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

Society for Industrial and Applied Mathematics is collaborating with JSTOR to digitize, preserve and extend access to SIAM Review.

http://www.jstor.org

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

SIAM REVIEW Vol. 34, No. 2, pp. 167-224,June1992

(

1992 SocietyforIndustrialand AppliedMathematics 001

PATH-FOLLOWING METHODS FOR LINEAR PROGRAMMING* CLOVIS C. GONZAGAt forlinearprogramming methods ofalgorithms is described In thispapera unified treatment Abstract. farfrom andthatstaysalways path.Thispathis a curvealongwhichthecostdecreases, basedon thecentral inprimal andprimalofthiscurvearedescribed set.Severalparameterizations theboundary ofthefeasible areobtainedbyfollowing thecurveusingdifferent algorithms anditis shownhowdifferent dualproblems, thecurveapproximately, andthisconcept byfollowing areobtained Polynomial algorithms parameterizations. path. ofa pointinrelation tothecentral theproximity rulesformeasuring becomesprecisebyusingexplicit algorithms interior path-following pointmethods, Keywords.linearprogramming, 49D AMS(MOS)subjectclassification.

a family ofalgorithms forsolving thelinear In thispaperwestudy 1. Introduction. problem programming minimize cTx subjectto

(P)

Ax

=

b

x

>

0,

m x n matrix, n > m. wherec E El, b E E', A is a full-rank Weassumethatthefeasibleregion S

= {X E In

I Ax = b,x > O}

relative interior is boundedandhasa nonempty givenby So

= {X

E 1WnI Ax = b,x >

O}.

wasfirst solvedbyDantzig[14]forty yearsago. problem The linearprogramming anditwill usedalgorithm, byhimisstillthemostwidely methoddeveloped Thesimplex andelegant, thesimplex methodis efficient remainso inthefuture. Although possibly in thelasttwo thatbecamemoreand morecharming it does notpossessa property In fact,a problemdevisedbyKlee and Minty[60] decades: polynomial complexity. thatgrew operations forcedthesimplexmethodto executea numberof arithmetical oftheproblem, to themethodan ofvariables attaching withthenumber exponentially worst-case complexity. exponential probforthelinearprogramming algorithm a polynomial The questiononwhether [58],[59]. He appliedtheellipsoidal lemexistswas answeredin 1978byKhachiyan probmethod ofShor[102]andYudinandNemirovskii [123]tothelinearprogramming neededto boundonthenumber ofarithmetical lemandproveda polynomial operations ofthe findan optimal Thebound,O(n4L),dependsona number L, thelength solution. oftheproblem data),whichis someofbitsusedinthedescription input(totalnumber with ofa "strongly Theexistence algorithm, i.e.,a method polynomial" whatfrustrating. andconstraints, isstilla diffiofvariables a complexity boundbasedonlyonthenumber andhada greatimpact raisedanenormous Themethod cultopenproblem. enthusiasm, form)May10,1991. (inrevised 4, 1990;acceptedforpublication September *Received bytheeditors RJ,Brazil(email: ofRio de Janeiro, C. Postal68511,21945Rio de Janeiro, tCOPPE,FederalUniversity

gonzagaQbrlncc . bitnet).

167 on Mon, 21 Sep 2015 16:22:39 UTC This content downloaded from 200.17.211.124 All use subject to JSTOR Terms and Conditions

168

CLOVIS C. GONZAGA

onthetheory ofcomplexity, butunfortunately thepractical havebeen implementations irremediably inefficient. Forcomprehensive studiesofthesetwoapproaches, see forinstance Dantzig[15], andTodd[32].Complexity inShamir Schrijver [100],andGoldfarb issuesarediscussed [101],Megiddo[70],[71],Bland,Golfarb, andTodd[12],Borgwardt [13],andTardos [107]. In 1984,Karmarkar [55]published hisalgorithm, whichnotonlyhada polynomial boundofO(n3-5L)operations, complexity lowerthanKhachian's, butwasannounced as moreefficient thanthesimplex method.Therewas initially muchdiscussion about thisclaim,butnowitis clearthatwell-coded versions ofthenewmethodology arevery whentheproblem efficient, sizeincreases abovesomethousands especially ofvariables. Karmarkar's isessentially algorithm different from thesimplex methodinthatitevolves the(relative) interior ofthefeasibleset,insteadoffollowing through a sequenceofverticesas doesthesimplex method.Karmarkar's hasa flavor algorithm ofnonlinear proincontrast withthecombinatorial gramming, gaitofthesimplex method. Karmarkar's initsoriginal form neededa specialformulation ofthelinear algorithm andreliedontheknowledge programming problem, ofthevalueofan optimal solution, or a processforgenerating efficient lowerboundsforit. Soon standard-form variants weredevelopedbyAnstreicher [4],Gay[24],Gonzaga[34],Steger[106],andYe and Kojima[121],andan efficient methodforgenerating lowerboundsfortheoptimalcost wasdevisedbyToddandBurrell[108].Another approachforfinding lowerboundswas developedbyAnstreicher [3]. A thorough ofKarmarkar's simplification thealgorithm due algorithm reproduces toDikin[16],[17],whichnowreceived thenameof"affine-scaling." willbe Thismethod discussedin ?3.2. Karmarkar's itsvariants andimplementations, briefly are algorithm, inGoldfarb andTodd[32].Wedescribe discussed a variant in ofKarmarkar's algorithm ?3.6. Ourconcern starts from thefactthatKarmarkar's wellbyavoidalgorithm performs ofthefeasibleset.Anditdoesthiswiththehelpofa classicalresource, ingtheboundary first usedinoptimization byFrisch[22]in 1955:thelogarithmic barrier function: X

E

ff?n

X

>

O

n

~-p(X)

= i=l

xi. Elog

Thisfunction neartheboundary ofthefeasiblesetS, andcanbe used growsindefinitely as a penalty attachedto thosepoints.Combining makespoints p(.) withtheobjective neartheboundary andforcesanyminimization algorithm to avoidthem. expensive, A questionis thennaturally raised:howfarfromtheboundary shouldonestay?A wasgiventhrough successful answer thedefinition oftheanalytic centerofa polytope by Sonnevend A well-behaved thebarrier function. [103],theuniquepointthatminimizes curveis formed centers ofall theconstant-cost slicesofthefeasibleset bytheanalytic in (P): thecentralpath. Thisis thesubjectofthispaper. Thispathis a regionwith someveryattractive andprovides an answer toourquestion:try primal-dual properties, to staynearthecentralpath.Renegardidso in 1986[96],andobtainedthefirst pathwitha complexity following lowerthanthatofKarmarkar's methodinterms algorithm, ofnumberofiterations (O(V/iL) againstO(nL)). Renegar'sapproachwasbasedon Huard'smethodofcenters [50]. Soon afterwards and Gonzaga[36],folVaidya[111],refining Renegar'sresults, withan overallcomplexity lowinga penaltyfunction approach,describedalgorithms ofO(n3L) arithmetical limitthatis stillstanding. operations, Simultaneously, Kojima, This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

PATH-FOLLOWING METHODS FOR LINEARPROGRAMMING

169

path-following method, whichwas Mizuno,andYoshise[65]developeda primal-dual and [64]andbyMonteiro bythesameauthors soontobe reducedtothatlowcomplexity potential function appearedlater, Adler[89].A fourth approachbasedon Karmarkar's first inYe [118]andtheninFreund[19],andinGonzaga[41]. results werecitedinthebriefhistorical accountabove.An Onlyprovencomplexity method function [6]:theclassicalbarrier amazing facthasbeenfoundoutbyAnstreicher [18],exactly programming byFiaccoandMcCormick (SUMT) developedfornonlinear in O(V/niL logL) iteraas implemented in 1968,solveslinearand quadraticprograms tions. hasbeenextremely activeinthelastfewyears. The fieldofinterior pointmethods thefourapproachesforlinearproOvera hundredpaperswerewritten, developing tolinearcomplementarity quadratic programming, gramming, extending themtoconvex Path-following methods, whichstarted programming. problems andtoconvexnonlinear evolvedintopractical withnicetheoretical largeas short-steps properties, algorithms stepsmethods. of centralpathalThe purposeof thispaperis to describea unifiedtreatment at thefourapproachescommented gorithms, and to showhowone arrivesnaturally and methods, methods, potential reduction penalty function above(methods ofcenters, ofpointsnearthecentral primal-dual methods).Weshallsee thatthegoodproperties atthesepoints, andthiswill associated toniceprimal-dual properties pathareintimately Andnotsurprisingly, weshallinthe forthewholetheory. providetheunifying concepts withprimal-dual properties, pathandworkdirectly endbe ableto abandonthecentral ofpath-following methods. whilekeepingalltheniceproperties Proofswillbe givenonlyforsomeresults.We hopeto achievethegoalofprovidandtopavethe function ofone approach(penalty methods), inga completetreatment We shall restrain tolinear of other ones. ourselves analyses the wayforstraightforward ofthefieldinthispaper:itshould tomakea survey programming, andwedo notintend be considered as a tutorial on thebasictechniques. rather overview ofthegeometriofthepaper.Thenextsectionisaninformal Organization andthe thelinearprogramming cal aspectsofthemethods. Section3 describes problem a variant ofKarmarkar's maintoolsusedininterior algorithm including pointmethods, thecentral Section4 describes withtheTodd-Burrell lowerboundupdating procedure. whichassumethatexactpointson the algorithms, pathandconceptual path-following stresses thesimilarities amongseveral patharecomputed byan oracle.The treatment 5 and forconceptual Sections algorithms. theorem approaches andendsbya complexity theconstruction 6 discussnearlycentralpointsandcentralization algorithms, allowing inwhichonlypointsnearthecentral ofcomputationally path algorithms, implementable to thevariousapproaches ofthesealgorithms (penalty are allowed.The specialization and primal-dual reduction methodsofcenters, function methods, potential methods, indetailin ??7,8, and9. Section10 hasreferences to topicsnot methods)is described oftheapproachtononlinear coveredinthispaperandtoextensions problems. Notation.We shallworkwithcolumnvectorsandmatrices denoted,respectively, willbe denotedbysuperindices; Different vectors bylowercase anduppercaseletters. willdenotecomponents of a vector.Theseare somespecialconventions: subindices Fora vectorlikex, xk,z, thecorresponding X, Xk, Z willdenote uppercase symbols formed Givena vectorx E ERn the thediagonalmatrix bythevector'scomponents. x21,i = 1,***,n. notation x- willbe usedforthevectorwithcomponents indicated The lettere willdenotethevectorofones,e = [1 ... j]T, withdimension bythecontext. This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

170

CLOVIS C. GONZAGA

usedinthetext: ofthemainsymbols hereis a listing Forfuture reference,

e = [1... 1]T

X = diag(xi,

**x,,n)

vectors inRfn. andpositive 1R+,1R++:nonnegative

x,w,z,A,b,c: variables anddatafor(P) and(D) (?3.1).

y,w, z, A, b,c: variablesand data forscaled problems(?3.2). x, w,z: optimalsolutionsfor(P) and (D) (?3.1). value(?3.1). v = c x: optimal

interior (?3.1). S, SO: feasiblesetfor(P) anditsrelative interior (?3.1). Z, ZO: setoffeasibledualslacksanditsrelative complement (?3.1). matrix intoAf(M) anditsorthogonal PM,PM: projection (?3.1). ofr intosubspaceinthecontext rp= Pr: projection (?3.3). p(.): barrier function of thecentralpath(?4.2). p X x(p): genericparameterization ofthedual centralpath(?4.2). p z(p): genericparameterization p v(p): dual cost (lowerbound) associatedto p (?4.2). -

a: penalty (18). multiplier K: upperboundforthecost(19). v: lowerboundforthecost(20). centerandpotential functions (?3.4). fo(), fK( ), fv(.):penalized, q > n (?3.4). q: fixedmultiplier,

centerofS (?3.3). X: analytic

h(x, p) = X- Ih(x, p): SSD directionfrome in scaled space (22).

forfp(-)fromx (23). h(x,p): SSD direction x tox(p) (40). measurefrom 6(x,p) = IIh(x, p)II:proximity points(?4.1). v(p),z(p), Ai(p):lowerbound,dualslacks,andgapassociatedtocentral v(x,p),z(x,p),A(x, p): lowerbound,dualslacks,andgap associatedto nearlycentral points(42). v5(x),z4x), A(x): bestguessesforlowerbound,dualslacks,andgapatx (?3.5). geveryinformally 2. Anoverview ofcentralpathmethods.Thissectiondiscusses will andreferences properties, ometrical path.Precisedefinitions, aspectsofthecentral be givenlater. function Letus startbyobserving thebarrier n

x

EJRn,

X >

0 F-4p(x)

=-E i=1

n

logxi=-logJ xi. i=1

Thisfunction penalizesvariablesthatapproachzero,andhencepenalizespointsnear centerofS, and theboundary ofS. The uniqueminimizer ofp(.) in SO is theanalytic theproduct ofthevariablesinS. Figure2.1 itcoincides withthepointthatmaximizes illustrates thecenterfora simpleproblem. toshowing thatNewton-Raphson's method can Muchofthepaperwillbededicated withexcellent precision be adaptedto thedetermination ofa centerwitha prescribed Forthetimebeing,letus assumethatan exact theoretical andpractical performance. is athand. solution is thatcostsshouldtryto be deThe mainidea behindall interior pointmethods As is naturalto do in the creasedand simultaneously moveawayfromtheboundary.

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

PATH-FOLLOWING METHODS FOR LINEARPROGRAMMING

171

FIG. 2.1. Levelcurves andvalues inS. fortheproduct ofvariables

faceofcompeting objectives, weshallexamine combinations ofcostandbarrier function, ina traditional construction knownas internal penalizedfunction: a

E R,

x EFH

fa

(x) = acTx +P(X).

Thisfunction wasextensively studiedbyFiaccoandMcCormick intheirbook[18],and described sincetheninall nonlinear textbooks. programming a a centralpoint Nowassociatetoeachvalueoftheparameter x(a) uniquely defined by (1)

x((a) = argminfa(x). xESO

The curvea e 1f?F-* x(a) is thecentral pathforthelinearprogramming problem(P). The curveis smoothandhastheimportant thatas a increases itconverges to property an optimalsolution of(P). Thereare severaldifferent ofthecentralpathas we shallsee below. descriptions Eachdescription toa different corresponds ofthecurve.One ofthem parameterization hasa simplegeometrical consider a central interpretation: point x(a) = argmin{accTx + xESO

p(x)}.

Thispointobviously solvestheproblem obtainedbyconstraining thecosttoitsvalueat x(a), thatis, x(a) = argmin{p(x)IcTx = cTx(a)}. xESO

Thisproblem describes theanalytic center ofa constant-cost sliceoftheoriginal feasible setS, andthisis illustrated byFig.2.2. Path-following algorithms followthecentral path.Let p e (w, w+) F-* x(p) be a parameterization ofthecentral Allalgorithms infinite. path,withw+ > w-, w+ possibly followthemodelbelow. ALGORITHM 2.1. Conceptual path-following: givenxo e so, po e (w-, w+), with x? = x(po).

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

172

CLOVIS C. GONZAGA

slicesofS. centers oftheconstant-cost FIG.2.2. Thecentral pointsaretheanalytic

k := 0. REPEAT

ChoosePk+1 > Pk minimization algorithm tofindxk+1 := x(pk+1). Call an internal k := -k+ 1.

UNTIL convergence.

a sequenceofindependent centralpoints.Actual As itis,themodelsimply generates willdependon theparameterization, (choiceofpoandx?), theinitialization algorithms andwhatis moreimportant, thecriterion forupdating theparameter. different algorithms. As andtheupdating rulecharacterize The parameterization in(1) andupdates anexample, approachusestheparameterization thepenalty function theparameterbyak+1 := (1 +

p)ak,

wherep is a positiveconstant.

thesameforall methods.Findinga central The internal algorithm is essentially witha simpleHeswitha nonlinear function objective pointis a minimization problem anda naturalchoiceis thealgorithm Weshalluse in sianmatrix, ofNewton-Raphson. inthenextsection, descent allcasesan algorithm tobe discussed calledscaling-steepest isinsomecasesexactly Thecrucial equivalent toNewton-Raphson. (SSD). Thismethod inrelation totheinternal is itsstopping rule. algorithm pointtobe discussed infinite Itisimpossible pointexactly time,andwewanttoconstruct tofinda central Also fromthepracticalpointofviewtheinternal algorithm polynomial algorithms. as soonas possible.We mustthenrenounceto thedeterminashouldbe terminated andwork"near"thecentral mustbe defined path.Precisecriteria tionofcentral points, a point"near"a central forconsidering point,andthiswillbe theobjectof?5. thatwe do havea goodcriterion formeasuring ourmodelcan proximity, Assuming inthis a little, as follows.Figure2.3 illustrates thebehavior ofalgorithms be improved model.

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

PATH-FOLLOWING METHODS FOR LINEARPROGRAMMING

7~~~~~

/

173

/~~~

2

Ix

I /

./; /

:

FIG.2.3. Short andlargesteppath-foUlowing algorithms.

ALGORITHM 2.2.Implementable path-following: givenxo e

x? nearx(po).

So, po e (w, w+), with

k := 0. REPEAT

UNTIL

ChoosePk+1 > Pk minimization to findxk+l nearX(pk+l). The algoCall an internal algorithm rithm starts atxk. k := k + 1. convergence.

ofparameter Figure2.3 showstwopossiblecombinations updatesand proximity In thefirst criteria. to tracethepath,so case,a short-step updateforcesthealgorithm thatall pointsgenerated arenearthepath.In thiscase,theinternal algorithm usually In thesecondcase, one iteration ofthemainalgorithm. executesexactly periteration theparameter is updatedbylargesteps,andseveraliterations oftheinternal algorithm areneededtoapproachthecentral toPk+1* pointcorresponding 3. Toolsand non-path-following methods.Thissectionestablishes themainfacts ingeneral. anddefinitions thatcomposethelanguageofinterior pointmethods 3.1. The linearprogramming problem.The linearprogramming problem(P) was and nonnegative withequalityconstraints alreadystatedin ?1. We chosetheformat withinequality andunrestricted buttheequivalent format constraints variables variables, couldhavebeenchosenas well.In fact,therearesimplerulesfor"translating" results fromone formulation to inequality intotheother(see Gonzaga[35]). The extension constraints ofthebarrier andofthenotionofanalytic function centeris straightforward byusingslacks. A is full-rank, andthatthefeasible We assumeas abovethattheconstraint matrix setS isboundedwithnonempty relative interior canbe relaxed: SO.Theseassumptions we onlyneeda boundedoptimalsetformostresults, butthisgeneralization affects the oftheresults. simplicity Weshallalso assumethatan initialinterior pointxo E SO is at hand.Thisassumpthatmodifies theproblem.A tionis inpractice replacedbyan initialization procedure

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

174

CLOVIS C. GONZAGA

Reliketheone discussedinAdler,Karmarkar, is a big-Mmethod, typical procedure sende,andVeiga[2]. unique), solution hasanoptimal theproblem Withthesehypotheses, x (notnecessarily willbe denotedby andtheoptimal valueoftheproblem v=

(2)

x.

ofc intoAf(A), showstheprojection problem(P). The figure Figure3.1 illustrates pointmethroleininterior willplayan important vectors thenullspaceofA. Projected theconceptofprojection. ods,andwe shalltakea littlespacetoreview

cPl

\ if/ X

AX = b

/

problem. FIG. 3.1. Thelinearprogramming

represented byA: Twosubspacesof1RI areassociatedtothelineartransformation therange complement, thenullspaceK(A) = {X E 1Rn I Ax = O},anditsorthogonal byIZ(AT) = {x eE1n I x = ATW, W E jRm}. Anyvectord E 1Rn spaceofAT, defined as d = dP+ dP, wheredPE Af(A) and iP E IZ(AT).dpand canbe uniquely decomposed complement. the ofd intoK(A) anditsorthogonal projection are, respectively, dP PA,such Sincetheprojection is linear,itcan be represented bya matrix operator thatdP = PAd. The orthogonalcomplementwillbe dP = PAd, wherePA = I

- PA.

IfA is a full-rank matrix: fortheprojection matrix, thenthereis a closedformula PA = I - AT(AAT)-1A.

(3)

Euclideandistanceto ofd intoK(A) is thepointinK(A) withsmallest Theprojection themostusualdefinition ofprojection: d. Thisis actually (4)

dP = argmin{lx - dll I x E AS(A)}.

Similarstatements areassociatedtotheorthogonal complement: (5)

dP= argmin{lizilI z = d - ATw,w E Rim}

(6)

= min{ld- ATwII I w E lm}. ildpli

z

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

METHODS FOR LINEAR PROGRAMMING PATH-FOLLOWING

175

bycp'x. The optimalsetfora problem(P) doesnotchangeifwe replacetheobjective thesteepestascent costcpshouldbe clear:itprovides oftheprojected The importance function f: SO -+ point.Givenanydifferentiable an interior forthecostfrom direction forf(.) fromx is -PAVf(x). descentdirection JRandan interior pointx,thesteepest intoAf(M)will matrix M, theprojection Remarkon notation.Givenanymatrix notation is possible, we use thesimplified no confusion be denotedbyPM. Whenever ofa vectorr willbe denotedby matrix, and thentheprojection P fortheprojection rp_ Pr-PMr. to (P) is associated The dualproblem Dual problem. maximize (D)

bTw

subjectto ATw + z

=

c

z

>

0.

(D) hasan optimal Thevariablesz E DRnarecalleddualslacks.Underourhypotheses, v. unique),andbTw V z) (notnecessarily solution(wz, The dualitygap. Givenanypair(x,z), wherex E S and (w,z) is feasiblefor(D) forsome w E #R', xTz = cTx-bTW.

ofz = c - ATw. fact,whichcanbe provedbydirectsubstitution Thisis a well-known ofcomplementary to xTz = 0: thisis thetheorem is equivalent Notethatoptimality slacks. The dualproblemseemsto havetoo manyvariables.In fact,thevariablesw can pair.Thishasbeen primal-dual symmetrical leadingtoa veryconvenient be eliminated, prostudiedbyToddandYe [110],andwe useherea verysimplereduction thoroughly cedure. LEMMA3.1. z E 1n isa feasibledualslackfor(D) ifand onlyifz > 0 and PAz = PAC

Proof.Considera vectorz > 0. Thenz is a feasibledualslackifandonlyifforsome

w E JRm,

c- z

= ATW.

But c - z can be decomposedin a uniquewayas c-Z = PA(c-z) +?ATW, and itfollowsfromthecomparisonofthetwoexpressionsabove thatPA(c- z) = PACPAZ = 0, completingtheproof. [1

thefeasibility of itprovides a simplerulefortesting Thislemmaisveryinteresting: canbe obtainednow. a dualslack.Someconclusions dualproblem(inthesensethattheobjective Givenanypointx E S, an equivalent as at all feasiblepoints)canbe written differs bya constant minimize xT z (7)

subjectto

PAZ z

=

PAC

>

0.

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

176

CLOVIS C. GONZAGA

Here theobjectiveis thedualitygap, and theoptimalvalue is

z

= cTI -

v.

theprimalproblemcan also be modified: itsobjectivecan be replaced Similarly, bycTx as we sawabove,orbyzTx foranyfeasibledualslackz. The equivalent primal problem willbe: minimize ;Tx (8)

subjectto

Ax

=

b

x

>

0,

Thedualfeasiblesetfor(7) anditsrelative Notation. willbe defined as interior Z

=

Z?=

{ZEJR|PAZ=PAC,Z>?O}, {ZE1Rn|PAZ=PAC,z>O}.

A scalingtransformation 3.2. Thescaling-steepest descentalgorithm. onproblem (P) isa changeofvariables x = Dy,whereD isa positive Givena point diagonalmatrix. x = Xoy,whereaccording x? E S, scalingaboutx? isthescalingtransformation toour notational convention, scaled problem Xo = diag(x?, , xO).Thelinearprogramming aboutx? willbe minimize

ET

subjectto

(SP)

y

Ay

=

y >

b

0,

whereA = AXO,c = Xoc are obtainedbysubstitution of x := Xoy in (P). The pointx?

is transported toe,thevectorofones. ina simpleway. affects dualvariables Scaling

LEMMA3.2. (w, z) is a feasibledual solutionfor(P) ifand onlyif(w, Xoz) is a feasible dual solutionfor(SP).

Proof.(w,z) is feasiblefor(P) ifandonlyif

z > O.

ATw+z=c

Multiplying bythepositive diagonalmatrix Xo, (AXo)Tw

+ XOz

=

XOC,

Xoz

> 0.

a dualfeasiblepair(w,Xoz) for(SP), completing Thischaracterizes theproof. O Theprimalvariables inscaledproblems Remarkon notation. willbe eithery orx. Allotherentities associatedtothescaledproblem willbe indicated bya bar. Thereare severalreasonswhyscalingis veryuseful.It obviously doesnotchange problem(P), and so itis in principle innocuous.The first reasonwhywe shallalways workwithscaledproblems is thatitsimplifies inmostofthetheory theexpressions and procedures tobe studied, yielding veryclearformulas. The secondreasonis thatscalingdoesaffect thesteepestdescentdirection. Andit doesso ina cleverway,as we shownow. Considera generalization ofproblem(P) fora differentiable objective f(I): minimize f (x),

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

PATH-FOLLOWING METHODS FOR LINEARPROGRAMMING

177

anda pointxo E SO.Thesteepest descentdirection wasstudied byCauchyaround1840: itis thedirection ofthelinearapproximation thatsolvestheminimization off(.) about inx?, x? overa unitballcentered (9)

minimize{Vf(xO)TdI Ildil< 6,d E Af(A)}.

The optimalsolution oftheCauchy-Schwarz andis stems,as a consequence inequality, alwaysa multiple of (10)

h=-PAVf(x0).

in constrained The steepestdescentdirection as we maybe veryinefficient problems, in Fig.3.2 fora verysimpleproblem, illustrate withS = 1R2. The steepestdescent whatis knowntodayas a trust-region a simple is actually minimization: computation ina simpleregion(a ball)toobtaina hint isminimized objective (linearapproximation) ofthefunction on thebehavior aroundthepoint.A ballis an obviouschoicefortrust anddemocratic regionbecauseitis easy(all one needsis a projection) (no directions arefavored). Thepresence ofpositivity constraints andmotivates the spoilsthesecondadvantage, searchforan easyregioncapableofreflecting theshapeoftheregionofinterest more inthe The easiestlargeshapeavailableis thelargest precisely. possiblesimpleellipsoid withaxesparallelto thecoordinate shownin Fig.3.2. The ellipsoid, positiveorthant, withS. axes(andhencesimple),provides a largetrust regionwhenintersected

FIG.3.2. Trust inCauchy andSSD algorithms. minimization region

thisellipsoidintoa ballcenteredat e, and Scalingtheproblemaboutx? deforms hencethesolutionofthetrustregionminimization is obtainedbyscalingfollowed by theprojection oftheresulting vector(see Fig.3.3). gradient Thisanalysis ina generalfirst-order trust results interior regionminimization algorithm thatcaninprinciple be usedforanycontinuously differentiable function. objective Thisalgorithm willbecalledscaling-steepest descent(SSD), andwillbe theminimization willuse a slightly methodusedin mostofthispaper(primal-dual different algorithms scaling). ALGORITHM 3.3. (SSD): given xo E SO,f: SO -> JRcontinuously differentiable. k := 0. REPEAT

Scaling:A := AXk, g

xkVf(xk).

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

CLOVIS C. GONZAGA

178

trustregions. FIG.3.3.Affine-scaling

Direction:h :=-PA9- _ Line search:y := e + Ah,y > 0. Scaling:xk+l Xky.

k := k+ 1.

UNTIL convergence.

h minimizes xk tothevectore. Thedirection transports Thescalingtransformation tothelargest ofy - f(y) = f(X-1y) ina ball(corresponding thelinearapproximation here:itis space).Thelinesearchalonghisnotspecified simpleellipsoidintheoriginal procedure witha heuristic minimization off(.) alongh,perhaps an approximate usually is present). function (notneededifthebarrier toavoidtheboundary is obtainedbythedirect forlinearprogramming An amazingly efficient algorithm of SSD to theoriginalproblem(P). Thisis themethodknownas affineapplication one in first proposedbyDikinin 1967[16]. Dikintookalwaysa stepoflength scaling, usedlargesteps,a fixedpercentage Otherresearchers thelinesearch,i.e.,A = 1/1lhll. Likein orthant. inthepositive possiblesteplength (above95 percent)ofthemaximum intheprojection workis concentrated thecomputational pointalgorithm, anyinterior neededineachiteration. operation variantof Karobtainedas a simplified is naturally algorithm The affine-scaling alongthispathbyseveralauthors:Barnes andwas rediscovered markar's algorithm, for convergent isglobally andFreedman [115].The algorithm Meketon, [9],Vanderbei, as Dikinprovedin1974[17]forunitsteplengths. withnoprimaldegeneration, problems to andLagarias[114],andextended byVanderbei andclarified Hisproofwasimproved bymany implemented largestepsbyGonzaga[39]. The methodhasbeensuccessfully Resende,andVeiga[2] and MonmaandMorton[84]. groups,likeAdler,Karmarkar, ofimplementations. andTodd[32]fora discussion See Goldfarb scalingopusingan explicit The searchdirection.We wrotetheSSD algorithm erationat eachiteration.Thisis notneeded,sincescalingwas onlya methodforthe This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

METHODS FOR LINEARPROGRAMMING PATH-FOLLOWING

179

intheoriginal expressed canbe explicitly thesearchdirection trust regionminimization; space,anditis easytosee thatitis givenby (11)

h=

Xkh = -XkPAXk Xkvf (x)

space. directly intheoriginal canbe written The SSD algorithm xo E S0, f: S0 -* R continuously ALGORITHM 3.4. (SSD): given differentiable. k := 0. REPEAT

Direction:h == -XkPAXk XkVf (Xkk). Linesearch:xk+l := xk + Ah,xk+l > 0. k := k+ 1. UNTIL convergence. mathematics, tendstoproducehard-to-read expressions Sincetheuse oftheseexplicit space. do scalingandworkinthetransformed we shallfrequently inFig.3.3. The ellipforan example,are illustrated iterations, The affine-scaling at eachpointto the space: theycorrespond soidaltrustregionsare shownin original at thepoint.Herewe simpleellipsoidinBR+centered intersection ofS andthelargest tookunitsteplengths. algorithm needs function andtheanalytic center.Theaffine-scaling 3.3. Thebarrier byreandobtainsthisfeature trustregions, interior pointstogenerateniceellipsoidal and a fixedpercentage of A stepofunitlengthis inefficient, thesteplength. stricting may it the boundary, the points avoids stepis farfromelegant.Although themaximum algorithm isnot accumulate nearit.Therearegoodreasonstobelievethattheresulting polynomial (see MegiddoandShub[73]). a cenwillbe obtained bydefining An elegantwayofactively avoiding theboundary of the two reducing objectives: the and simultaneous consideration terfor polytope S, by offinding thecostfora whileandturntotheproblem Wenowforget costsandcentering. a "center" forthepolytope S. thecenter ofgravity, butitscompuofcenter isprobably Thebestpossibledefinition thanthelinearprogramming moredifficult problem tationis knowntobe verydifficult, volumeinscribed itself.Anothernicecenteris thecenterofan ellipsoidofmaximum inpolynomial and timebyKhachiyan in S. Although thisellipsoidhasbeencomputed centers.A thirdgoodcenter Theseare geometrical Todd[57],it is stilltoo difficult. in polynomial timefora has beendefined byVaidya[113],andcan also be computed volume centeris likebeforethecenterofa maximum givenprecision:thevolumetric ofS istakenamongthesimpleellipsoids (intersections ellipsoidinS, butthemaximum andellipsoids withaxesparalleltothecoordinate axes).Forthetimebeingitis alsotoo difficult methods. forpractical At thistime,themostusefulcenteris theanalytic center,definedbySonnevend centerofS is theuniquepointgivenby [103]:The analytic (12)

X = argminp(x). xESO

of theanalytic center(centering) dependson a goodunderstanding Approaching willbe The notational conventions thebarrier function. explainedin theintroduction Thebarrier function usedinthestudy ofitsproperties. by p: ffn -* fRis defined

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

180

CLOVIS C. GONZAGA n

p(x)

xi,

= -Elog i=l

andhasderivatives (13)

(14)

-x-l, V2p(x) = X`2 Vp(x)

Vp(e) =-e, V2p(e) = I.

=

SinceV2p(x)ispositive definite inSO,p(.) isstrictly convex.Besidesthis, Convexity. theboundary center as x approaches ofS, andthustheanalytic p(x) growsindefinitely iswelldefined. D. Wehave Effect diagonalmatrix ofscaling.Considera positive n

p(Dx) = p(x) -

logdi. i=l

Giventwopointsx1,x2 > 0, - p(Dxl)

p(Dx2)

= p(X2)

-p(Xl)

ofp(.). andhencescalingoperations do notaffect variations variations of reasonfortheuse ofscaling:whilenotaffecting Herewe see another Stillmorestriking is the thebarrier scalingyieldsextremely easyderivatives. function, thatthesteepest is theidentity, withtheconsequence factthatat e theHessianmatrix direction.Hence,the frome coincideswiththeNewton-Raphson descentdirection and Newton-Raphson's methodwithlinesearches descentalgorithm scaling-steepest function. coincideforthebarrier weshall oftheefficiency ofalgorithms Linearapproximations arounde.Inourstudy a boundon Atthispointwe establish ofthebarrier function. uselinearapproximations arounde. theerrorofthelinearapproximation around1. on thelogarithm function Webeginbylisting someresults LEMMA3.5. Let A E (-1, 1) begiven.Then (15)

A2 A > log(l + A) > A - -

1

ofthelogarithm. oftheconcavity is a directconsequence Proof.Thefirst inequality in thelogarithm was provedbyKarmarkar The secondinequality [55],bydeveloping series: Taylor's log(l+A)

=

A--2 +3 2 3

4 4

A2

>

andtheproofis complete.

A--l2(1?+IAI+A12+...) 2 A2 1

O

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

PATH-FOLLOWING METHODS FOR LINEAR PROGRAMMING

181

Variation ofthebarrierfunction arounde. LEMMA 3.6. Considera vectord E 1fn suchthatIIdlIk0 < 1. Then p(e + d) > Vp(e)'d

(16)

(17)

-e'd.

=

p(e + d) < Vp(e)'d + 1I1d12- 1

2 1 lldlIk0

-e'd + 1Id12 -1 2 1 lldlli

Proof.We have n

p(e + d)

=

l-,1og(1

+ di).

i=l

itis enoughto extendtheproperties Sincedi e (-1,1) byhypothesis, (15). The extensionis straightforward byaddingtheinequalities fori = 1 ton. 0 SincetheSSD algorithm coincides withNewton-Raphson's method with Centering. linesearchesfortheminimization itis naturalto concludethat ofthebarrier function, eithermethodmustbe efficient forthedetermination oftheanalytic center.Theresultingalgorithm isindeedefficient bothintheory andinpractice:itistheonlymethod used intheliterature. Itscomplexity wasstudiedbyVaidya[112],andwillbe revisedin?6. 3.4. Auxiliary functions. Givena pointin SO,ourtaskis obtaining a betterpoint withrespectto twogoals: costimprovement andcentering. As itis naturalwhentwo we takecombinations arepresent, ofthem. objectives thisreasoning, different functions are constructed, Following auxiliary each one of algorithms. Each auxiliary function usesa parameter leadingto a different family thatweights insomewaytheimportance Each auxgivento eachofthetwoobjectives. willbe associated iliary function tooneparameterization ofthecentral path,as weshall seeindetailinthenextsection.Herewesimply listthefunctions usedinprimal methods methods willbe examined (primal-dual separately). a (i) Thepenalizedfunction (Frisch[22],FiaccoandMcCormick [181)-parameter associatedtoa duality gap: x E So

(18)

f(x) = acTx + p(x).

K, upperboundto (ii) Thecenterfunction (Huard[50],Renegar[96])-parameter theoptimalcost;q > n constant: (19)

x E S0

s.t. cT x < K

fK (x)

=

-qlog(K-c

Tx) + p(x).

(iii) The potential function (Karmarkar [55])-parameterv, lowerboundforthe optimalcost;q > n constant: (20)

x E S?

fv(x) = qlog(cTX

-

v) + p(x).

In thenotation we sacrificed formal usedforthesefunctions forsimplicprecision is singledoutbythesymbol usedfortheparameter.We shall ity:theactualfunction dedicatemucheffort to eachofthesefunctions ahead. At thispointwe wantto make somecomments on theirsimilarities.

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

182

CLOVIS C. GONZAGA

All auxiliary functions haveas secondtermthebarrierfunction, for responsible theboundary. Thefirst terminvolves both thecost,andtheparameter avoiding weights terms.In (i), increasing a increases theimportance ofthecostterm;in(ii),decreasing - log(K - cTx); in(iii) thesameeffect K increases is obtainedbyincreasing v. is a comparison at e (e will Stillmoreinteresting ofthegradients ofthesefunctions resultfroma scalingoperation inalgorithms), respectively, _

- e, aec-e,

q

-

K_-ceCCe

q Te-v

-

ate areallcombinations Thesteepest descentdirections oftwovectors:-PAC andPAe, true calledcost-reduction direction andcentering direction. Thisisactually respectively, usedbythealgorithms interior formostexistent pointmethods:thesearchdirections ofthecost-reduction andcentering directions arecombinations (forscaledproblems). to finda conclusion is thatgivenK or v, itis straightforward Another interesting valueofa suchthatVfQ(e) coincides with,respectively, VfK (e) orVfv(e): (21)

a = K q

a=

c

v

fortheauxiliary Mostinternal Descentdirections functions. algorithms applythe witha fixedvalueoftheparameter.Let us SSD algorithm to theauxiliary functions forthesedescentdirections. somenotation introduce in theoriginal Givena pointx e SO,thedescentdirections spacewillbe denoted inscaled directions h(x,p),wherep is a parameter amonga, K, v. The corresponding spacewillbe denotedh(x,p). Wehave (22)

h(x, p) = -PAXXVfp(x),

(23)

h(x, p) = Xh(x, p).

betweentheSSD direction Nowletus discusstherelationship h(x,p) andtheNewtonthedirections Raphsonstep(NR step).The mainresultis forthepenalizedfunction: coincide.

LEMMA3.7. Considerthefunctionfa(.) fora fixeda > 0. The NR stepfromx coincideswiththeSSD direction, givenby = -acp + ep. (i) h(e, a) =-PAVfca(e) (ii) h(x, a) =-XPAXXVfa (x) = -XPA (a - e)a

off].(.) about thatx = e. Thenthequadratic Proof.Assumeinitially approximation e hasderivatives givenby (e + h) VfcQ

= VfQ,f(e) + Ih, (e) + V2fcQ(e)h VfcQ

ofthequadratic totheminimizer sinceV2fQ, (e) = V2p(e) = I. TheNR stepcorresponds theprojected tozero: obtainedbysetting approximation, gradient PVfQ(e) + h(e, a) = 0,

theproofof(i). completing In fact,thequadratic is scaleinvariant. To prove(ii), notethattheNR algorithm ofa function does notdependon themetricofthespace (in contrast approximation

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

METHODS FOR LINEARPROGRAMMING PATH-FOLLOWING

183

Weconcludethatforan arbitrary descentdirection). steepest withthenorm-dependent as in(i) inthatspace, x E So, theNR stepinscaledspacewillbe computed h(x, a) =-PAXVf,Q(x),

theproof. [ completing term, is influenced bythefirst theHessianmatrix functions, Fortheotherauxiliary as a quasi-Newton to NR. Butit can be interpreted and SSD is no longerequivalent sense. methodinthefollowing modelofthefunca quadratic ateachiteration methodminimizes A quasi-Newton expansion: fromtheTaylor tion,whichmaydiffer fp(x + h) uses E The SSD algorithm

=

fp(x) + Vfp(x)Th + 2hTEh.

V2p(x) insteadof E

=

V2fp(x). We lose thecontribution

isnullinthe Thiscontribution termofthefunctions. ofthefirst ofthesecondderivatives term thefirst function, coincide.Forthepotential andthemethods penalizedfunction, definiteness thepositive matrix thatmaydestroy rank-one definite a negative contributes can theequivalence Forthecenterfunction, andthusis ignored. oftheHessianmatrix, in?8. tobe described transformation be reestablished bya problem 3.5. Guessinga dual slack and a lowerbound. The parameterused bythepotential

function f (*) mustbe a lowerboundto thevaluev ofan optimalsolution.We now a lowerbound a feasibledualslack,andconsequently forguessing a procedure describe in[40],andgivesthesameboundsas themethods waspresented forv. Thisprocedure andVial[25]usingprojective developedbyToddandBurrell[108],andbyde Ghellinck as theonesusedinall The dualslacksgenerated byithavethesameformat geometry. methods. reduction existent primalpotential associatestoita lowerboundv(x) > Givenanyfeasiblepointx E S, theprocedure fails,and no dual slackis generated.The -oo. If v(x) = -oo, thentheprocedure willbe ensuredbythefact(to be seenin ?5.2)thata good oftheprocedure usefulness path. atpointson ornearthecentral lowerboundwillalwaysbe generated thate E S. Suppose initially

FromLemma3.1,wededucethata vectorz E 1RWisa feasibledualslackifandonly intrying to finda "very ifz > 0 andz = cp + -y,where-yI K(A). Ourguessconsists vector.Theideal vector-yI Kr(A) andaddingittocp toobtaina nonnegative positive" toKr(A). Wetry to e, butingenerale is notorthogonal guesswouldbe -yproportional toe - ep. -yproportional Letus definethevectora I Kr(A) givenby ti

(24)

e ep>

If forsome it E 1, cp - ,ii > 0, thenz = cp - ,ii is a feasibledualslack.The duality gap associatedto theprimal-dualpair (e, z) willbe A=eTZ

p

= cT e-

fL

sinceaTe

- ep) = (e - ep)Te. = 1 as is easyto see because (e - ep)T(e Now v = cTe - A is a lowerbound forv. The best lowerboundwillcorrespondto

valuefor/A. admissible themaximum

This content downloaded from 200.17.211.124 on Mon, 21 Sep 2015 16:22:39 UTC All use subject to JSTOR Terms and Conditions

184

CLOVIS C. GONZAGA

We can nowformalize theprocedure, to each x E S a lowerbound associating v(x) E [-oo,v] obtainedbytheprocedure aboveafterscalingtheproblem aboutx: A(x)

(25)

iV(X)

=

inf{eTPA- -

=

C X-A(X).

fLI

PAC-

,ii

>

O},

withtheconvention thatinf0 = +oo. IfA(x) < +oo, thentheprocedure defines the dualfeasibleslack Z(x) = X-1(PAE- Aia),

(26)

whereftis theminimizer in(25). 3.6. Non-path-following variantsofKarmarkar's Karmarkar's algorithm. original algorithm [55]isbasedonthepotential function withq = n. Itassumesthattheoptimal costvaluev isknown, andusesthisparameter valuefromthebeginning. Sincetheoptimalvalueisseldomavailable, Karmarkar proposedtheuseoflowerboundstov,andan inthereferences updating procedure.Updating weresoonimproved cited procedures in?3.5. Karmarkar's isnotsimply theSSD algorithm funcalgorithm appliedtothepotential tion.Forcompleteness, a verybriefdescription we nowpresent ofitsmechanics. First,assumethattheprimalproblem (P) is statedintheformat minimize cTx subjectto

A'x

=

0

aTx

=

1

>

0.

x

thisformat from(P) is straightforward withtheintroduction ofan extravariObtaining able.Letq = n inthepotential function thatv = 0. (20) andassumeinitially The resulting function is fo(x) = n logcTx + p(x). It is zero-degree hopotential i.e,foranyx > 0, A > 0, fo(Ax)= fo(x). Thismeansthatgivenanypoint mogeneous, x > 0 such thatA'x = 0 and aTx > 0 but aTx $ 1, the pointx/aTx is feasibleand = 1). Thusthefollowing hasthesamepotential value(sinceaT(x/aTx) schemecanbe used: aTx = 1; (i) Droptheconstraint (ii) Use SSD tofindxk suchthatfo(Xk) n thecenteris morethan activeregion.Renegarfoundthecleverproperty

halfwaytowardtheoptimalsolutions,thatis,cTx(K)

< (v + K)/2.

ofcentralpointsis welldefined, sincefK (x) is strictly Again,thecharacterization convexandgrowsindefinitely neartheboundary oftherestricted feasibleregion.The centralpointx(K) associatedto K >

is uniquelydeterminedbythecondition

c

Kq

(32)

v

- Px

=0.

Consider thefeatures ofthisparameterization. Thefirst columnofTable1 describes

x = x(K), and letus look closelyto thedualitygap forthecase inwhichq = n. We have = K

A(K)

-

cTx(K).

thatindeed Theduality cTx < K. Thisshows gapequalstheslackintheextrarestriction cTx is lowerthanhalfwaybetween plenty ofroomtodecreaseK v andK, andprovides

whilekeepingtheconstraint cTx < K inactive.There is stillmoreroomifq > n. The conceptualAlgorithm 2.1 is specializedbyspecifying theupdate rule:

SetKk+1 :=f3Kk + (1-f3)c xk. The duality gap at x(Kk+1) willbe suchthatA(Kk+l) > 3A(Kk), sinceA(Kk+1) = and themethodas "optiCTXk+l < CTXk. Thischaracterizes (n/q)(Kk+l cTxk+1), Notethat is smallerthanthedesiredreduction. mistic," sincetheactualgapreduction thisdoesnotdestroy sinceK - v hasa sounddecreaseat eachiteration, convergence, as we shownow. LEMMA4.4. ConsiderthecenterfinctionfK(*) withq > n and thecentralpoint

x = x(K) for K > v. Define r = n/q and

K' =f3K+(1

-3)cTX.

Then r+f3 1+/3 < 1+ r 2 1+r

K '-0

(33) (33)

v< K'~~~~ ~~K-0'

-

and thegap is relatedto K - v by

(34)

r(K

A(K)

Suggest Documents