An Algorithm for Posynomial Geometric Programming, Based on

0 downloads 0 Views 102KB Size Report
Geometric Programming (GP) is a methodology for solving algebraic nonlinear optimization problems. The theory of GP was initially developed about three decades ago and ... GP has found applications in a wide variety of areas, most notably ...
An Algorithm for Posynomial Geometric Programming, Based on Generalized Linear Programming Jayant Rajgopal* Department of Industrial Engineering University of Pittsburgh, Pittsburgh, PA 15261 Dennis L. Bricker Department of Industrial and Management Engineering University of Iowa, Iowa City, IA 52242

*

Corresponding Author

This work was partially supported by the National Science Foundation, Grant No. DMII9209935

SUMMARY This paper describes a column generation algorithm for posynomial geometric programming that is based on Dantzig’s generalized linear programming principle. The algorithm exploits the linear constraint structure of the dual problem while successfully avoiding all of the traditional computational problems associated with dual-based algorithms. Test results indicate that the algorithm is extremely robust and can be used successfully to solve large-scale geometric programming problems on a microcomputer.

1. INTRODUCTION Geometric Programming (GP) is a methodology for solving algebraic nonlinear optimization problems.

The theory of GP was initially developed about three decades ago and

culminated in the publication of the seminal text in this area by Duffin, Peterson and Zener [1967]. Subsequently, GP has been studied by a number of researchers. Its attractive structural properties as well as its elegant theoretical basis have led to a number of interesting applications and the development of numerous useful results. Perhaps the most important property of GP is that a problem with highly nonlinear constraints can be stated equivalently as one with only linear constraints.

This allows for the development of powerful solution techniques, since linear

constraints in general make a problem significantly easier to solve. A second important aspect of GP is that it yields invariance properties that have very useful practical implications. In some cases (e.g., problems with "zero degrees of difficulty"), these properties imply that regardless of the cost coefficients, for a given set of design constraints the different elements of an optimal design should make the same relative contributions to total cost. These invariance properties are especially significant in engineering design. GP has found applications in a wide variety of areas, most notably engineering design where many problems have cost and constraint functions that are power functions of the design variables and readily lend themselves to formulation as geometric programs.

In addition to

engineering, Avriel [1980] and Ecker [1980] list numerous other application areas such as transportation, management science, planning, and reliability. Despite this there appears to be a rather persistent belief that GP is a very specialized and narrow subset of nonlinear programming with a rigid format requirement that precludes most optimization problems from being addressed by this technique. This is quite simply, not true. Beightler and Phillips in their book [1976] show

how many nonlinear programs may be restated as geometric programs with very little additional effort by simple techniques such as a change of variables or by straightforward algebraic manipulation of the terms. Moreover, it appears that this misconception also leads people to solve many GP problems via standard nonlinear programming techniques. While there are of course a multitude of general purpose algorithms for nonlinear programming, a technique that specifically exploits the structure and properties of the problem at hand is intuitively appealing and should be expected to perform better - indeed, this point was discussed at length during a special conference on GP that was held at the Colorado School of Mines in 1992 to mark the 25th anniversary of its development. The consensus was that not only is there a need for good GP algorithms, but also for packaging these into a user-friendly system which can be easily used by practitioners who may be able to formulate their optimization problems but may not necessarily have any specialized knowledge of GP. Several algorithms were developed specifically for GP in the first ten or so years after its inception; a survey of these may be found in Dembo (1976), Sarma et al. (1978) and Rijckaert and Martens (1978). Since then the development has been somewhat slower, although of late some new techniques that are based on interior point methods have been presented, e.g., Bricker and Chang [1992], Kortanek and No [1992] and Kortanek, Xu and Ye [1997]. All of the solution techniques may be categorized as either primal based algorithms that directly solve the nonlinear primal problem, or dual based algorithms that solve the equivalent linearly constrained dual. There are two schools of thought on which approach is superior. While the dual is intuitively more attractive on account of its relative structural simplicity, it also presents numerous computational problems, especially in the presence of slack primal constraints at the optimum, when difficulties arise with nondifferentiability. Furthermore, recovering the optimal values of the

primal variables from a dual optimum in some of these situations could also involve complications such as the need to solve “subsidiary problems.” This has caused some researchers to abandon the linear structure and address the primal directly; however, in this situation one obviously has to contend with a completely nonlinear problem. This paper describes an algorithm that solves the linearly constrained GP dual and also avoids all the computational difficulties traditionally associated with the dual. It has been tested on a large number of problems, for which it has found the optimum solution in every instance. The algorithm is based on a reformulation of the GP dual as a generalized linear program (GLP) as presented by Rajgopal and Bricker [1990], and uses column generation to solve a sequence of linear programs. It is written in FORTRAN and in its current form, the size of the problems that it can solve is limited only by available memory on the computer. Problem data can be entered interactively for smaller problems or entered into a file in an appropriate format for larger problems. Unlike most codes for solving nonlinear programs, no starting point need be provided and there are no parameters of any kind that need to be specified. If it is desired, the user may set the tolerances on constraint feasibility and convergence; otherwise the algorithm uses default values. These values have been tested and the resulting optimum solutions are all well within reasonable tolerances. The entire package, including a user’s manual and examples, is in a selfextracting file that has been placed in the public domain and can be retrieved by anonymous ftp from ftp.pitt.edu/ie/gp/gpinstal.exe. 2. THE PRIMAL-DUAL PAIR IN GEOMETRIC PROGRAMMING The primal geometric program may be stated as follows: Program A

Minimize g0(x)

(1)

st

gk(x)≤1,

k=1,2,...,p,

(2)

x∈ R m++ m

where gk(x) =

∑c ∏x i

i∈[ k ]

aij j

, k=0,1,...,p, and ci>0.

(3)

j=1

Each function gk (k=0,1,...,p) is a generalized polynomial consisting of the sum of terms with index set [k]; each term within the polynomial has a strictly positive coefficient (so that the function is also called a posynomial) followed by a product of the design variables each having an exponent that, unlike a standard polynomial, is not restricted to the set of nonnegative integers, but may be any real number (including negative numbers). The domain of a posynomial, because of possible fractional and/or negative exponents, is restricted to the set of strictly positive real numbers. There are a total of n terms included in the p+1 posynomials, and the index set I = [1,2,...,n] is used to number these sequentially so that the first term of posynomial 0 has the index 1, and the last term of posynomial p has the index n. The index subset [k] numbers the terms in the kth posynomial, with I=[0]∪[1]∪...∪[p] and [k]∩[l]=φ for k≠l. The corresponding dual geometric program may be stated as

Program B

δi p  n  λ  c    i  ∏ λk k  Maximize v(λ ,δ) = ∏   i=1  δ i    k = 0    st

∑δ

i

= λk ,

k=0,1,...,p,

(4)

(5)

i ∈[k]

λ0 = 1,

(normality)

(6)

(orthogonality)

(7)

n

∑ δ iaij = 0 , i=1

j=0,1,...,m,

δi , λk ≥ 0 for all defined i and k

Note that the dual has a highly nonlinear objective but is linearly constrained. There are (effectively) n variables (one δi corresponding to each term in the primal) since one could eliminate each λk by substituting ∑ δ i for it. It is worth examining the structure of the dual a little i ∈[k]

further since the proposed algorithm is based upon the same. First, note that one could instead maximize the logarithm of the objective (since the logarithm is a monotone increasing function of its argument). There are several advantages to this. If the λk are eliminated by substituting from (5) into (4) into (6), this results in a concave function to be maximized over a convex set and numerous efficient algorithms exist for solving a problem with this property. Alternatively, if one were to retain the λk and work with the logarithm of the dual objective, the resulting function, although no longer concave over its domain, is separable in the δi and λk variables and once again there are various techniques that can exploit this separability.

In summary, the polyhedral

constraint set and the attractive structural features of the objective make it a natural choice to solve the dual as opposed to the primal. Unfortunately, there are also certain disadvantages associated with dual- based methods and it is worth examining these as well. These problems arise primarily when one or more primal constraints are slack at the optimum. First, if primal constraint k is inactive at the optimum, then λk is equal to zero at the dual optimum (it is easily shown that λk is equivalent to the Lagrange multiplier for constraint k so that this result follows from the complementary slackness property). Then (5) implies that each δi, i∈[k] must also be equal to zero. This causes several difficulties. First, terms such as

δ ( ci δ i ) i and λλk k (or δi ln δi and λk ln λk if the logarithm of the dual objective is being used) are no longer defined. However, in the limit these quantities approach 1 (or 0 with the log-dual

objective) and may therefore be explicitly defined as being equal to 1 (or 0) when δi and λk are zero; in a computer code special provisions need to be made for this. This does not eliminate difficulties associated with variables equal to zero at optimality, because the objective function is nondifferentiable at zero. This causes difficulties for any algorithm that requires the computation of gradients and special provisions need to be made when such algorithms are coded. A second disadvantage lies in the recovery of the primal solution vector (x) from the optimal dual solution vector (λ ,δ). With slack constraints at the optimum there may occasionally be insufficient information to recover the primal solution vector.

In order to see this, first

consider the relationship between the primal and dual solutions at their respective optima. These are as follows: m

a c i ∏ x j ij = δ i v ( λ , δ ),

i ∈[0],

(8)

i ∈[k ]such that k≥1, λk>0.

(9)

j=1

m a c i ∏ x j ij = δ i / λ k , j=1

In the above system, there are m unknowns (the vector x) and one equation corresponding to each term in the objective, as well as to each term that lies in a constraint posynomial for which the corresponding value of λk is positive at the dual optimum. Assuming that the program is canonical (in essence, a “well-behaved” problem), the optimum values of δi for such k will also be positive and one may then take logarithms on both sides of the above equations so that a system that is linear in wj = ln xj is obtained: ln[δ i v(λ , δ ) / ci ], i ∈[0]  ∑ aij w j =  j =1  ln[δ / (λ c )], i ∈[k ]∋ k ≥ 1, λ k ≥ 0 i k i  m

(10)

If all the λk and the δi are positive, then the coefficient matrix for the vector w is the (n-row by mcolumn) exponent matrix A for the original dual problem. Since any GP problem may be stated with an exponent matrix A that has m 0  j =1 Gk =  m w + w A ( ρ k ) − G ( ρ k ) for k = 0 ∑ k  o j =1 j kj

(23)

If there exists ρk∈Qk such that G k is negative then the corresponding column may be introduced into the basis to improve the objective. Equivalently, there is no attractive column from Q k as long as G k ≥ 0 for all ρk∈Qk. Thus for each k, a subproblem of maximizing G k over all ρk∈Qk is solved and checked to see if its value at the optimum is negative. If so the corresponding value of

ρk is used to generate an additional column for the LP, otherwise there is no beneficial column for k. It is also easy to show that this subproblem is trivially solved since it has a closed form solution. Specifically, the ith element of the maximizing vector ρk is given by m

ρk* = ci exp(∑ −w j aij ) j =1

m

∑ cs exp(∑ − w j asj )

s∈[ k ]

(24)

j =1

Substituting the value of ρk* from (24) into the expressions for G k(ρk) and Akj(ρk) given by (12) and (13), and the resulting values into the expression for the reduced cost G k given by (23) yields the maximum value of the latter. After some algebraic simplification it is easily shown that this m m     value is given by -ln  ∑ ci exp(∑ − w j aij ) for k≥1 and w0 - ln  ∑ ci exp(∑ − w j aij ) for k=0. It  i∈[0 ]  i∈[ k ]   j =1 j =1

is thus clear that for the LP to be at its optimum m m     -ln  ∑ ci exp(∑ − w j aij ) ≥ 0, i.e.,  ∑ ci exp(∑ − w j aij ) ≤ 1,  i∈[ k ]  i∈[ k ]   j =1 j =1

k≥1

m m     w0 - ln  ∑ ci exp(∑ − w j aij ) ≥ 0, i.e.,  ∑ ci exp(∑ − w j aij ) ≤ exp(w0),   i∈[0 ]   i∈[0 ] j =1 j =1

(25)

(26)

In order to see an elegant relationship between the preceding statement and the original primal, let us define xj = exp(-wj),

j=1,2,...,m.

(27)

Substituting from (27) into (25) and (26) yields the following condition for optimality: m a c ∑ i ∏ x j ij = gk(x) ≤ 1, k=1,...,p

(28)

m a c ∑ i ∏ x j ij = g0(x) ≤ exp(w0)

(29)

i∈[ k ]

i ∈[ 0 ]

j=1

j =1

Thus at each iteration of the algorithm the simplex multipliers corresponding to (18) are negated and then exponentiated in order to obtain estimates of the primal solution vector via (27). Then (28) is used to check whether this vector is feasible in the original GP primal, and (29) is used to check for any duality gap. If (28) and (29) are satisfied then the vector x must be optimal for the original problem and the algorithm stops. Otherwise, corresponding to each k for which (28) is violated (and for k=0 if (29) is violated), a vector ρk is computed via (24) and then used to generate a column which is then added on to the current LP approximation. This is then solved to obtain a new vector of simplex multipliers and the procedure continues. 4.3 Specification of the Algorithm The GPGLP algorithm discussed above may now be specified via the following steps: STEP 0: Define a suitable tolerance ε>0. Construct the initial LP approximation to Program D by adding nk columns for each k as described in Section 4.1. STEP 1: Solve the current LP approximation and obtain the simplex multiplier vector w = (w1, w2,...,wm, w0).

STEP 2: Obtain xj from wj via (27). For k=1,2,...,p check whether gk(x) < 1+ε. and for k=0 check whether g0(x) 0.

Step 0 generates the initial ρ0 vectors [1 0 0 0]T, [0 1 0 0]T and [0 0 1 0]T from the set Q0 and the vector ρ1=[0 0 0 1] from the set Q1. From (12) and (13), A01(ρ0)=3ρ1-ρ2+ρ3, A02(ρ0)=-2ρ1-3ρ3 and G0(ρ0)=ρ1ln(0.44/ρ1)+ρ2ln(10/ρ2)+ρ3ln(0.592/ρ3). For the three vectors from Q 0 these yield the columns [Gk(ρk) Ak1(ρk) Ak2(ρk) σ]T as [ln(0.44) 3 -2 [ln(0.592) 1 -3

1]T for the initial LP.

1]T, [ln(10) -1 0

1]T and

Similarly A11(ρ1)=-ρ4, A02(ρ0)=3ρ4 and

G0(ρ0)=ρ4ln(8.62/ρ4) so that the vector from Q1 yields the column [ln(8.62) -1 initial LP solved is thus the following: Maximize λ01ln(0.44) + λ02ln(10) +λ03ln(0.592) +λ11ln(8.62)

3

0]. The

st

3λ01 - λ02 + λ03 - λ11= 0 -2λ01 - 3λ03 + 3λ11= 0 λ01 + λ02 +λ03 = 1 All λk≥0.

(Note: The superscript corresponds to a particular candidate from a set Qk). The solution to this LP (Step 1) yields the simplex multipliers w1=-0.5063, w2=0.5492, w0=1.7963 so that x1=exp(-w1)=1.659 and x2=exp(-w2)=0.577 (Step 2). The value of g1(x)=0.998 < 1 and the value of g0(x) = 17.175>exp (w0)=6.027. Thus we generate a new column only for k=0. Using (24) the corresponding vector ρ0 is given by [0.35 0.35 0.30 0]T and this in turn generates the column [1.457 1 1.6 1]. This is now added on to the previous LP approximation along with a new variable λ04 and the procedure continues. Using ε=10-6 it took the procedure 12 iterations to converge to the optimum solution of x1=1.28669, x2= 0.53046 with an objective of 16.2059; a total of 18 columns were generated. 5. DISCUSSION Some of the features of the algorithm presented in the previous section are now discussed. First, unlike most nonlinear programming algorithms, the user does not need to specify any starting point; the procedure used in Step 0 guarantees a dual feasible starting point (assuming one exists). It may be shown via the well known universal convergence theorem [Zangwill, 1969] that under some mild assumptions of compactness the algorithm is guaranteed to converge as long as the original problem is feasible and has an attained minimum. The compactness assumptions are readily satisfied by placing a suitable upper bound on each variable if necessary. Although a linear program is solved at each step, the computational effort is minimal since the optimum from the prior step constitutes a valid starting basis that is usually very close to the

new optimum and this is reached after a small number of pivots. An interesting feature of the algorithm is that if a constraint has a single term in it (i.e., the corresponding posynomial is a monomial), then the value of the corresponding ρ must always be equal to 1. Thus for such constraints, one never has to generate any columns after Step 0. As a corollary, if all posynomials are monomials then the algorithm solves only a single linear program; this of course, also follows trivially from the fact that the original GP may be converted to an LP by a logarithmic transformation. Simple upper and/or lower bounds are readily incorporated into the procedure without the requirement to explicitly state these as posynomial constraints. These bounding constraints are simply monomial constraints, so that after Step 0 no additional computational effort is required by these constraints. Another feature of the algorithm is that the values of x at each iteration may be viewed as estimates of the primal solution so that along with progress towards the dual optimum, progress towards primal feasibility can also be observed. Finally, in our implementation a column replacement as opposed to a column addition strategy is employed; at each iteration when a new column is generated, the oldest nonbasic column is replaced in order to prevent growth in the dimensions of the LP tableau. Of course, it is possible that this column may be generated again, but the replacement strategy proved to be far more efficient from a computational standpoint. The GPGLP algorithm has been coded in standard FORTRAN and linked with the XMP routines for linear programming written by Marsten [1981].

The code uses dynamic array

allocation and the sizes of the problems that it can solve are limited only by available computer memory. The entire package is a stand alone “black box” with no user-defined parameters other than those related to desired tolerances and desired detail of output.

The code will run on any

microcomputer using an Intel-386 based processor or better. A version of the code that functions

as a callable subroutine also exists; this may be used to embed the solver within any other program of the user’s choice. The algorithm has been tested on a wide variety of problems in the literature including the numerous test problems listed in Beck and Ecker [1972], Dembo [1976], and Rijckaert and Martens [1978]. It was also tested with a couple of larger problems, namely a water pollution control model with 50 degrees of difficulty [Fiacco and Ghaemi, 1982] and a production planning problem [Jha, Kortanek and No, 1990] with 274 degrees of difficulty. The GPGLP algorithm converged to the optimum in every case. Several problems in the literature [Beck and Ecker, 1972] that require the use of subsidiary problems were also tested and once again, as expected, GPGLP was able to directly find the optimum without the need to solve any subsidiary problems. These results attest to the robustness of the proposed algorithm and the associated code. ACKNOWLEDGEMENT This work was partially supported by the Operations Research and Production Systems Program of the Division of Design, Manufacture and Industrial Innovation, National Science Foundation, via Grant No. DMII-9209935.

REFERENCES Avriel, M., (1980), Advances in Geometric Programming, Plenum Press, New York, NY. Beck, P. A., and J. G. Ecker, (1972), “Some Computational Experience with a Modified Convex Simplex Algorithm for Geometric Programming,” Technical Report ADTC-72-70, Armament Development and Test Center, Elgin AFB, Florida. Beightler, C. S. and D. T. Phillips, (1976), Applied Geometric Programming, John Wiley and Sons, New York, NY. Bricker, D. L. and H. Y. Chang, (1992), “A Path-Following Algorithm for Posynomial Geometric Programming,” Presented at the 25th Anniversary Geometric Programming Conference, August 3-5, 1992, Colorado School of Mines, Golden, CO. Dantzig, G. B., (1963), Linear Programming and Extensions, Princeton University Press, Princeton, NJ. Dembo, R. S., (1976), “A Set of Geometric Programming Test Problems and Their Solutions,” Mathematical Programming, 10: 192-213. Dembo, R. S., (1978), “Dual to Primal Conversion in Geometric Programming,” Journal of Optimization Theory and Applications, 26: 243-252. Duffin, R. J., (1970), “Linearizing Geometric Programs, SIAM Review, 12: 211-227. Duffin, R. J., E. L. Peterson and C. Zener, (1967), Geometric Programming - Theory and Applications, John Wiley and Sons, NY. Ecker, J. G., (1980), “Geometric Programming: Methods, Computations and Applications ,” SIAM Review, 22: 338-362.

Fiacco, A.V. and A. Ghaemi, (1982), “Sensitivity Analysis of a Nonlinear Water Pollution Control Model Using an Upper Hudson River Data Base,” Operations Research, 30: 1-28. Gochet, W., Y. Smeers, and K. O. Kortanek, (1973), “Using Semi-Infinite Programming in Geometric Programming,” Proceedings of the 20th International Meeting of TIMS, Tel Aviv, Israel, 1973; E. Shlifer (Ed.), Academic Press, New York, NY, 2: 430-438. Jha, S., K. O. Kortanek and H. No., (1988), “Lotsizing and Setup Time Reduction under Stochastic Demand: A Geometric Programming Approach,” Working Paper Series No. 88-12, Department of Management Science, University of Iowa, IA. Kortanek, K. O., and H. No, (1992), “A Second Order Affine Scaling Algorithm for the Geometric Programming Dual with Logarithmic Barrier,” Optimization, 23: 303-322. Kortanek, K. O., X. Xu, and Y. Ye, (1997), “An Infeasible Interior-Point Algorithm for Solving Primal and Dual Geometric Programs,” Mathematical Programming 76: 155-181. Marsten, R. E. (1981), “The Design of the XMP Linear Programming Library,” ACM Transactions on Mathematical Software, 7: 481-497. Rajgopal, J. and D. L. Bricker, (1990), “Posynomial Geometric Programming as a Special Case of Semi-Infinite Linear Programming,” Journal of Optimization Theory and Applications, 66: 455475. Rajgopal, J. and D. L. Bricker, (1992), "On Subsidiary Problems in Geometric Programming", European Journal of Operational Research, 63: 102-113. Rijckaert, M. J. and

X. M. Martens, (1978), “Comparison of Generalized Geometric

Programming Algorithms,” Journal of Optimization Theory and Applications, 26: 205-242.

Sarma, P. V. L. N., X. M. Martens, G. V. Reklaitis, and M. J. Rijckaert, (1978), “A Comparison of Computational Strategies for Geometric Programs,” Journal of Optimization Theory and Applications, 26: 185-203. Zangwill, W. I., (1969), Nonlinear Programming: A Unified Approach, Prentice-Hall, Inc., Englewood Cliffs, NJ.

Suggest Documents