Chap12 Nonlinear Programming

169 downloads 2503 Views 169KB Size Report
Graphical Illustration of Nonlinear Programming Problems .... The basic idea is to approximate f(x) within the neighborhood of the current trial solution by a ...
College of Management, NCTU

Operation Research II

Spring, 2009

Chap12 Nonlinear Programming ‰ General Form of Nonlinear Programming Problems Max f(x) S.T. gi(x) ≤ bi for i = 1,…, m x≥0 9 No algorithm that will solve every specific problem fitting this format is available. ‰ An Example – The Product-Mix Problem with Price Elasticity 9 The amount of a product that can be sold has an inverse relationship to the price charged. That is, the relationship between demand and price is an inverse curve.

9 The firm’s profit from producing and selling x units is the sales revenue xp(x) minus the production costs. That is, P(x) = xp(x) – cx. 9 If each of the firm’s products has a similar profit function, say, Pj(xj) for producing and selling xj units of product j, then the overall objective function is n

f(x) =

∑ P (x j =1

j

j

) , a sum of nonlinear functions.

9 Nonlinearities also may arise in the gi(x) constraint function.

Jin Y. Wang

Chap12-1

College of Management, NCTU

Operation Research II

Spring, 2009

‰ An Example – The Transportation Problem with Volume Discounts 9 Determine an optimal plan for shipping goods from various sources to various destinations, given supply and demand constraints. 9 In actuality, the shipping costs may not be fixed. Volume discounts sometimes are available for large shipments, which cause a piecewise linear cost function.

‰ Graphical Illustration of Nonlinear Programming Problems Max Z = 3x1 + 5x2 S.T. x1 ≤ 4 9x12 + 5x22 ≤ 216 x1, x2 ≥ 0

9 The optimal solution is no longer a CPF anymore. (Sometimes, it is; sometimes, it isn’t). But, it still lies on the boundary of the feasible region. ¾ We no longer have the tremendous simplification used in LP of limiting the search for an optimal solution to just the CPF solutions. 9 What if the constraints are linear; but the objective function is not?

Jin Y. Wang

Chap12-2

College of Management, NCTU

Operation Research II

Spring, 2009

Max Z = 126x1 – 9x12 + 182x2 – 13x22 ≤ 4 S.T. x1 2x2 ≤ 12 3x1 + 2x2 ≤ 18 x1, x2 ≥ 0

9 What if we change the objective function to 54x1 – 9x12 + 78x2 – 13x22

9 The optimal solution lies inside the feasible region. 9 That means we cannot only focus on the boundary of feasible region. We need to look at the entire feasible region. ‰ The local optimal needs not to be global optimal--Complicate further

Jin Y. Wang

Chap12-3

College of Management, NCTU

Operation Research II

Spring, 2009

9 Nonlinear programming algorithms generally are unable to distinguish between a local optimal and a global optimal. 9 It is desired to know the conditions under which any local optimal is guaranteed to be a global optimal. ‰ If a nonlinear programming problem has no constraints, the objective function being concave (convex) guarantees that a local maximum (minimum) is a global maximum (minimum). 9 What is a concave (convex) function? 9 A function that is always “curving downward” (or not curving at all) is called a concave function.

9 A function is always “curving upward” (or not curving at all), it is called a convex function.

9 This is neither concave nor convex.

‰ Definition of concave and convex functions of a single variable 9 A function of a single variable f (x) is a convex function, if for each pair of values of x, say, x ' and x '' ( x ' < x '' ), f [λx '' + (1 − λ ) x ' ] ≤ λf ( x '' ) + (1 − λ ) f ( x ' )

for all value of λ such that 0 < λ < 1 . 9 It is a strictly convex function if ≤ can be replaced by for the case of strict concave). Jin Y. Wang

Chap12-4

College of Management, NCTU

Operation Research II

Spring, 2009

9 The geometric interpretation of concave and convex functions.

‰ How to judge a single variable function is convex or concave? 9 Consider any function of a single variable f(x) that possesses a second derivative at all possible value of x. Then f(x) is d 2 f ( x) ≥ 0 for all possible value of x. convex if and only if dx 2

concave if and only if

d 2 f ( x) ≤ 0 for all possible value of x. dx 2

‰ How to judge a two-variables function is convex or concave? 9 If the derivatives exist, the following table can be used to determine a two-variable function is concave of convex. (for all possible values of x1 and x2) Quantity

Convex

Concave

≥ 0

≥ 0

∂ 2 f ( x1 , x 2 ) ∂x12

≥ 0

≤ 0

∂ 2 f ( x1 , x 2 ) ∂x 22

≥ 0

≤ 0

∂ 2 f ( x1 , x 2 ) ∂ 2 f ( x1 , x 2 ) ⎡ ∂ 2 f ( x1 , x 2 ) ⎤ −⎢ ⎥ ∂x12 ∂x 22 ⎣ ∂x1 ∂x 2 ⎦

Jin Y. Wang

2

Chap12-5

College of Management, NCTU

Operation Research II

Spring, 2009

9 Example: f ( x1 , x 2 ) = x12 − 2 x1 x 2 + x 22

‰ How to judge a multi-variables function is convex or concave? 9 The sum of convex functions is a convex function, and the sum of concave functions is a concave function. 9 Example: f(x1, x2, x3) = 4x1 – x12 – (x2 – x3)2 = [4x1– x12] + [–(x2 – x3)2]

‰ If there are constraints, then one more condition will provide the guarantee, namely, that the feasible region is a convex set. ‰ Convex set 9 A convex set is a collection of points such that, for each pair of points in the collection, the entire line segment joining these two points is also in the collection.

9 In general, the feasible region for a nonlinear programming problem is a convex set whenever all the gi(x) (for the constraints gi(x) ≤ bi) are convex. Max Z = 3x1 + 5x2 S.T. x1 ≤ 4 9x12 + 5x22 ≤ 216 x1, x2 ≥ 0

Jin Y. Wang

Chap12-6

College of Management, NCTU

Operation Research II

Spring, 2009

9 What happens when just one of these gi(x) is a concave function instead? Max Z = 3x1 + 5x2 ≤ 4 S.T. x1 ≤ 14 2x2 8x1 – x12 + 14x2 – x22 ≤ 49 x1, x2 ≥ 0 ¾ The feasible region is not a convex set. ¾ Under this circumstance, we cannot guarantee that a local maximum is a global maximum. ‰ Condition for local maximum = global maximum (with gi(x) ≤ bi constraints). 9 To guarantee that a local maximum is a global maximum for a nonlinear programming problem with constraint gi(x) ≤ bi and x ≥ 0, the objective function f(x) must be a concave function and each gi(x) must be a convex function. 9 Such a problem is called a convex programming problem. ‰ One-Variable Unconstrained Optimization 9 The differentiable function f(x) to be maximized is concave. 9 The necessary and sufficient condition for x = x* to be optimal (a global max) is df * = 0 , at x = x . dx

9 It is usually not very easy to solve the above equation analytically. 9 The One-Dimensional Search Procedure. ¾ Fining a sequence of trial solutions that leads toward an optimal solution. ¾ Using the signs of derivative to determine where to move. Positive derivative indicates that x* is greater than x; and vice versa.

Jin Y. Wang

Chap12-7

College of Management, NCTU

Operation Research II

Spring, 2009

‰ The Bisection Method 9 Initialization: Select ε (error tolerance). Find an initial x (lower bound on x* ) and x (upper bound on x* ) by inspection. Set the initial trial solution x ' =

x+x . 2

9 Iteration: ¾ Evaluate

df ( x) at x = x ' . dx

df ( x) ' ≥ 0 , reset x = x . dx df ( x) ' ≤ 0 , reset x = x . ¾ If dx

¾ If

' ¾ Select a new x =

x+x . 2

9 Stopping Rule: If x − x ≤ 2ε , so that the new x ' must be within ε of x * , stops. Otherwise, perform another iteration. 9 Example: Max f(x) = 12x – 3x4 – 2x6

0 1 2 3 4 5 6 7

Jin Y. Wang

df(x)/dx

x

x

4.09 -2.19 1.31 -0.34 0.51

0.75 0.75 0.8125 0.8125 0.828125

1 0.875 0.875 0.84375 0.84375

New x

'

0.875 0.8125 0.84375 0.828125 0.8359375

f (x' )

7.8439 7.8672 7.8829 7.8815 7.8839

Chap12-8

College of Management, NCTU

Operation Research II

Spring, 2009

‰ Newton’s Method 9 The bisection method converges slowly. ¾ Only take the information of first derivative into account. 9 The basic idea is to approximate f(x) within the neighborhood of the current trial solution by a quadratic function and then to maximize (or minimize) the approximate function exactly to obtain the new trial solution.

9 This approximating quadratic function is obtained by truncating the Taylor series after the second derivative term. f '' ( xi ) f ( xi +1 ) ≈ f ( xi ) + f ( xi )( xi +1 − xi ) + ( xi +1 − xi ) 2 2 '

9 This quadratic function can be optimized in the usual way by setting its first derivative to zero and solving for xi+1.

Thus, xi +1 = xi −

f ' ( xi ) . f '' ( x i )

9 Stopping Rule: If xi +1 − xi ≤ ε , stop and output xi+1. 9 Example: Max f(x) = 12x – 3x4 – 2x6 (same as the bisection example) ¾

xi +1 = xi −

f ' ( xi ) = f '' ( x i )

¾ Select ε = 0.00001, and choose x1 = 1. f ( xi ) Iteration i xi 1 2 3 4

0.84003 0.83763

7.8838 7.8839

f ' ( xi )

f '' ( xi )

-0.1325 -0.0006

-55.279 -54.790

xi+1

0.83763 0.83762

‰ Multivariable Unconstrained Optimization 9 Usually, there is no analytical method for solving the system of equations given by setting the respective partial derivatives equal to zero. 9 Thus, a numerical search procedure must be used.

Jin Y. Wang

Chap12-9

College of Management, NCTU

Operation Research II

Spring, 2009

‰ The Gradient Search Procedure (for multivariable unconstrained maximization problems) 9 The goal is to reach a point where all the partial derivatives are 0. 9 A natural approach is to use the values of the partial derivatives to select the specific direction in which to move. 9 The gradient at point x = x’ is ∇f (x) = (

∂f ∂f ∂f ’ , ,..., ) at x = x . ∂x1 ∂x x ∂xn

9 The direction of the gradient is interpreted as the direction of the directed line segment from the origin to the point (

∂f ∂f ∂f , ,..., ) , which is the direction of ∂x1 ∂x x ∂x n

changing x that will maximize f(x) change rate. 9 However, normally it would not be practical to change x continuously in the direction of ∇f (x), because this series of changes would require continuously reevaluating the

∂f and changing the direction of the path. ∂xi

9 A better approach is to keep moving in a fixed direction from the current trial solution, not stopping until f(x) stops increasing. 9 The stopping point would be the next trial solution and reevaluate gradient. The gradient would be recalculated to determine the new direction in which to move. ¾ Reset x’ = x’ + t* ∇f (x’), where t* is the positive value that maximizes f(x’+t* ∇ f(x’)) = 9 The iterations continue until ∇f ( x) = 0 with a small tolerance ε .

‰ Summary of the Gradient Search Procedures 9 Initialization: Select ε and any initial trail solution x’. Go first to the stopping rule. 9 Step 1: Express f(x’+t ∇ f(x’)) as a function of t by setting x j = x 'j + t (

∂f ) ' , for ∂x j x = x

j = 1, 2,…, n, and then substituting these expressions into f(x). 9 Step 2: Use the one-dimensional search procedure to find t = t* that maximizes f(x’+t ∇ f(x’)) over t ≥ 0. 9 Step 3: Reset x’ = x’ + t* ∇ f(x’). Then go to the stopping rule. Jin Y. Wang

Chap12-10

College of Management, NCTU

Operation Research II

9 Stopping Rule: Evaluate ∇ f(x’) at x = x’. Check if

Spring, 2009

∂f ≤ ε , for all j = 1,2,…, n. ∂xi

If so, stop with the current x’ as the desired approximation of an optimal solution x*. Otherwise, perform another iteration. ‰ Example for multivariate unconstraint nonlinear programming Max f(x) = 2x1x2 + 2x2 – x12 – 2x22 ∂f ∂f = 2 x 2 − 2 x1 , = 2 x1 + 2 − 4 x 2 ∂x1 ∂x 2

. We verify that f(x) is Suppose pick x = (0, 0) as the initial trial solution. ∇f (0,0) =

9 Iteration 1: x = (0, 0) + t(0, 2) = (0, 2t) f ( x ' + t∇f ( x ' )) = f(0, 2t) =

9 Iteration 2: x = (0, 1/2) + t(1, 0) = (t, 1/2)

9 Usually, we will use a table for convenience purpose. Jin Y. Wang

Chap12-11

College of Management, NCTU

Iteration 1 2

x'

Operation Research II

∇f ( x ' )

x ' + t ∇f ( x ' )

Spring, 2009

f ( x ' + t∇f ( x ' ))

t*

x ' + t * ∇f ( x ' )

‰ For minimization problem 9 We move in the opposite direction. That is x’ = x’ – t* ∇ f(x’). 9 Another change is t = t* that minimize f(x’ – t ∇ f(x’)) over t ≥ 0 ‰ Necessary and Sufficient Conditions for Optimality (Maximization) Problem Necessary Condition Also Sufficient if: f(x) concave One-variable unconstrained df = 0 Multivariable unconstrained

dx ∂f = 0 (j=1,2,…n) ∂xi

f(x) concave

General constrained problem KKT conditions

f(x) is concave and gi(x) is convex ‰ The Karush-Kuhn-Tucker (KKT) Conditions for Constrained Optimization 9 Assumed that f(x), g1(x), g2(x), …, gm(x) are differentiable functions. Then x* = (x1*, x2*, …, xn*) can be an optimal solution for the nonlinear programming problem only if there exist m numbers u1, u2, …, um such that all the following KKT conditions are satisfied: (1)

m ∂g ∂f * − ∑ u i i ≤ 0 , at x = x , for j = 1, 2, …, n ∂x j i =1 ∂x j

(2) x *j (

m ∂g ∂f * − ∑ u i i ) = 0 , at x = x , for j = 1, 2, …, n ∂x j i =1 ∂x j

(3) gi(x*) – bi ≤ 0, for i =1, 2, …, m (4) ui [gi(x*) – bi] = 0, for i =1, 2, …, m

Jin Y. Wang

Chap12-12

College of Management, NCTU

Operation Research II

Spring, 2009

(5) x *j ≥ 0 , for j =1, 2, …, m (6) u j ≥ 0 , for j =1, 2, …, m ‰ Corollary of KKT Theorem (Sufficient Conditions) 9 Note that satisfying these conditions does not guarantee that the solution is optimal. 9 Assume that f(x) is a concave function and that g1(x), g2(x), …, gm(x) are convex functions. Then x* = (x1*, x2*, … , xn*) is an optimal solution if and only if all the KKT conditions are satisfied. ‰ An Example Max f(x) = ln(x1 + 1) + x2 S.T. 2x1 + x2 ≤ 3 x1, x2 ≥ 0 n = 2; m = 1; g1(x) = 2x1 + x2 is convex; f(x) is concave. 1. ( j = 1)

1 − 2u1 ≤ 0 x1 + 1

2. ( j = 1) x1 (

1 − 2u1 ) = 0 x1 + 1

1. (j = 2) 1 − u1 ≤ 0 2. (j = 2) x 2 (1 − u1 ) = 0 3. 2 x1 + x2 − 3 ≤ 0 4. u1 (2 x1 + x 2 − 3) = 0 5. x1 ≥ 0, x2 ≥ 0 6. u1 ≥ 0 9 Therefore, There exists a u1 = 1 such that x1 = 0, x2 = 3, and u1 = 1 satisfy KKT conditions. The optimal solution is (0, 3). ‰ How to solve the KKT conditions 9 Sorry, there is no easy way. 9 In the above example, there are 8 combinations for x1( ≥ 0), x2( ≥ 0), and u1( ≥ 0). Try each one until find a fit one. 9 What if there are lots of variables? 9 Let’s look at some easier (special) cases.

Jin Y. Wang

Chap12-13

College of Management, NCTU

Operation Research II

Spring, 2009

‰ Quadratic Programming Max f(x) = cx – 1/2 xTQx S.T. Ax ≤ b x≥0 n

9 The objective function is f(x) = cx – 1/2 xTQx = ∑ c j x j − j =1

1 n n ∑∑ qij xi x j . 2 i =1 j =1

9 The qij are elements of Q. If i = j, then xixj = xj2, so –1/2qij is the coefficient of xj2. If i ≠ j, then –1/2(qij xixj + qji xjxi) = –qij xixj, so –qij is the coefficient for the product of xi and xj (since qij = qji).

9 An example Max f(x1, x2) = 15x1 + 30x2 + 4x1x2 – 2x12 – 4x22 S.T. x1 + 2x2 ≤ 30 x1, x2 ≥ 0

9 The KKT conditions for the above quadratic programming problem. 1. 2. 1. 2. 3. 4. 5. 6.

(j = 1) (j = 1) (j = 2) (j = 2)

Jin Y. Wang

15 + 4x2 – 4x1 – u1 ≤ 0 x1(15 + 4x2 – 4x1 – u1) = 0 30 + 4x1 – 8x2 – 2u1 ≤ 0 x2(30 + 4x1 – 8x2– 2u1) = 0 x1 + 2x2 – 30 ≤ 0 u1(x1 + 2x2 – 30) = 0 x1 ≥ 0, x2 ≥ 0 u1 ≥ 0

Chap12-14

College of Management, NCTU

Operation Research II

Spring, 2009

9 Introduce slack variables (y1, y2, and v1) for condition 1 (j=1), 1 (j=2), and 3. = –15 1. (j = 1) – 4x1 + 4x2 – u1 + y1 + y2 = –30 1. (j = 2) 4x1 – 8x2 – 2u1 + v1 = 30 3. x1 + 2x2 Condition 2 (j = 1) can be reexpressed as 2. (j = 1) x1y1 = 0 Similarly, we have 2. (j = 2) x2y2 = 0 4. u1v1 = 0 9 For each of these pairs—(x1, y1), (x2, y2), (u1, v1)—the two variables are called complementary variables, because only one of them can be nonzero. ¾ Combine them into one constraint x1y1 + x2y2 + u1v1 = 0, called the complementary constraint. 9 Rewrite the whole conditions 4x1 – 4x2 + u1 – y1 = 15 – y2 = 30 –4x1 + 8x2 + 2u1 + v1 = 30 x1 + 2x2 =0 x1y1 + x2y2 + u1v1 x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, y1 ≥ 0, y2 ≥ 0, v1 ≥ 0 9 Except for the complementary constraint, they are all linear constraints. 9 For any quadratic programming problem, its KKT conditions have this form Qx + ATu – y = cT Ax + v = b x ≥ 0, u ≥ 0, y ≥ 0, v ≥ 0 x Ty + u Tv = 0 9 Assume the objective function (of a quadratic programming problem) is concave and constraints are convex (they are all linear). 9 Thus, x is optimal if and only if there exist values of y, u, and v such that all four vectors together satisfy all these conditions. 9 The original problem is thereby reduced to the equivalent problem of finding a feasible solution to these constraints. 9 These constraints are really the constraints of a LP except the complementary constraint. Why don’t we just modify the Simplex Method?

Jin Y. Wang

Chap12-15

College of Management, NCTU

Operation Research II

Spring, 2009

‰ The Modified Simplex Method 9 The complementary constraint implies that it is not permissible for both complementary variables of any pair to be basic variables. 9 The problem reduces to finding an initial BF solution to any linear programming problem that has these constraints, subject to this additional restriction on the identify of the basic variables. 9 When cT ≤ 0 (unlikely) and b ≥ 0, the initial solution is easy to find. x = 0, u = 0, y = – cT, v = b 9 Otherwise, introduce artificial variable into each of the equations where cj > 0 or bi < 0, in order to use these artificial variables as initial basic variables ¾ This choice of initial basic variables will set x = 0 and u = 0 automatically, which satisfy the complementary constraint. 9 Then, use phase 1 of the two-phase method to find a BF solution for the real problem. ¾ That is, apply the simplex to (zi is the artificial variables) Min Z = ∑ z j j

Subject to the linear programming constraints obtained from the KKT conditions, but with these artificial variables included. ¾ Still need to modify the simplex method to satisfy the complementary constraint. 9 Restricted-Entry Rule: ¾ Exclude from consideration any nonbasic variable to be the entering variable whose complementary variable already is a basic variable. ¾ Choice the other nonbasic variables according to the usual criterion. ¾ This rule keeps the complementary constraint satisfied all the time. 9 When an optimal solution x*, u*, y*, v*, z1 = 0, …, zn = 0 is obtained for the phase 1 problem, x* is the desired optimal solution for the original quadratic programming problem.

Jin Y. Wang

Chap12-16

College of Management, NCTU

Operation Research II

Spring, 2009

‰ A Quadratic Programming Example Max 15x1 + 30x2 + 4x1x2 – 2x12 – 4x22 S.T. x1 + 2x2 ≤ 30 x1, x2 ≥ 0

Jin Y. Wang

Chap12-17

College of Management, NCTU

Operation Research II

Spring, 2009

‰ Constrained Optimization with Equality Constraints 9 Consider the problem of finding the minimum or maximum of the function f(x), subject to the restriction that x must satisfy all the equations g1(x) = b1 … gm(x) = bm 9 Example: Max f(x1, x2) = x12 + 2x2 S.T. g(x1, x2) = x12 + x22 = 1 9 A classical method is the method of Lagrange multipliers. m

¾ The Lagrangian function h( x, λ ) = f ( x) − ∑ λi [ g i ( x) − bi ] , where (λ1 , λ2 ,..., λm ) i =1

are called Lagrange multipliers. 9 For the feasible values of x, gi(x) – bi = 0 for all i, so h(x, λ ) = f(x). 9 The method reduces to analyzing h(x, λ ) by the procedure for unconstrained optimization. ¾ Set all partial derivative to zero m ∂g ∂h ∂f = − ∑ λi i = 0 , for j = 1, 2, …, n ∂x j ∂x j i =1 ∂x j

∂h = − g i ( x) + bi = 0, for i = 1, 2, …, m ∂λi

¾ Notice that the last m equations are equivalent to the constraints in the original problem, so only feasible solutions are considered. 9 Back to our example ¾ h(x1, x2) = x12 + 2x2 – λ ( x12 + x22 – 1). ¾

∂h = ∂x1 ∂h = ∂x 21 ∂h = ∂λ

Jin Y. Wang

Chap12-18

College of Management, NCTU

Operation Research II

Spring, 2009

‰ Other types of Nonlinear Programming Problems 9 Separable Programming ¾ It is a special case of convex programming with one additional assumption: f(x) and g(x) functions are separable functions. ¾ A separable function is a function where each term involves just a single variable. ¾ Example: f(x1, x2) = 126x1 – 9x12 + 182x2 – 13x22 = f1(x1) + f2(x2) f1(x1) = f2(x2) = ¾ Such problem can be closely approximated by a linear programming problem. Please refer to section 12.8 for details. 9 Geometric Programming ¾ The objective and the constraint functions take the form N

g ( x) = ∑ ci Pi ( x) , where Pi ( x) = x1ai1 x 2ai 2 ...x3ai 3 for i = 1, 2, …, N i =1

¾ When all the ci are strictly positive and the objective function is to be minimized, this geometric programming can be converted to a convex programming problem by setting x j = e y . j

9 Fractional Programming ¾ Suppose that the objective function is in the form of a (linear) fraction. Maximize f(x) = f1(x) / f2(x) = (cx + c0) / (dx + d0). ¾ Also assume that the constraints gi(x) are linear. Ax ≤ b, x ≥ 0. ¾ We can transform it to an equivalent problem of a standard type for which effective solution procedures are available. ¾ We can transform the problem to an equivalent linear programming problem by letting y = x / (dx + d0) and t = 1 / (dx + d0), so that x = y/t. ¾ The original formulation is transformed to a linear programming problem. Max Z = cy + c0t S.T. Ay – bt ≤ 0 dy + d0t = 1 y,t ≥ 0

Jin Y. Wang

Chap12-19