Granularity in nonlinear mixed-integer optimization - Optimization Online

0 downloads 0 Views 249KB Size Report
Dec 12, 2017 - Abstract We study a deterministic technique to check the existence of feasible points for mixed-integer nonlinear optimization problems which ...
Noname manuscript No. (will be inserted by the editor)

Granularity in nonlinear mixed-integer optimization Christoph Neumann · Oliver Stein · Nathan Sudermann-Merx

December 12, 2017

Abstract We study a deterministic technique to check the existence of feasible points for mixed-integer nonlinear optimization problems which satisfy a structural requirement that we call granularity. We show that solving certain purely continuous optimization problems and rounding their optimal points leads to feasible points of the original mixed-integer problem, as long as the latter is granular. The present analysis is motivated by numerical results for mixed-integer linear problems from C. Neumann, O. Stein, N. Sudermann-Merx: A feasible rounding approach for mixed-integer optimization problems, Optimization Online, Preprint ID 2017-12-6367, 2017. We explain why the practical performance observed there improves when the number of discrete feasible points increases relative to the size of the relaxed feasible set. In particular, for the generated feasible points we present computable a-priori and a-posteriori bounds on the deviation of their objective function values from the optimal value. We illustrate our results computationally for large scale bounded knapsack problems. Keywords Rounding · granularity · inner parallel set · consistency · global error bound Mathematics Subject Classification (2000) 90C11 · 90C10 · 90C31 · 90C30 Preliminary citation Optimization Online, Preprint ID 2017-12-6373, 2017 Christoph Neumann Institute of Operations Research, Karlsruhe Institute of Technology (KIT), Germany E-mail: [email protected] Oliver Stein Institute of Operations Research, Karlsruhe Institute of Technology (KIT), Germany E-mail: [email protected] Nathan Sudermann-Merx Advanced Business Analytics, BASF Business Services GmbH, Germany E-mail: [email protected]

2

Christoph Neumann et al.

1 Introduction In this paper we suggest a deterministic feasibility test along with a construction method for good feasible points of mixed-integer nonlinear optimization problems (MIPs). It is based on a property of the feasible set which we call granularity and which may be checked efficiently under suitable assumptions. Our analysis is motivated by the successful numerical application of our algorithmic approach to mixedinteger linear problems in [28]. In fact, in the latter paper a computational study of problems from the MIPLIB libraries shows that granularity may be expected and exploited in various real world applications. Moreover, the practical performance of this approach is observed to improve for optimization problems with ‘many’ discrete feasible points relative to the size of the continuously relaxed feasible set. The present article will explain this effect by introducing a mesh size parameter for the discrete variables which, while approaching zero, models such an increasing number of discrete feasible points. An error bound result will show that the optimality error of our approach tends to zero at least linearly with decreasing mesh sizes, which is in accordance with the above observations. Moreover, while [28] exploits granularity of linear problems, the present paper extends this technique to the nonlinear setting. Mixed-integer optimization problems with varying mesh size also appear when the effect of different granularity levels of discrete decision variables is explicitly modeled as when they correspond, for example, to monetary or measurement units which may either be considered in full or fractional units, or to differently sized transaction round lots of shares (or normal trading units) in an investment portfolio. Such problems were first systematically studied in [34,35] for the mixed-integer linear and nonlinear cases, respectively. We emphasize that the present paper only uses some techniques, but not the results from [34,35]. We shall call the feasible set of a MIP (and the MIP itself) granular if a certain inner parallel set is nonempty (cf. Def. 3.4). The main effect of granularity is that, under weak assumptions, it may not only be checked efficiently in terms of some auxiliary NLP, but also provides a sufficient condition for the existence of feasible points for the MIP. Note that even for purely integer linear optimization problems the construction of feasible points is known to be an NP-hard problem [31]. This has triggered the development of many search heuristics, among them the feasibility pump ([1,13,14]), Undercover ([7]), relaxation enforced neighborhood search ([6]), diving strategies ([8]), and many others (see [5, Sec. 6] for a survey). As opposed to such heuristics, the technique from the present paper is deterministic and efficient for granular MIPs. In particular, we only have to solve purely continuous auxiliary problems. While granularity will thus provide an efficiently computable condition for a nonempty feasible set, our approach will also be constructive in the sense that it computes some feasible point. To estimate its quality, we shall bound the deviation of its objective function value from the optimal value of MIP by explicitly computable expressions. In applications, such small bounds may lead to the decision to accept the feasible point as ‘close enough to optimal’. Otherwise, a good feasible point may

Granularity in nonlinear mixed-integer optimization

3

be used, for example, to initialize an appropriate branch-and-cut method with a small upper bound on the optimal value, or to start a local search heuristic there (cf., e.g., [10]). The remainder of this article is structured as follows. After stating some preliminaries in Section 2, in Section 3 we define granularity in terms of some enlarged inner parallel set of the continuously relaxed feasible set, give a functional description for this enlarged inner parallel set, and explain how granularity may be checked algorithmically both for fixed and varying mesh sizes. Based on this, Section 4 introduces feasible rounding approaches for the construction of feasible points and provides apriori and a-posteriori bounds on the deviation of their objective function values from the optimal value. In Section 5 we illustrate our theoretical findings by computational results for large scale knapsack problems. Some conclusions and final remarks end the article in Section 6. 2 Preliminaries We will study mixed-integer nonlinear optimization problems of the form MIPh :

min

(x,y)∈Rn ×hZm

f (x, y) s.t. gi (x, y) ≤ 0, i ∈ I,

(x, y) ∈ D

with a mesh size parameter h > 0, a nonempty and closed set D ⊆ Rn × Rm , a finite index set I = {1, . . . , p} with p ∈ N, and real-valued functions f , gi , i ∈ I, defined on D. For remarks on the appropriate treatment of equality constraints in MIPh we refer to [28]. Note that the discrete decision vector y is assumed to lie in the scaled set hZm . To verify granularity of a given problem MIPh we will need additional Lipschitz assumptions for the functions f , gi , i ∈ I, on the set D, and to make auxiliary NLPs like the continuous relaxation of MIPh numerically tractable, we shall impose further assumptions like convexity of the set D and the functions f , gi , i ∈ I. However, these additional assumptions will only be introduced where necessary. In the mixed-integer linear case these assumptions will necessarily be satisfied, but they also cover various nonlinear instances. While the purely integer case (n = 0) is included in our analysis, we will assume m, p > 0 throughout this article. Remark 2.1 As mentioned already in [34], the mesh dependence of MIPh might as well be modeled using the substitution y = hz with z ∈ Zm , which leads to the problem MIPh′ :

min

(x,z)∈Rn ×Zm

f (x, hz) s.t. gi (x, hz) ≤ 0, i ∈ I,

(x, hz) ∈ D.

However, the main advantage of the model without substitution of y is the clear visibility of the granular nature of y. Moreover, the NLP relaxation d : MIP

min

(x,y)∈Rn ×Rm

f (x, y) s.t. gi (x, y) ≤ 0, i ∈ I, (x, y) ∈ D

b and its optimal value vb are mesh indepenof MIPh and, in particular, its feasible set M dent. We emphasize, on the other hand, that our choice of the model is for notational

4

Christoph Neumann et al.

convenience only, and that the alternative modeling would not affect our results below. ⊓ ⊔ Example 2.2 We shall specify our subsequent results to the case of a mixed-integer linear optimization problem MILPh :

min

(x,y)∈Rn ×hZm

c⊺ x + d ⊺ y

s.t. Ax + By ≤ b,

(x, y) ∈ D

with vectors c ∈ Rn , d ∈ Rm , b ∈ R p , a (p, n)-matrix A, a (p, m)-matrix B, and a polyhedral set D ⊆ Rn × Rm . Denoting by αi⊺ and βi⊺ , i = 1, . . . , p, the rows of A and B, respectively, we obtain gi (x, y) = αi⊺ x + βi⊺ y − bi , i ∈ I. This setting slightly differs from the one in [34], which is further discussed in [35]. An explicit nonlinear example for MIPh is provided in [35, Ex. 1.2]. ⊓ ⊔ The reformulation from Remark 2.1 for the problem MILPh from Example 2.2 leads to the problem MILPh′ :

min

(x,z)∈Rn ×Zm

1 ⊺ ⊺ hc x+d z

s.t.

1 h Ax + Bz

≤ h1 b,

(x, hz) ∈ D.

In the purely integer case with D = Rm this problem becomes ILPh′ :

min d ⊺ z

z∈Zm

s.t. Bz ≤ 1h b

and, thus, gives rise to the following further application of integer problems with variables from a scaled set Zm . Example 2.3 Consider the purely integer linear problem ILP(t) :

min d ⊺ z s.t. Bz ≤ tb

z∈Zm

with a parameter t > 0 scaling the right hand side vector b. Then, by our above considerations, with h := 1/t the problem ILP(t) is equivalent to the problem ILPh :

min d ⊺ y s.t. By ≤ b.

y∈hZm

⊓ ⊔ The numerical illustration of our results by large scale bounded knapsack problems in Section 5 actually is of the type considered in Example 2.3. 3 Granularity d of MIPh In the following we shall denote the feasible set of the NLP relaxation MIP by b = {(x, y) ∈ D| gi (x, y) ≤ 0, i ∈ I}. M

Moreover, for any point (x, y) ∈ Rn × Rm we call (q xh , yqh ) a rounding if xqh = x

and

yqh ∈ hZm ,

|(q yh ) j − y j | ≤ h2 , j = 1, . . . , m,

hold, that is, y is rounded componentwise to a point in the mesh hZm , and x remains unchanged. Note that a rounding does not have to be unique.

Granularity in nonlinear mixed-integer optimization

5

3.1 The inner parallel set With the sets B∞ 0, 21 and



:= {y ∈ Rm | kyk∞ ≤ 12 }

K := {0} × B∞ 0, 12 any rounding of (x, y) obviously satisfies



(q xh , yqh ) ∈ ((x, y) + hK) ∩ (Rn × hZm ) .

(3.1)

The central object of our technique is the inner  parallel set (in the sense of b with respect to K = {0} × B∞ 0, 1 at distance h, Minkowski) to M 2 b −h := {(x, y) ∈ Rn × Rm | (x, y) + hK ⊆ M}. b M

In the terminology of [34], this inner parallel set is also called grid relaxation retract. b −h must lie in Mh since in view of (3.1) it satisfies Any rounding of any point (x, y) ∈ M b ∩ (Rn × hZm ) = Mh . (q xh , yqh ) ∈ ((x, y) + hK) ∩ (Rn × hZm ) ⊆ M

Hence, as a first sufficient condition for consistency of Mh we obtain the following result. b −h be nonempty. Then Proposition 3.1 For given h > 0 let the inner parallel set M also Mh is nonempty.

While the applicability of Proposition 3.1 of course hinges on a functional description b −h , a more serious drawback is that this condition may of the inner parallel set M not be expected to show consistency of sets Mh with h = 1 and involving binary variables, as they often appear in practice. In fact, if y1 is a binary variable modeled b −1 must satisfy y1 = 1/2. This will often be ruled as y1 ∈ Z ∩ [0, 1] then all (x, y) ∈ M out by other constraints, the more so if many binary variables appear. Consequently, b −1 then is empty and the condition from Proposition 3.1 is useless. the set M

3.2 An enlarged inner parallel set

b −h may be enlarged without losing the It turns out that often the inner parallel set M property that any rounding of any of its elements lies in Mh . Note that any enlargement of the inner parallel set is beneficial, since it increases chances for its consistency. In Example 3.8 we will see how this promotes the successful application of a consistency test to binary problems. To this end, we extend an idea from [28] and pay special attention to any inequality constraint function gi with i ∈ Ie := {i ∈ I| gi (x, y) = βi⊺ y − bi , βi ∈ Zm \ {0} }. By the transformation y = hz with z ∈ Z, for each i ∈ Ie the solutions of the inequality βi⊺ y ≤ bi in hZm correspond to the solutions of βi⊺ z ≤ bi /h in Zm . Since the left hand

6

Christoph Neumann et al.

side of the latter inequality can only attain integer values, one may increase the right hand side to any value strictly below the next integer, that is, to any value ⌊bi /h⌋ + δ with δ ∈ [0, 1), without changing the solution set. In some cases the right hand side may even be further increased without changing the solution set. In fact, by a result from, e.g., [9] the left hand side can only attain values which are multiples of the greatest common divisor ωi of the nonzero entries of βi . Hence, the right hand side may even be increased to any value strictly below the next multiple of ωi . To describe this construction formally, for a ∈ R and ω ∈ Z we put ⌊a⌋ω := max{z ∈ ω Z| z ≤ a} and increase bi /h to ⌊bi /h⌋ωi + δ ωi with any δ ∈ [(bi /h − ⌊bi /h⌋ωi )/ωi , 1). We emphasize that the smaller choices δ ∈ [0, (bi /h − ⌊bi /h⌋ωi )/ωi ) are also possible, but would not increase bi /h and, thus, not lead to an enlarged inner parallel set. So far, we have relaxed the inequality βi⊺ z ≤ bi /h to βi⊺ z ≤ ⌊bi /h⌋ωi + δ ωi without allowing for additional solutions in Zm . For the original variables y ∈ hZm this means that we may relax the right hand side of βi⊺ y ≤ bi to h⌊bi /h⌋ωi + hδ ωi without changing its set of solutions y ∈ hZm . For any i ∈ Ie , with bhi (δ ) := h⌊bi /h⌋ωi + hδ ωi − bi we may thus replace the inequality gi (x, y) ≤ 0 by gi (x, y) ≤ bhi (δ ) with any δ ∈ [(bi /h − ⌊bi /h⌋ωi )/ωi , 1). Since the value ( b /h−⌊b /h⌋ maxi∈Ie i ωii ωi if Ie 6= 0/ h δe := 0 if Ie = 0, / satisfies δeh < 1, the set [δeh , 1) is nonempty. For any function gi with i ∈ I \ Ie we put bhi (δ ) = 0 for any δ ∈ [δeh , 1). Then, with the set b h := {(x, y) ∈ D| gi (x, y) ≤ bhi (δ ), i ∈ I} M δ

b⊆M bh, and any δ ∈ [δeh , 1) we obtain M δ and

b h ∩ (Rn × hZm ) , Mh = M δ

b h }. b −h ⊆ M b −h := {(x, y) ∈ Rn × Rm | (x, y) + hK ⊆ M M δ δ

b −h Repeating the arguments from Section 3.1 for the enlarged inner parallel set M δ yields the following lemma which not only prepares the statement of Theorem 3.3, but will also be employed in Section 4. b −h , Lemma 3.2 Let h > 0 and some δ ∈ [δeh , 1) be given. Then for any point (x, y) ∈ M δ any of its roundings (q xh , yqh ) lies in Mh . Our first main result is an immediate consequence of Lemma 3.2.

Granularity in nonlinear mixed-integer optimization

7

Theorem 3.3 For given h > 0 and for some δ ∈ [δeh , 1) let the enlarged inner parallel b −h be nonempty. Then also Mh is nonempty. set M δ Theorem 3.3 motivates the following definition.

Definition 3.4 For h > 0 we call the set Mh granular if the enlarged inner parallel set b −h is nonempty for some δ ∈ [δeh , 1). Moreover, we call a problem MIPh granular M δ if its feasible set Mh is granular. ⊓ ⊔

In this terminology Theorem 3.3 states that any granular problem MIPh possesses a nonempty feasible set Mh . We note that, independently of the above enlargement technique, already the modelling of a problem may affect the chances for its granularity [28].

3.3 A functional description for the enlarged inner parallel set For an algorithmic employment of Theorem 3.3 we need a functional description at b −h . To this end, its defining functions gi : D → R, i ∈ I, will be least of a subset of M δ assumed to satisfy global Lipschitz conditions with respect to y uniformly in x. To be more specific, for any x ∈ Rn we define the set D(x) := {y ∈ Rm | (x, y) ∈ D} and denote by prx D := {x ∈ Rn | D(x) 6= 0} / the parallel projection of D to the ‘x-space’ Rn . Then the functions gi , i ∈ I, are assumed to satisfy Lipschitz conditions with respect to the ℓ∞ -norm on the fibers {x} × D(x), independently of the choice of x ∈ prx D. gi ≥ 0 such that for all x ∈ pr D and Assumption 3.5 For all i ∈ I there exists some L∞ x 1 2 all y , y ∈ D(x) we have gi gi 1 |gi (x, y1 ) − gi (x, y2 )| ≤ L∞ k(x, y1 ) − (x, y2 )k∞ = L∞ ky − y2 k∞ .

We allow for vanishing Lipschitz constants to cover trivial cases. Some problem classes for which the Lipschitz constants from Assumption 3.5 can be calculated are discussed in Examples 3.9–3.11 below. Under Assumption 3.5, for any δ ∈ [δeh , 1) the set n o gi h h (δ ), i ∈ I Tδ−h := (x, y) ∈ D−h | gi (x, y) + L∞ ≤ b i 2 with the inner parallel set of D with respect to K at distance h,

D−h := {(x, y) ∈ Rn × Rm | (x, y) + hK ⊆ D}, b −h , as we shall show next. A precursor of this result is an inner approximation of M δ was used to show [35, Lem. 2.4].

8

Christoph Neumann et al.

Lemma 3.6 Under Assumption 3.5, for any h > 0 and δ ∈ [δeh , 1) we have Tδ−h ⊆ b −h . M δ

Proof In the case Tδ−h = 0/ the assertion trivially holds. Otherwise, let (x, y) ∈ Tδ−h . We have to show that b h = {(x, y) ∈ D| gi (x, y) ≤ bhi (δ ), i ∈ I} (x, y + hη ) ∈ M δ  holds for any η ∈ B∞ 0, 21 . First, (x, y) ∈ D−h and (0, η ) ∈ K imply (x, y + hη ) = (x, y) + h(0, η ) ∈ D. This also yields x ∈ prx D and y+hη ∈ D(x). As also y lies in D(x), Assumption 3.5 implies for any i ∈ I gi gi h gi (x, y + hη ) − gi (x, y) ≤ L∞ hkη k∞ ≤ L∞ 2. From the definition of Tδ−h we thus obtain gi gi (x, y + hη ) ≤ gi (x, y) + L∞

h 2

≤ bhi (δ ),

b −h . so that altogether we have shown (x, y) ∈ M ⊓ ⊔ δ −h h Note that in some situations for sufficiently large values of δ ∈ [δe , 1) the set Tδ b [28]. Still, Lemmata 3.2 and 3.6 as well may be so large that it is not contained in M as Theorem 3.3 immediately give rise to the next result. Proposition 3.7 Under Assumption 3.5, for any h > 0 and δ ∈ [δeh , 1) the following assertions are true: a) For any point (x, y) ∈ Tδ−h , any of its roundings (q xh , yqh ) lies in Mh . b) If Tδ−h is nonempty, then MIPh is granular. c) If Tδ−h is nonempty, then also Mh is nonempty. The following example illustrates our main motivation for the introduction of the enlarged inner parallel set. Example 3.8 Consider the problem MIP :

min

(x,y)∈Rn ×Bm

f (x, y) s.t. gi (x, y) ≤ 0, i ∈ I,

with binary variables y j ∈ B = {0, 1}, j = 1, . . . , m, and inequality constraints gi satisfying Assumption 3.5. Then we may put h := 1 and introduce the additional inequality constraints 0 ≤ y ≤ e to obtain a problem of type MIPh with h = 1, where e denotes the all ones vector. Any nonnegativity constraint −y j ≤ 0 may be written as −e⊺j y ≤ b j := 0 with ω j = 1, so that we are allowed to increase b j by any δ ∈ [0, 1), that is, replace the inequality y j ≥ 0 by y j ≥ −δ . Analogously, any constraint y j ≤ 1 may be replaced by y j ≤ 1 + δ . All Lipschitz moduli of these additional constraints are easily seen to equal one. For any δ ∈ [0, 1) this results in the inner approximation  gi ≤ 0, i ∈ I, ( 1 − δ )e ≤ y ≤ ( 1 + δ )e Tδ−1 = (x, y) ∈ Rn × Rm | gi (x, y) + 12 L∞ 2 2 of the enlarged inner parallel set.

Granularity in nonlinear mixed-integer optimization

9

Already for the choice δ = 1/2 the relaxed feasible set b1 = {(x, y) ∈ Rn × Rm | gi (x, y) ≤ 0, i ∈ I, 0 ≤ y ≤ e} M

and

−1 T1/2 =

 gi ≤ 0, i ∈ I, 0 ≤ y ≤ e , (x, y) ∈ Rn × Rm | gi (x, y) + 12 L∞

gi /2, i ∈ I. Hence, depending on the only differ by the presence of the constants L∞ −1 b1 is to be nonempty if M size of the Lipschitz constants, there is a chance for T1/2 nonempty. Increasing δ to values close to one further increases this chance. This shows that even binary problems may be granular in the sense of Definition 3.4. In fact, in [28] it is shown that many linear test problems from the MIPLIB libraries are both binary and granular. ⊓ ⊔

Before we continue by studying feasibility conditions for the set Tδ−h in Sections 3.4 and 3.5, let us identify some problem classes which are suitable for the computation of the necessary Lipschitz constants in Assumption 3.5. From the mean value theorem, the H¨older inequality and the Weierstrass theorem it is well-known that for fixed x ∈ prx D with a nonempty and compact set D(x) and a continuously differentiable (in y) function gi (x, ·), i ∈ I, the value gi L∞ (x) = max k∇y gi (x, y)k1 y∈D(x)

is a Lipschitz constant for gi (x, ·) on D(x) with respect to the ℓ∞ -norm. To achieve the uniformity of the Lipschitz condition with respect to x, as required in Assumption 3.5, a natural additional assumption is the separability of gi with respect to x and y, that is, gi (x, y) = Fi (x) + Gi (y), along with a Cartesian product structure D = X ×Y with a nonempty and closed set X ⊆ Rn and a nonempty and compact set Y ⊆ Rm . This results in gi L∞ = max k∇Gi (y)k1 . y∈Y

Example 3.9 If Gi (y) = βi⊺ y, i ∈ I, is a linear function, as in the mixed-integer linear problem MILPh from Example 2.2, we do not even need the Cartesian product structure of D, but immediately arrive at the explicit formula gi L∞ = kβi k1 .

⊓ ⊔ Example 3.10 If Gi (y) = 21 y⊺ Qi y + βi⊺ y is a quadratic function with some square, symmetric, but possibly indefinite matrix Qi , and if Y is a polytope, then gi L∞ = max kQi y + βi k1 y∈Y

may be computed as the optimal value of the linear optimization problem LP :

max e⊺ (u + v) s.t. Qi y + βi = u − v, u, v ≥ 0, y ∈ Y. y,u,v

⊓ ⊔

10

Christoph Neumann et al.

Example 3.11 If the entries of the gradient ∇Gi , i ∈ I, are factorable functions, techgi as a guaranteed niques from interval arithmetic may be employed to compute L∞ upper bound for maxy∈Y k∇Gi (y)k1 (cf., e.g., [16,27]). ⊓ ⊔ Example 3.12 Using the result from Example 3.9, for the mixed-integer linear problem MILPh from Example 2.2 with any h > 0 and δ ∈ [δeh , 1) we obtain n o Tδ−h = (x, y) ∈ D−h | Ax + By + kβ k1 h2 ≤ b + bh (δ )

where, by a slight abuse of notation, kβ k1 stands for the vector with entries kβi k1 , i = 1, . . . , p. We remark that in the case D = X × Rm , with some nonempty and closed set X ⊆ Rn , we have D−h = D, so that in [34] not only the inclusion from Lemma 3.6, but b −h for the inner parallel set could be shown for the mixedeven the identity T −h = M b −h of the enlarged inner integer linear case. The extension to the description Tδ−h = M δ parallel set is straightforward. ⊓ ⊔ 3.4 Computable sufficient conditions for granularity under fixed mesh size

According to Proposition 3.7b, for a fixed mesh size h > 0 and δ ∈ [δeh , 1) a straightforward condition for granularity is a successful feasibility test for the set Tδ−h , for example the negativity of the infimum of the purely continuous feasibility problem Fδ−h :

min

z

(x,y,z)∈Rn ×Rm ×R

s.t.

gi h −h h gi (x, y) + L∞ 2 − bi (δ ) ≤ z, i ∈ I, (x, y) ∈ D

z ≥ −1. Fδ−h

Note that itself is a consistent problem for a nonempty set D−h , where a functional description of D allows an analogous approach for checking consistency of D−h . If, in addition, the sets D and, thus, D−h are polyhedral, and the functions gi , i ∈ I, are, for example, smooth and convex, then the problem Fδ−h is efficiently solvable. Under such assumptions, the following result provides an efficiently computable sufficient condition for the existence of feasible points for MIPh . Proposition 3.13 Let Assumption 3.5 hold, and for some h > 0 and δ ∈ [δeh , 1) let the infimum of the feasibility problem Fδ−h be negative. Then the set Mh is nonempty. Note that in the case Ie = 0/ the problem Fδ−h from Proposition 3.13 does not depend on δ . While the result from Proposition 3.13 in general requires the choice of some fixed δ ∈ [δeh , 1), one may as well state a feasibility problem to check whether appropriate values of δ exist in the first place. This is the case if the infimum of the purely continuous problem F −h :

min

δ

(x,y,δ )∈Rn ×Rm ×R

s.t.

gi gi (x, y) + L∞

h 2

≤ h⌊bi /h⌋ωi − bi + hωi δ , i ∈ Ie ,

gi gi (x, y) + L∞

h 2

≤ 0, i ∈ I \ Ie , (x, y) ∈ D−h ,

δ ≥ δeh lies strictly below one, since the set Tδ¯−h then is nonempty for some δ¯ ∈ [δeh , 1).

Granularity in nonlinear mixed-integer optimization

11

Proposition 3.14 Let Assumption 3.5 hold, and for some h > 0 let the infimum of the feasibility problem F −h be strictly smaller than one. Then the set Mh is nonempty. Under the above convexity and smoothness assumptions, also F −h may be solved efficiently. In the case Ie = 0/ the result formally still is correct, since Tδ−h then does not depend on δ and is nonempty iff the infimum of F −h is zero, while it is empty iff the infimum is +∞. However, algorithmically for Ie = 0/ it may be safer to work with the feasibility problem Fδ−h from Proposition 3.13. Remark 3.15 As a main result of Section 3.4 we stress that, under appropriate assumptions, a feasibility proof for M1 can be obtained by exact and efficient methods, as opposed to heuristics. On the other hand, we emphasize that the feasibility tests h from Propositions 3.13 and 3.14 cannot be successful if  for no δ ∈ [δe , 1) the set 1 h b Mδ contains any translate of the set K = {0} × B∞ 0, 2 . This may happen, for exb h are bounded with too small diameter or if they are too flat. ample, if the sets M δ Such instances corresponding to empty inner parallel sets give rise to the general NPhardness of the feasibility problem for MIP1 . ⊓ ⊔ 3.5 Computable sufficient conditions for granularity under variable mesh size In some applications the user may have control over the mesh size h > 0, for example, if it models the coarseness of measurement units or if it corresponds to the right hand side scaling t from Example 2.3. Since the size of the set Tδ−h increases for decreasing h, there is a chance that the granularity condition Tδ−h 6= 0/ holds for all sufficiently small values of h. In fact, the consistency of Tδ−h may be checked by solving a purely continuous feasibility problem in which h plays the role of an additional decision variable. However, since the terms bhi (δ ), i ∈ Ie , from the enlargement technique in Section 3.2 are nonsmooth in the variable h, we will only state conditions for the consistency of the smaller sets n o gi h (3.2) T −h := (x, y) ∈ D−h | gi (x, y) + L∞ ≤ 0, i ∈ I ⊆ Tδ−h 2

for sufficiently small h > 0. The latter, of course, also implies consistency of Tδ−h for all δ ∈ [δeh , 1). The following two results extend [34, Lem. 2.6] and [35, Prop. 2.8].

Proposition 3.16 Let Assumption 3.5 hold, and let the supremum h¯ of the problem F:

max

h

(x,y,h)∈Rn ×Rm ×R

gi h −h s.t. gi (x, y) + L∞ 2 ≤ 0, i ∈ I, (x, y) ∈ D

¯ the set Mh is nonempty. be positive. Then for all h ∈ (0, h) ¯ for any h ∈ (0, h) ¯ we may choose Proof By the assumption of a positive supremum h, a feasible point (x, ¯ y, ¯ h) of F. Since the constraints of F then imply (x, ¯ y) ¯ ∈ T −h , the −h ¯ The assertion now follows from (3.2) and set T is nonempty for all h ∈ (0, h). Proposition 3.7c. ⊓ ⊔

12

Christoph Neumann et al.

If, again, the set D is polyhedral, and the functions gi , i ∈ I, are smooth and convex, then the problem F in Proposition 3.16 is efficiently solvable. Note that F may be unbounded, leading to h¯ = +∞. Next, we discuss an assumption that ensures a positive supremum h¯ of the prob¯ In the lem F in Proposition 3.16, along with a computable lower bound h′ for h. following Slater condition, int D stands for the topological interior of the set D. b satisfies the Slater condition in the Assumption 3.17 The relaxed feasible set M sense that there exists some (x, ¯ y) ¯ ∈ int D with gi (x, ¯ y) ¯ < 0, i ∈ I.

Proposition 3.18 Let Assumptions 3.5 and 3.17 hold with a corresponding point (x, ¯ y), ¯ choose some h1 > 0 with (x, ¯ y) ¯ + hK ∈ D for all h ∈ (0, h1 ], and put   2gi (x, ¯ y) ¯ gi h2 := − max i ∈ I, L∞ > 0 . gi L∞

Then the explicitly computable value h′ := min{h1 , h2 } is positive, and for all h ∈ (0, h′ ] the set Mh is nonempty.

Proof First, since (x, ¯ y) ¯ is chosen from int D, and since K is a bounded set, the stated choice of h1 > 0 is possible, that is, the value h′ is well-defined. Moreover, the value h2 is positive by Assumption 3.17 which implies h′ > 0. Next we shall show that (x, ¯ y, ¯ h′ ) is feasible for the problem F from Propo′ −h sition 3.16. In fact, (x, ¯ y) ¯ ∈D holds by h′ ≤ h1 and the definition of h1 , and ′ g i gi (x, ¯ y) ¯ + h L∞ /2 ≤ 0, i ∈ I, holds by h′ ≤ h2 and the definition of h2 . This yields the feasibility of the point (x, ¯ y, ¯ h′ ) for F so that the supremum h¯ ≥ h′ > 0 of F must be positive. The final assertion now follows from Proposition 3.16. ⊓ ⊔

4 Explicit constructions of feasible points In Section 3 we were interested in sufficient conditions for set Mh of MIPh to be nonempty. If the latter holds, in practical applications one often also needs to know some element of Mh explicitly. In view of Proposition 3.7a, for h > 0 and δ ∈ [δeh , 1) xh , yqh ) ∈ this may be achieved by computing any point (x, y) ∈ Tδ−h and round it to (q Mh . We call such a technique a feasible rounding approach (FRA). We emphasize that all of the subsequently presented feasible rounding approaches will require explicit knowledge of the Lipschitz constants from Assumption 3.5.

4.1 Feasible rounding approaches A first possibility for a feasible rounding approach is to take (x, y) ∈ Tδ−h from an optimal point (x, y, z) of the feasibility problem Fδ−h from Proposition 3.13 with z < 0 (if it exists). While here δ ∈ [δeh , 1) has to be given, one might as well take (x, y) ∈ Tδ−h from an optimal point (x, y, δ ) of the feasibility problem F −h from Proposition 3.14 with δ < 1 (if it exists). Although both of these approaches may be possible at low

Granularity in nonlinear mixed-integer optimization

13

computational cost, the quality of the obtained point (q xh , yqh ) ∈ Mh in terms of its objective function value f (q xh , yqh ) cannot be expected to be good. To generate ‘good’ points in Mh instead, the following method proves to be successful in numerical tests for MILPs [28]. FRA-ROR (feasible rounding approach by retract-optimize-round): Compute an optimal point (xhr , yrh ) of f over Tδ−h , that is, of the problem Phr :

min

(x,y)∈Rn ×Rm

gi f (x, y) s.t. gi (x, y) + L∞

h 2

≤ bhi (δ ), i ∈ I, (x, y) ∈ D−h ,

and then round it to (q xhr , yqrh ) ∈ Mh . The problem Phr may be solved efficiently if, for example, the sets D and, thus, are polyhedral, and the functions f , gi , i ∈ I, are smooth and convex. If linear constraints describing D are known explicitly, then [34, Lem. 2.3] may be used to compute an explicit description of D−h .

D−h

Example 4.1 By Example 3.12 the problem Phr for the mixed-integer linear problem MILPh from Example 2.2 is a linear optimization problem. Note that for the efficient solvability of Phr the linearity of f is of minor importance. For example, a smooth convex function f leads to a smooth convex optimization problem Phr with polyhedral feasible set. ⊓ ⊔

4.2 An a-posteriori error bound For a two-dimensional purely integer linear problem with two inequality constraints, Figure 4.1 illustrates that the point generated by FRA-ROR may be far from optimal. Modifying the illustrated situation by forcing the angle between the two constraints to become more acute results in examples which move the constructed point arbitrarily far away from an optimal point. y2 T −h

b M y⋆h

yqrh

Fig. 4.1 Possible location of the feasible point constructed by FRA-ROR for the minimization of y1

y1

14

Christoph Neumann et al.

To evaluate how ‘good’ the point (q xhr , yqrh ) ∈ Mh is, we compare its objective funcr r r tion value vqh := f (q xh , yqh ) with the optimal value vh of MIPh . As the latter is unknown, we rather bound the difference vqrh − vh in terms of the optimal value vb of the relaxed d by problem MIP (4.1) 0 ≤ vqrh − vh ≤ vqrh − vb.

This bound can be computed explicitly. In fact, after the solution of Phr only the red has to be solved additionally. laxed problem MIP 4.3 An a-priori error bound While an a-posteriori error bound can be achieved at low computational cost under suitable assumptions, it is not useful for controlling the error in the sense that for a given accuracy ε > 0 the mesh size h may be a-priorily chosen such that vqrh − vh < ε holds, given that the application allows varying values of h as in Section 3.5. Hence we shall derive an a-priori error bound for vqrh − vh which does not depend on the solutions of auxiliary optimization problems, but merely on the problem data. Moreover, our results yield an intuition for the performance of FRA-ROR on problems with a structure similar to Example 2.3, where the right hand side of the inequality constraints is scaled. To this end, in the following dist( (b x, yb), Tδ−h ) :=

inf

(x,y)∈Tδ−h

k(x, y) − (b x, yb)k

(4.2)

shall denote the distance of some point (b x, yb) ∈ Rn × Rm to the set Tδ−h with respect to n m some norm k · k on R × R . In addition to the Lipschitz continuity of the functions gi , i ∈ I, with respect to the ℓ∞ -norm from Assumption 3.5, in the following we will also need Lipschitz continuity of f with respect to the norm from (4.2). Assumption 4.2 There exists some L f ≥ 0 such that for all (x1 , y1 ), (x2 , y2 ) ∈ D we have | f (x1 , y1 ) − f (x2 , y2 )| ≤ L f k(x1 , y1 ) − (x2 , y2 )k. Furthermore, let L∞f ≥ 0 denote a Lipschitz constant with respect to y uniformly in x for f on D with respect to the ℓ∞ -norm. Under Assumption 4.2 a possible, but not necessarily tight, choice is L∞f := κ L f with some norm constant κ > 0 such that for all (x, y) ∈ Rn × Rm we have k(x, y)k ≤ κ k(x, y)k∞ . The following example indicates a better choice for the MILP case. Example 4.3 In the mixed-integer linear problem MILPh from Example 2.2, the best possible Lipschitz constant for f (x, y) = c⊺ x + d ⊺ y on Rn × Rm , that is, the Lipschitz modulus | f (x1 , y1 ) − f (x2 , y2 )| , sup 1 1 2 2 (x1 ,y1 )6=(x2 ,y2 ) k(x , y ) − (x , y )k

Granularity in nonlinear mixed-integer optimization

15

is easily seen to coincide with the dual norm of (c, d), so that we may put L f := k(c, d)k⋆ := max{c⊺ x + d ⊺ y| k(x, y)k ≤ 1}. Moreover, we may choose L∞f := kdk⋆∞ = kdk1 . ⊓ ⊔ Lemma 4.4 Let Assumptions 3.5 and 4.2 hold, let L∞f ≥ 0 denote a Lipschitz constant with respect to y uniformly in x for f on D with respect to the ℓ∞ -norm, let d and for any h > 0 and δ ∈ [δeh , 1) let (b x⋆ , yb⋆ ) denote any optimal point of MIP, r r q (q xh , yh ) denote any rounding of any optimal point (xhr , yrh ) of f over Tδ−h . Then the value vqrh = f (q xhr , yqrh ) satisfies x⋆ , yb⋆ ), Tδ−h ). 0 ≤ vqrh − vh ≤ L∞f h2 + L f dist( (b

Proof As above, the first inequality stems from Proposition 3.7a. For the proof of the x⋆ , yb⋆ ) onto the set Tδ−h second inequality note that, with any projection (xhπ , yπh ) of (b r r with respect to k · k, the upper bound vqh − vb of vqh − vh from (4.1) may be written as vqrh − vb = ( f (q xhr , yqrh ) − f (xhr , yrh )) + ( f (xhr , yrh ) − f (xhπ , yπh )) + ( f (xhπ , yπh ) − f (b x⋆ , yb⋆ )).

Due to xqhr = xhr , the first term satisfies

f (q xhr , yqrh ) − f (xhr , yrh ) ≤ L∞f k(xhr , yqrh ) − (xhr , yrh )k∞ = L∞f kq yrh − yrh k∞ ≤ L∞f 2h . Since (xhr , yrh ) is an optimal point of Phr , while (xhπ , yπh ) is a feasible point, for the second term we obtain f (xhr , yrh ) − f (xhπ , yπh ) ≤ 0. Finally, as the distance is the optimal value of the corresponding projection problem, the third term can be bounded by x⋆ , yb⋆ ) ≤ L f k(xhπ , yπh ) − (b x⋆ , yb⋆ )k = L f dist( (b x⋆ , yb⋆ ), Tδ−h ), f (xhπ , yπh ) − f (b

and the assertion is shown. ⊓ ⊔ It remains to bound the expression dist( (b x⋆ , yb⋆ ), Tδ−h ) from the upper bound in Lemma 4.4 in terms of the problem data. As in Section 3.5, to facilitate the analysis we reduce the set Tδ−h to T −h . Due to (3.2) this yields dist( (b x⋆ , yb⋆ ), Tδ−h ) ≤ dist( (b x⋆ , yb⋆ ), T −h )

for any δ ∈ [δeh , 1). The latter distance may be bounded above in terms of problem data by employing a global error bound for the system of inequalities describing T −h . To make the description of T −h purely functional, in the sequel we will choose D = Rn × Rm . To state the global error bound, let g denote the vector of functions gi , i ∈ I, and g the vector of Lipschitz constants Lgi , i ∈ I, both in R p . With the componentwise L∞ ∞ positive-part operator a+ := (max{0, a1 }, . . . , max{0, a p })⊺ for vectors a ∈ R p we

16

Christoph Neumann et al.

g )+ k −h . A global may then define the penalty function k(g(x, y) + 2h L∞ ∞ of the set T error bound relates the geometric distance to the (consistent) set T −h with the evaluation of its penalty function by stating the existence of a constant γ−h > 0 such that for all (b x, yb) ∈ Rn × Rm we have g )+ k . dist( (b x, yb), T −h ) ≤ γ−h k(g(b x, yb) + h2 L∞ ∞

(4.3)

As Hoffman showed the existence of such a bound for any linear system of inequalities in his seminal work [17], γ−h is also called a Hoffman constant, and the error bound (4.3) is known as a Hoffman error bound. Short proofs of this result for the polyhedral case can be found in [15,19]. For global error bounds of broader problem classes see, for example, [2,12,20–26,33], and [3,4,30] for surveys. These references also contain sufficient conditions for the existence of global error bounds. To cite an early result for the nonlinear case from [33], if for convex functions gi , i ∈ I, the set T −h is bounded and satisfies Slater’s condition, then a global error bound holds. b It was used analThe next result simplifies the error bound for points (b x, yb) ∈ M. ogously in [35, Th. 3.3] and follows from the subadditivity of the max operator, the b monotonicity of the ℓ∞ -norm, as well as g+ (b x, yb) = 0 for any (b x, yb) ∈ M. Lemma 4.5 Let D = Rn × Rm , let Assumption 3.5 hold, for given h > 0 let T −h 6= 0, / b satisfy x, yb) ∈ M and let the error bound (4.3) hold with some γ−h > 0. Then all (b g dist( (b x, yb), T −h ) ≤ γ−h kL∞ k∞ 2h .

The combination of Lemmata 4.4 and 4.5 yields the main result of this section. Theorem 4.6 Let D = Rn × Rm , let Assumptions 3.5 and 4.2 hold, let L∞f ≥ 0 denote a Lipschitz constant with respect to y uniformly in x for f on D with respect to the ℓ∞ -norm, and for any h > 0 with T −h 6= 0/ let the error bound (4.3) hold with some γ−h > 0. Then for any δ ∈ [δeh , 1) the objective function value vqrh of any rounding of any optimal point of f over Tδ−h satisfies  g k∞ 2h . 0 ≤ vqrh − vh ≤ L∞f + L f γ−h kL∞ Example 4.7 For the mixed-integer linear problem MILPh from Example 2.2 we obtain g kL∞ k∞ = max kβi k1 = kBk∞ , i∈I

where kBk∞ denotes the maximal absolute row sum of the matrix B. Furthermore, from [17] it is known that for polyhedral constraints not only the global error bound always exists, but also that the corresponding Hoffman constant γ may be chosen independently of the right-hand side vector and, thus, in our case independently of h. The assumption on the existence of an error bound may thus be dropped from Theorem 4.6, and the remaining assumptions yield 0 ≤ vqrh − vh ≤ (kdk1 + k(c, d)k⋆ γ kBk∞ ) h2 . Hence, the error vqrh − vh tends to zero at least linearly with h → 0. The same holds for nonlinear objective functions f under Assumption 4.2, when kdk1 and k(c, d)k⋆ are replaced by the corresponding Lipschitz constants L∞f and L f , respectively. ⊓ ⊔

Granularity in nonlinear mixed-integer optimization

17

g , TheoIn the general nonlinear setting with Lipschitz constants L f , L∞f and L∞ rem 4.6 shows that the dependence of the error bound on the mesh size h is intimately related to the dependence of the Hoffman constant γ−h on h. Then the linear decrease of these errors with h → 0, as in the situation of Example 4.7, is possible if Hoffman constants remain bounded under small perturbations of the underlying inequality system. This was shown in [29] for convex problems under mild assumptions and, under Assumption 3.17, for sufficiently small h > 0 it may be applied to the inequalities describing the set T −h along the lines of the proof of [35, Cor. 3.6]. This shows the following assertion.

b be Corollary 4.8 Let D = Rn × Rm , let Assumptions 3.5, 3.17 and 4.2 hold, let M bounded, and let the functions gi , i ∈ I, be real-valued convex. Then for h → 0 the decreasing rate of the error vqrh − vh is at least linear. Note that, along with Example 2.3, the latter result explains why error bounds improve for optimization problems when the right hand sides of their inequality constraints are increased. On the other hand, fixed box constraints like in binary problems prevent such an argument. 5 An application to bounded knapsack problems The following computational study comprises results for the bounded knapsack problem which was introduced in [11] and is known to be an NP-hard optimization problem (cf. [18, pp. 483-491]). In its original formulation, which is also called the 0-1 knapsack problem, all decision variables are binary. The bounded knapsack problem (BKP) is a generalization of the 0-1 knapsack problem where it is possible to pick more than one piece per item, that is, the integer decision variables may not be binary. A possible numerical approach to bounded knapsack problems is to transform them into equivalent 0-1 knapsack problems for which solution techniques exist that perform very well in practical applications. In contrast to this approach we exploit granularity of the BKP and obtain very good feasible points by applying FRA-ROR to test instances of the bounded knapsack problem. In the bounded knapsack problem we have m ∈ N item types and denote the value and weight of item j ∈ {1, . . . , m} by v j and w j , respectively. Further, there are at most b j ∈ N units of item j available and the capacity of the knapsack is given by c > 0. By maximizing the total value of all items in the knapsack we arrive at the purely integer optimization problem m

BKP :

maxm ∑ v j y j

y∈Z j=1

m

s.t.

∑ w j y j ≤ c, 0 ≤ y j ≤ b j ,

j = 1, . . . , m.

j=1

In order to obtain hard test examples of the BKP we create so-called strongly correlated instances (cf. [32] for an analogous treatment in the context of 0-1 knapsack problems), that is, the weights w j are uniformly distributed in the interval [1, 10000] and we have v j = w j + 1000. Furthermore, b j , j ∈ {1, . . . , m}, is uniformly distributed within the set {0, . . . ,U} for an integer upper bound U ∈ N and, in order to avoid trivial solutions, we set c = σ ∑mj=1 w j b j for some σ ∈ (0, 1).

18

Christoph Neumann et al.

Note that the granularity level of the BKP is controlled by the randomly chosen data b j and w j , j ∈ {1, . . . , m}, as well as σ ∈ (0, 1). The expected values of all b j are U/2 and, at least for fixed weights w j , j ∈ {1, . . . , m}, the expected value of c is σ (∑mj=1 w j )U/2. For the expected test instances the parameter U thus plays the role of the parameter t from Example 2.3 and controls the granularity level. Using the technique from Section 3.2 for the box constraints with δ = 1/2, the enlarged inner parallel set of the BKP is given by −1 T1/2 := {y ∈ Rm |

m

1

m

∑ w jy j ≤ c − 2 ∑ w j, 0 ≤ y j ≤ b j,

j = 1, . . . , m}.

j=1

j=1

−1 is nonempty if and only if c − 12 ∑mj=1 w j ≥ 0 holds. For our specific We see that T1/2 choice of c the latter is equivalent to m

1

∑ w j (σ b j − 2 )

≥ 0.

j=1

−1 may be empty for small values of σ and b j , j ∈ {1, . . . , m}. In the In particular, T1/2 remainder of this section we set σ = 1/3 and use different values of U ≥ 5. Then the expected values of the terms σ b j − 1/2, j = 1, . . . , m, exceed 1/3, so that the enlarged inner parallel sets may be expected to be nonempty. −1 In fact, the inner parallel set T1/2 turns out to be nonempty in all created test instances, so that all test problems are granular in the sense of Definition 3.4. In particular, no further enlargement of the inner parallel set is necessary.

U m 100 1000 10000 100000 1000000

5

10

100

1000

10000

4.35e-01 4.62e-01 4.46e-01 4.48e-01 4.47e-01

2.27e-01 2.50e-01 2.40e-01 2.41e-01 2.40e-01

2.43e-02 2.70e-02 2.58e-02 2.60e-02 2.59e-02

2.45e-03 2.72e-03 2.60e-03 2.62e-03 2.61e-03

2.45e-04 2.72e-04 2.60e-04 2.62e-04 2.61e-04

Table 5.1 Relative optimality gap of FRA-ROR for different choices of U and m

In Table 5.1 we consider the relative optimality gap (b v − vq)/b v of FRA-ROR applied to different instances of the BKP. The results seem to indicate that the optimality gap is independent of the problem size m. However, we see a strong dependency of the optimality gap on the upper bound U. This is caused by the fact that U controls the expected granularity level, which plays a crucial role in the error bound obtained for FRA-ROR. Note that the error bound given in Example 4.7 actually bounds the absolute optimality gap, and that this bound decreases linearly with higher granularity levels. Thus, for the current setting this result predicts a hyperbolic decrease of the relative optimality gap with increasing values of U. This is confirmed by Figure 5.1.

Granularity in nonlinear mixed-integer optimization

19

0.7

relative optimality gap

0.6

0.5

0.4

0.3

0.2

0.1

0 0

10

20

30

40

50

60

70

80

90

100

U

Fig. 5.1 Relative optimality gap for m = 1000 and different choices of U

As mentioned above, solving the BKP to optimality is an NP-hard optimization problem. Instead, for nonempty enlarged inner parallel sets the main effort of our feasible rounding approach consists of solving a continuous linear optimization problem which can be done in polynomial time. This fact is demonstrated in Table 5.2 and Figure 5.2 where we see that especially for the larger test instances FRA-ROR is able to find very good feasible points in reasonable time. Their relative optimality gaps (cf. Table 5.1) are of order 10−3 , that is, the additional time that Gurobi needs to identify a global optimal point only yields a marginal benefit. U

5

m 100 1000 10000 100000 1000000

0.015 0.004 0.045 0.580 7.546

10 0.016 0.016 0.141 2.063 35.484

0.002 0.004 0.035 0.654 7.845

100 0.031 0.016 0.125 2.109 13.766

0.001 0.002 0.024 0.546 8.014

0.000 0.000 0.125 2.047 33.219

1000 0.001 0.002 0.051 0.574 8.154

0.000 0.016 0.125 2.031 33.078

10000 0.002 0.002 0.028 0.548 8.254

0.000 0.016 0.125 1.984 33.594

Table 5.2 Computing time in seconds for FRA-ROR (left) and Gurobi (right) for different choices of U and m

6 Conclusions In this article, feasibility tests for mixed-integer nonlinear optimization problems are studied. They make use of purely continuous optimization problems over an enlarged inner parallel set of the continuously relaxed feasible set, which possesses the crucial property that rounding of any of its points sustains feasibility for the original problem. In particular, no mixed-integer auxiliary problems have to be treated. For assessing the actual quality of the generated feasible points, their optimality gap is estimated by a-posteriori as well as a-priori error bounds, and the latter are shown

20

Christoph Neumann et al.

16

optimal point by Gurobi feasible point by FRA-ROR

computing time [seconds]

14 12 10 8 6 4 2 0 1

1.5

2

2.5

3

m

3.5

4

4.5

5 ×105

Fig. 5.2 Computing time in seconds for an optimal point by Gurobi and a feasible point by FRA-ROR for U = 1000 and different choices of m

to decrease at least linearly in the granularity level. The bounded knapsack problem illustrates our findings computationally. Numerical results for the application of the feasible rounding approach to problems from the MIPLIB libraries, which motivated the current research, are reported in [28]. As it is already stressed for the linear case in [28], also in the nonlinear setting feasible rounding approaches need not be viewed as standalone concepts. For example, in combination with branch-and-bound ideas they may be used for pruning, by providing good upper bounds on NLP-nodes. As the Lipschitz constants are crucial for consistency of the functional description of the inner approximation Tδ−h of the enlarged inner parallel set, calculating at least good upper bounds for them is of major importance. In special cases, like if the functions describing the feasible set are separable, and linear in y, even the precise Lipschitz moduli can be computed as shown in Example 3.9. This makes these problems especially suitable for the feasibility tests and feasible rounding approaches, particularly if used as a standalone concept. If, on the other hand, the constraints do not fulfill such special properties, using single, global Lipschitz constants may be insufficient to ensure consistency of Tδ−h , even if the problem itself is granular in the sense of Definition 3.4. Here, dividing the feasible set into multiple boxes, each with its own local Lipschitz constant increases the chance of finding a good feasible point. Therefore, combining a feasible rounding approach with any concept like branch-and-bound, where optimization problems have to be solved over multiple boxes of decreasing size, seems to be particularly promising. Such adaptations of the presented techniques to nonlinear problems are left for future research.

References 1. T. ACHTERBERG , T. B ERTHOLD , Improving the feasibility pump, Discrete Optimization, Vol. 4 (2007), 77-86.

Granularity in nonlinear mixed-integer optimization

21

2. A. AUSLENDER , J.-P. C ROUZEIX , Global regularity theorems, Mathematics of Operations Research, Vol. 13 (1988), 243-253. 3. A. AUSLENDER , M. T EBOULLE , Asymptotic Cones and Functions in Optimization and Variational Inequalities, Springer, New York, 2003. 4. D. A Z E´ , A survey on error bounds for lower semicontinuous functions, in: Proceedings of 2003 MODE-SMAI Conference, EDP Sci., Les Ulis, 2003, 1-17. 5. P. B ELOTTI , C. K IRCHES , S. L EYFFER , J. L INDEROTH , J. L UEDTKE , A. M AHAJAN , Mixed-integer nonlinear optimization, Acta Numerica, Vol. 22 (2013), 1-131. 6. T. B ERTHOLD , RENS – the optimal rounding, Mathematical Programming Computation, Vol. 6 (2014), 33-54. 7. T. B ERTHOLD , A.M. G LEIXNER , Undercover: a primal MINLP heuristic exploring a largest subMIP, Mathematical Programming, Vol. 144 (2014), 315-346. 8. P. B ONAMI , J.P.M. G ONC¸ ALVES , Heuristics for convex mixed integer nonlinear programs, Computational Optimization and Applications, Vol. 51 (2012), 729-747. 9. M. C ONFORTI , G. C ORNU E´ JOLS , G. Z AMBELLI , Integer Programming, Springer, Cham, 2014. 10. E. DANNA , E. ROTHBERG , C. L E PAPE , Exploring relaxation induced neighborhoods to improve MIP solutions, Mathematical Programming, Vol. 102 (2005), 71-90. 11. G.B. DANTZIG , Discrete-Variable Extremum Problems, Operations Research, Vol. 5 (1957), 266277. 12. S. D ENG , Computable error bounds for convex inequality systems in reflexive Banach Spaces, SIAM Journal on Optimization, Vol. 7 (1997), 274-279. 13. M. F ISCHETTI , F. G LOVER , A. L ODI , The feasibility pump, Mathematical Programming, Vol. 104 (2005), 91-104. 14. M. F ISCHETTI , D. S ALVAGNIN , Feasibility pump 2.0, Mathematical Programming Computation, Vol. 1 (2009), 201-222. ¨ 15. O. G ULER , A.J. H OFFMAN , U.G. ROTHBLUM , Approximations to solutions to systems of linear inequalities, SIAM Journal on Matrix Analysis and Applications, Vol. 16 (1995), 688-696. 16. E. H ANSEN , Global Optimization using Interval Analysis, Marcel Dekker, New York, 1992. 17. A.J. H OFFMAN , On approximate solutions of systems of linear inequalities, Journal of Research of the National Bureau of Standards, Vol. 49 (1952), 263-265. 18. H. K ELLERER , U. P FERSCHY, D. P ISINGER , Knapsack Problems, Springer, New York, 2004. 19. D. K LATTE , Eine Bemerkung zur parametrischen quadratischen Optimierung, Seminarbericht Nr. 50, Sektion Mathematik der Humboldt-Universit¨at zu Berlin, 1983, 174-185. 20. A.S. L EWIS , J.-S. PANG , Error bounds for convex inequality systems, in: J.P. Crouzeix, J.E. Martinez-Legaz, M. Volle (eds.), Generalized Convexity, Generalized Monotonicity: Recent Results, Kluwer Academic Publishers, 1996, 75-110. 21. G. L I , Global error bounds for piecewise convex polynomials, Mathematical Programming, Vol. 137 (2013), 37-64. 22. G. L I , B.S. M ORDUKHOVICH , T.S. P HAM , New fractional error bounds for polynomial systems with applications to H¨olderian stability in optimization and spectral theory of tensors, Mathematical Programming, Vol. 153 (2015), 333-362. 23. X.D. L UO , Z.Q. L UO , Extension of Hoffman’s error bound to polynomial systems, SIAM Journal on Optimization, Vol. 4 (1994), 383-392. 24. Z.Q. L UO , J.S. PANG , Error bounds for analytic systems and their applications, Mathematical Programming, Vol. 67 (1994), 1-28. 25. O.L. M ANGASARIAN , A condition number for differentiable convex inequalities, Mathematics of Operations Research, Vol. 10 (1985), 175-179. 26. O.L. M ANGASARIAN , T.H. S HIAU , Lipschitz continuity of solutions of linear inequalities, programs and complementarity problems, SIAM Journal on Control and Optimization, Vol. 25 (1987), 583-595. 27. A. N EUMAIER , Interval Methods for Systems of Equations, Cambridge University Press, Cambridge, 1990. 28. C. N EUMANN , O. S TEIN , N. S UDERMANN -M ERX , A feasible rounding approach for mixed-integer optimization problems, Optimization Online, Preprint ID 2017-12-6367, 2017. 29. H.V. N GAI , A. K RUGER , M. T H E´ RA , Stability of error bounds for semi-infinite convex constraint systems, SIAM Journal of Optimization, Vol. 20 (2010), 2080-2096. 30. J.-S. PANG , Error bounds in mathematical programming, Mathematical Programming, Vol. 79 (1997), 299-332. 31. C.H. PAPADIMITRIOU , K. S TEIGLITZ , Combinatorial Optimization, Dover Publications, Mineola, 1998.

22

Christoph Neumann et al.

32. D. P ISINGER , Where are the hard knapsack problems?, Computers & Operations Research, Vol. 32 (2005), 2271-2284. 33. S.M. ROBINSON , An application of error bounds for convex programming in a linear space, SIAM Journal on Control and Optimization, Vol. 13 (1975), 271-273. 34. O. S TEIN , Error bounds for mixed integer linear optimization problems, Mathematical Programming, Vol. 156 (2016), 101-123. 35. O. S TEIN , Error bounds for mixed integer nonlinear optimization problems, Optimization Letters, Vol. 10 (2016), 1153-1168.