Keywords: Global optimization, Mixed-integer non-linear optimization, Factorable formulation, Trilinear, Tri- linear hul
Proceedings of GOW’16, pp. 81 – 84.
On sBB Branching for Trilinear Monomials∗ Emily Speakman1 and Jon Lee1 1 Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, USA. {eespeakm, jonxlee}@umich.edu
Abstract
The case of having three or more quantities multiplied together occurs frequently in global-optimization models. In the context of spatial branch-and-bound, using volume as a comparison measure, we investigate the choice of branching variable and branching point and provide some analytic insights.
Keywords:
Global optimization, Mixed-integer non-linear optimization, Factorable formulation, Trilinear, Trilinear hull, McCormick relaxation, Spatial branch-and-bound, Branching point.
1.
Introduction
The spatial branch-and-bound (sBB) algorithm (see [1, 7, 10], for example) is designed to find a globally-optimal solution of factorable mathematical-optimization formulations (see [4]). This divide-and-conquer technique works by introducing auxiliary variables to express every function of the original formulation as a labeled directed graph (DAG). From these DAGs, relaxations are formed and refined (see [2], for example). For a given function, the DAG can be constructed in more than one way, and therefore the algorithm has a choice to make. This choice can have a strong impact on the quality of the convex relaxation obtained from the formulation, and since sBB obtains bounds from these convex relaxations, this choice can have a significant impact on the algorithm. There has been substantial research on how to obtain quality convex relaxations (see [3] for references), and some consideration has been given to constructing DAGs in a favorable way. In particular, [11] obtained analytic results regarding the convexifications obtained from different DAGs for trilinear monomials. [11] compute both the extreme point and inequality representations of alternative relaxations and calculate their n-dimension volumes as a comparison measure. Using volume as a measure gives a way to analytically compare formulations and corresponds to a uniform distribution of the optimal solution across a relaxation. Along with finding good convex relaxations, another important choice in the implementation of sBB is the branching variable and branching point. There has been extensive computational research into branching-point selection (e.g., see [2]). The commonly used approaches (see [9]) are to take the midpoint of the upper and lower bounds of a variable, to branch on the value of the variable at the current solution, or to take a convex combination of the two. This last method ensures that the branching point is not too close to a bound. These alternatives are intuitive and have been supported by empirical evidence. Our work aims to provide analytic results for branching-point selection. In our work, we focus on trilinear monomials; that is, functions of the form f = x1 x2 x3 , where each xi is a simple variable. This is an important class of functions for sBB, because these results apply to monomials that involve auxiliary variables. This means that whenever a formulation contains the product of three (or more) expressions (possibly complex themselves), our results apply. In addition, the case of non-zero lower bounds is particularly im-
∗ The
authors gratefully acknowledge partial support from NSF grant CMMI-1160915 and ONR grant N00014-14-1-0315.
82
Emily Speakman and Jon Lee
portant; even if an original variable has a lower bound of zero there is no guarantee that this will also be the case for an auxiliary variable. Furthermore, after branching, the lower bound of a variable will no longer be zero for at least one child. Using the same notation as [11], for variables xi ∈ [ai , bi ], i = 1, 2, 3, let Oi := ai (bj bk ) + bi (aj ak ). Then construct a labeling such that O1 ≤ O2 ≤ O3 . We can assume (w.l.o.g) that (Ω)
a1 b2 b3 + b1 a2 a3 ≤ b1 a2 b3 + a1 b2 a3 ≤ b1 b2 a3 + a1 a2 b3 .
This condition also arises in the complete characterization of the inequality description for the convex hull of the trilinear monomial f = x1 x2 x3 (see [6, 5]). The (complicated) inequality description of the convex hull is directly used by some global-optimization software (e.g., BARON and ANTIGONE). However, other software (e.g., COUENNE and SCIP) instead uses the iterative McCormick technique to obtain a (simpler) convex relaxation for the trilinear case. These alternative approaches reflect the tradeoff between using a more complicated but stronger convexification and a simpler, weaker one. Furthermore it is not obvious which method will lead to a faster algorithm for a given problem. [11] use volume to compare the alternative (equally) simple relaxations arising from double McCormick with the trilinear convex hull. The classic result of McCormick [4] is used to convexify bilinear monomials. Here, we are interested in the convex hull of the points (f, x1 , x2 ) := (a1 a2 , a1 , a2 ), (a1 b2 , a1 , b2 ), (b1 a2 , b1 , a2 ), (b1 b2 , b1 , b2 ); a tetrahedron in R3 . To derive the facets of the tetrahedron, we multiply out the following inequalities and substitute the variable f for all instances of x1 x2 . (x1 − a1 )(x2 − a2 ) ≥ 0, (x1 − a1 )(b2 − x2 ) ≥ 0, (b1 − x1 )(x2 − a2 ) ≥ 0, (b1 − x1 )(b2 − x2 ) ≥ 0. When we use McCormick iteratively to convexify the trilinear monomial f := x1 x2 x3 , we have three choices of double-McCormick convexifications corresponding to which pair of variables we deal with first. For example, we could first group the variables x1 and x2 , introduce an auxiliary variable w = x1 x2 , and convexify, and then convexify f = wx3 also using the McCormick inequalities. However, we could instead group as x2 (x1 x3 ) or x1 (x2 x3 ). Concretely (using the same notation as [11]) consider the monomial f = xi xj xk , and assume that we first group the variables xi and xj . We let wij = xi xj , and so f = wij xk . Convexify wij = xi xj : wij − aj xi − ai xj + ai aj −wij + bj xi + ai xj − ai bj −wij + aj xi + bi xj − bi aj wij − bj xi − bi xj + bi bj
Convexify f = wij xk : ≥ 0, ≥ 0, ≥ 0, ≥ 0.
f − ak wij − ai aj xk + ai aj ak −f + bk wij + ai aj xk − ai aj bk −f + ak wij + bi bj xk − bi bj ak f − bk wij − bi bj xk + bi bj bk
≥ 0, ≥ 0, ≥ 0, ≥ 0.
For each of the three double-McCormick relaxations, [11] use Fourier-Motzkin elimination to project out the auxiliary variable and obtain a system in the original variables f, xi , xj and xk (i.e. in R4 ). In doing so they are able to compute and compare the volume of the system resulting from each choice along with the volume of the convex hull (also in R4 ). A key result of this paper is that the ‘optimal’ double-McCormick relaxation is obtained when first grouping variables x1 and x2 . We refer to the polytope arising from this relaxation as P3 . From [11], we have formulae for the volume of the convex hull and the best double-McCormick relaxation, parameterized in terms of the upper and lower variable bounds. Let ci ∈ [ai , bi ] be the branching point of variable xi . By substituting ai = ci and bi = ci for a given variable xi into the appropriate formula and summing the results, we obtain the total resulting volume given that we branch on variable xi at point ci . Using this approach we show that when using the convex hull and branching on any variable, the midpoint gives the smallest total volume. In this sense, the commonly-used midpoint is indeed the optimal branching
Branching-Point Selection
83
point. We compare the results from branching at the midpoint of each variable and show that branching on the first variable (labeled according to Ω) gives the smallest total volume. We then show how many steps of branching we can complete before the labeling of the variables must change and branching on a different variable becomes optimal. Next, we consider the double-McCormick relaxation, P3 . For this, we show that when branching on variable x3 the optimal branching point is the midpoint. However when we consider branching on either variable x1 or variable x2 the optimal branching point is not the midpoint. From [8], any double-McCormick relaxation reduces to the convex hull when the lower bounds are all zero. However, once we branch we no longer have all zero lower bounds; hence the difference in the optimal branching point when using a double-McCormick relaxation compared with the convex hull. We show that even in these cases, the sum of the two resulting (double-McCormick relaxation) volumes from branching is a convex function in the branching point over the appropriate domain. We also show that the minimum of this function always occurs at a point greater than the midpoint. Convexity is nice because we can then find the optimal branching point via a simple bisection search.
2.
Results
2.1
Trilinear hull
From [11], we have that the 4-d volume of the convex hull is given by: VolPH = (b1 − a1 )(b2 − a2 )(b3 − a3 ) × (b1 (5b2 b3 − a2 b3 − b2 a3 − 3a2 a3 ) + a1 (5a2 a3 − b2 a3 − a2 b3 − 3b2 b3 )) /24, and the volume of the smallest double-McCormick relaxation (referred to as P3 ) comes from first grouping variables x1 and x2 , and is given by: (b1 − a1 )(b2 − a2 )2 (b3 − a3 )2 5(a1 b1 b2 − a1 b1 a2 ) + 3(b21 a2 − a21 b2 ) . VolP3 = VolPH + 24(b1 b2 − a1 a2 ) Theorem 1. Let ci ∈ [ai , bi ] be the branching point for xi . With the full convex hull, the smallest total i volume after branching is obtained when ci = ai +b 2 , i.e., branching at the midpoint is optimal. Theorem 2. With the full convex hull (and branching at the midpoint of a variable), branching on x1 obtains the smallest total volume and branching on x3 obtains the largest total volume. Proposition 3. With the full convex hull and branching on x1 at the midpoint, for the left interval, if sBB bounds tightening does not occur, the optimal branching variable will not change until a2 (b1 − a1 ) log2 steps. a1 (b2 − a2 ) Proposition 4. With the full convex hull and branching on x1 at the midpoint, for the right interval, if sBB bounds tightening does not occur, the optimal branching variable will not change until b2 (b1 − a1 ) log2 steps. b1 (b2 − a2 )
2.2
Best double-McCormick relaxation
Some software does not use the explicit convex hull for trilinear monomials but instead employs repeated McCormick to obtain a relaxation. Here we describe some branching-point analysis for the double-McCormick relaxation P3 (the relaxation with the smallest volume).
84
Emily Speakman and Jon Lee
Theorem 5. Let c3 ∈ [a3 , b3 ] be the branching point for x3 . Using the relaxation P3 , the smallest total 3 volume after branching is obtained when c3 = a3 +b 2 , i.e., branching at the midpoint is optimal. Next we consider branching on x1 and x2 . Even for the special case of ai = 0 and bi = 1 for i = 1, 2, the midpoint for say x1 is not the optimal branching point when using the relaxation P3 . Substituting these values into the volume formulae, we find √that the minimum of the appropriate (convex) function is obtained when branching at x1 = 33 ≈ 0.577. Theorem 6. For i = 1, 2 and using the relaxation P3 , the total volume of the relaxations after branching on xi : VolP3 + VolP3 ai =ci
bi =ci
is a convex function in the branching point ci , over the domain ci ∈ [ai , bi ]. Proposition 7. For i = 1, 2 and using the relaxation P3 , the minimum of the convex function VolP3 + VolP3 ai =ci
over the domain ci ∈ [ai , bi ] occurs at some value of ci >
3.
bi =ci ai +bi 2 .
Conclusions
We have presented some analytic results on branching variable and branching-point selection in the context of sBB applied to models having functions involving the multiplication of three or more terms. Of course variables often appear in multiple functions. Therefore when deciding on a branching variable or a branching point we may obtain conflicting guidance. But this is an issue with any branching rule, including those tested empirically, and it is always a challenge to find good ways to combine local information to make algorithmic decisions.
References [1] C.S. Adjiman, S. Dallwig, C.A. Floudas, and A. Neumaier. A global optimization method, αBB, for general twice-differentiable constrained NLPs: I. Theoretical advances. Computers & Chemical Engineering, 22(9):1137–1158, 1998. [2] P. Belotti, J. Lee, L. Liberti, F. Margot, and A. Wächter. Branching and bounds tightening techniques for non-convex MINLP. Optimization Methods & Software, 24(4-5):597–634, 2009. [3] S. Cafieri, J. Lee, and L. Liberti. On convex relaxations of quadrilinear terms. Journal of Global Optimization, 47:661–685, 2010. [4] G.P. McCormick. Computability of global solutions to factorable nonconvex programs: Part I. Convex underestimating problems. Mathematical Programming, 10:147–175, 1976. [5] C.A. Meyer and C.A. Floudas. Trilinear monomials with mixed sign domains: Facets of the convex and concave envelopes. Journal of Global Optimization, 29:125–155, 2004. [6] C.A. Meyer and C.A. Floudas. Trilinear monomials with positive or negative domains: Facets of the convex and concave envelopes. Frontiers in Global Optimization, pages 327–352, 2004. [7] H.S. Ryoo and N.V. Sahinidis. A branch-and-reduce approach to global optimization. Journal of Global Optimization, 8(2):107–138, 1996. [8] H.S. Ryoo and N.V. Sahinidis. Analysis of bounds for multilinear functions. Journal of Global Optimization, 19(4):403–424, 2001. [9] N. Sahinidis. BARON 15.6.5: Global Optimization of Mixed-Integer Nonlinear Programs, User’s Manual, 2015. [10] E.M.B. Smith and C.C. Pantelides. A symbolic reformulation/spatial branch-and-bound algorithm for the global optimisation of nonconvex MINLPs. Computers & Chemical Engineering, 23:457–478, 1999. [11] E. Speakman and J. Lee. Quantifying double McCormick. arXiv:1508.02966v2, http://arxiv.org/abs/ 1508.02966, 2015.