How to Compute a Local Minimum of the MPCC

3 downloads 0 Views 1016KB Size Report
May 20, 2017 - Tangi Migot, J.-P Dussault, M Haddou, A Kadrani. To cite this ...... [23] Abdeslam Kadrani, Jean Pierre Dussault, and Abdelhamid Benchakroun.
How to Compute a Local Minimum of the MPCC Tangi Migot, J.-P Dussault, M Haddou, A Kadrani

To cite this version: Tangi Migot, J.-P Dussault, M Haddou, A Kadrani. How to Compute a Local Minimum of the MPCC. 2017.

HAL Id: hal-01525402 https://hal.archives-ouvertes.fr/hal-01525402 Submitted on 20 May 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destin´ee au d´epˆot et `a la diffusion de documents scientifiques de niveau recherche, publi´es ou non, ´emanant des ´etablissements d’enseignement et de recherche fran¸cais ou ´etrangers, des laboratoires publics ou priv´es.

How to Compute a Local Minimum of the MPCC Migot, T.





Dussault, J.-P.

Haddou, M.



Kadrani, A.



May 20, 2017

Abstract We discuss here the convergence of relaxation methods for MPCC with approximate sequence of stationary points by presenting a general framework to study these methods. It has been pointed out in the literature, [25], that relaxation methods with approximate stationary points fail to give guarantee of convergence. We show that by dening a new strong approximate stationarity we can attain the desired goal of computing an M-stationary point. We also provide an algorithmic strategy to compute such point. Existence of strong approximate stationary point in the neighborhood of an M-stationary point is proved.

Keywords : nonlinear programming - MPCC - MPEC - relaxation methods - regularization methods - stationarity - constraint qualication - complementarity - optimization model with complementarity constraints AMS Subject Classication : 90C30, 90C33, 49M37, 65K05

1 Introduction We consider the Mathematical Program with Complementarity Constraint

min f (x)

x∈Rn s.t.

with

f : R n → R, h : Rn → Rm , g : Rn → Rp

is a shortcut for

u ≥ 0, v ≥ 0

and

uT v = 0.

G, H : Rn → Rq . All these functions are assumed The notation 0 ≤ u ⊥ v ≥ 0 for two vectors u and v

and

be continuously dierentiable through this paper.

Rn

(MPCC)

h(x) = 0, g(x) ≤ 0, 0 ≤ G(x) ⊥ H(x) ≥ 0,

to in

In this context solving the problem means nding a local

minimum. Even so this goal apparently modest is hard to achieve in general. This problem has become an active subject in the literature since the last two decades and has been the subject of several monographs [28, 29] and PhD thesis [11, 21, 17, 34, 32, 6]. The wide variety of applications that can be casted as an MPCC is one of the reason for this popularity.

Among other we can cite truss

topology optimization [17], discrete optimization [1], image restoration [3], optimal control [2, 16]. Otherwise, another source of problem are bilevel programming problems [8, 9], where the lower-level problem is replaced by its optimality conditions. This may lead to a more general kind of problem called Mathematical Program with Equilibrium Constraint [29] or Optimization Problem with Variational Inequality Constraint [20]. The MPCC formulation has been the most popular in the literature motivated by more accessible numerical approaches. (MPCC) is clearly a non-linear programming problem and in general most of the functions involved in the formulation are non-convex. In this context solving the problem means nding a local minimum. Even ∗ IRMAR-Insa † BISOUS,

Département d'informatique, Université de Sherbrooke

‡ INSEA

1

so this goal apparently modest is hard to achieve in general due to the degenerate nature of the MPCC. Therefore, numerical methods that consider only rst order informations may expect to compute a stationary point. The wide variety of approaches with this aim computes the KKT conditions, which require that some constraint qualication holds at the solution to be an optimality condition. However, it is well-known that these constraint qualications never hold in general for (MPCC). For instance, the classical MangasarianFromowitz constraint qualication that is very often used to guarantee convergence of algorithms is violated at any feasible point. This is partly due to the geometry of the complementarity constraint that always has an empty relative interior. These issues have motivated the denition of enhanced constraint qualications and optimality conditions for (MPCC) as in [20, 19, 30, 13] to cite some of the earliest research. In 2005, Flegel & Kanzow provide an essential result that denes the right necessary optimality condition to (MPCC). This optimality condition is called M(Mordukhovich)-stationary condition. The name comes from the fact that those conditions are derived by using Mordukhovich normal cone in the usual optimality conditions of (MPCC). In view of the constraint qualications issues pledge to the (MPCC) the relaxation methods provide an intuitive answer. The complementarity constraint is relaxed using a parameter so that the new feasible domain is not thin anymore. It is assumed here that the classical constraints

g(x) ≤ 0

and

h(x) = 0

are not

more dicult to handle than the complementarity constraint. Finally, the relaxing parameter is reduce to converge to the feasible set of (MPCC) similar to an homotopy technique. The interest for such methods has already been the subject of some PhD thesis in [32, 34] and is an active subject in the literature. These methods have been suggested in the literature back to 2000 by Scheel & Scholtes in [30] replacing the complementarity

∀i ∈ {1, . . . , q}

by

Gi (x)Hi (x) − t ≤ 0, . For more clarity we denote

Φt (x)

the map that relaxed the complementarity constraint and so in this case

∀i ∈ {1, . . . , q} ΦSS t,i (x) = Gi (x)Hi (x) − t, .

(SS)

This natural approach is later extended by Demiguel, Friedlander, Nogales & Scholtes in [7] by also relaxing the positivity constraints

G(x) ≥ −t, H(x) ≥ −t.

In [27], Lin & Fukushima improve this relaxation

by expressing the same set with two constraints instead of three.

This improvement leads to improved

constraint qualication satised by the relaxed sub-problem. Even so the feasible set is not modied this improved regularity does not come as a surprise, since constraint qualication measures the way the feasible set is described and not necessarily the geometry of the feasible set itself. In [33], the authors consider a

G(x) = H(x) = 0 in the following way ∀i ∈ {1, . . . , q} ( |Gi (x) − Hi (x)| if |Gi (x) − Hi (x)| ≥ t , (SU) ΦSU t,i (Gi (x), Hi (x)) = Gi (x) + Hi (x) − i (x) ) otherwise tψ( Gi (x)−H t

relaxation of the same type but only around the corner

ψ is a suitable function as described in [33]. An example of such function being ψ(z) = π2 sin( π2 z + 3π 2 ) + 1. In the corresponding papers it has been shown that under suitable conditions providing convergence of

where

the methods they still might converge to some spurious point, called C-stationary point. The convergence to M-stationary being guaranteed only under some second-order condition. Up to this point it is to be noted that dierent methods used in the literature such as interior-point methods, smoothing of an NCP function and elastic net methods share a lot of common properties with the (SS) method and its extension. A new perspective for those schemes has been motivated in [22] providing an approximation scheme with convergence to M-stationary point by considering

∀i ∈ {1, . . . , q}

ΦKDB (x) = (Gi (x) − t)(Hi (x) − t), . t,i

(KDB)

This is not a relaxation since the feasible domain of (MPCC) is not included in the feasible set of the subproblems.

The method has been extended has a relaxation method through a NCP function in [24]

2

∀i ∈ {1, . . . , q}

as

ΦKS t,i (x) = φ(Gi (x) − t, Hi (x) − t), .

(KS)

In a recent paper, [25], Kanzow & Schwartz discuss convergence of the method (KS) considering sequence of approximate stationary points, that is a point that satisfy approximately the KKT conditions.

They

illustrate the fact that the method may converge to spurious weak-stationary point. Our motivation in this paper is to deal with this issue and present an algorithmic approach. We present in this paper a generalized framework to study relaxation methods that encompass the methods KDB, KS and the recent buttery relaxation. We introduce a new method called asymmetric relaxation that also belong to this framework. Then, by introducing a new kind of approximate stationary point we prove that the methods that belong to the generalized framework converge to M-stationary points. We also deal with the question of existence and computation of such point. Existence is proved in the neighbourhood of an M-stationary point without the need of constraint qualication. An active set-penalization scheme that is a generalization of the penalized-active-set method proposed in [23] is proposed to solve the sub-problems of the relaxation. The rest of this paper is organized as follows : Section 2 introduces the basic knowledge from non-linear programming and mathematical programming with complementarity constraints that will be extensively used along this paper. Section 3 presents a unied framework to study relaxation methods and Section 4 illustrates that the strongest existing methods as well as a new asymmetric relaxation belong to this framework. Section 5 motivates the diculty of computing an stationary point of MPCC by using approximate sequences of stationary point. These issues are solved in Section 6, which presents a new notion of approximate stationary point that is sucient to guarantee the well-behaviour of relaxation methods. Section 7 discuss existence of approximate stationary point in a neighborhood of a solution that may not exist. These issues are handled in two steps. Firstly, Section 8 introduces the MPCC with slack variables. Secondly, Section 9 shows that for the MPCC with slack variables, existence of strong epsilon stationary point is guaranteed under very mild conditions. Finally, Secion 10 presents an algorithmic strategy to compute the new approximate stationary point, while Section 11 shows preliminaries numerical results using this strategy.

Notations

Through this paper, we use classical notation in optimization. Let

xT

denotes the transpose

x is denoted ∇x f (x) ∇f (x) when the derivative is clear from the context. supp(x) for x ∈ Rn is the set of indices such that xi 6= 0 for i ∈ {1, . . . , n}. 1l is the vector whose components are all one. R+ and R++ denotes the set of n non-negative and positive real numbers. Given two vectors u and v in R , let u ◦ v denotes the Hadamard product of two vectors so that u ◦ v = (ui vi )1≤i≤n . We also use classical asymptotic Landau notations : f (x) = o(g(x)) as x → a if and only if for all positive constant M there exists a positive number δ such f (x) that |f (x)| ≤ M |g(x)| for all |x − a| ≤ δ , in other words limx→a g(x) = 0. f (x) = ω(g(x)) as x → a if and only if for all positive constant M there exists a positive number δ such that |f (x)| ≥ M |g(x)| for all |x − a| ≤ δ . (x) = K with K a positive nite constant. f (x) ∼ (g(x)) as x → a if and only limx→a fg(x) of a vector or a matrix

x.

The gradient of a function

f

at a point

x

with respect to

and

2 Preliminaries We give classical denitions from non-linear programming and then present their enhanced version to MPCC that will be used in the sequel.

3

2.1 Non-Linear Programming Let a general non-linear program be

minf (x) x∈R

s.t.

(NLP)

h(x) = 0, g(x) ≤ 0,

h : Rn → Rm , g : Rn → Rp and f : Rn → R. Denote F the feasible region of (NLP), the set of active r T g indices Ig (x) := {i ∈ {1, ..., p} | gi (x) = 0} at x, the generalized Lagrangian L (x, λ) = rf (x) + g(x) λ + T h g h r h(x) λ with λ = (λ , λ ) and M (x) is the set of index r multipliers. r By denition, λ is an index r multiplier for (NLP) at a feasible point x if (r, λ) 6= 0 and ∇x L (x, λ) = T 0, λ ≥ 0, g(x) λ = 0. An index 0 multiplier is also called singular multiplier, [4], or an abnormal multiplier, [5]. We call a KKT-point a couple (x, λ) with λ an 1-index multiplier at x. A couple (x, λ) with λ a 0-index multiplier at x is called Fritz-John point. with

In the context of solving non-linear program, that is nding a local minimum, one widely used technique is to compute necessary conditions also called optimality conditions. The principle tool is the Karush-KuhnTucker (KKT) conditions. Let there exists

1



λ ∈ M (x )

x∗

be a local minimum of (NLP) that satisfy a constraint qualication, then

that satisfy

∇x L1 (x∗ , λ) = 0,

(KKT)

h(x∗ ) = 0, min(−g(x∗ ), λg ) = 0 A point

(x∗ , λ)

that satisfy (KKT) is called a stationary point.

In the context of numerical computation it can be dicult to compute stationary points. Hence, it is of interest to consider epsilon-stationary points.

Denition 2.1

(epsilon-stationary point). Given a general non-linear program (NLP) and  ≥ 0 with  ∈ Rn × Rm × Rp . We say that (x∗ , λ) ∈ Rn × Rm × Rp is an epsilon-stationary point (or an epsilon-KKT point) if it satises

∇x L1 (x∗ , λ) ≤  , ∞

and |hi (x∗ )| ≤ , ∀i ∈ {1, . . . , m} gi (x∗ ) ≤ , λgi ≥ 0, |λgi gi (x∗ )| ≤  ∀i ∈ {1, . . . , p} At

=0

.

we get the classical denition of a stationary point of (NLP).

2.2 Mathematical Program with Complementarity Constraints Let

Z

be the set of feasible points of (MPCC). Given

I +0 (x)

x ∈ Z,

we denote

:= {i ∈ {1, . . . , n} | Gi (x) > 0

and

Hi (x) = 0}

(x)

:= {i ∈ {1, . . . , n} | Gi (x) = 0

and

Hi (x) > 0}

I 00 (x)

:= {i ∈ {1, . . . , n} | Gi (x) = 0

and

Hi (x) = 0}

I

0+

We dene the generalized MPCC-Lagrangian function of (MPCC) as

LrM P CC (x, λ) := rf (x) + λg g(x) + λh h(x) − λG G(x) − λH H(x) with

λ := (λg , λh , λG , λH ).

We remind that the tangent cone of a set

X

TX (x∗ ) = {d ∈ Rn | ∃tk ≥ 0

at

x∗ ∈ X

and

is a closed cone dened by

xk →X x∗ 4

s.t.

tk (xk − x∗ ) → d} .

The cone

LM P CC ,

dened in [30] is given as the following not necessarily convex cone

LM P CC (x∗ ) := {d ∈ Rn |∇hi (x∗ )T d = 0 ∀i = 1, . . . , m, ∇gi (x∗ )T d ≤ 0 ∀i = 1, . . . , p, ∇Gi (x∗ )T d = 0 ∀i ∈ I 0+ (x∗ ), ∇Hi (x∗ )T d = 0 ∀i ∈ I +0 (x∗ ), ∇Gi (x∗ )T d ≥ 0, ∇Hi (x)T d ≥ 0 ∀i ∈ I 00 (x∗ ), (∇Gi (x∗ )T d)(∇Hi (x∗ )T d) = 0 ∀i ∈ I 00 (x∗ )} . Due to [12], one always has the following inclusions

TZ (x∗ ) ⊆ LM P CC (x∗ ) . Given a cone

K ⊂ Rn ,

the polar of

K

is the cone dened by

K ◦ := {z ∈ Rn | z T x ≤ 0, ∀x ∈ K}.

We can now dene a mild constraint qualication for (MPCC) called MPCC-Guignard CQ.

Denition 2.2.

∗ ◦ Let x∗ ∈ Z . We say that MPCC-GCQ holds at x∗ if TZ◦ (x∗ ) = LM P CC (x )

In general, there does not exist KKT stationary points since (MPCC) is highly degenerate and does not satisfy classical constraint qualication from non-linear programming. So we introduce weaker stationary concept as in [30, 19].

Denition 2.3 (Stationary point).

x∗ ∈ Z is said

• Weak-stationary if there exists λ ∈ Rm × Rp × Rq × Rq such that ∇x L1M P CC (x∗ , λ) = 0, min(−g(x∗ ), λg ) = 0, 0+ ∗ ∀i ∈ I +0 (x∗ ), λG (x ), λH i = 0, and ∀i ∈ I i = 0.

• Clarke-stationary point if x∗ is weak-stationary and H ∀i ∈ I 00 (x∗ ), λG i λi ≥ 0.

• Alternatively (or Abadie)-stationary point if x∗ is weak-stationary and H ∀i ∈ I 00 (x∗ ), λG i ≥ 0 or λi ≥ 0.

• Mordukhovich-stationary point if x∗ is weak-stationary and H G H ∀i ∈ I 00 (x∗ ), either λG i > 0, λi > 0 or λi λi = 0.

• Strong-stationary point if x∗ is weak-stationary and H ∀i ∈ I 00 (x∗ ), λG i ≥ 0, λi ≥ 0. Relations between these notions are straightforward from the denition. The following theorem is a keystone to dene necessary optimality conditions for (MPCC).

Theorem 2.1 ([14]).

A local minimum of is an M-stationary point.

(MPCC)

that satises MPCC-GCQ or any stronger MPCC-CQ

5

Figure 1: Signs of

λG , λH

for indices

i ∈ I 00 .

From the left to the right : weak-stationary, C-stationary,

A-stationary, M-stationary and S-stationay.

Therefore, devising algorithms to reach KKT stationary points (S-stationary) is not possible in general, and we must satisfy ourselves in devising algorithms reaching M-stationary points. The following example due to Kanzow and Schwartz exhibits a situation where the global minimizer is not a KKT point but an M-stationary point. We will return to this example later on.

Example 2.1.

min

x1 + x2 − x3

s.t.

g1 (x) := −4x1 + x3 ≤ 0 g2 (x) := −4x2 + x3 ≤ 0 0 ≤ G(x) := x1 ⊥ H(x) := x2 ≥ 0.

x∈R3

The global solution is (0, 0, 0)t but is not a KKT point. Indeed, the gradient of the Lagrangian equal to zero yields             1 0 0 0 −4 1 g g 0 − λH 1 + η 0 0 =  1  + λ1  0  + λ2 −4 − λG 1 2 −1 0 0 0 1 1 H and since λg1 + λg2 = 1(third line), summing the rst two lines yields 2 − 4(λg1 + λg2 ) − λG 1 − λ2 = 0 and G H therefore λ1 + λ2 = −2; both cannot be non-negative. Apart from MPCC-GCQ there exists a wide variety of MPCC-constraint qualication described in the literature. We conclude this section by dening only a short selection of them with interesting properties for relaxation methods. Most of these conditions are constraint qualication from non-linear programming that are extended for (MPCC). One of the most principal constraint qualication used in the literature of (MPCC) is the MPCC-LICQ, see [31] for a discussion on this CQ. In a similar way we extend CRCQ as in [15]. A condition that is similar was used in [24, 18] to prove convergence of relaxation methods for (MPCC).

Denition 2.4.

Let x∗ ∈ Z .

1. MPCC-LICQ holds at x∗ if the gradients {∇gi (x∗ ) (i ∈ Ig (x∗ )), ∇hi (x∗ ) (i = 1, . . . , m), ∇Gi (x∗ ) (i ∈ I 00 (x∗ )∪I 0+ (x∗ )), ∇Hi (x∗ ) (i ∈ I 00 (x∗ )∪I +0 (x∗ ))}

are linearly independent. 2. MPCC-CRCQ holds at x∗ if there exists δ > 0 such that, for any subsets I1 ⊆ Ig (x∗ ), I2 ⊆ {1, . . . , m}, I3 ⊆ I 0+ (x∗ ) ∪ I 00 (x∗ ), and I4 ⊆ I +0 (x∗ ) ∪ I 00 (x∗ ), the family of gradients {∇gi (x∗ ) (i ∈ I1 ), ∇hi (x∗ ) (i ∈ I2 ), ∇Gi (x∗ ) (i ∈ I3 ), ∇Hi (x∗ ) (i ∈ I4 )}

has the same rank for each x ∈ Bδ (x∗ ), where Bδ (x∗ ) is the ball of radius δ centered at x∗ .

6

In order to prove convergence of very general relaxation methods we consider the denition of MPCCCRSC, which was introduce and proved to be a constraint qualication very recently in [10]. The polar of the cone

LM P CC

is a key tool in the denition of constraint qualication. It is however not

trivial to compute. Therefore, we introduce the following cone:

PM (x∗ ) := {d ∈ Rn | ∃(λg , λh , λG , λH ) ∈ Rp+ × Rm × Rq × Rq with

d=

H G H 00 ∗ λG i λi = 0 or λi > 0, λi > 0 ∀i ∈ I (x ), m X X λgi ∇gi (x∗ ) + λhi ∇hi (x∗ )

i∈Ig (x∗ )

i=1

X



i∈I 0+ (x∗ )∪I 00 (x∗ )

Denition 2.5. that

X

∗ λH i ∇Hi (x )}.

i∈I +0 (x∗ )∪I 00 (x∗ )

00 00 00 Let x∗ ∈ Z . MPCC-CRSC holds at x∗ if for any partition I++ ∪ I0− ∪ I−0 = I 00 (x∗ ) such

λgi ∇gi (x∗ ) +

m X

X

∗ λG i ∇Gi (x ) +

00 i∈I−0

X

λhi ∇hi (x∗ ) −

∗ λG i ∇Gi (x ) −

00 i∈I 0+ (x∗ )∪I++

i=1

i∈Ig

+

X

∗ λG i ∇Gi (x ) −

X

X

∗ λH i ∇Hi (x )

00 i∈I +0 (x∗ )∪I++

∗ λH i ∇Hi (x ) = 0 ,

00 i∈I0−

G 00 H 00 H 00 with λgi ≥ 0 (i ∈ Ig (x∗ )),λG i and λi ≥ 0 (i ∈ I++ ), λi > 0 (i ∈ I−0 ), λi (i ∈ I0− ) > 0, there exists δ > 0 such that the family of gradients

{∇gi (x) (i ∈ I1 ), ∇hi (x) (i = 1, . . . , m), ∇Gi (x) (i ∈ I3 ), ∇Hi (x) (i ∈ I4 )}

has the same rank for every x ∈ Bδ (x∗ ), where I1 := {i ∈ Ig (x∗ )| − ∇gi (x∗ ) ∈ PM (x∗ )}, 00 00 I3 := I 0+ (x∗ ) ∪ {i ∈ I++ |∇Gi (x∗ ) ∈ PM (x∗ )} ∪ I−0 , 00 00 I4 := I +0 (x∗ ) ∪ {i ∈ I++ |∇Hi (x∗ ) ∈ PM (x∗ )} ∪ I0− .

−∇Gi (x∗ ) and −∇Hi (x∗ ) belong to PM (x∗ ). Indeed, we G H 00 00 already require that λi and λi must be non-zero respectively for the indices i ∈ I−0 and i ∈ I0− and so it implies that these gradients belong to this set. It is not necessary to add that the gradients

Furthermore, MPCC-CRSC is weaker than MPCC-CRCQ. Indeed, MPCC-CRCQ requires that every family of linearly dependant gradients remains linearly dependant in some neighbourhood, while the MPCCCRSC condition consider only the family of gradients that are linearly dependant with coecients that have M-stationary signs.

3 A Unied Framework for Relaxation/Approximation Methods In the past decade, several methods have been proposed to compute an M-stationary point of (MPCC). The rst was the approximation scheme proposed by [22], which was latter improved as a relaxation by [24]. This relaxation scheme has been generalized recently in [10] to a more general family of relaxation schemes. We proposed in this section a unied framework that embraces those methods and may be used to derive new methods.

7

Consider the following parametric non-linear program

Rt (x, s)

parametrized by the vector

t:

min f (x)

x∈Rn s.t.

limktk→0 t¯(t) → 0 and the relaxation map Φ : Rn → Rq . In the sequel we skip t¯ to simplify the notation. It is to be noted here that t is a vector of an arbitrary size denoted l as for instance in [10] where l = 2. The generalized Lagrangian function of (Rt (x)) m p q q q is dened for ν ∈ R × R × R × R × R as with

t¯ : Rl+ → R+

(Rt (x))

g(x) ≤ 0, h(x) = 0, G(x) ≥ −t¯(t)1l, H(x) ≥ −t¯(t)1l, Φ(G(x), H(x); t) ≤ 0,

such that

the dependency in t and denote

LrRt (x, ν) := rf (x) + g(x)T ν g + h(x)T ν h − G(x)T ν G − H(x)T ν H + Φ(G(x), H(x); t)T ν Φ . Let



be the set of active indices for the constraint

Φ(G(x), H(x); t) ≤ 0

so that

IΦ (x; t) := {i ∈ {1, . . . , q} | Φi (G(x), H(x); t) = 0} . The denition of a generic relaxation scheme is completed by the following hypothesis :

• Φ(G(x), H(x); t) is a continuously dierentiable so that Φi (G(x), H(x); t) := Φ(Gi (x), Hi (x); t). •

real valued map extended component by component,

Direct computations give that the gradient with respect to all

x ∈ Rn

x

for

i ∈ {1, . . . , q}

of

Φi (G(x), H(x); t)

for

is given by

∇x Φi (G(x), H(x); t) = ∇Gi (x)αiH (x; t) + ∇Hi (x)αiG (x; t) , αH (x; t) and αG (x; t) are continuous maps by smoothness assumption on Φ(G(x), H(x); t), which assume satisfy ∀x ∈ Z

where we

lim αH (x; t) = H(x)

ktk→0



At the limit when

ktk

and

lim αG (x; t) = G(x) .

ktk→0

(H2)

goes to 0, the feasible set of the parametric non-linear program (Rt (x)) must

converges to the feasible set of (MPCC). In other words, given

F(t) the feasible set of (Rt (x)) it holds

that

lim F(t) = Z ,

(H3)

ktk→0 where the limit is assume pointwise.



At the boundary of the feasible set of the relaxation of the complementarity constraint it holds that for all

i ∈ {1, . . . , q} Φi (G(xk ), H(xk ); t) = 0 ⇐⇒ FGi (x; t) = 0

or

FH i (x; t) = 0,

(H4)

where

FG (x; t) := G(x) − ψ(H(x); t), FH (x; t) := H(x) − ψ(G(x); t), and

ψ

(1)

is a continuously dierentiable real valued function extended component by component. Note

ψ may be two dierent functions in (1) as long as they satisfy the assumptions below. ψ(H(x); t), ψ(G(x); t) are non-negative for all x ∈ {x ∈ Rn | Φ(G(x), H(x); t) = 0} ∀z ∈ Rq lim ψ(z; t) = 0 . (H5)

that the function Those functions and satisfy

ktk→0

8

We prove in Lemma 6.1 that this generic relaxation scheme for (MPCC) converges to an M-stationary point requiring the following essential assumption on the functions respect to the rst variable of

ψ

satises

ψ.

As

t

goes to 0 the derivative with

∀z ∈ Rq ∂ψ(x; t) =0. lim ∂x x=z ktk→0

(H6)

We conclude this section by giving an explicit formula for the relaxation map at the boundary of the feasible set.

Lemma 3.1.

Given Φ(G(x), H(x); t) be such that for all i ∈ IΦ (x; t) Φi (G(x), H(x); t) = FGi (x; t)FH i (x; t) .

The gradient with respect to x of Φi (G(x), H(x); t) for i ∈ IΦ (x; t) is given by ∇x Φi (G(x), H(x); t) := ∇Gi (x)αiH (x; t) + ∇Hi (x)αiG (x; t) ,

with ∂ψ(x; t) FH i (x; t), = FGi (x; t) − ∂x x=Hi (x) ∂ψ(x; t) H FGi (x; t). αi (x; t) = FH i (x; t) − ∂x αiG (x; t)

x=Gi (x)

4 Existing Methods under the Unied Framework In this section, we illustrate the fact that the existing methods in the literature fall under this unied framework.

Indeed, the approximation method from Kadrani et al.

[22] as well as the two relaxation

methods from Kanzow & Schwartz [24] and from Dussault,Haddou & Migot [10] satisfy those hypothesis. We conclude this section by presenting a new asymetric relaxation method that also belong to our framework. An optimization method that satises all of the 6 hypothesis dened in the previous section is called an UF-method.

4.1 The Boxes Approximation In 2009 Kadrani, Dussault and Bechakroun introduce a method, which enjoys the desired goal to converges to an M-stationary point, see [22]. Their original method consider an approximation of the complementarity constraints as a union of two boxes connected only on one point

(t, t)

(here

t ∈ R),

in the following way

∀i ∈ {1, . . . , q}: ΦKDB (x; t) = (Gi (x) − t)(Hi (x) − t) . i

(2)

This is not a relaxation but an approximation, since the feasible domain of the relaxed problem does not include the feasible domain of (MPCC) as illustrated on Figure 2.

Proposition 4.1. Proof.

The approximation scheme (Rt (x)) with

Continuity of the map

Φ

(2)

is an UF-method.

as well as (H3) has been proved in [22].

(H4) is satised by construction considering

ψ(z; t) = t.

In this case (H6) and (H5) are obviously satised.

Now, we consider (H2). Direct computations give that the gradient of

Φ

for all

i ∈ {1, . . . , q}

∇x ΦKDB (x; t) = ∇Gi (x)(Hi (x) − t) + ∇Hi (x)(Gi (x) − t) . i

9

is given by

Figure 2: Feasible set of the approximation (2).

Therefore,

αiG

and

αiH

are given by

αiH (x; t) = Hi (x) − t, αiG (x; t) = Gi (x) − t. It clearly holds that

αiG (x; t) →ktk→0 Gi (x)

and

αiH (x; t) →ktk→0 Hi (x).

So, in this case (H2) is satised.

This completes the proof that all of the 6 hypothesis are satised and so the approximation (2) is an UF-method.

4.2 The L-shape Relaxation The previous method has latter been extended to a relaxation in [24] as illustrated on Figure 3 using a piecewise NCP function by considering

∀i ∈ {1, . . . , q}

ΦKS i (x; t) = φ(Gi (x) − t, Hi (x) − t) , where

2

φ:R →R

is a continuously dierentiable NCP-function with for instance

( φ(a, b) =

Proposition 4.2. Proof.

(3)

ab, if a + b ≥ 0 − 21 (a2 + b2 ), if a + b < 0

The relaxation scheme (Rt (x)) with

Continuity of the map

Φ

(3)

.

is an UF-method.

as well as (H3) has been proved in [24].

(H4) is satised by construction considering

ψ(z; t) = t.

In this case (H6) and (H5) are obviously satised.

Φ for all i ∈ {1, . . . , q} is given ( ∇Gi (x)(Hi (x) − t) + ∇Hi (x)(Gi (x) − t) if Hi (x) − t + Gi (x) − t ≥ 0 ∇x ΦKS . i (x; t) = −∇Gi (x)(Gi (x) − t) − ∇Hi (x)(Hi (x) − t) else

Now, we consider (H2). Direct computations give that the gradient of

Therefore,

αiG

and

αiH

are given by

(

Hi (x) − t if Hi (x) − t + Gi (x) − t ≥ 0 −(Gi (x) − t) otherwise ( Gi (x) − t if Hi (x) − t + Gi (x) − t ≥ 0 αiG (x; t) = −(Hi (x) − t) otherwise

αiH (x; t) =

10

,

.

by

Figure 3: Feasible set of the relaxation (3).

In the case

Hi (x) − t + Gi (x) − t ≥ 0

it clearly holds that

αiG (x; t) → Gi (x)

and

αiH (x; t) → Hi (x).

So, in

this case (H2) is satised.

Hi (x)−t+Gi (x)−t < 0 the opposite holds that is αiG (x; t) → −Hi (x) and αiH (x; t) → −Gi (x). t t ∗ 00 ∗ However, it is to be noted that sequences x with x →ktk→0 x that belong to this case satisfy i ∈ I (x ). G H To sum up, in this case for x ∈ Z then αi (x; t) → Hi (x) = Gi (x) = 0 and αi (x; t) → Gi (x) = Hi (x) = 0. In the case

This proves that (H2) holds in this case too and so this hypothesis holds for this relaxation. This completes the proof that all of the 6 hypothesis are satised and so the relaxation (3) is an UFmethod.

4.3 The Buttery Relaxation The buttery family of relaxations deal with two positive parameters

(t1 , t2 )

dened such that for all

i∈

{1, . . . , q} ΦB i (x; t) := φ(F1i (x; t), F2i (x; t)) ,

(4)

with

F1i (x; t) := Hi (x) − t1 θt2 (Gi (x)), F2i (x; t) := Gi (x) − t1 θt2 (Hi (x)), θt2 : R →] − ∞, 1] are continuously dierentiable non-decreasing concave function with θ(0) = 0, and 0 1 lim θt2 (x) = 1 ∀x ∈ R++ completed in a smooth way for negative values by considering θt2 (z < 0) = zθ t(0)t . 2

where

t2 →0

We assume the following relation between the parameter

t1 = o(t2 )

and

t1 = ω(t22 ).

This method is illustrated on Figure 4.

Proposition 4.3. Proof.

The relaxation scheme (Rt (x)) with

Continuity of the map

Φ

(4)

is an UF-method.

as well as (H3) has been proved in [10].

ψ(z; t) = t1 θt2 (z). t1 = o(t2 ).

(H4) is satised by construction considering satised. The latter being insured by

11

In this case (H5) and (H6) are obviously

Figure 4: Feasible set of the relaxation (4).

Now, we consider (H2). Direct computations give that the gradient of

  F1i (x; t) − t1 θt0 2 (Gi (x))F2i (x; t) ∇Gi (x)     + F (x; t) − t θ0 (H (x))F (x; t) ∇H (x) i 1i i 2i 1 t2 B  ∇x Φi (x; t) = 0  (G (x))F t θ (x; t) − F (x; t) ∇G (x) i 1i 1 2i i  t2    + t1 θt0 2 (Hi (x))F2i (x; t) − F1i (x; t) ∇Hi (x) αiG

Therefore,

and

αiH

for all

i ∈ {1, . . . , q}

is given by

if

F1i (x; t) + F2i (x; t) ≥ 0

if

F1i (x; t) + F2i (x; t) < 0

.

are given by

(

In the case

Φ

F1i (x; t) − t1 θt0 2 (Gi (x))F2i (x; t) t1 θt0 2 (Gi (x))F1i (x; t) − F2i (x; t)

αiH (x; t)

=

αiG (x; t)

( F2i (x; t) − t1 θt0 2 (Hi (x))F1i (x; t) = t1 θt0 2 (Hi (x))F2i (x; t) − F1i (x; t)

F1i (x; t) + F2i (x; t) ≥ 0

it clearly holds that

if

F1i (x; t) + F2i (x; t) ≥ 0

otherwise if

F1i (x; t) + F2i (x; t) ≥ 0

otherwise

αiG (x; t) → Gi (x)

and

,

.

αiH (x; t) → Hi (x).

So, in this

case (H2) is satised.

F1i (x; t) + F2i (x; t) < 0 the opposite holds that is αiG (x; t) → −Hi (x) and αiH (x; t) → −Gi (x). t t ∗ 00 ∗ However, it is to be noted that sequences x with x →ktk→0 x that belongs to this case satisfy i ∈ I (x ). G H Therefore, in this case for x ∈ Z then αi (x; t) → Hi (x) = Gi (x) = 0 and αi (x; t) → Gi (x) = Hi (x) = 0. In the case

This proves that (H2) holds in this case too and so this hypothesis holds for this relaxation. This completes the proof that all of the 6 hypothesis are satised and so the relaxation (4) is an UFmethod.

4.4 An Asymmetric Relaxation Up till now we only consider relaxation methods that are symmetric.

We can dene also asymmetric

relaxation methods illustrated on Figure 5 that respect the hypothesis of our unied framework. Let

IG

and

IH

be two sets of indices such that

IG ∪IH = {1, . . . , q} and IG ∩IH = ∅.

Then, the relaxation

constraint is dened with

( Φi (G(x), H(x); t) =

(Gi (x) − t)Hi (x) Gi (x)(Hi (x) − t)

12

for for

i ∈ IG i ∈ IH

.

(5)

Figure 5: Feasible set of the relaxation (5) with

Proposition 4.4. Proof.

The relaxation scheme (Rt (x)) with

Continuity of the map

Φ(G(x), H(x); t)

(5)

IH = {1}

and

IG = ∅ .

is an UF-method.

as well as (H3) can be easily deduce from the denition of

(5). (H4) is satised by construction considering

ψ(z; t) = t

or

0.

In this case (H5) and (H6) are obviously

satised. Now, we consider (H2). Direct computations give that the gradient of

( ∇x Φi (G(x), H(x); t)(x) = Therefore,

αiG

and

αiH

Φ

for all

∇Gi (x)Hi (x) + ∇Hi (x)(Gi (x) − t) ∇Gi (x)(Hi (x) − t) + ∇Hi (x)Gi (x)

for for

i ∈ {1, . . . , q}

is given by

i ∈ IG i ∈ IH

are given by

(

Hi (x) for i ∈ IG Hi (x) − t for i ∈ IH ( Gi (x) − t for i ∈ IG αiG (x; t) = Gi (x) for i ∈ IH

αiH (x; t)

=

,

.

Clearly in both cases (H2) is satised. This completes the proof that all of the 6 hypothesis are satised and so the relaxation (3) is an UFmethod.

5 Motivations on Epsilon-Solution to the Regularized Subproblems We have seen in the previous sections a general framework to dene relaxations of (MPCC).

From an

algorithmic point of view, the main idea of relaxation methods to solve (MPCC) is to compute a sequence of stationary points or more precisely approximate stationary points for each value of a sequence of parameter

{tk }.

The following denition is a specialization of Denition 2.1 for (Rt (x)). It consists in replacing most

0 in (KKT) by small quantities

.

Denition 5.1.

xk is an epsilon-stationary point for (Rt (x)) with k ≥ 0 if there exists ν k ∈ Rm × Rp × R × R × R such that

∇L1R (xk , ν k ; tk ) ≤ k t ∞ q

q

q

13

and |h(xk )| ≤ k ∀i ∈ {1, . . . , m}, gi (xk ) ≤ k , νig,k ≥ 0, |gi (xk )νig,k | ≤ k ∀i ∈ {1, . . . , p}, Gi (xk ) + t¯k ≥ −k , ν G,k i ≥ 0, ν G,k i (Gi (xk ) + t¯k ) ≤ k ∀i ∈ {1, . . . , q}, Hi (xk ) + t¯k ≥ −k , ν H,k i ≥ 0, ν H,k i (Hi (xk ) + t¯k ) ≤ k ∀i ∈ {1, . . . , q}, Φi (G(xk ), H(xk ); tk ) ≤ k , νiΦ,k ≥ 0, νiΦ,k Φi (G(xk ), H(xk ); tk ) ≤ k ∀i ∈ {1, . . . , q}. Unfortunately, it has been shown in [25] (Theorem 9 and 12) or [10] (Theorem 4.3) for the KDB, L-shape and buttery relaxations that under this denition, sequences of epsilon-stationary points only converge to weak-stationary point without additional hypothesis. Our goal of computing an M-stationary point with a realistic method is far from obvious. Indeed, epsilon-stationary points have two main drawbacks considering our goal. The diculties may come from the approximation of the complementarity condition and the approximate feasibility as shown in Example 5.1 or from the approximation of the feasibility of the relaxed constraint as illustrated in Example 5.2. In those examples, we consider the scheme (2) in order to simplify the presentation, but these observations can be easily generalized to the others methods. Kanzow and Schwartz provide the following example exhibiting convergence to a W-stationary point.

Example 5.1.

min x2 − x1 s.t. 0 ≤ x1 ⊥ x2 ≥ 0.

If we perturb the relation ν Φ Φ(x1 , x2 ; t) ≤  (leaving the other conditions ν Φ ≥ 0, Φ(x1 , x2 ; t) ≤ 0), ν Φ may be positive when the constraint Φ(x1 , x2 ; t) is not active. For the case KDB Φ(x1 , x2 ; t) = (x1 − t)(x2 − t) = −2 with  = t2 , the point x(t, ) = (t−, t+)T ≥ (0, 0)T is epsilonstationary for small enough : Φ(x1 , x2 ; t) ≤  and the choice ν Φ = 1 makes the Lagrangian (−1, 1)T + ν Φ Φ(x1 , x2 ; t) vanish. x(t) converges to the origin when t,  −→ 0 but the origin is only weakly stationary. Now, if the complementarity constraint is relaxed, but the complementarity condition is guaranteed convergence may occur to C-stationary points as shown in the following example.

Example 5.2.

min 21 ((x1 − 1)2 + (x2 − 1)2 ) s.t. 0 ≤ x1 ⊥ x2 ≥ 0.

We specialize the relations in Denition 5.1 as the following, for t and  close to 0.

       

x1 − 1 1 0 x2 − t G H Φ

≤ ,

−ν +ν

x2 − 1 − ν 0 1 x1 − t ∞ 0 ≤ νG, (x1 + t) ≥ 0, ν G (x1 + t) ≤ , 0 ≤ νH , (x2 + t) ≥ 0, ν H (x2 + t) ≤ , 0 ≤ νΦ, (x1 − t)(x2 − t) ≤ , ν Φ [(x1 − t)(x2 − t) − ] ≥ 0. √ √ √ √  % +∞ satisfy the above relations. The points (t + , t + )T together with ν G = ν H = 0 and ν Φ = 1−t−  The limit point when t,  −→ 0 is the origin, which is a C-stationary point with ν G = ν H = −1. On this example, the relaxed regularized complementarity constraint is active for any small enough

t,  > 0;

moreover, the relaxed regularized stationary point is a local maximum for

is a local maximum for the original (MPCC). Another example might help understanding the phenomenon.

14

√ t + 2  < 1.

The origin

Figure 6: Buttery relaxation with a constraint

Example 5.3. We again specialize the relations



1 − x1

1 − x2 0 ≤ νG, 0 ≤ νH , 0 ≤ νΦ,

ΦB (x; t) ≤ .

min − 21 ((x1 − 1)2 + (x2 − 1)2 ) s.t. 0 ≤ x1 ⊥ x2 ≥ 0.

in Denition 5.1 as the following, for t and  close to 0.        1 0 x2 − t

≤ , − νG − νH + νΦ 0 1 x1 − t ∞ (x1 + t) ≥ 0, ν G (x1 + t) ≤ , (x2 + t) ≥ 0, ν H (x2 + t) ≤ , (x1 − t)(x2 − t) ≤ , ν Φ [(x1 − t)(x2 − t) − ] ≥ 0.

√ √ This time, the points (t + , t + )T are no more epsilon-stationary but the points x = (1, −t)T , ν H = 1 + t and x = (−t, 1)T , ν G = 1 + t are. Their limits are (1, 0)T and (0, 1)T which are KKT points for the original MPCC with ν H = 1, ν G = 0 or ν H = 0, ν G = 1. The point (−t, −t)T with ν H = 1 + t, ν G = 1 + t is also stationary, and of course converges to the origin, a local minimizer of the original MPCC. In this example, the limit points are not minimizers for the original MPCC, but satisfy the rst order KKT conditions for a minimizer. The second order conditions fails for those limit points. The two examples show limiting solutions of regularized subproblems which are not local minimizers of the original MPCC. The rst one fails to satisfy a rst order condition while the second one satises such a rst order condition but not the second order one (it is a maximum on the active set). The Figure 6 gives an intuition that explain the weak convergence in Example 5.2 by showing the

-

feasible set of the buttery relaxed complementarity constraint. It can be noticed that this feasible set is very similar to the relaxation from Scheel and Scholtes, [30]. Therefore, it is no surprise that we can not expect more than convergence to a C-stationary point in these conditions.

6 Convergence of Epsilon-Stationary Sequences We now address the convergence of sequences of epsilonstationary points. This motivates the denition of a new kind of epsilon-stationary point called strong epsilon-stationary point, which is more stringent regarding the complementarity constraint.

Denition 6.1. p

R ×R

3q

xk is a strong epsilon-stationary point for (Rt (x)) with k ≥ 0 if there exists ν k ∈ Rm × such that

∇L1R (xk , ν k ; tk ) ≤ k t ∞

15

and |h(xk )| ≤ t¯k + O(k ) ∀i ∈ {1, . . . , m}, gi (xk ) ≤ k , νig,k ≥ 0, |gi (xk )νig,k | ≤ k ∀i ∈ {1, . . . , p}, Gi (xk ) + t¯k ≥ −k , νiG,k ≥ 0, νiG,k (Gi (xk ) + t¯k ) ≤ k ∀i ∈ {1, . . . , q}, Hi (xk ) + t¯k ≥ −k , νiH,k ≥ 0, νiH,k (Hi (xk ) + t¯k ) ≤ k ∀i ∈ {1, . . . , q}, Φi (G(xk ), H(xk ); tk ) ≤ 0, νiΦ,k ≥ 0, νiΦ,k Φi (G(xk ), H(xk ); tk ) = 0 ∀i ∈ {1, . . . , q}. In this case following a similar proof to the one of [25] and [10], we get an improved result that keep the nice properties of the relaxations without strong assumption on the sequence of

{k }.

The following lemma shows that a sequence of strong epsilon-stationary points converges to a weakstationary point. This is not a new result since the same has been proved in [25] and [10] for a sequence of epsilon-stationary points. However, the new denition allows to go further by showing convergence to an M-stationary point.

Lemma 6.1. Given {tk } a sequence of parameter and {k } a sequence of non-negative parameter such that both sequences decrease to zero as k ∈ N goes to innity. Assume that k = o(t¯k ). Let {xk , ν k } be a sequence of strong epsilon-stationary points of (Rt (x)) according to denition 6.1 for all k ∈ N with xk → x∗ . Let {η G,k }, {η H,k } be two sequences such that η G,k := ν G,k − ν Φ,k αH (xk ; tk ), η H,k := ν H,k − ν Φ,k αG (xk ; tk ).

(6)

Assume that the sequence of multipliers {ν h,k , ν g,k , η G,k , η H,k } is bounded. Then, x∗ is an M-stationary point of (MPCC). Proof.

The proof is divided in two parts. We rst show that

x∗

is a weak-stationary point and then we prove

that is an M-stationary point. Let us prove the rst part of the lemma. By denition

{xk , ν k } is a sequence of strong epsilon-stationary

points of (Rt (x)). We make the condition on the Lagrangian



∇L1R (xk , ν k ; tk ) ≤ k t ∞ more explicit. By construction of

Φ(G(x), H(x); t)

this condition becomes



∇f (xk ) + ∇g(xk )T ν g,k + ∇h(xk )T ν h,k − ∇G(xk )T η G,k − ∇H(xk )T η H,k ≤ k . ∞ Besides, the sequence of multipliers

{ν h,k , ν g,k , η G,k , η H,k }

(7)

is assumed bounded. Therefore, it follows that

the sequence converges to some limit point

{ν h,k , ν g,k , η G,k , η H,k } → (ν h , ν g , η G , η H ) . It is to be noted that for

We prove that

k

suciently large it holds supp(ν

g

) ⊂ supp(ν g,k )

supp(η

G

) ⊂ supp(η G,k ) .

supp(η

H

) ⊂ supp(η H,k )

(x∗ , ν h , ν g , η G , η H ) is a weak-stationary point. Obviously, since k ↓ 0 it follows that x∗ ∈ Z , = 0 by (7) and that νig = 0 for i ∈ / Ig (x∗ ). It remains to show that for indices

∇x L1M P CC (x∗ , ν h , ν g , η G , η H )

16

i ∈ I +0 (x∗ ), ηiG = 0. The opposite case for indices i ∈ I 0+ (x∗ ) +0 ∗ So, let i be in I (x ). By denition of strong k -stationarity it holds for all k that

would follow in a completely similar way.

|νiG,k (Gi (xk ) + t¯k )| ≤ k . νiG,k →k→∞ 0

Gi (xk ) → Gi (x∗ ) > 0. Φ,k Without loss of generality we may assume that for k suciently large νi 6= 0 otherwise ηiG = 0 and the Φ,k proof is complete. By strong -stationarity νi 6= 0 implies that FH,i (xk ; tk ) = 0 by (H4). (H2) yields H k ∗ G Φ,k αi (x ; tk ) → Hi (x ) and so ηi = 0 unless ν diverges as k grows. We now prove that the latter case leads

Therefore,

since

k ↓ 0

and

to a contradiction. Assume that constant

C

ν Φ,k → ∞,

boundedness hypothesis on

ηiG

gives that there exists a nite non-vanishing

such that

νiΦ,k αiH (xk ; tk ) → C . ηiH is nite and νiΦ,k αiG (xk ; tk ) → ∞ as Gi (xk ) > 0 then necessarily νiH,k → ∞. Furthermore, H,k k k noticing that FH (x ; tk ) = 0 gives Hi (x ) ≥ 0, leads to a contradiction with νi → ∞ since by -stationarity we get H,k H,k H,k νi (Hi (xk ) + t¯k ) = νi Hi (xk ) + νi t¯k ≤ k Moreover, since

and

k = o(t¯k ).

We can conclude that for

i ∈ I +0 (x∗ ), ηiG = 0

and therefore

x∗

is a weak-stationary point.

x∗ is even stronger that weak-stationary point since it is an M-stationary point. 00 ∗ G H G H We now consider indices i ∈ I (x ). Our aim here is to prove that either ηi > 0, ηi > 0 or ηi ηi = 0. It Φ,k is clear that if νi = 0, then ηiG and ηiH are non-negative values and the result holds true. So, without loss Φ,k of generality we may assume that νi ≥ 0 and then Φi (G(xk ), H(xk ); tk ) = 0 by Denition 6.1. k k k k By construction of Φi (G(x ), H(x ); tk ) given in hypothesis (H4) it follows that Φi (G(x ), H(x ); tk ) = k k 0 ⇐⇒ FG,i (x ; tk ) = 0 or FH,i (x ; tk ) = 0 where we remind that Now, let us prove that

FG,i (xk ; tk ) = Gi (xk ) − ψ(Hi (xk ); tk ), FH,i (xk ; tk ) = Hi (xk ) − ψ(Gi (xk ); tk ), FG,i (xk ; tk ) = 0 since the other case is completely similar. k Furthermore by construction of Φi (G(x ), H(x ); tk ) it holds that Gi (x ) is non-negative in this case. Considering one of the complementarity condition of the strong -stationarity gives Without loss of generality we assume that

k

k

k ≥ |νiG,k (Gi (xk ) + t¯k )| = |νiG,k Gi (xk )| + |νiG,k t¯k | , since

Gi (xk )

is non-negative and it follows that

|νiG,k t¯k | ≤ k . Necessarily Now at

νiG,k →k→∞ 0 as we assume in our statement that k = o(t¯k ). xk we can use Lemma 3.1 that for FGi (xk ; tk ) = 0 gives ∂ψ(x; tk ) G k αi (x ; tk ) = − FH i (xk ; tk ), ∂x x=Hi (xk ) αiH (xk ; tk ) = FH i (xk ; tk ). k FH i (x ; tk ) = 0 we are ∂ψ(x;tk ) →k→∞ 0. ∂x k

Obviously, if holds that

x=Hi (x )

done and so assume that

FH i (xk ; tk ) 6= 0.

By hypothesis (H6), it Φ,k G k Therefore, αi (x ; tk )νi going to a non-zero limit would imply that

17

αiH (xk ; tk )νiΦ,k goes to innity. However, this Φ,k G k converges to zero. necessarily αi (x ; tk )νi

is a contradiction with

FH i (xk ; tk ). FH i (xk ; tk ) ≥ 0

Finally, we examine two cases regarding the sign of non-negative, which satisfy the desired condition. For argument than for

νiG,k .

Thus, it follows that

This concludes the proof that stationarity of

x



x∗

we proved for every

ηiG

being nite. We can conclude that

For we

FH i (xk ; tk ) ≤ 0, H,k get νi →k→∞ 0

we get

ηiG , ηiH

using the same

ηiH = 0.

is an M-stationary point, since additionally to the proof of weak-

i ∈ I 00 (x∗ )

that either

ηiH > 0, ηiG > 0

or

ηiH ηiG = 0.

The following theorem is a direct consequence of both previous lemmas and is our main statement.

Theorem 6.1. Given {tk } a sequence of parameter and {k } a sequence of non-negative parameter such that both sequences decrease to zero as k ∈ N goes to innity. Assume that k = o(t¯k ). Let {xk , ν k } be a sequence of epsilon-stationary points of (Rt (x)) according to denition 6.1 for all k ∈ N with xk → x∗ such that MPCC-CRSC holds at x∗ . Then, x∗ is an M-stationary point of (MPCC). Proof.

The proof is direct by Lemma 6.1 and Corollary 2.3 of [10] that ensures boundedness of the sequence

(6) under MPCC-CRSC. Theorem (6.1) attains the ultimate goal, however it is not a trivial task to compute such a sequence of epsilon-stationary points. This is discussed later. Another important question is the existence of strong epsilon-stationary points in the neighbourhood of an M-stationary point.

This problem is tackled in the

following sections.

7 On Lagrange Multipliers of the Regularization The following example develops on Example 2.1 due to Kanzow and Schwartz exhibits a situation where the regularized subproblems have no KKT point.

Example 7.1.

The KDB regularized problem is min

x1 + x2 − x3

s.t.

−4x1 + x3 ≤ 0 −4x2 + x3 ≤ 0 x1 ≥ −t x2 ≥ −t (x1 − t)(x2 − t) ≤ 0.

x∈R3

(8)

The point (t, t, 4t)t is feasible so that the minimum value of this program is ≤ −2t. Moreover, whenever x1 > t, we must have x2 0, consider the point x = 0, (sG , sH ) = (t, t + δ) and Lagrange multiplier (ν sG , ν sH , ν G , ν H , ν Φ ) = (−1, 0, 0, 0, 1δ ). • Condition on the gradient of the Lagrangian |∇x L1 (x, s, ν)| = | − 1 − ν sG | = 0 |∇sG L1 (x, s, ν)| = |ν sG − ν G + ν Φ (sH − t)| = 0 |∇sH L1 (x, s, ν)| = |ν sH − ν H + ν Φ (sG − t)| = 0 • Condition on the feasibility |x − sG | = t ≤  |0 − sH | = t + δ ≤  sG + t = 2t ≥ −, sH + t = 2t + δ ≥ − (sG − t)(sH − t) = 0 • Condition on the complementarity |(sG + t)ν G | = 0, |(sH + t)ν H | = 0 |(sG − t)(sH − t)ν Φ | = 0

This completes the proof that there is a strong epsilon-stationary point for the formulation with slack variables. This example motivates the use of slack variables to dene the (MPCC) and in this case study the existence of strong epsilon-stationary point in a neighbourhood of an M-stationary point.

19

8 The MPCC with Slack Variables Consider the following parametric non-linear program

min

(x,s)∈Rn ×R2q s.t.

Rt (x, s)

parametrized by

t:

f (x) g(x) ≤ 0, h(x) = 0,

s

(Rt (x, s))

sG = G(x), sH = H(x), sG ≥ −et¯, sH ≥ −et¯, Φ(sG , sH ; t) ≤ 0, limktk→0 t¯ = 0+ and the relaxation map Φ(sG , sH ; t) : Rq × Rq → Rq H(x) by sG and sH in the map Φ(G(x), H(x); t). s The generalized Lagrangian function of (Rt (x, s)) is dened as

with

is dened by replacing

G(x)

and

Lrs (x, s, ν; t) :=rf (x) + g(x)T ν g + h(x)T ν h − (G(x) − sG )T ν sG − (H(x) − sH )T ν sH − sG T ν G − sH T ν H + Φ(sG , sH ; t)T ν Φ . Let

Fs

s

be the feasible set of (Rt (x, s)).

The following result is a direct corollary of Theorem 6.1 stating that the reformulation with slack variables does not alter the convergence result.

Corollary 8.1. Given {tk } a sequence of parameter and {k } a sequence of non-negative parameter such that both sequences decrease to zero as k ∈ N goes to innity. Assume that k = o(t¯k ). Let {xk , ν k } be a sequence of strong epsilon-stationary points of (Rts (x, s)) for all k ∈ N with xk → x∗ such that MPCC-CRSC holds at x∗ . Then, x∗ is an M-stationary point of (MPCC). Proof.

˜ h(x) := (h(x), sG −G(x), sH −H(x)) and x ˜ := (x, sG , sH ). s It is clear that the non-linear program (Rt (x, s)) fall under the formulation (Rt (x)). Therefore, we can apply 6.1 to conclude this proof. Let

˜ h(x) : Rn → Rm ×Rq ×Rq

be such that

s

The following lemma giving an explicit form of the gradient of the Lagrangian function of (Rt (x, s)) can be deduced through direct computations.

Lemma 8.1.

The gradient of Lrs (x, s, ν; t) is given by

∇x Lrs (x, s, ν; t) = r∇f (x) + ∇g(x)T ν g + ∇h(x)T ν h − ∇G(x)T ν sG − ∇H(x)T ν sH , ∇sG Lrs (x, s, ν; t) = ν sG − ν G + ∇sG Φ(sG , sH ; t)T ν Φ , ∇sH Lrs (x, s, ν; t) = ν sH − ν H + ∇sH Φ(sG , sH ; t)T ν Φ .

(10)

(11)

There is two direct consequences of this result. First, it is easy to see from this lemma that computing

Lrs (x, s, ν; t) is equivalent to computing a stationary point of Lr (x, ν; t). Secondly, a r stationary point of Ls (x, s, ν; t) with r = 1 satises one of the condition of weak-stationary point of (MPCC) 1 that is ∇LM P CC (x, ν) = 0.

a stationary point of

In the next section, we now consider the existence of strong epsilon-stationary point for the relaxation

s

with slack variables (Rt (x, s)).

9 Existence of Strong Epsilon-Stationary Points for the Regularization with Slack Variables Before stating our main result we give a serie of additional hypothesis on the relaxation and the function

ψ.

It is essential to note once again, that all these hypothesis are not restrictive, since they are satised by the existing methods in the literature.

20

9.1 Assumptions The assumptions made in this section are divided in two parts. The rst part concerns assumptions on the domain of the relaxation to be precised in the following subsection. The second part is assumptions on the relaxation function

ψ

that as been used to dened the relaxation map on the boundary of the feasible set in

Section 3.

9.1.1 Assumptions on the Relaxations We denote

Bc ((−t¯1l, −t¯1l)T )

the ball of radius

c

and centered in

(−t¯1l, −t¯1l)T .

We assume that the schemes

considered in the sequel satisfy belong to one of two following cases for positive constants

c, 

and

t¯ :

Case 1

Bc ((−t¯1l, −t¯1l)T ) ∩ {(sG , sH )T | sG ≥ −t¯1l, sH ≥ −t¯1l} ⊂ Fs

(F1)

Case 2

 Bc ((−t¯1l, ψ(−t¯1l; t))T ) ∪ Bc ((ψ(−t¯1l; t), −t¯)T ) ∩{(sG , sH )T | sG ≥ −t¯1l, sH ≥ −t¯1l, Φ(sG , sH ; t) = 0} ⊂ Fs (F2) The rst case includes the buttery relaxation and the KS relaxation, while the second case includes the approximation KDB.

9.1.2 Assumptions on the Relaxation Function For all

t ∈ Rl++ ,

we make the following supplementary assumptions on the function

remind here that the functions

ψ

are separable with respect to



ψ

for all

x ∈ Rq .

We

x.

∂ψ(x; t) >0. ∂t

(A1)

∂ψ(x; t) ≥0. ∂x

(A2)

ψ(ψ(ktk∞ ; t); t) ≤ ktk∞ .

(A3)

ψ(−ktk∞ ; t) ≤ ktk∞

(A4)



• •

Hypothesis (A1) in particular implies some monotonicity on the feasible set of the relaxed problems. Assumption (A4) is used for the second kind of relaxations only. It is to be noted here that the assumptions (A1),(A2),(A3) and (A4) are not the weakest for obtaining the following results. However, those assumptions are satised by all the relaxations dened in the literature.

Lemma 9.1.

Assume that (A1), (A2) and (A3) hold true. Then, giving constants c > 0, t¯ > 0 and  > 0 the following holds true for all ktk∞ ∈ (0, t¯ + c) t¯ + c − ψ(ψ(t¯ + c; t); t) > 0 .

Proof.

Using (A1), (A2) and that

ktk∞ ∈ (0, t¯ + c)

yields

t¯ + c − ψ(ψ(t¯ + c; t); t) > t¯ + c − ψ(ψ(t¯ + c; e(t¯ + c)); e(t¯ + c)) ≥ 0 . The conclusion comes from assumption (A3).

21

Lemma 9.2.

Assume that (A1) and (A3) holds true. Then, giving constants c > 0, t¯ > 0 and  > 0 the following holds true for all ktk∞ ∈ (0, t¯ + c] ψ(t¯ + c; t) ≤ t¯ + c .

Proof.

Using assumption (A1) and then (A3) gives

ψ(t¯ + c; t) < ψ(t¯ + c; e(t¯ + c)) ≤ t¯ + c , which concludes the proof.

Lemma 9.3. that

Given positive constants t¯, c, , K . There exists a t∗ > 0 such that for all t ∈ (0, t∗ ] it holds ∂ψ(x; t) ≤ K ∂x x=t¯+c

and

∂ψ(x; t) ≤ K . ∂x x=c

0 ≤ t¯ −

Proof.

The proof is clear from Assumption (H6) on the relaxation.

Lemma 9.4.

Assume that (A1) and (A4) holds true. Then, giving constants c > 0, t¯ > 0 and  > 0 with t¯ > c the following holds true for all ktk∞ ∈ (0, t¯ − c] ψ(t¯ − c; t) ≤ t¯ + c .

Proof.

Using assumption (A1) and then (A4) gives

ψ(t¯ − c; t) < ψ(t¯ − c; e(t¯ − c)) ≤ t¯ − c ≤ t¯ + c , which concludes the proof.

9.2 Main Theorem on Existence of Lagrange Multiplier All of the supplementary assumptions made above are now used to derive the following result.

Theorem 9.1.

Let x∗ ∈ Z be an M-stationary point and  > 0 be arbitrarily small. Furthermore, assume that the hypothesis (A1),(A2),(A3),(A4) on ψ and the hypothesis (F1) or (F2) on the relaxation introduced above hold true. Then, there exists positive constants c, t∗ and t¯∗ with t¯∗ > c and a neighbourhood U (x∗ ) of (x∗ , 0)T such that for all t ∈ (0, t∗ ) and t¯ ∈ (0, t¯∗ ) there exists (x, s)T ∈ U (x∗ ), which is strong epsilonstationary point of the relaxation (Rts (x, s)). Regarding the value of

t∗

we need at least that

ktk∞ ≤ t¯ − c.

The constant

c

is given in the proof and

depends on the multipliers of the M-stationary point.

Proof.

The proof is conducted in two steps.

First, we construct a point based on the solution that is a

candidate to be a strong epsilon-stationary point. Then, we verify that this candidate is actually a strong epsilon-stationary point.

x∗

is assumed to be an M-stationary point. Therefore, there exists

λ = (λg , λh , λG , λH )

∇L1M P CC (x∗ , λ) = 0, H min(λg , −gi (x∗ )) = 0, λG I +0 (x∗ ) = 0, λI 0+ (x∗ ) = 0, either Let

c

H λG i > 0, λi > 0

either

H λG i λi = 0

for

i ∈ I 00 (x∗ ).

be the positive constant bounding the value of the Lagrange multipliers so that

c :=

max

i∈supp(λG ),j∈supp(λH )

22

1 1 + H . |λG | |λ i j |

such that

Construction of the point

(ˆ x, sˆ, νˆ)

Let us construct a point

(ˆ x, sˆ, νˆ)

that satisfy the strong epsilon-

stationary conditions (6.1).

x ˆ := x∗ , νˆg := λg , νˆh := λh , νˆsG := λG , νˆsH := λH . We now split into two cases

(A) and (B) corresponding to the two dierent kind of relaxations.

Denote

the following set of indices

00 H I−0 (x∗ , λ) := {i = 1, . . . , q | λG i < 0, λi = 0}, 00 H I0− (x∗ , λ) := {i = 1, . . . , q | λG i = 0, λi < 0}, 00 00 (x∗ , λ) ∪ I0− (x∗ , λ)), Iν G := supp(ˆ ν sG ) \ (I−0 00 00 Iν H := supp(ˆ ν sH ) \ (I−0 (x∗ , λ) ∪ I0− (x∗ , λ)),

Iν H \ Iν G

A) Consider the Case 1, we choose sˆG , sˆH , νˆG , νˆH

and

νˆΦ

such that :

 00 ψ(t¯ + c; t), i ∈ I−0 (x∗ , λ)    t¯ + c, i ∈ I 00 (x∗ , λ) 0− sˆG := −t¯νˆG  , i ∈ I G  νG   νˆ ∈ F otherwise  00 t¯ + c, i ∈ I0− (x∗ , λ)    ψ(t¯ + c; t), i ∈ I 00 (x∗ , λ) −0 sˆH := −t¯νˆH  H , i ∈ I H  ν   νˆ ∈ F otherwise

,

,

( 00 00 νˆisG for i ∈ I 0+ (x∗ ) ∪ I 00 (x∗ )\(I−0 (x∗ , λ) ∪ I0− (x∗ , λ)) νˆ := 0 otherwise ( 00 00 νˆisH for i ∈ I +0 (x∗ ) ∪ I 00 (x∗ )\(I−0 (x∗ , λ) ∪ I0− (x∗ , λ)) νˆH := 0 otherwise G

and nally

νˆΦ :=

 s −ˆ νi G   (s;t)  αH i   

(ˆ x, sˆ, νˆ)

satisfy the stationarity :

s −ˆ νi H G αi (s;t)

0

for

00 i ∈ I−0 (x∗ , λ)

for

00 i ∈ I0− (x∗ , λ)

,

.

.

otherwise

Finally, we verify that in both cases we satisfy the strong epsilon-

stationary conditions, that is



∇L1s (ˆ x, sˆ, νˆ; t) ∞ ≤ , and

|hi (ˆ x)| ≤ t¯ + c ∀i ∈ {1, . . . , m}, gi (ˆ x) ≤ , νˆig ≥ 0, |gi (ˆ x)ˆ νig | ≤  ∀i ∈ {1, . . . , p}, |Gi (ˆ x) − sˆG,i | ≤ t¯ + c ∀i ∈ {1, . . . , q}, |Hi (ˆ x) − sˆH,i | ≤ t¯ + c ∀i ∈ {1, . . . , q}, sˆG,i + t¯ ≥ −, νˆiG ≥ 0, νˆiG (ˆ sG,i + t¯) ≤  ∀i ∈ {1, . . . , q}, sˆH,i + t¯ ≥ −, νˆiH ≥ 0, νˆiH (ˆ sH,i + t¯) ≤  ∀i ∈ {1, . . . , q}, Φi (ˆ sG , sˆH ; t) ≤ 0, νˆiΦ ≥ 0, νˆiΦ Φi (ˆ sG , sˆH ; t) ≤ 0 ∀i ∈ {1, . . . , q}. 23

We split the rest of the proof of I.

II. III. IV. V. VI.

A) in 6 parts :



∇x L1s (ˆ x)ˆ νig | ≤  ∀i ∈ {1, . . . , p} x, sˆ, νˆ; t) ∞ ≤ , gi (ˆ x) ≤ , νˆig ≥ 0, |gi (ˆ {1, . . . , m};

∇s L1s (ˆ x, sˆ, νˆ; t) ∞ ≤ ;

and

|hi (ˆ x)| ≤ t¯ + c ∀i ∈

|Gi (ˆ x) − sˆG,i | ≤ t¯ + c, |Hi (ˆ x) − sˆH,i | ≤ t¯ + c ∀i ∈ {1, . . . , q}; sH,i + t¯) ≤  ∀i ∈ {1, . . . , q}; sG,i + t¯) ≤ , sˆH,i + t¯ ≥ −, νˆiH (ˆ sˆG,i + t¯ ≥ −, νˆiG (ˆ Φi (ˆ sG , sˆH ; t) ≤ 0, , νˆiΦ Φi (ˆ sG , sˆH ; t) ≤ 0 ∀i ∈ {1, . . . , q}; νˆiG ≥ 0, νˆiH ≥ 0, νˆiΦ ≥ 0.

Let us now run through these 6 conditions.

I. Since, xˆ = x∗

(ˆ ν g , νˆh , ν sG , ν sH ) = (λg , λh , λG , λH ) it holds

∇x L1s (ˆ x, sˆ, νˆ; t) ∞ = 0

and

that

and

|hi (ˆ x)| = 0 ∀i ∈ {1, . . . , m}, gi (ˆ x) ≤ 0, νˆig ≥ 0, |gi (ˆ x)ˆ νig | ≤ 0 ∀i ∈ {1, . . . , p}.

II. The gradient of the Lagrangian with respect to s is given by ∇sG L1s (ˆ x, sˆ, νˆ; t) = νˆsG − νˆG + νˆΦ αH (ˆ s; t), ∇sH L1s (ˆ x, sˆ, νˆ; t) = νˆsH − νˆH + νˆΦ αG (ˆ s; t). 00 I−0 (x∗ , λ) (the case s −ˆ νi G . Therefore αH s;t) i (ˆ

In the case

νˆiΦ

=

00 I−0 (x∗ , λ)

is similar by symmetry) it is true that

νˆiG = νˆiH = 0

and

∇sG,i L1s (ˆ x, sˆ, νˆ; t) = 0, ∇sH,i L1s (ˆ x, sˆ, νˆ; t) since for

00 i ∈ I−0 (x∗ , λ)

−ˆ νisG αiG (ˆ s; t) sG ∂ψ(x; t) = νˆi , = H ∂x x=t¯+c αi (ˆ s; t)

by construction of

sˆG , sˆH

it holds that

FGi = 0

and

sˆH,i = t¯ + c.

The conclusion

follows by Lemma 9.3, which gives

∂ψ(x; t) ≤. ∂x x=t¯+c 0+ ∗ 00 00 Now, in the cases I (x ) ∪ I 00 (x∗ ) \ (I−0 (x∗ , λ) ∪ I0− (x∗ , λ)) and I +0 (x∗ ) ∪ I 00 (x∗ ) ∗ 00 x, sˆ, νˆ; t) = 0. I0− (x , λ)) the construction of the multipliers gives directly that ∇(sG ,sH ) L1s (ˆ This concludes the proof of II.

00 \ (I−0 (x∗ , λ) ∪

III. xˆ feasible for the MPCC yields to

|Gi (ˆ x) − sˆG,i | = |ˆ sG,i | |Gi (ˆ x) − sˆG,i | = 0

and

|Gi (ˆ x) − sˆG,i | = |ˆ sG,i |

|Hi (ˆ x) − sˆH,i | = 0,

for

i ∈ I 0+ (x∗ )

|Hi (ˆ x) − sˆH,i | = |ˆ sH,i |,

for

i ∈ I +0 (x∗ ).

and

and

|Hi (ˆ x) − sˆH,i | = |ˆ sH,i |,

By symmetry it is sucient to consider the variables



Let

00 i ∈ I0− (x∗ , λ),

then

sG .

sˆG,i = t¯ + c; 24

for

i ∈ I 00 (x∗ ).

We analyse the cases where

00 00 i ∈ I0− (x∗ , λ), I−0 (x∗ , λ)

and

Iν G .



Let

00 i ∈ I−0 (x∗ , λ),



Let

i ∈ Iν G ,

then

sˆG,i = ψ(t¯ + c; t) ≤ t¯ + c by = νˆG − t¯ ≤ νˆG + t¯ ≤ t¯ + c.

then

sˆG,i

In every cases it holds that

i

i

|ˆ sG,i | ≤ t¯ + c and s ˆH,i are

IV. By construction sˆG,i

such that

Lemma 9.2;

and so this III is veried. both non-negative as

ψ(.; t)

is assumed non-negative for indices

i

Φi (ˆ sG , sˆH ; t) = 0.

It remains to verify the condition in the case where

i ∈ Iν G

and

i ∈ Iν H .

However, in both cases it holds

that

 − t¯νˆiG ¯ +t= νˆiG  − t¯νˆiH ¯ ∀i ∈ Iν H , sˆH,i + t¯ = +t= νˆiH ∀i ∈ Iν G , sˆG,i + t¯ =

 > 0 ≥ −, νˆiG  > 0 ≥ −. νˆiH

So, the feasibility in condition IV is satised. Now, regarding the complementarity condition it holds that

   − t¯νˆiG G ¯ ¯ sG,i + t)ˆ ∀i ∈ Iν G , |(ˆ νi | = + t νˆiG =  G νˆ  i H   − t¯νˆi H ¯ ¯ ∀i ∈ Iν H , |(ˆ sH,i + t)ˆ νi | = + t νˆiH = . H νˆi This proves that the complementarity condition holds true for the relaxed positivity constraints and so condition IV is veried.

V. The feasibility Φi (ˆsG , sˆH ; t) ≤ 0 and the complementarity condition νˆiΦ Φi (ˆsG , sˆH ; t) ≤ 0 are satised



by construction and by hypothesis on the relaxation.

VI. The multiplier νˆΦ

00 i ∈ I−0 (x∗ , λ) it holds that ∂ψ(x; t) H FGi (ˆ s; t) = FH i (ˆ s; t) > 0 , αi (ˆ s; t) = FH i (ˆ s; t) − ∂x x=sG,i

since

FGi (ˆ s; t) = 0

is non-negative since for

by construction of

sˆG,i

and

FH (t¯ + c; t) > 0

by Lemma 9.1. The case

(12)

00 i ∈ I0− (x∗ , λ)

follows by symmetry. The others multipliers are obviously non-negative by construction. This concludes the case VI. The verication of all 6 cases proves that the point constructed above is strong epsilon-stationary, which concludes the proof of the relaxations

B) Consider the Case 2.

Let

(A).

sˆG , sˆH , νˆG , νˆH and νˆΦ be such that :  00 ∗ ¯  ψ(t + c; t), i ∈ I−0 (x , λ)   00 ∗  t¯ + c, i ∈ I0− (x , λ)  −t¯ν ˆG , i∈ Iν G ∩ Iν H sˆG := ν ˆG   −t¯ν ˆH  ψ ; t , i ∈ Iν H \ Iν G  ν ˆH    ∈ F otherwise  00 t¯ + c, i ∈ I0− (x∗ , λ)     00 ¯  i∈ I−0 (x∗ , λ)  ψ(t + c), −t¯ν ˆG ; t , i ∈ Iν G ∩ Iν H sˆH := ψ ν ˆG   H ¯  −tνˆ , i ∈ I  2 H    νˆ ∈ F otherwise 25

,

(13)

,

(14)

(

G

Φ

νˆ :=

νˆisG for i ∈ Iν G ∩ Iν H 0 otherwise ( νˆisH for i ∈ Iν H \ Iν G := 0 otherwise

νˆ :=

,

(15)

νˆH

,

(16)

 s −ˆ νi G   H (ˆ α  i s;t)   

s −ˆ νi H s;t) αG i (ˆ

0

for

00 i ∈ I−0 (x∗ , λ)

for

00 i ∈ I0− (x∗ , λ) ∪ Iν G ∩ Iν H

.

(17)

otherwise

Once again we run through the 6 conditions. It is to be noted that the variables involved in I. have not been changed so this condition stands true.

II. As pointed out earlier, the gradient of the Lagrangian with respect to s is given by ∇sG L1s (ˆ x, sˆ, νˆ; t) = νˆsG − νˆG + νˆΦ αH (ˆ s; t), ∇sH L1s (ˆ x, sˆ, νˆ; t) = νˆsH − νˆH + νˆΦ αG (ˆ s; t).

For indices

Iν H \ I ν G .

i

in

For

with respect to

00 I−0 (x∗ , λ)

00 I−0 (x∗ , λ)

and

Iν H , then νˆiG

i ∈ Iν G ∩ s becomes

>

(A). Let us consider indices i in Iν

we refer to case

0, νˆiH

=0

and ν ˆiΦ

=

G ∩ Iν H and s −ˆ νi H and so the gradient of the Lagrangian αG s;t) i (ˆ

∇sG,i L1s (ˆ x, sˆ, νˆ; t) = νˆisG − νˆiG + νˆiΦ αiH (ˆ s; t) =

−ˆ νisH αiH (ˆ s; t) , G αi (ˆ s; t)

∇sH,i L1s (ˆ x, sˆ, νˆ; t) = νˆisH − νˆiH + νˆiΦ αiG (ˆ s; t) = 0. By construction of

sˆG,i

and

∇sG L1s (ˆ x, sˆ, νˆ; t) for some

t ∈ (0, t∗ )

sˆH,i ,

it holds that

FH i (ˆ s; t) = 0

−ˆ νisH αiH (ˆ s; t) = νˆisH = G αi (ˆ s; t)

and so

∂ψ(x; t) ∂x

=

νˆisH

x=sG,i

∂ψ(x; t) ∂x x=

≤,  s ν ˆ G i

−t¯

according to Lemma 9.3.

Now, for indices

i ∈ Iν H \ I ν G

νˆisG = 0

it holds that

gradient of the Lagrangian with respect to

s

so by the choice of multipliers

νG, νH

and

νΦ

the

vanishes.

This allows to conclude that the condition II holds true.

III. Since xˆ is feasible for the MPCC therefore |Gi (ˆ x) − sˆG,i | = |ˆ sG,i | |Gi (ˆ x) − sˆG,i | = 0

and

|Gi (ˆ x) − sˆG,i | = |ˆ sG,i | Let us consider the case where

|Hi (ˆ x) − sˆH,i | = 0,

for

i ∈ I 0+ (x∗ )

|Hi (ˆ x) − sˆH,i | = |ˆ sH,i |,

for

i ∈ I +0 (x∗ )

and

and

i ∈ Iν G ∩ Iν H

cases have been checked in case

|Hi (ˆ x) − sˆH,i | = |ˆ sH,i |,

noticing that

(A) of this proof.

for

i ∈ Iν H \ Iν G

i ∈ I 00 (x∗ ).

is similar by symmetry. The others

It follows that

 − t¯νˆG  ≤ + t¯ ≤ t¯ + c, |ˆ sG,i | = νˆG νˆiG    − t¯νˆiG . |ˆ sH,i | = ψ ; t G νˆi The second condition is ensured by Lemma 9.4.

IV. and V. A.

These conditions are straightforward and follows the same path that condition IV. in the

case

26

VI. The proof for indices is very similar to the condition VI in case (A) except in the case where i is in Iν G ∩ I ν H .

In this case

νˆiΦ = since by construction of

sˆG

and

sˆH , FH i (ˆ s; t) = 0

νˆiΦ = for

c ≤ t¯ since ψ

−ˆ νisH −ˆ νisH = , FGi (ˆ s; t) s; t) αiG (ˆ here. Now, by denition of

FG (ˆ s; t)

it follows

−ˆ νisH −ˆ νisH ≥0, = sH,i ; t) sˆG,i − ψ(ˆ sH,i ; t) /ˆ νisG − t¯ − ψ(ˆ

is non-negative whenever

sG

and

sH

are chosen such that

FH i (s; t) = 0 (Assumption (H5)).

The verication of all 6 cases proves that the point constructed above is strong epsilon-stationary, which concludes the proof of the relaxations

(B) and complete the whole proof.

It is to be noted that no constraint qualication is required for this result. This is a clear improvement over what was obtained in the literature in the ideal case of sequence of stationary points.

For instance,

Theorem 5.1 in [22] requires some second-order information to get a result on existence of stationary points for the regularized sub-problems.

Theorem 9.2. For any M-stationary point of (MPCC) that satises MPCC-CRSC, there exists a sequence of strong epsilon-stationary points of the relaxation (Rts (x, s)) that converges to that point. Proof.

Theorem 9.1 gives more relations between the parameters that are compatible with Corollary 8.1.

Indeed for a chosen sequence of arbitrarily small parameters Theorem 9.1 requires that

t¯k > c

and

tk

{k },

Corollary 8.1 requires that

k = o(t¯k ) and t¯k − ck .

must be suciently small, in particular smaller than

Thus, a straightforward application of both of these results provides the result. Previous section point out that such result can not be obtained without a formulation with slack variables.

10 How to Compute Strong Epsilon-Stationary Points The previous section introduces the new concept of strong epsilon-stationary point considering the relaxed sub-problems. In this section, we answer to the non-trivial question of how to compute such an approximate stationary point. We present here a generalization of the penalization with active set scheme proposed in [23] and illustrate the fact that it has the desired property.

10.1 A Penalization Formulation The following minimization problem aims at nding

(x, s) ∈ Rn × Rq × Rq

so that

1 φ(x, s) 2ρ , ≥ −t¯1l, Φ(sG , sH ; t) ≤ 0

min Ψρ (x, s) := f (x) + x,s

s.t. where

φ

sG ≥ −t¯1l, sH

(P (x, s))

is the penalty function

φ(x, s) := k max(g(x), 0), h(x), G(x) − sG , H(x) − sH k2 . An adaptation of Theorem 6.1 gives the following result that validate the penalization approach.

Theorem 10.1.

Given a decreasing sequence {ρk } of positive parameter and {k } a sequence of non-negative parameter that decrease to zero as k ∈ N goes to innity. Assume that k = o(t¯k ). Let {xk , ν k } be a sequence of strong epsilon-stationary points of (P (x, s)) according to denition 6.1 for all k ∈ N with xk → x∗ such that MPCC-CRSC holds at x∗ . If x∗ is feasible, then it is an M-stationary point of (MPCC). 27

Proof.

Assuming that

x∗

is feasible for (MPCC), the result is a straightforward adaptation of Theorem

6.1. Unfortunately, the strong assumption on the previous theorem that

x∗

must be feasible is hard to avoid.

Indeed, it is a classical pitfall of penalization methods in optimization to possibly compute a limit point that minimizes the linear combination of the constraints. In other words, we compute a point

x∗

infeasible that

satises

p q q X X X hi (x∗ )∇hi (x∗ )+ max(−gi (x∗ ), 0)∇gi (x∗ )− max(Gi (x∗ ), 0)∇Gi (x∗ )− max(Hi (x∗ ), 0)∇Hi (x∗ ) = 0 .

m X i=1

i=1

i=1

i=1

This phenomenon has been well-known in non-linear programming methods for instance with lter methods. Such a point is sometimes called infeasible stationary point. It is interesting to note that the way the penalty parameter

ρ

behave may provide some informations

on the stationarity of the limit point. Indeed, if we nd a stationary point of the initial problem without driving

ρ to innity, then we get an S-stationary point.

This observation was introduced in [6] in the context

of elastic interior-point for (MPCC) and then adapted to the penalization technique from [23].

Theorem 10.2.

(MPCC),

Let (x, s) be a strong epsilon-stationary point of (P (x, s)) with ρ > 0. If x is feasible for then x is an S-stationary point of (MPCC).

This fact was already observed in Theorem 2 of [23] in a slightly weaker but similar framework. We do not repeat the proof, but gives an interpretation of this result. It has been made clear in the proof of the convergence theorem, Theorem 6.1, that the case where is an M-stationary point only occur if the sequence of multipliers



Φ,k

}

diverges.

x∗

Therefore, it is to be

expected that the penalty parameter must be driven to its limit to observe such phenomenon.

10.2 Active Set Method for the Penalized Problem We discuss here an active set method to solve the penalized problem (P (x, s)). This method is an extension of the method proposed in [23] to the general class of methods presented in previous sections.

s

The set of points that satises the constraints of (Rt (x, s)) is denoted by

Ft,t¯ and

let

βt,t¯(x, s)

denotes

the measure of feasibility

βt,t¯(x, s) :=k max(g(x), 0)k2 + kh(x)k2 + kG(x) − sG k2 + kH(x) − sH k2 + k max(−sG + t¯1l, 0)k2 + k max(−sH + t¯1l, 0)k2 + k max(−Φ(sG , sH ; t), 0)k2 . Let

W(s; t, t¯)

be the set of active constraints among the constraints

sG ≥ −t¯1l, sH ≥ −t¯1l, Φ(sG , sH ; t) ≤ 0 , Ft,Rt¯ denotes the set of points that satises those constraints. some i ∈ {1, . . . , q} the relaxed constraint is active since

and

Φi (sG , sH ; t) = 0 ⇐⇒ sH,i = ψ(sG,i ; t)

or

Remark 10.1.

(18)

We can be even more specic when for

sG,i = ψ(sH,i ; t) .

It is essential to note here that active constraints act almost like bound constraints since an active constraint means that for some i ∈ {1, . . . , q} one (possibly both) of the two case holds sG,i = − t¯ or ψ(sH,i ; t) sH,i

or . ¯ = − t or ψ(sG,i ; t) 28

Considering the relaxation from Kanzow & Schwartz it is obviously a bound constraint since ψ(sG,i ; t) = ψ(sH,i ; t) = t. The buttery relaxation gives ψ(sG,i ; t) = t1 θt2 (sH,i ) and ψ(sH,i ; t) = t1 θt2 (sG,i ). This is not a bound constraint but we can easily use a substitution technique. This key observation is another motivation to use a formulation with slack variables. Furthermore, a careful choice of the function ψ may allow to get an analytical solution of the following equation in α for given values of sG , sH , dsG , dsH : sG,i + αdsG,i − ψ(sH,i + αdsH,i , t) = 0 .

Solving exactly this equation is very useful while computing the largest step so that the iterates remain feasible x , the equation above is reduced to the along a given direction. For the buttery relaxation with θt2 (x) = x+t 2 following second order polynomial equation if sH,i + αdsH,i ≥ 0: (sH,i + αdsH,i + r)(sG,i + αdsG,i ) − t(sH,i + αdsH,i ) = 0 .

(19)

Algorithm 1 presents an active-set scheme to solve (P (x, s)), which is described in depth in the sequel of this section.

Data:

xk−1 , sk−1 ; precision  > 0 ; σρ ∈ (0, 1) update in ρ, ρ0 > 0 initial value of ρ, τvio ∈ (0, 1), ρmax ; 0 Initial estimate of the multiplier ν ; Initial working set W0 of active constraints, A0 initial matrix of gradients Initial data

of active constraints.;

sat:=true

1 Begin ; 2 Set j := 0, ρ := ρ0 ; 3 (xk−1,0 , sk−1,0 )=Projection of (xk−1 , sk−1 ) if not feasible for (P (x, s)) ; 4 while sat and k∇L1 (xk,j , sG k,j , sH k,j , ν j ; tk ))k2∞ > kν j k∞ or min(ν j ) < 0 5 6

do

Substitution of the variables that are xed by the active constraints in Compute a feasible direction

dj

or βtk ,t¯k (xk,j , sk,j ) > 



Wj ;

that lies in the subspace dened by the working set

Wj

(see (20))

and satises the conditions (SDD);

7

Compute

α ¯

the maximum non-negative feasible step along

dj

α ¯ := sup{α : z j + αdj ∈ Ft,t¯} Compute a step length

8 9 10 11 12 13 14 15

if

αj = α ¯

then

αj ≤ α ¯

Update the working set

(see (19)) such that Armijo condition (21) holds ;

→ Wj+1

and compute

Aj+1

the matrix of gradients of active

constraints

(xk,j+1 , sk,j+1 ) = (xk,j , sk,j ) + αj dj

;

j:=j+1 ;

if

βtk ,t¯k (xk,j+1 , sk,j+1 ) ≥ max(τvio βtk ,t¯k (xk,j , sk,j ), ) & ρ < ρmax ρ = σρ ρ

else

Determine the approximate multipliers

ν j+1 = (ν G , ν H , ν Φ )

then

by solving

ν j+1 ∈ arg min kATj+1 ν − ∇Ψρ (z j , t)k2 ν∈R Relaxing rule :

16 17 18

if

Update of the

j

sat:=kd

W|

∃ i, νij+1 < 0 and (satisfy (22) or αj = 0 ) then working set Wj+1 (with an anti-cycling rule) ;

k>

return: x , sk or a decision of unboundedness. Algorithm 1: Outer loop iteration : active set method for relaxed non-linear program (P (x, s)). k

29

At each step, the set

Wj

denotes the set of active constraints of the current iterate

xk,j .

As pointed

out in Remark 10.1, these active constraints x some of the variables. Therefore, by replacing these xed variables we can rewrite the problem in a subspace of the initial domain. Thus, we consider the following minimization problem

min

(x,s)∈Rn ×R2q s.t.

˜ sS ∪S ) ˜ ρ (x, sS ∪S ) := f (x) + 1 φ(x, Ψ G H G H 2ρ , sG,i ≥ −t¯ for i ∈ SG , sH,i ≥ −t¯ for i ∈ SH , Φi (sG , sH ; t) ≤ 0

for

˜t (x, s)) (P

i ∈ SG ∪ SH

with

IG IH

SG

and

SH

:= {i ∈ {1, . . . , q} | sGi = −t¯} := {i ∈ {1, . . . , q} | sH i = −t¯}

0+ IGH

:= {i ∈ {1, . . . , q} | sH i = ψ(sG ; t)}

+0 IGH 00 IGH

:= {i ∈ {1, . . . , q} | sGi = ψ(sH ; t)} := {i ∈ {1, . . . , q} | sGi = sH i = ψ(0; t)}

SG

+0 00 := {i ∈ {1, . . . , q}}\(IG ∪ IGH ∪ IGH )

SH

0+ 00 := {i ∈ {1, . . . , q}}\(IH ∪ IGH ∪ IGH )

respectively denote the set of indices where the variables

sG

and

sH

are free.

Some of the xed variables are replaced by a constant and others are replaced by an expression that depends on the free variables. It is rather clear from this observation that the use of slack variables is an essential tool to handle the non-linear bounds. A major consequence here is that the gradient of

Ψ

in this subspace must be done with care using the

composition of the derivative formula :

˜ ρ (x, sS ∪S ) = J T¯ ∇Ψρ (x, s) , ∇Ψ G H Wj where

JW¯j

is an

(n + 2q) × (n + #SG + #SH )

(20)

matrix dened such that

 x JW ¯j  s  :=  JW¯Gj  . sH JW ¯j 

JW¯j

The three sub-matrices used to dene

JW¯j

are computed in the following way

x JW ¯j = Idn ,

sG JW ¯j ,i

=

  eT ,   i   0,

sH JW ¯j ,i

=

JW¯j ,i

for

  eT ,   i

x=sH

i∈

for

+0 i ∈ IGH

,

+0 ({1, . . . , q}\SG )\IGH

i ∈ SH eTi ,

for

∂ψ(x;t) ∂x

  0, where

i ∈ SG eTi ,

for

∂ψ(x;t) ∂x

for

x=sG

i∈

denotes the i-th line of a matrix and

ei

for

0+ i ∈ IGH

,

0+ ({1, . . . , q}\SH )\IGH is a vector of zero whose i-th component is one. We may

proceed in a similar way to compute the hessian matrix of

30

˜ ρ (x, sS ∪S , t). Ψ G H

dj

The feasible direction

is constructed to lie in a subspace dened by the working set and satisfying the

sucient-descent direction conditions:

∇Ψ(z j )dj ≤ −µ0 k∇Ψ(z j )k2 kdj k ≤ µ1 k∇Ψ(z j )k where

µ0 > 0, µ1 > 0. αj ∈ (0, α ¯]

The step length

If

α ¯

,

(SDD)

is respectively computed to satisfy the Armijo and Wolfe conditions:

Ψ(z j + αj dj ) ≤ Ψ(z j ) + τ0 αj ∇Ψ(z j )T dj , τ0 ∈ (0, 1)

(21)

∇Ψ(z j + αj dj )T dj ≥ τ1 ∇Ψ(z j )T dj , τ1 ∈ (τ0 , 1)

(22)

satises the Armijo condition (21), the active set strategy adds a new active constraint and the Wolfe

condition (22) is not enforced. Otherwise, the Armijo condition requires

α

Suggest Documents