Consider the following mathematical programming problem P(f0,f1) with a single ... obtain an optimal solution with a smaller penalty parameter? ...... (47). It is easy to see that the set {y > 0 : βm(y) < M} is non empty, so y0 < +â. ..... the origin (see [2, 3]) and the obtained necessary and sufficient conditions coincide with.
Strictly increasing positively homogeneous functions with application to exact penalization A. M. RUBINOV SITMS, University of Ballarat Victoria 3353, AUSTRALIA R. N. GASIMOV Osmangazi University, Department of Industrial Engineering, Bademlik 26030, Eski¸sehir, TURKEY
Abstract We study a nonlinear exact penalization for optimization problems with a single constraint. The penalty function is constructed as a convolution of the objective function and the constraint by means of IPH (increasing positively homogeneous) functions. The main results are obtained for penalization by strictly IPH functions. We show that some restrictive assumptions, which have been made in earlier researches on this topic, can be removed. We also compare the least exact penalty parameters for penalization by different convolution functions. These results are based on some properties of strictly IPH functions that are established in the paper.
Key words Nonlinear penalization, exact penalty parameter, IPH functions, strictly IPH functions, equivalence of IPH convolution functions.
1
Introduction
Different types of generalized Lagrange and penalty functions have found many applications in theoretical investigation and in solving constraint optimization problems (see [4, 5, 6, 7, 9, 14, 15, 13, 19, 20, 21] and references therein). In this paper we study a generalized nonlinear penalization by means of the so-called IPH functions. Consider the following mathematical programming problem P (f0 , f1 ) with a single inequality constraint: P (f0 , f1 )
minimize f0 (x) subject to x ∈ X, f1 (x) ≤ 0,
(1)
where X is a metric space and f0 , f1 are finite functions defined on X. We assume that f0 is a positive function. A mathematical programming problem with a finite number of inequality constraints gi and equality constraints hj can be reduced to a problem (1) by setting f1 (x) = max(maxi gi (x), maxj |hj (x)|). Let p be a function of two variables. Consider a nonlinear penalty function for the problem P (f0 , f1 ) with respect to the function p: + L+ d0 > 0, d > 0, p (x, d0 , d) = p(d0 f0 (x), df1 (x)), 1
and the function: qp (d0 , d) = inf x∈X L+ p (x, d0 , d). Then the function qp (1, d) is called the dual function of P (f0 , f1 ) corresponding to p. The corresponding dual problem Dp (f0 , f1 ) is maximize qp (1, d) subject to d > 0. ¯ = maxd>0 qp (1, d). The funcA number d¯ is called an exact penalty parameter if qp (1, d) tion p, which serves for convoluting the objective f0 and the constraint f1 , is called a convolution function. We shall study penalty functions L+ p corresponding to the strictly increasing, positively homogeneous of the first degree (strictly IPH), convolution functions p. The main questions, in which we are interested, are: • to find conditions that guarantee the existence of exact penalty parameters; • how to describe the least exact penalty parameter, if exact penalty parameters exist? Similar questions and the zero duality gap property for optimization problems having an optimal solution on the boundary of the set of feasible elements, have been studied in [14] and [16]. In this paper we answer these questions in the general situation. We show that although the formula, which was suggested in [16], holds in a more general situation than in [16], for some problems a different formula should be applied. We give necessary and sufficient conditions for the existence of an exact parameter in a general situation. Note that the above mentioned questions were studied in [14] and [16] in terms of the usual perturbation function β of the problem P (f0 , f1 ): β(y) = inf{f0 (x) : x ∈ X, f1 (x) ≤ y},
y ≥ 0.
However, for investigation of these questions in this study we use the modified perturbation function βm (y) = inf{f0 (x) : x ∈ X, 0 < f1 (x) ≤ y}, y > 0. Only unfeasible elements are taken into account in forming the modified perturbation function. One more question, which is studied in detail in the paper, is the comparison of the least exact parameters for different convolution functions. The significance of this question is due to the following: We can use different convolution functions for penalization of the initial constrained problem. Which of them is better in the sense that it allows us to obtain an optimal solution with a smaller penalty parameter? We introduce the notion of equivalence of convolution functions and show that many strictly increasing sublinear functions are equivalent via penalization. For examination of nonlinear penalization by using strictly IPH convolution functions we need an advanced theory of these functions. The main tool in the study of IPH functions is abstract convexity (see [11, 12, 18]). We use the duality between IPH functions and the so-called normal sets, presented in [12] Chapter 2. The basis of this duality is the abstract convexity of IPH functions with respect to the set of min-type functions. (All required
2
definitions and results from abstract convexity are given in the paper.) We develop a theory of strictly IPH functions, which is based on the mentioned duality. The paper is structured as follows. The class of IPH functions is studied in Section 2. The nonlinear penalization by means of IPH convolution functions is presented in Section 3. The least exact parameters are examined in Section 4. Section 5 is devoted to the comparison of the least exact penalty parameters corresponding to different convolution functions.
2
IPH functions and their associated functions
2.1
Preliminaries
This subsection contains some known elements from the theory of IPH functions. The reader can find proofs of all results from this subsection in [14], (see, Section 2) and also in [12], (see, Chapters 2 and 3). Let n be a positive integer and I = {1, . . . , n}. Consider the space IRn of all vectors (xi )i∈I . We shall use the following notations: • xi is the i-th coordinate of a vector x ∈ IRn ; • if x, y ∈ IRn then x ≥ y ⇔ xi ≥ yi for all i ∈ I; • if x, y ∈ IRn then x > y ⇔ x ≥ y & x 6= y; • if x, y ∈ IRn then x y ⇔ xi > yi for all i ∈ I; n • IR+ := {x ∈ IRn : x ≥ 0}; n • IR++ := {x ∈ IRn : x 0}: 1 , IR++ = IR1++ , IR+∞ = IR+ ∪ {+∞}. • IR = IR1 ; IR+ = IR+
We accept by definition that the supremum over the empty set is equal to zero. n Let K be either IR+ or IRn++ .
Definition 2.1 A function p : K → IR+∞ is called (i) increasing if x ≥ y implies p(x) ≥ p(y); (ii) strictly increasing if x > y implies p(x) > p(y) (iii) positively homogeneous if p(λx) = λp(x) for x ∈ K and λ > 0.
3
In this section we shall study increasing positively homogeneous (IPH) functions. It is known (and easy to check) that each IPH function defined on IRn++ is continuous. We shall study only continuous IPH functions defined on IRn+ . If p is an IPH function defined on IRn+ then its restriction to IRn++ is also an IPH function and there is a one-to-one correspondence between the set of all IPH functions defined on IRn++ and the set of all n continuous IPH functions defined on IRn+ . Let p : IR++ → IR+∞ be an IPH function. If n there exists a point y ∈ IR++ such that p(y) = 0, (p(y) = +∞, respectively) then p(x) = 0 (p(x) = +∞, respectively) for all x ∈ IRn++ . Consider the coupling function hl, yi defined on IRn++ × IRn++ by hl, yi = min li yi i∈I
(2)
The function l : IRn++ → IR+ defined by l(y) = hl, yi is called the min-type function generated by l. Clearly l is an IPH function. The set L of all min-type functions is a supremal generator of the set of all IPH functions in the sense that each IPH function is the supremum of a certain set of min-type functions. To express this fact more precisely, we need the following definition. Definition 2.2 1) Let p : IRn++ → IR+∞ be an IPH function. The set n n : hl, yi ≤ p(y) for all y ∈ IR++ } supp (p) = {l ∈ IR++
(3)
is called the support set of the function p. 2) Let p : IRn+ → IR be a continuous IPH function. Then the support set of the n is called the support set of p. restriction of p to IR++ Proposition 2.1 Let p : IRn++ → IR+∞ be an IPH function. Then p(y) = sup{hl, yi : l ∈ supp (p)} for all y ∈ IRn++ . If p 6= 0, p 6= +∞, then for each y ∈ IRn++ there exists l ∈ supp (p) such that hl, yi = p(y), hence p(y) = max{hl, yi : l ∈ supp (p)}. Remark 2.1 If p(y) = +∞ for all y ∈ IRn++ then supp (p) = IRn++ . If p(y) = 0 for all y ∈ IRn++ then supp (p) = ∅. For x ∈ IRn++ we shall require the following notational convention: x−1 ≡
1 = x
1 xi
(4)
i∈I
n n → IR+ be an IPH function. Then supp (p) = {l ∈ IR++ : Proposition 2.2 Let p : IR++ −1 p(l ) ≥ 1}.
4
Definition 2.3 A nonempty set U ⊂ IRn++ is called normal if x ∈ U, y ∈ IRn++ , y ≤ x implies y ∈ U . The empty set is normal by definition. In the sequel we shall study closed normal subsets of IRn++ . We consider IRn++ as a topological space with the topology induced from IRn . Let U ⊂ IRn++ . Denote by cl U and n , respectively. The cl∗ U the closure of the set U in the topological spaces IRn and IR++ n n boundary of the set in IR and IR++ will be denoted by bd and bd∗ U , respectively. It easily follows from the definition of the support set that the set supp (p) is closed and normal for each IPH function p : IRn++ → IR+∞ . To present an important example of a normal set, we need the following definitions. n Let g : IR++ → IR+∞ be a function. The set hypo + g defined by n hypo + (g) = {(α, y) ∈ IR1+n ++ : 0 < α ≤ g(y), y ∈ IR++ }
is called the positive part of the hypograph of the function g. The function g is upper n semicontinuous if and only if hypo + g is a closed (in IR1+n ++ ) set. A function g : IR++ → IR+ is called decreasing if x ≤ y implies g(x) ≥ g(y). Proposition 2.3 The function g : IRn++ → IR is upper semicontinuous and decreasing if and only if hypo + g is a normal subset of IR1+n ++ . n → IR+∞ Let U ⊂ IR++ × IRn++ be a closed normal set. Consider the function gU : IR++ defined by gU (y) = sup{α : (α, y) ∈ U }. (5)
(Recall that we accept by definition that the supremum over the empty set is equal to n → IR+∞ be zero.) The function gU is upper semicontinuous and decreasing. Let g : IR++ an upper semicontinuous and decreasing function. Let dom g = {y ∈ IRn++ : g(y) < +∞} and C(dom g) = {y ∈ IRn++ : g(y) = +∞} be the complement of dom g. It follows from the properties of g that the set C(dom g) is closed and normal. If U ⊂ IR1+m ++ is a closed normal set and g = gU then U = {(α, y) : y ∈ C(dom gU ), α > 0} ∪ {(α, y) : y ∈ dom gU : 0 < α ≤ gU (y)}.
(6)
The following definition plays the main role in this paper. Definition 2.4 Let p : IR1+n ++ → IR+∞ be an IPH function and U = supp (p). Then the function hp = gU is called the associated function to p. Since U := supp (p) is a closed normal subset of IR1+n ++ , this definition makes sense. 5
2.2
Lattices of IPH functions, normal sets and decreasing functions
Recall that an ordered set A is called a complete lattice if each subset B ⊂ A has the exact upper bound sup B and the exact lower bound inf B. We shall use the following well known result (see, for example [1]): Proposition 2.4 Let A1 and A2 be two complete lattices and let ψ : A1 → A2 be an isotonic one-to-one correspondence (the isotonicity of ψ means that a ≥ b implies ψ(a) ≥ ψ(b)). Then for each B ⊂ A1 we have ψ(supB) = sup ψ(B), ψ(inf B) = inf ψ(B). Consider the ordered set Pn of all IPH functions defined on IRn++ with the point-wise n order relation: p ≥ q ⇐⇒ p(y) ≥ q(y) for all y ∈ IR++ . Clearly Pn is a complete lattice: if B ⊂ Pn , then (sup B)(y) = supp∈B p(y) and (inf B)(y) = inf p∈B p(y). Let Nn be the ordered set of all closed normal subsets of IRn++ with the order relation defined via the inclusion: U1 ≥ U2 ⇐⇒ U1 ⊃ U2 . This set is also a complete lattice. If E ⊂ Nn T S then inf E = U ∈E U and sup E = cl∗ U ∈E U . (Here cl∗ stands for the closure in the topological space IRn++ .) The mapping ϕ : Pn → Nn defined by ϕ(p) = supp (p) is called Minkowski duality. Proposition 2.5 ([8, 12]) 1) Minkowski duality ϕ : Pn → Nn is an isotonic one-to-one correspondence. The following result follows directly from Proposition 2.4 and Proposition 2.5. Proposition 2.6 Let (pt )t∈T be a family of IPH functions and p∗ (y) = sup pt (y),
p∗ (y) = inf pt (y). t∈T
t∈T
Then supp (p∗ ) = cl∗
[
supp (pt ),
supp (p∗ ) =
t∈T
\
supp (pt ).
t∈T
Denote by Dn the ordered set of all upper semicontinuous decreasing functions defined on IRn++ with the point-wise order relation . This set is a complete lattice. If Z ⊂ Dn , then (inf Z)(y) = inf g∈Z g(y) and (sup Z)(y) coincides with the upper regularization cl g Z of the function g Z (y) = supg∈Z g(y). (The upper regularization cl g of a function g is the least upper semicontinuous function, which is greater than or equal to g.) ∗ Proposition 2.7 Let (pt )t∈T be a family of IPH functions defined on IRn+1 ++ and p (y) = supt∈T pt (y), p∗ (y) = inf t∈T pt (y). Then the associated function hp∗ of the IPH function p∗ coincides with the upper regularization of the function y 7→ supt∈T pt (y), the associated function hp∗ of the IPH function p∗ coincides with the function y 7→ inf t∈T pt (y).
6
Proof: Consider the mapping ϕ1 : Nn+1 → Dn defined by ϕ1 (U ) = gU where gU is a function defined by (5). Clearly ϕ1 is an isotonic mapping. It easily follows from (6) that the mapping ϕ1 is a one-to-one correspondence. Consider the composition ψ = ϕ1 ◦ ϕ of mappings ϕ and ϕ1 . Since both ϕ and ϕ1 are isotonic one-to-one correspondences it follows that ψ is also isotonic one-to-one correspondence. The desired result follows now from Proposition 2.4 4 The following assertion will be useful in the sequel. Proposition 2.8 Let p : IR1+n ++ → IR+ be an IPH function and c > 0. Then hcp (y) = y chp c Proof: We have
n+1 supp (cp) = {l ∈ IRn+1 ++ : hl, xi ≤ cp(x) for all x ∈ IR++ } l , x ≤ p(x) for all x ∈ IRn+1 = l: ++ c
=
n
cl0 : hl0 , xi ≤ cp(x) for all x ∈ IRn+1 ++
= c supp (p).
o
Hence
α y , c c α y y α : , = c sup ∈ supp (p) = chp . c c c c
hcp (y) = sup{α : (α, y) ∈ supp (cp)} = sup α :
∈ supp (p)
4
2.3
Strictly normal sets and strictly IPH functions
n A normal set U ⊂ IR++ is called strictly normal if for each boundary point x ∈ bd∗ U the inequality y > x implies y ∈ / U . Here bd∗ U is the boundary of U in the topological space n IR++ . An IPH function p : IRn++ → IR+ is called strictly IPH if this function is strictly increasing on IRn++ : y > x implies p(y) > p(x).
Proposition 2.9 An IPH function p : IRn++ → IR+ is strictly IPH if and only if the set supp (p) is strictly normal. Proof: 1) First assume that supp (p) is strictly normal. We need to show that p is strictly increasing. Let x ∈ IRn++ and x > y. Let lx = p(x)/x. We have p(lx−1 ) = 1. So it follows from Proposition 2.2, lx ∈ supp (p). Since hλlx , xi = λp(x) > p(x) for all λ > 1 it follows that λlx ∈ / supp (p), hence lx ∈ bd∗ supp (p). The same argument shows that ∗ ly ∈ bd supp (p). Now suppose that p(x) = p(y). Then the inequality y > x implies 7
ly > lx . Since lx is a boundary point of supp (p) and supp (p) is strictly normal, it follows / supp (p). We have a contradiction, which shows that p(x) > p(y). that ly ∈
2) Let p be a strictly increasing function. Consider l ∈ bd∗ supp (p). Let m > l. We need to show that m ∈ / supp (p). Suppose to the contrary that m ∈ supp (p). Then it follows from the normality of supp (p) that m ∈ bd∗ supp (p). Applying Proposition 2.2 and the continuity of p we conclude that bd∗ supp (p) = {l0 : p(1/l0 ) = 1}. Hence p(1/l) = p(1/m). However, this is impossible since 1/l < 1/m and p is a strictly increasing 4 function.
n → IR+∞ be a function. Recall that the set {y ∈ IRn++ : g(y) < +∞} is Let g : IR++ called dom g. We say that g is continuous at a point y ∈ IRn++ if yk → y implies g(yk ) → g(y). If y ∈ cl dom f then this definition is equivalent to the following: yk → y, yk ∈ dom g n implies g(yk ) → g(y). The function g is continuous on IR++ if it is continuous at each n point y ∈ IR++ .
6 ∅, U 6= IR1+n Proposition 2.10 Let U ⊂ IR++ × IRn++ be a closed normal set, U = ++ . Then U is strictly normal if and only if the function gU defined by (5) is strictly decreasing on n the set {y ∈ dom gU : gU (y) > 0} and continuous on IR++ . Proof: 1) Let U be strictly normal and let 0 < gU (y) < +∞. Since gU (y) > 0 it follows that (gU (y), y) ∈ U . The definition of gU implies the inclusion (gU (y), y) ∈ bd∗ U . Let y 0 > y. Then we also have (gU (y), y 0 ) > (gU (y), y), hence (gU (y), y 0 ) ∈ / U . Since U is / U . Thus closed it follows that there exists ε > 0 such that (gU (y) − ε, y 0 ) ∈ gU (y 0 ) = sup{α : (α, y 0 ) ∈ U } ≤ gU (y) − ε < gU (y), so gU is strictly decreasing. We now show that gU is continuous on IRn++ . Since gU is upper semicontinuous it is enough to check the lower semicontinuity of gU . Suppose to the contrary that gU is not lower semicontinuous at a point y ∈ cl dom gU , then lim inf gU (y 0 ) := E < gU (y). 0 y →y
(7)
Let ε > 0 be a number such that E + ε < gU (y). It follows from this inequality that (E + ε, y) ∈ U . Let us check whether (E + ε, y) is a boundary point of U or not. Let yk be a sequence such that yk → y, yk ∈ dom gU and gU (yk ) → lim inf y0 →y gU (y 0 ) = E. Let / U and due to (7) we have: (αk , yk ) → (E + ε, y). Thus αk = gU (yk ) + ε. Then (αk , yk ) ∈ (E + ε, y) ∈ bd∗ U . Let E + ε < E 0 < gU (y). Since (E 0 , y) > (E + ε, y) and U is strictly normal, it follows that (E 0 , y) ∈ / U . On the other hand the inequality E 0 < gU (y) implies 0 (E , y) ∈ U . We arrive at a contradiction, which shows that gU is continuous. n 2) Let the function gU be strictly decreasing on the set {y ∈ IR++ : 0 < gU (y) < +∞} n and continuous on IR++ . Let (α, y) be a boundary point of the set U . First we check the equality α = gU (y). Suppose to the contrary that α = 6 gU (y). Since gU (y) = sup{α0 > 0 :
8
(α0 , y) ∈ U } it follows that 0 < α < gU (y). There exists a sequence (αk , yk ) → (α, y) such that (αk , yk ) ∈ / U . The latter means that αk > gU (yk ). Since gU is continuous it follows that α = lim αk ≥ limk gU (yk ) = gU (y). Thus our assumption is not valid and α = gU (y). Let (α0 , y 0 ) > (α, y). If y 0 > y then gU (y 0 ) < gU (y) = α ≤ α0 . Since α0 > gU (y 0 ) it follows / U . We have that (α0 , y 0 ) ∈ / U . If y 0 = y then α0 > α = gU (y). This means that (α0 , y 0 ) ∈ 4 proved that the set U is strictly normal. For application to nonlinear penalization we need to consider continuous IPH functions p defined on the cone IR2+ and such that p(1, 0) > 0,
p(0, 1) > 0.
(8)
It is easy to check that (8) is equivalent to the following 2 p(x) > 0 for all x ∈ IR+ , x 6= 0.
(9)
x) = 0 then it would be p(x) = 0 for all Indeed, if there were a point x ¯ ∈ IR2++ with p(¯ 2 x ∈ IR++ and due to the continuity of p we have p(1, 0) = p(0, 1) = 0, which contradicts (8). Thus p(x) > 0 for all x ∈ IR2++ . Since p is positively homogeneous it follows also that p(u, 0) > 0 and p(0, v) > 0 for all u, v > 0. It is easy to see that p(1, 0) > 0, p(0, 1) > 0 =⇒ hp (y) > 0 for all y ∈ IR++ .
(10)
Indeed, let hp (y) = 0 for y > 0. Then (α, y) ∈ / supp (p) for all α > 0, hence for each 2 ε > 0 there exists xε ∈ IR++ such that h(ε, y), xε i = min(εx1ε , yx2ε ) > p(xε ). We can assume without loss of generality that kxε k = 1 and there exists limε→0 xε = x ¯. Then p(¯ x) ≤ limε→0 h(ε, y), xi = 0, which contradicts (9). Thus (10) is valid. 2 → IR+ be a continuous strictly IPH function such that (8) Proposition 2.11 Let p : IR+ holds and let b > 0. Then p(0, 1) = b if and only if
hp (y) = +∞ if y ≤ b,
hp (y) < +∞ if y > b.
(11)
Proof: 1) Let h : IR++ → IR++ be a continuous and decreasing function such that h(y) > 0 for all y > 0. Assume that there exists a number c ≥ 0 such that h(y) = +∞ if y < c,
h(y) < +∞ if y > c.
(12)
(If c = 0 then h is a finite function.) Since h is continuous on IR++ it follows that limy→c h(y) = +∞. It is easy to check that for all u > 0 the equation y = h(y)u has the unique solution yu . We now show that lim sup yu ≤ c as u → 0.
(13)
Indeed, assume that there exists a sequence uk → 0 such that yuk ≥ c + ε, 9
(14)
where ε > 0 is a sufficiently small number with h(c + ε) > 0. Since h is decreasing, we deduce that h(yuk ) ≤ h(c + ε) < +∞, hence yuk = h(yuk )uk ≤ h(c + ε)uk . Thus yuk → 0, which contradicts (14). We have proved that (13) holds. 2) Assume that (11) is valid. Due to (6) we have supp (p) = {(α, y) : α > 0, y ≤ b} so
[
{(α, y) : 0 < α ≤ hp (y), y > b},
p(u, 1) = sup{min(αu, y) : (α, y) ∈ supp (p)} = max(
sup
min(αu, y),
sup
min(αu, y)).
(15)
0b
(α,y):α>0,y≤b
Applying (10) we conclude that hp (y) > 0 for all y > 0. It is easy to check that min(αu, y) = b.
sup
(16)
(α,y):α>0,y≤b
We also have min(αu, y)) = sup min(hp (y)u, y).
sup
y>b
0b
The function p is strictly increasing, so applying Proposition 2.9, Proposition 2.10 and (10) we conclude that hp is a continuous on IR++ function, which is strictly decreasing and positive on {y : y > b}. Let u be a positive number and yu be the solution of the equation hp (y)u = y. Then py>b min(hp (y)u, y) = yu , hence sup
min(αu, y)) = yu .
(17)
α≤hp (y),y>b
It follows from (15), (16) and (17) that p(u, 1) = max(b, yu ). Due to the continuity of p and (13) with c = b, we have p(0, 1) = b. 3) We now prove the reverse assertion: if p(0, 1) = b > 0 then (11) holds. Assume that the associated function hp is finite for all y > 0. Then we have for u > 0: p(u, 1) = sup{min(αu, y) : (α, y) ∈ supp (p)} = sup min(hp (y)u, y). y>0
We have supy>0 (hp (y)u, y) = yu , where yu is the solution of the equation hp (y)u = y. Applying (13) with c = 0, we conclude that limu→0 yu = 0. Since p is continuous and p(u, 1) = yu , we have p(0, 1) = 0, which is impossible. Thus, we have established that the set Y = {y : hp (y) = +∞} is not empty. Since hp is continuous and decreasing on IRn++ it follows that there exists a number c > 0 such that Y = (0, c]. It follows from the second part of the proof that p(0, 1) = c. Hence c = b. The proposition is proved. 4 In the following sections we shall apply some special classes of IPH functions defined on IR2+ to the examination of nonlinear penalization. We shall use the following notations: 10
P is the set of all continuous IPH functions p : IR2+ → IR+ such that p(1, 0) > 0,
p(1, u) → +∞ as u → +∞
Pa = {p ∈ P : p(1, 0) = a},
a > 0;
(18) (19)
Pa,b is the set of continuous strictly IPH functions p : IR2+ → IR+ such that p(1, 0) = a,
p(0, 1) = b,
a > 0, b > 0.
(20)
Note that Pa,b ⊂ Pa . Indeed, let u → +∞. Since p(1, u) = up(1/u, 1) and p(1/u, 1) → b > 0 it follows that limu→+∞ p(1, u) = +∞. Let p ∈ P. We denote the restriction of the function p to IR2++ by the same symbol p. Using this restriction, we can define for p the associated function hp . The following assertion will be useful in the sequel (see [14], Proposition 3.5). Proposition 2.12 Let p ∈ P. Then p ∈ P1 (that is, p(1, 0) = 1) if and only if lim hp (y) = 1.
y→+∞
3 3.1
Nonlinear penalization by means of an IPH convolution function Preliminaries
Consider the following constrained optimization problem: P (f0 , f1 )
minimize f0 (x) subject to x ∈ X, f1 (x) ≤ 0,
(21)
where X is a metric space and fi : X → IR, i = 1, 2. The optimal value of the problem P (f0 , f1 ) will be denoted by M . We assume in the sequel that M > inf f0 (x) := γ > 0. x∈X
(22)
Remark 3.1 1) The inequality M > γ means that the constraint f1 is essential; 2) the inequality γ > 0 means that the objective function is uniformly positive. If a lower bound b of f0 over X is known, then we can simply replace f0 with f0 − b0 , where b0 > b. A certain transformation of an arbitrary problem P (f0 , f1 ) leads to an equivalent problem with uniformly positive objective function. We now describe this transformation. Let d > 0 and let ϕ : IR → (d, +∞) be a strictly increasing continuous function. Consider a problem P (f0 , f1 ) with not necessarily positive f0 : X → IR. Let f˜0 = ϕ ◦ f0 . Then the problem P (f˜0 , f1 ) is equivalent to P (f0 , f1 ) in the sense that both P (f0 , f1 ) and P (f˜0 , f1 ) have the same local and global minima. Clearly inf x∈X f˜0 (x) > 0. 11
Consider a problem P (f0 , f1 ). We denote by X0 the set of all feasible elements of this problem and by X1 the complement of X0 to X. Thus X0 = {x ∈ X : f0 (x) ≤ 0},
X1 = {x ∈ X : f1 (x) > 0}.
(23)
Let f1+ (x) = max(f1 (x), 0). Then f1+ (x)
=
(
0 f1 (x)
x ∈ X0 x ∈ X1 .
Let p ∈ P. Consider a problem P (f0 , f1 ). The function Lp+ (x, d0 , d) = p(d0 f0 (x), df1+ (x)),
x ∈ X, d0 > 0, d > 0
(24)
is called the nonlinear penalty function corresponding to p. The function qp (d0 , d) = inf p(d0 f0 (x), df1+ (x)) ≡ inf Lp+ (x, d0 , d), x∈X
x∈X
d0 > 0, d > 0
(25)
is called the dual function, corresponding to the function p. It is easy to check that qp (d0 , d1 ) is an IPH function defined on IR2++ so we can apply the theory of IPH functions for examination of the dual function. In the sequel we shall mainly consider the function of one variable qp (1, d), however properties of IPH function qp (d0 , d) will be used in the study of qp (1, d). The problem (Dp (f0 , f1 )) :
sup qp (1, d) subject to d > 0
(26)
is called the dual problem to P (f0 , f1 ) corresponding to p. Denote by MDp the value of dual problem: MDp = sup qp (1, d) (27) d>0
Let p ∈ P1 . Then for each p(f0 , f1 ) we have MDp := sup qp (1, d) ≤ inf f0 (x) := M x∈X0
d>0
(28)
We say that the zero duality gap property (with respect to p) holds for P (f0 , f1 ) if MDp = ¯ If d¯ is an exact M . A number d¯ > 0 is called an exact penalty parameter if MDp = qp (1, d). parameter, then the problem P (f0 , f1 ) can be reduced to the unconstrained minimization ¯ Since the function qp (1, d) is increasing it follows that each of the function L+ (x, 1, d). number that is greater than an exact parameter is also an exact parameter. The function Lp+ is called an exact nonlinear penalty function if an exact parameter exists. The zero duality gap property and exact penalization for nonlinear penalty functions have been studied in [14, 16] (see also [12], Chapter 4) by assuming that the following assumption holds: 12
Assumption 3.1 Let X0 and X1 be the sets defined by (23) for a problem P (f0 , f1 ). Then there exists a sequence xk ∈ X1 such that f1 (xk ) → 0 and f0 (xk ) → M . Roughly speaking, Assumption 3.1 means that there is a solution of P (f0 , f1 ), which is located on the boundary of the set X0 of feasible elements. One of the goals of the current paper is to show that main results from ([14, 16]) are valid without this assumption.
3.2
Modified perturbation functions and the zero duality gap property
The main tool in the theoretical study of nonlinear penalization by means of IPH convolution functions is the perturbation function (see [14, 12]). Recall that the perturbation function β of the problem P (f0 , f1 ) is defined on IR+ by: β(y) = inf{f0 (x) : x ∈ X, f1 (x) ≤ y}.
(29)
Note that β(0) = M . The function β is decreasing, and therefore it is upper semicontinuous. The lower semicontinuity of the perturbation function β at zero is very important in applications. Upper semicontinuity of β implies the following: β(0) is lower semicontinuous at zero if and only if M = limy→+0 β(y). In order to extend some results from [14] for a more general class of problems we need to consider a certain modification of the perturbation function (29). Definition 3.1 The function βm (y) = inf{f0 (x) : x ∈ X, 0 < f1 (x) ≤ y},
y > 0.
(30)
is called the modified perturbation function of the problem P (f0 , f1 ). Using notation (23) we can present the modified perturbation function in the form: βm (y) = inf{f0 (x) : x ∈ X1 , f1 (x) ≤ y},
y>0
(31)
We shall define the modified perturbation function also at the origin. By definition, βm (0) = Mm , where Mm := lim βm (y). (32) y→+0
It follows from this definition that βm is continuous at zero. The function βm is decreasing. We have β(y) =
inf
x∈X,f1 (x)≤y
f0 (x) = min( inf f (x), x∈X0
inf
x∈X1 ,f1 (x)≤y
f0 (x)) = min(M, βm (y)).
(33)
Thus β(y) = βm (y) for all y > 0 if and only if βm (y) ≤ M for all y. It follows from (33) that βm ≥ β. Proposition 3.1 [14, 12], If Assumption 3.1 holds then β(y) = βm (y) for all y > 0. 13
Assume that the perturbation function β is lower semicontinuous at zero. Then Mm = βm (0) ≥ lim β(y) = M. y→+0
(34)
Let X0 and X1 be the sets defined by (23). Since f1+ (x) = 0 for x ∈ X0 and f1+ (x) = f1 (x) for x ∈ X1 it follows that: qp (d0 , d) = min( inf p(d0 f0 (x), 0), inf p(d0 f0 (x), f1 (x)). x∈X1
x∈X0
(35)
Let p ∈ P1 . Then inf p(d0 f0 (x), 0) = inf d0 f0 (x)p(1, 0) = d0 M p(1, 0) = d0 M.
x∈X0
x∈X0
(36)
Consider the function rp defined on IR2++ by rp (d0 , d) = inf p(d0 f0 (x), df1 (x)) x∈X1
(37)
It is easy to see that rp is an IPH function. It follows from (35) and (36) that qp (d0 , d) = min(d0 M, rp (d0 , d))
(38)
Assumption 3.1 implies the equality qp = rp (see [14, 12]). The IPH function rp and its associated function hrp were the main tool for examination of properties of nonlinear penalization in [14]. The equality qp = rp does not necessarily hold if Assumption 3.1 is not valid, so we need to study the dual function qp itself and its associated function hqp . The following two statements are analogues to results, which have been proved in [14] for problems P (f0 , f1 ), satisfying Assumption 3.1. These results describe some properties of hrp in terms of the perturbation function β and the value M of the problem P (f0 , f1 ). Analysis of the proofs shows that if we replace β with the modified perturbation function βm and M with Mm , respectively, then these results hold also for problems, for which Assumption 3.1 is not valid. We now present these statements. We omit their proofs since they are similar to those in [14]. Proposition 3.2 (compare with Proposition 6.1 and Corollary 6.1 in [14]) Let p ∈ P. Then hrp can be represented as the multiplicative inf-convolution of βm and hp : hrp (y) = inf βm (z)hp z>0
y , z
y > 0.
Theorem 3.1 (compare with Theorem 6.1 in [14]). Let p ∈ P1 . Then supz>0 hrp (z) = supd>0 rp (1, d) = Mm . The following assertion holds. 14
Lemma 3.1 Let p ∈ P1 . Then (39)
MDp = min(M, Mm ), where MDp is the value of the dual problem Dp .
Proof: By definition, MDp = supd>0 qp (1, d) = limd→+∞ qp (1, d). We also have (see (38)) qp (1, d) = min(M, rp (1, d)). Hence, MDp = min(M, lim rp (1, d)) = min(M, sup rp (1, d)). d→+∞
d>0
Applying Theorem 3.1 we conclude that (39) is valid.
4
Theorem 3.2 Let p ∈ P1 and let the perturbation function β be lower semicontinuous at the origin. Then the zero duality gap property (M = MDp ) holds. Proof: Lower semicontinuity implies ( see (34)), that Mm ≥ M . Since due to (39) MDp = min(M, Mm ), we have MDp = M . 4 Remark 3.2 A different proof of Theorem 3.2 can be found in [17]. It has also been proved in [17] that lower semicontinuity of β implies M = MDp . Thus the zero duality gap property holds if and only if the perturbation function is lower semicontinuous at the origin.
3.3
Exact penalization
Consider a problem P (f0 , f1 ). Let p ∈ P1 and L+ p (d0 , d, x) be the nonlinear penalty function of P (f0 , f1 ), corresponding to p. For the examination of exact penalization we need to study the associated function hqp of the dual function qp . First we shall express this function in terms of hrp and then we shall use Proposition 3.2, which gives a convenient decomposition of the associated function hrp as the multiplicative inf-convolution of the function hp and βm . Note that the associated function hp depends only on p, and the modified perturbation function βm depends only on the problem P (f0 , f1 ). It follows from (38) that qp (d0 , d) = min(s(d0 , d), rp (d0 , d)),
(d0 , d) ∈ IR2++ ,
(40)
where s(d0 , d) = d0 M . Clearly s is an IPH function. First we describe the function hs , associated to s. Proposition 3.3 We have hs (y) = M for all y > 0.
15
Proof: We have supp (s) = {(l0 , l) ∈ IR2++ : min(l0 d0 , ld) ≤ M d0 , for all (d0 , d) ∈ IR2++ } d ≤ M d0 , for all (d0 , d) ∈ IR2++ } = {(l0 , l) ∈ IR2++ : d0 min l0 , l d0 = {(l0 , l) ∈ IR2++ : min(l0 , lu) ≤ M for all u > 0} = {(l0 , l) ∈ IR2++ : l0 ≤ M }.
Hence, hs (y) = sup{l0 : (l0 , l) ∈ supp (s)} = sup{l0 : l0 ≤ M } = M. 4 Proposition 3.4 hqp (y) = min(M, hrp (y)), (y > 0). Proof: It follows from Proposition 2.7, Proposition 3.3 and (40).
4
The proof of the following proposition is similar to that of Proposition 7.1 in [14] and we omit it. Proposition 3.5 Let p ∈ P1 and let d > 0. Then rp (1, d) ≥ M if and only if hrp (z) ≥ M for all z ∈ (0, M/d]. Using Proposition 3.5, Proposition 3.4 and (38) we can establish the following statement. Proposition 3.6 Let p ∈ P1 and let d > 0. Then qp (1, d) = M if and only if hqp (z) = M for all z ∈ (0, M/d]. Proof: Due to (38) and Proposition 3.4, we have, respectively qp (1, d) = min(M, rp (1, d)), (d > 0)
hqp (y) = min(M, hrp (y)), (y > 0).
(41)
Combining (41) with Proposition 3.6 we conclude that qp (1, d) = M ⇐⇒ rp (1, d) ≥ M ⇐⇒ hqp
M d
≥ M ⇐⇒ hqp
M d
= M. 4
3.4
Nonlinear penalization by means of an IPH function p ∈ Pa
We now extend some results of this Section to convolution functions p ∈ Pa with a > 0. Note that p ∈ Pa implies a−1 p ∈ P1 . This leads to the following definitions. Let p ∈ Pa . We say that the zero duality gap property holds for a problem P (f0 , f1 ) if sup qp (1, d) = aM. d>0
16
¯ = aM . The function rp A number d¯ > 0 is called an exact penalty parameter if qp (1, d) defined by (37) will be used in the study of penalization by means of p ∈ Pa . Using Proposition 2.8 we can obtain the following results. Theorem 3.3 Let p ∈ Pa . Then supz>0 hrp (z) = supd>0 rp (1, d) = aMm . Proof: Let p0 = a−1 p. Then rp0 (d0 , d) = inf p0 (d0 f0 (x), df1 (x)) = x∈X1
1 rp (d0 , d). a
(42)
It follows from Proposition 2.8 that hrp0 (y) = (1/a)hrp (ay). Hence sup hrp0 (y) = a−1 sup hrp (y).
(43)
sup rp0 (1, d) = sup hrp0 (y) = M
(44)
y>0
y>0
Due to Theorem 3.1 we have d>0
y>0
Combining (42), (43) and (44) we obtain the result.
4
Proposition 3.7 Let p ∈ Pa . Then hqp (y) = min(aM, hrp (y). Proof: It follows from Proposition 3.4 and Proposition 2.8.
4
Proposition 3.8 Let p ∈ Pa with a > 0 and let d > 0. Then qp (1, d) = aM if and only if hqp (z) = aM for all z ∈ (0, aM/d]. 1 qp (d0 , d), hence qp (1, d) = aM ⇐⇒ qp0 (1, d) = a 1 M . We also have due to Proposition 2.8: hqp0 (y) = hqp (ay). Thus the result follows from a Proposition 3.6. 4 Proof: Let p0 = a−1 p. Then qp0 (d0 , d) =
4
The least exact penalty parameter
Consider a problem P (f0 , f1 ) and a function p ∈ P such that L+ p is the exact penalty ¯ = M. ¯ function for P (f0 , f1 ), that is, there exists a number d > 0 such that qp (1, d) Definition 4.1 The number dp defined by: dp = inf{d > 0 : qp (1, d) = M }
(45)
is called the least exact penalty parameter of problem P (f0 , f1 ) with respect to an IPH function p. 17
It is easy to check that the existence of an exact penalty parameter implies the existence of the least exact penalty parameter dp . Since L+ p is the exact penalty function, it follows that the zero duality gap property holds, hence the perturbation function β is lower semicontinuous at the origin. It follows from (34) that Mm ≥ M.
(46)
Consider a problem P (f0 , f1 ) and the modified perturbation function βm . Let y0 = inf{y > 0 : βm (y) < M }.
(47)
It is easy to see that the set {y > 0 : βm (y) < M } is non empty, so y0 < +∞. Indeed, since M > γ, it follows that there exists x ∈ X1 such that y := f (x) < M . Then βm (y) < M . Recall that (see (33)): β(y) = min(M, βm (y)). It follows from (47) that β(y) = βm (y) for y > y0 . We shall study the least exact penalty parameter only for convolution functions p ∈ Pa,b , where a, b > 0, that is continuous strictly IPH functions defined on IR2+ and such that p(1, 0) = a, p(0, 1) = b. Let hp be the associated function for a function p ∈ Pa,b . It easily follows from Proposition 2.8 and Proposition 2.12 that limy→+∞ hp (y) = a. Due to Proposition 2.11 we have hp (y) = +∞ if y ≤ b and hp (y) < +∞ if y > b. Applying Proposition 2.9 and Proposition 2.10 we conclude that hp is continuous and strictly decreasing on (b, +∞) and limy→b hp (y) = +∞. Lemma 4.1 Let z > y0 . Then aM = hqp (z) if and only if aM ≤ β(y)hp (z/y) for y0 < y < z/b. Proof: Due to Theorem 3.3 we have aMm ≥ hrp (z) for all z > 0. We also have, by applying Proposition 3.7, aM = min(aM, aMm ) ≥ min(aM, hrp (z)) = hqp (z). Thus aM = hqp (z) ⇐⇒ aM ≤ hrp (z).
(48)
It follows from Proposition 3.2 that hrp (z) = inf βm (y)hp y>0
z . y
(49)
Since dom hp = (b, +∞), we conclude that hp
z y
= +∞ for 0
by0 : ϕp (z) =
inf
y0 by0 such that ϕp (˜ z ) > by0 . Let z ∈ A and z 0 > z. Due to Lemma 4.2, z 0 ∈ A. Since ϕp is decreasing, we have ϕp (z 0 ) ≤ ϕp (z). Assume that ϕp (z 0 ) < ϕp (z).
(59)
We have ϕp (z 0 ) =
inf
y0 ϕp (z 0 ) = vp (˜ y . Since y˜ = min{y 0 ∈ [z/b, z 0 /b] : vp (y 0 ) = ϕp (z 0 )} y ). Let z < by < b˜ and y < y˜ it follows that vp (y) > ϕp (z 0 ) = vp (˜ y ). Thus vp (y) > vp (˜ y ) for all y < y˜, so ϕp (b˜ y) =
inf
y0 y0 . Then aM = hqp (z) ⇐⇒ z ≤ vp (y),
y ∈ (y0 , z/b).
(61)
Proof: Let y0 < y < z/b. Since hp is decreasing and finite on (b, +∞), it follows from the definition of vp : z ≤ vp (y) ⇐⇒
vp (y) z z ≤ ⇐⇒ hp y y y 20
≥ hp
vp (y) y
=
aM . β(y)
Thus z ≤ vp (y) for all y ∈ (y0 , z/b) if and only if aM ≤ β(y)hp (z/y) for all y ∈ (y0 , z/b). Hence (61) follows from Lemma 4.1. 4 Let p be an IPH function. We now examine the exact penalization by the penalty function Lp+ corresponding to p. Recall (see (57)) that lim inf y→y0 vp (y) ≥ by0 . In the sequel we consider separately two cases: lim inf y→y0 vp (y) > by0 and lim inf y→y0 vp (y) = by0 . Theorem 4.1 Let P (f0 , f1 ) be a problem with the continuous perturbation function β and let p ∈ Pa,b with a, b > 0. Let y0 be a point defined by (47). Then (i) An exact parameter exists if and only if lim inf y→y0 vp (y) > 0. Let an exact parameter exist and let dp be the least exact penalty parameter defined by (45). Then (ii) If lim inf y→y0 vp (y) = by0 then dp =
aM by0 .
(iii) If lim inf y→y0 vp (y) > by0 then dp = aM sup
y>y0
1 yhp−1
aM β(y)
.
(62)
Proof: (i) Applying Proposition 3.8 we conclude that the existence of an exact parameter is equivalent to the following: there exists z > 0 such that hqp (z) = aM . Due to Lemma 4.1, hqp (z) = aM if and only if: z ≤ yhp−1 (aM/β(y)) ≡ vp (y) for all y ∈ (0, z/b).
(63)
The existence of z > 0 with the property (63) is equivalent to lim inf y→y0 vp (y) > 0. Assume that the exact parameter exists. The proof of (ii) and (iii) is based on Proposition 3.8. It follows from this proposition that the dp = aM/zp , where dp is the least exact penalty parameter and zp is the largest number such that hqp (z) = aM . Hence for calculation of dp we need to calculate zp . (ii) Let lim inf y→y0 vp (y) = by0 . Existence of an exact penalty parameter implies y0 > 0. Let z ≤ by0 . Combining the inequality βm (y) ≥ M for y ≤ y0 and (50) we conclude that hrp (z) =
inf
0 by0 . Since lim inf y→y0 vp (y) = by0 it follows that for each z > by0 there exists y ∈ (y0 , z/b) such that vp (y) ≡ yhp−1 (aM/β(y)) < z. The last inequality is equivalent to 21
aM > β(y)hp (z/y). Applying (50) we conclude that aM > hrp (z). We have demonstrated that zp = by0 . Hence dp = aM/by0 . (iii) Let lim inf y→y0 vp (y) > by0 . Then there exists a number δ > 0 such that vp (y) ≥ by0 +δ if y0 < y < y0 + δ/b. Let zˆ = by0 + δ. Then ϕp (ˆ z ) = inf y0 y0 : z ≥ ϕp (z)}. Due to Proposition 4.1 the function ϕp is constant on Ap2 . Applying Lemma 4.2 we conclude that Ap2 = [zp , +∞). Let mp = inf vp (y). y>y0
For each ε > 0 we can find y 0 > y0 such that vp (y 0 ) ≤ mp + ε. For each z > by 0 we have mp ≤ ϕp (z) =
inf
y0 y0 vp (y) y>y0 yh−1 p β(y)
4
Strong exact penalty parameters
Consider a nonlinear penalty function Lp+ (x, 1, d) for a problem P (f0 , f1 ), corresponding to an IPH function p ∈ Pa,b . An exact penalty parameter d¯ for this function is called a strong exact penalty parameter if the solution set argmin x∈X0 P (f0 , f1 ) of the problem P (f0 , f1 ) coincides with the solution set argmin x∈X Pd of the unconstrained problem (Pd )
+ minimize L+ p (x, 1, d) ≡ p(f0 (x), df1 (x)) subject to x ∈ X.
(65)
We consider the most interesting case, when y0 = 0. The following statement has been proved by A. Rubinov and R. Gasimov (see [13] and references therein). Theorem 4.2 Let p ∈ Pa,b and P (f0 , f1 ) be a problem with continuous perturbation function β and such that y0 = 0, where y0 is the number defined by (47). Let the least exact penalty parameter dp exist. Then each d > dp is a strong exact penalty parameter. The following example demonstrates that not every least exact penalty parameter dp is necessarily strong. 22
Example 4.1 Let X coincide with the segment [−3, 3] ⊂ IR. Consider the problem P (f0 , f1 ), where f1 (x) = x and f0 (x) =
(
−x + 10 −x2 + 10
if − 3 ≤ x ≤ 0 if 0 ≤ x ≤ 3.
+ Let p(u, v) = u + v, then L+ p coincides with the classical penalty function L (x, d): +
L (x, d) =
(
−x + 10 −x2 + dx + 10
if − 3 ≤ x ≤ 0 if 0 ≤ x ≤ 3.
We have X0 = [−3, 0], X1 = (0, 3], M = 10, a = 1. An easy calculation shows that minx∈X L+ (x, d) < M for d < 3, minx∈X L+ (x, d) = M for d ≥ 3, hence dp = 3 is the least exact penalty parameter. Since argmin x∈X L+ (x, 3) = {0, 3} 6= {0} = argmin x∈X0 P (f, g), it follows that dp is not a strong parameter.
4.2
Examples
In this subsection we shall calculate the least exact penalty parameters with respect to some convolution functions p ∈ Pa,b . Example 4.2 Consider the convolution function
tk (y1 , y2 ) = ay1k + by2k
1
k
,
with a, b > 0. It is obvious that tk ∈ Pa1/k ,b1/k . Due to Proposition 2.2 we have
a b supp (tk , L) = (α, y) : k + k ≥ 1 = α y
(
ay k (α, y) : α ≤ k y −b k
)
,
(66)
By definition of the associated function, htk (y) = Hence
h−1 tk (y)
1
ak y
(
=
Consider a problem P (f0 , f1 ). We have vtk (y) = yh−1 tk
a1/k M β(y)
!
1 y k −b k
) +∞,
1
,
y > bk ,
, 1 k
0 bk .
aM (aM k − aβ k (y))
23
1/k
,
y > y0 .
(68)
We have lim inf y→+0 vtk (y) > 0 if and only if lim inf y→0
Mk
y > 0. − β k (y)1/k
(69)
Let y0 be a number defined by (47). If y0 > 0 then the exact penalty parameter exists. In this case (69) trivially holds. If y0 = 0 then the existence of an exact penalty parameter is equivalent to (69). Assume that an exact penalty parameter dtk exists. If y0 > 0 and lim inf y→y0 vp (y) = aM . Otherwise 0 then dp = 1/k b y0
b1/k y
dtk = sup
y>y0
1
aM k − aβ k (y) 1
bk y
k
.
(70)
By taking a = b = 1 in (67) and (70) we can obtain the expressions for the inverse associated function and for the least exact penalty parameter to the convolution function
sk (y1 , y2 ) = y1k + y2k By (67) and (68) we have h−1 sk (y) =
y (y k − 1)
1 k
, y > 1,
vsk (y) =
1
k
.
(M k
My , y > y0 − β k (y))1/k
(71)
.
(72)
Assume that y0 = 0. By (70) we have
dsk = sup y>0
Thus dt k =
1
M k − β k (y) y
1
a b
k
k
dsk .
(73)
Note also that by using (67) and (70) we can obtain expressions for inverse associated functions and for the least exact penalty parameter via the classical penalty function s1 (y1 , y2 ) = y1 + y2 . We have h−1 s1 (y) =
y , y > 1, y−1
vs 1 =
My , y > y0 . M − β(y)
(74)
(75)
Necessary and sufficient conditions for the existence of an exact penalty parameter have the following form: y lim inf > 0, (76) y→0 M − β(y) 24
It is easy to check that (76) is equivalent to the calmness of the perturbation function at the origin (see [2, 3]) and the obtained necessary and sufficient conditions coincide with well-known conditions based on calmness (see [2, 3]). If an exact penalty parameter exists then the least exact penalty parameter has the following form: ds1 = sup y>0
M − β (y) . y
(77)
Example 4.3 Let P (f, f1 ) be a problem such that y0 = 0, where y0 is the number defined by (47). Let c > 0 and u ≥ c. Consider the function p(u,c) defined as q
p(u,c) (y1 , y2 ) = c y12 + y22 + cy1 + uy2 This function is strictly IPH. We have
(78)
pu,c (0, 1) = c + u
p(u,c) (1, 0) = 2c,
(79)
so pu,c ∈ P2c,c+u . We denote the associated function corresponding to pu,c by h(u,c) . Then h(u,c) (y) = +∞ for y ≤ u + c and h(u,c) (y) =
2cy (y − u) for y > u + c. (y − u)2 − c2
(80)
The inverse of h(u,c) , is the function: h−1 (u,c) (z)
√ u (z − c) + c z 2 − 2cz + u2 , = z − 2c
z > 2c
(81)
By taking u = c in the last formula, we have: h−1 (c,c) (z) = Let c = c∗ := 1/2. Then the function p(c∗ ,c∗ ) (y1 , y2 ) =
2c (z − c) . z − 2c
(82)
1q 2 1 1 y1 + y22 + y1 + y2 2 2 2
−1 with h−1 belongs to P1,1 . It is interesting to compare h(c s1 . We have due to (82) and ∗ ,c∗ ) (77), respectively: z z − 1/2 −1 h(c , hs−1 . (83) (z) = (z) = 1 ∗ ,c∗ ) z−1 z−1
25
5
The least exact penalty parameters via different convolution functions
5.1
Comparison of exact penalty parameters
In this section we consider only problems P (f0 , f1 ), for which the perturbation function β is continuous and β(y) < M for all y > 0. (The latter means that y0 = 0.) Denote the class of such problems by B. Theorem 4.1 demonstrates that the least exact penalty parameter for P (f0 , f1 ) with respect to a function p ∈ Pa,b can be determined by means of the functions h−1 p . Hence, different convolution functions can lead to exact penalty parameters, which are close in a certain sense, if the inverse functions to corresponding associated functions are close. It is interesting to compare the least exact penalty parameters for different IPH functions. Using Theorem 4.1 we can establish the following proposition. Proposition 5.1 Let a, b1 , b2 > 0 and let p1 ∈ Pa,b1 , p2 ∈ Pa,b2 and p1 ≤ p2 . Then dp1 ≥ dp2 for each (f0 , f1 ) ∈ B such that dp1 and dp2 exist. The proof of Proposition 5.1 is based on the following assertion. Lemma 5.1 Let h and g be decreasing functions defined on IR+ . Let bh , bg and a be positive numbers such that 1) h(x) = +∞ for x ≤ bh and g(x) = +∞ for x ≤ bg , 2) The restriction of h to (bh , +∞) is a finite and strictly decreasing function mapping onto (a, +∞); the restriction of g to (bg , +∞) is a finite and strictly decreasing function mapping onto (a, +∞). 3) limx→bh +0 h(x) = +∞, limx→bg +0 g(x) = +∞ g −1
Assume that h(x) ≥ g(x) for all x. Then h−1 (y) ≥ g −1 (y) for all y > a, where h−1 and are functions, inverse to h and g, restricted to (bh , +∞) and (bg , +∞), respectively.
Proof: Since h ≥ g it follows that bh ≥ bg . The function h−1 maps onto (bh , +∞) and the function g −1 maps onto (bg , +∞). Assume that there exists y > 1 such that x1 = h−1 (y) < g −1 (y) = x2 . Then x2 > x1 > bh ≥ bg and we have y = h(x1 ) ≥ g(x1 ) > g(x2 ) = y, 4
which is a contradiction.
Proof of Proposition 5.1. Since p1 ≤ p2 it follows (see Subsection 2.2) hp1 ≤ hp2 . Applying −1 Lemma 5.1 we conclude that h−1 p1 ≤ hp2 . Due to Theorem 4.1, we have: dp1 = aM sup
1
−1 y>0 yhp1 (aM/β(y))
and dp2 = aM sup
1
. −1 y>0 yhp2 (aM/β(y))
26
(84)
Thus dp1 ≥ dp2 .
4
We now apply Proposition 5.1 to the examination of some increasing sublinear convolution functions. We need the following definitions. The set ∂ + p = {l ∈ IRn+ : [l, x] ≤ p(x) for all x ∈ IRn+ } is called the positive subdifferential of an increasing sublinear function p defined on the P cone IRn+ , where [l, x] = i li xi . The set ∂ + p(x) = {l ∈ ∂ + p : [l, x] = p(x)} is called n . It can be proved that ∂ + p(x) is not the positive subdifferential of p at a point x ∈ IR+ n empty for each x ∈ IR+ . To show this, consider the sublinear function p∗ defined on IRn by p∗ (x) = p(x+ ). Let ∂p∗ ≡ ∂p∗ (0) be a subdifferential of p∗ at zero. Then (see [10], Lemma 8.4), ∂ + p = ∂p∗ . It follows from this assertion that the positive subdifferential ∂ + p(x) = {l ∈ ∂ + p : [l, x] = p(x)} of p at a point x ∈ IRn+ coincides with the subdifferential ∂p∗ (x) = {l ∈ ∂p∗ : [l, x] = p(x)}, hence ∂ + p(x) is nonempty. We are interested in the positive subdifferential of p at the point x0 = (1, 0). It can happen that l2 = 0 for each (l1 , l2 ) ∈ ∂ + p(x0 ). For example, if p = s2 , then ∂s2 (1, 0) = {(1, 0)}. Assume that p(x) = αx2 + p˜(x), where p˜ is an increasing sublinear function and α > 0. Then l2 ≥ α > 0 for each (l1 , l2 ) ∈ ∂ + p(x0 ). Proposition 5.2 Let p ∈ Pa,b with a, b > 0 be a strictly increasing sublinear function such that the positive subdifferential ∂ + p(1, 0) contains a vector (l1 , l2 ) with l2 > 0. Then there exists Λ < +∞ with the following property: for each P (f0 , f1 ) ∈ B, such that there exist exact penalty parameters with respect to s1 and p, it holds: dp ≤ Λds1 . Here dp and ds1 are the least exact penalty parameters for P (f0 , f1 ) with respect to p and s1 , respectively. 2 by Proof: Let (l1 , l2 ) ∈ ∂ + p(1, 0) with l2 > 0. Consider the function t1 defined on IR+ t1 (y) = l1 y1 + l2 y. Since (l1 , l2 ) ∈ ∂ + p(1, 0) it follows that t1 (y) ≤ p(y) for all y ∈ IR2+ and l1 = t1 (1, 0) = p(1, 0) = a. Since t1 is strictly increasing it follows that t1 ∈ Pa,l2 . The 4 result now follows from Proposition 5.1 and Example 4.2.
5.2
Equivalence of penalization with different convolution functions
Definition 5.1 Let a, b1 , b2 > 0 and let p1 ∈ Pa,b1 , p2 ∈ Pb2 . The function p1 is said to be equivalent to p2 if there exist numbers λ > 0 and Λ < +∞ with the following property: for each P (f0 , f1 ) ∈ B such that exact penalty parameters with respect to p1 or p2 exist, it holds: λdp2 ≤ dp1 ≤ Λdp2 . Here as usual dpi is the least exact penalty parameter for P (f0 , f1 ) with respect to the function pi . We will denote the equivalence of p1 and p2 as p1 ∼ p2 . 27
Clearly p1 ∼ p2 , and p2 ∼ p3 implies p1 ∼ p3 . Theorem 5.1 Let p1 ∈ Pa,b1 , p2 ∈ Pa,b2 . Let 0 < inf
z>0
h−1 p2 (z) h−1 p1 (z)
≤ sup z>0
h−1 p2 (z) hp−1 1 (z)
< +∞.
(85)
Then the functions p1 and p2 are equivalent. Proof easily follows from Theorem 4.1. Consider two IPH functions p, q ∈ Pa,b . Assume that there exist positive numbers λ and Λ such that λp(x) ≤ q(x) ≤ Λp(x). (86)
The following example shows that (86) does not imply the equivalence of the functions p and q.
Example 5.1 Consider the IPH functions p = s1 and q = s1/2 , where sk (u, v) = (uk + v k )1/k . Clearly s1 , s1/2 ∈ P1,1 . It is known ([14, 12]) that s1 does not equivalent to s1/2 . However (86) holds for these functions with λ = minx:p(x)=1 q(x) > 0 and Λ = maxx:p(x)=1 q(x) < +∞. We now examine the equivalence of increasing sublinear convolution functions. Proposition 5.3 Let a, b > 0 and let p ∈ Pa,b be a sublinear strictly increasing function. Then there exists a number λ > 0 such that λds1 ≤ dp .
(87)
for all P (f0 , f1 ) ∈ B having exact penalty parameters with respect to p and s1 . Proof: Let t1 (y) = ay1 + by2 . Due to example 4.2, it is sufficient to prove that λdt1 ≤ dp . Suppose to the contrary that for each λ > 0 there exists a problem P (f0 , f1 ) ∈ B such that exact penalty parameters with respect to p and t1 exist and dp < λdt1 . We can assume without loss of generality that λb < 1, then λbdt1 is not an exact penalty parameter. We have aM
= ≤ =
inf p(f0 (x), dp f1+ (x)) ≤ inf p(f0 (x), λdt1 f1+ (x))
x∈X
inf p(f0 (x), 0) +
x∈X
x∈X p(0, λdt1 f1+ (x))
inf (af0 (x) + λdt1 bf + (x)) < inf (af0 (x) + dt1 f1+ (x) = aM, x∈X
x∈X
where the first inequality is obtained due to monotonicity of p and the second inequality follows from the sublinearity of p. Since λbdt1 is not the exact penalty parameter and dt1 is an exact penalty parameter for t1 , we obtain the last inequality and the last equality, respectively. We get the contradiction, which shows the validity of Proposition. 4 28
Remark 5.1 Let p be a function, which was considered in Proposition 5.3. It is easy to check that the existence of an exact penalty parameter with respect to p implies the existence of an exact penalty parameter with respect to s1 . Theorem 5.2 Let p ∈ Pa,b with a, b > 0 be a sublinear function, such that the positive subdifferential ∂ + p(1, 0) contains a vector (l1 , l2 ) with l2 > 0. Then p ∼ s1 . Proof: It follows from Proposition 5.2 and Proposition 5.3
4
Corollary 5.1 Let p1 : IR2+ → IR+ and p2 : IR2+ → IR+ be strictly increasing continuous IPH functions, such that pi (1, 0) > 0 and pi (0, 1) > 0 (i = 1, 2). Assume that ∂pi (1, 0) contains a vector (l1i , l2i ) with l2i > 0, i = 1, 2. Then p1 ∼ p2 . Indeed it follows from Theorem 5.2 and Remark 5.1.
References [1] G. Birkhoff, Lattice theory, third ed., AMS Colloqiuim Publications,, vol 25, AMS, 1967. [2] J.V. Burke, Calmness and exact penalization, SIAM J. Control and Optimization Vol. 29, pp. 493-497, 1991. [3] J.V. Burke, An exact penalization viewpoint of constrained optimization, SIAM J. Control and Optimization, Vol. 29, pp. 968-998, 1991. [4] Giannessi, F. and Mastroeni, M. On the theory of vector optimization and variational inequalities. Image space analysis and separation, In Vector Variational Inequalities and Vector Equilibria. Mathematical Theories, F. Giannessi -ed., Kluwer Academic Publisher, Dordrecht, pp. 153-215, 1999. [5] Goh, C. J. and Yang, X. Q., A nonlinear Lagrangian theory for nonconvex optimization, J. Optimiz. Theory Appl. Vol. 109, pp. 99-121, 2001. [6] Huang, X. X. and Yang, X. Q., Approximate optimal solutions and nonlinear Lagrangian functions, Journal of Global Optimization(accepted). [7] D. Li, Zero duality gap for a class of nonconvex optimization problems, J. Optimization Theory and Appl.85(1995), 309-323. [8] A.M. Kutateladze and A.M. Rubinov, Minkowski Duality and its Applications, Russian Math. Surveys, 27 (1972), 137-191. [9] Luo, Z. Q., Pang, J. S. and Ralph, D., Mathematical Programs with Equilibrium Constraints, Cambridge University Press, New York, 1996. 29
[10] V.L. Makarov, M.J. Levin and A.M. Rubinov,Mathematical Economic Theory: Pure and Mixed types of Economic Mechanisms, North-Holland, Amstersam, 1995. [11] D. Pallaschke and S. Rolewicz, Foundations of Mathematical Optimization, Kluwer Academic Press, 1997 [12] A.M. Rubinov, Abstract Convexity and Global Optimization, Kluwer Academic Publishers, 2000. [13] A.M. Rubinov, X.Q. Yang, A.M. Bagirov and R. Gasimov, Lagrange-type functions in constrained optimization, to appear. [14] A.M. Rubinov, B.M. Glover and X.Q. Yang, Decreasing functions with application to penalization, SIAM J. Optimization Vol. 10, pp. 289-313, 2000. [15] A.M. Rubinov and A. Uderzo, On global optimality conditions via separation functions, J. Optimization Theory and Appl, 109 (2001), 345-370. [16] A.M. Rubinov and X.Q. Yang, Penalty functions with a small penalty parameter: theory, submitted paper. [17] A. M. Rubinov, X.X. Huang and X.Q. Yang, The zero duality gap property and lower semicontinuity of the perturbation function, sbmitted paper. [18] I. Singer, Abstract Convex Analysis, A Wiley-Interscience Publication, New York, 1997. [19] Wang, C. Y., Yang X. Q. and Yang X. M., Nonlinear Lagrange duality theorems and penalty function methods in continuous optimization, submitted paper. [20] X.Q. Yang, and X.X. Huang, A nonlinear Lagrangian approach to constrained optimization problems, SIAM J. Optimization, (accepted). [21] Yu. G. Yevtushenko and V. G. Zhadan, Exact auxiliary functions in optimization problems, U.S.S.R Comput. Maths. Math. Phys. 30 (1990) pp. 31- 42
30