2
ERROR BOUNDS FOR VECTOR-VALUED FUNCTIONS: NECESSARY AND SUFFICIENT CONDITIONS
3
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
1
Abstract. In this paper, we attempt to extend the definition and existing local error bound criteria to vector-valued functions, or more generally, to functions taking values in a normed linear space. Some new derivative-like objects (slopes and subdifferentials) are introduced and a general classification scheme of error bound criteria is presented.
4
Keywords: variational analysis, error bounds, subdifferential, slope
5
Mathematics Subject Classification (2000): 49J52, 49J53, 58C06 1. Introduction
6
In variational analysis, the term “error bounds” usually refers to the following property. Given an (extended) real-valued function f on a set X, consider its lower level set (1.1)
S(f ) := {x ∈ X| f (x) ≤ 0} −
the set of all solutions of the inequality f (x) ≤ 0. If a point x is not a solution, that is, f (x) > 0, then it can be important to have an estimate of its distance from the set (1.1) (assuming that X is a metric space) in terms of the value f (x). If a linear estimate is possible, that is, there exists a constant γ > 0 such that (1.2) 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
d(x, S(f )) ≤ γf + (x)
for all x ∈ X, then f possesses the (linear) error bound property or the error bound property holds for f . Here the denotation f + (x) = max(f (x), 0) is used. Hence (1.2) is satisfied automatically for all x ∈ S(f ). If x¯ ∈ S(f ) (usually it is assumed that f (¯ x) = 0) and (1.2) is required to hold (with some γ > 0) for all x near x¯, then we have the definition of the local (near x¯) error bound property. Error bounds play a key role in variational analysis. They are of great importance for optimality conditions, subdifferential calculus, stability and sensitivity issues, convergence of numerical methods, etc. For the summary of the theory of error bounds and its various applications to sensitivity analysis, convergence analysis of algorithms, penalty function methods in mathematical programming the reader is referred to the survey papers by Az´e [2], Lewis & Pang [21], Pang [29], as well as the book by Auslender & Teboule [1]. Numerous characterizations of the error bound property have been established in terms of various derivative-like objects either in the primal space (directional derivatives, slopes, etc.) or in the dual space (subdifferentials, normal cones) [3,4,7–15,17,19,20,24–30,34–36]. In the present paper, we attempt to extend (1.2) as well as local error bound criteria to vector-valued functions f , defined on a metric space X and taking values in a normed linear space Y . The presentation, terminology and notation follow that of [11]. Some new derivative-like objects (slopes and subdifferentials), which can be of independent interest, are introduced and a general classification scheme of error bound criteria is presented. It is illustrated in Fig. 3 – 8. The research of the second author was partially supported by the Australian Research Council, grant DP110102011.
2
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
46
The plan of the paper is as follows. In Section 2, we introduce an abstract ordering operation and define an extension of (1.1) and a nonnegative real-valued function fy+ whose role is to replace f + in (1.2) in the case f takes values in a normed linear space. Note that, unlike the scalar case, an additional parameter y is required now. The function fy+ can be viewed as a scalarizing function for the vector minimization problem defined by f . In Sections 3 and 4 various kinds of slopes, directional derivatives and subdifferentials for vector-valued function f are defined via the scalarizing function fy+ . In Section 3, we discuss primal space error bound criteria in terms of slopes. Section 4 is devoted to the criteria in terms of directional derivatives and subdifferentials. In the final Section 5, three special cases are considered: error bounds when either the image or the pre-image is finite dimensional and in the convex case. Our basic notation is standard, see [23, 31]. Depending on the context, X is either a metric or a normed space. Y is always a normed space. If X is a normed space, then its topological dual is denoted X ∗ while h·, ·i denotes the bilinear form defining the pairing between the two spaces. The closed unit balls in a normed space and its dual are denoted B and B∗ respectively. Bδ (x) denotes the closed ball with radius δ and center x. If A is a set in metric or normed space, then cl A, int A, and bd A denote its closure, interior, and boundary respectively; d(x, A) = inf a∈A kx − ak is the point-to-set distance. We also use the denotation α+ = max(α, 0).
47
2. Minimality and error bounds
48
In this section, an ordering operation in a normed linear space is discussed and the error bound property is defined.
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
49 50 51 52 53
2.1. Minimality. Let Y be a normed linear space. Suppose that for each y ∈ Y a subset Vy ⊂ Y is given with the property y ∈ Vy . We are going to consider an abstract “order” operation in Y defined by the collection of sets {Vy }: we say that v is dominated by y in Y if v ∈ Vy . Of course, this operation does not possess in general typical order properties. It can get more natural, for example, if a closed cone C ⊂ Y is given and Vy is defined as one of the following (for (2.3) C must be assumed pointed): (2.1)
{v ∈ Y | y − v ∈ C},
(2.2)
{v ∈ Y | v − y ∈ / C} ∪ {y},
(2.3)
{v ∈ Y | v − y ∈ / int C}.
Now let X be a metric space and f : X → Y . Denote Sy (f ) := {u ∈ X| f (u) ∈ Vy } − 54 55
the y-sublevel set of f (with respect to the order defined by the collection of sets {Vy }). We will also use the following nonnegative real-valued function (2.4)
56 57
59
u ∈ X.
If Vy is closed, then fy+ (u) = 0 if and only if u ∈ Sy (f ). We say that x is a local Vy -minimal point of f if (2.5)
58
fy+ (u) := d(f (u), Vy ),
fy+ (x) ≤ fy+ (u) for all u near x.
The definition depends on the choice of y. Condition (2.5) is obviously satisfied for any x ∈ Sy (f ).
3
If Vy is given by (2.1), i.e., Vy = y − C, then fy+ (u) = d(y − f (u), C) and the function fy+ is nondecreasing in the sense that for any x1 , x2 ∈ X such that f (x2 ) − f (x1 ) ∈ C it holds fy+ (x1 ) ≤ fy+ (x2 ). Indeed, fy+ (x1 ) = d(y − f (x1 ), C) ≤ d(y − f (x2 ), C) + d(f (x2 ) − f (x1 ), C) = d(y − f (x2 ), C) = fy+ (x2 ).
60 61
By this property, if C is a pointed cone and x is a strict local Vy -minimal point of f , i.e., fy+ (x) < fy+ (u) for all u near x, u 6= x, then x is locally minimal with respect to the ordering relation generated by C, i.e., there exists a neighborhood Ux of x such that (2.6)
(f (Ux ) − f (x)) ∩ (−C) = {0}.
If Y = R, we will always assume the natural ordering: Vy := {v ∈ R| v ≤ y}. This corresponds to setting C = R+ in any of the definitions in (2.1)–(2.3). We will also consider the usual distance in R: d(v1 , v2 ) = |v1 − v2 |. Then Sy (f ) = {u ∈ X| f (u) ≤ y}, fy+ (u) = (f (u) − y)+ . 62 63 64
If y < f (x), then (2.5) means that x is a point of local minimum of f in the usual sense and does not depend on y (as long as y < f (x)). If y ≥ f (x), then x is automatically a Vy -minimal point of f . If Y = Rn , y = (y1 , y2 , . . . , yn ) ∈ Rn , f = (f1 , f2 , . . . , fn ) : X → Rn , and Vy = {(v1 , v2 , . . . , vn ) ∈ Rn | vi ≤ yi , i = 1, 2, . . . , n}, then (2.7)
Sy (f ) = {u ∈ X| fi (u) ≤ yi , i = 1, 2, . . . , n}
and, for any norm k · k in Rn , we have (2.8) 65
The local minimality condition (2.5) is satisfied if (2.9)
66
fy+ (u) = k(f1 (u) − y1 )+ , (f2 (u) − y2 )+ , . . . , (fn (u) − yn )+ k. fi (x) ≤ fi (u) for all i such that fi (x) > yi and all u near x.
The opposite statement is not true in general. Example 1. Let f : R → R2 be defined as f (u) = (u, 1 − u). Then the range of f is the set {(v1 , v2 ) ∈ R2 | v1 + v2 = 1}. Let y = 0 ∈ R2 and V0 = R2− := {(v1 , v2 ) ∈ R2 | v1 ≤ 0, v2 ≤ 0}. Suppose that R2 is equipped with the norm kv1 , v2 k = |v1 | + |v2 |. Then 1 − u, if u ≤ 0, 1, if 0 < u ≤ 1, f0+ (u) = u, if u > 1.
67 68
Hence f0+ attains its minimum value 1 on the set [0, 1]. At the same time, for any x ∈ [0, 1], condition (2.9) is violated. Consider, for instance, the max-type norm on Rn : kyk = maxi |yi |. Then µ ¶+ + + fy (u) = max (fi (u) − yi ) = max (fi (u) − yi ) . 1≤i≤n
1≤i≤n
The local minimality in the sense of (2.6) means that there is no u ∈ Ux such that fi (u) ≤ fi (x) for all 1 ≤ i ≤ n and f (u) 6= f (x). In other words, for every u ∈ Ux , either we have f (u) = f (x) or there exists an index i, 1 ≤ i ≤ n, such that fi (u) > fi (x).
4
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
Hence for any y ∈ Rn and all u ∈ Ux , we have either f (u) = f (x) or fy+ (x) < fy+ (u). In consequence, for any y ∈ Rn fy+ (x) ≤ fy+ (u) for all u close to x. 69
which shows that x is a local Vy -minimal point of f .
70
2.2. Error bounds. Let x¯ ∈ X and y¯ := f (¯ x). Consider the y¯-sublevel set Sy¯(f ) and + the function fy¯ . We say that f satisfies the (local) error bound property at x¯ if there exists a γ > 0 and a δ > 0 such that
71 72
d(x, Sy¯(f )) ≤ γfy¯+ (x),
(2.10) 73 74 75 76
∀x ∈ Bδ (¯ x).
If x¯ is a locally minimal solution in the sense of (2.6), then (2.10) means that x¯ is a local weak sharp solution [5, Definition 8.2.3] to f . The error bound property can be equivalently defined in terms of the error bound modulus of f at x¯: · ¸ fy¯+ (x) (2.11) Er f (¯ x) := lim inf , x→¯ x d(x, Sy¯(f )) ∞
82
namely, the error bound property holds for f at x¯ if and only if Er f (¯ x) > 0. The notation [·/·]∞ in (2.11) is used for the extended division operation which differs from the conventional one by the additional property [0/0]∞ = ∞. This allows one to incorporate implicitly the requirement x ∈ / cl Sy¯(f ) in definition (2.11). The error bound property can be characterized in terms of certain derivative-like objects.
83
3. Criteria in terms of slopes
84
In this section, several primal space error bound criteria in terms of various kinds of slopes are established. The slopes are defined via the scalarizing function fy+ . A different approach to defining slopes directly in terms of the original vector function and then deducing error bound criteria based on the application of the vector variational principle [6] will be considered elsewhere.
77 78 79 80 81
85 86 87 88 89 90
3.1. Slopes. Recall that in the case f : X → R∞ , the slope of f at x ∈ X (with f (x) < ∞) is defined as (see, for instance, [16]) (3.1)
|∇f |(x) := lim sup u→x
91
(f (x) − f (u))+ . d(u, x)
In our general setting f : X → Y , we define the slope of f at x ∈ X relative to y ∈ Y as (3.2) Equivalently |∇f |y (x) =
|∇f |y (x) :=
lim sup u→x
0,
|∇fy+ |(x)
(fy+ (x) − fy+ (u))+ = lim sup . d(u, x) u→x
fy+ (x) − fy+ (u) , if x is not a local Vy -minimal point of f , d(u, x) otherwise.
If Vy is given by (2.1) with C being a convex cone, then c + C ⊂ C for any c ∈ C, and consequently fy+ (x) − fy+ (u) = sup[d(0, f (x) − y + C) − d(0, f (u) − y + c)] ≤ d(f (u) − f (x), C). c∈C
5 92
Hence |∇f |y (x) ≤ |∇f |1y (x), where d(f (u) − f (x), C) lim sup , if x is not a local y-minimal point of f , (3.3) |∇f |1y (x) := d(u, x) u→x 0, otherwise. If C is a pointed cone, then the upper limit in the above formula admits the following equivalent representation: ½ ¾ d(f (u) − f (x), C) d(f (u) − f (x), C) lim sup = inf inf r| < r, ∀u ∈ Bε (x) \ {x} ε>0 r>0 d(u, x) d(u, x) u→x = inf inf {r| ∀u ∈ Bε (x) \ {x} ∃v ∈ BY such that rd(u, x)v ∈ / f (x) − f (u) − C}. ε>0 r>0
If, additionally, int C 6= ∅, then we have BY ⊂ e + C for some e ∈ −int C and the latter formula gives d(f (u) − f (x), C) lim sup ≤ inf inf {r| rd(u, x)e ∈ / f (x) − f (u) − C, ∀u ∈ Bε (x) \ {x}}. ε>0 r>0 d(u, x) u→x In view of this, we have |∇f |1y (x) ≤ |∇f |1y (x, e), where inf inf {r| rd(u, x)e ∈ / f (x) − f (u) − C, ε>0 r>0 1 |∇f |y (x, e) := ∀u ∈ Bε (x) \ {x}}, 0, 93 94 95
There are obvious similarities between definitions (3.1) and (3.2). Note also an important difference: the latter one depends on an additional parameter y ∈ Y . If Y = R and y < f (x), then y vanishes and (3.2) reduces to (3.1). Proposition 3.1. Let Y = R. Then |∇f |y (x) =
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112
x is not a local y-minimal point of f , otherwise.
½
|∇f |(x), if y < f (x), 0, otherwise.
Proof. Under the assumptions, fy+ (u) = (f (u) − y)+ . In the case y < f (x), one obviously has fy+ (x) = f (x) − y. If f is lower semicontinuous at x, then also fy+ (u) = f (u) − y for all u ∈ X near x. Hence fy+ (x) − fy (u) = f (x) − f (u), and consequently, |∇fy+ |(x) = |∇f |(x). If f is not lower semicontinuous at x, then there exists a sequence xk → x such that f (xk ) → α < f (x). Then, by definition (3.1), |∇f |(x) = ∞. At the same time, fy+ (xk ) → (α − y)+ < fy+ (x). Hence fy+ is not lower semicontinuous at x either and |∇fy+ |(x) = ∞. In the case y ≥ f (x), one has fy+ (x) = 0. It follows that x is a point of minimum of fy+ , and consequently, |∇fy+ |(x) = 0. ¤ It is easy to check that in the case Y = R and y < f (x), (3.3) also reduces to (3.1). It always holds |∇f |y (x) ≥ 0 while the equality |∇f |y (x) = 0 means that x is a stationary point of fy+ . If the slope |∇f |y (x) is strictly positive, it characterizes quantitatively the descent rate of fy+ at x. If Vy is given by (2.1), then obviously |∇f |1y (x) ≥ 0. Moreover, if C is a closed convex pointed cone with nonempty interior, f is directionally differentiable at x, and x is not its local Vy -minimal point, then the equality |∇f |1y (x) = 0 means that x is a stationary point of f in the sense of Smale [32,33], i.e., for any direction p ∈ X we have f 0 (x; p) 6∈ −int C. In this context the following proposition holds. Proposition 3.2. Let C be a closed pointed cone, int C 6= ∅ and f : X → Y be directionally differentiable at x ∈ X. If x is not a local y-minimal point of f , then f 0 (x; p) + kpk|∇f |1y (x)e 6∈ −int C for all p ∈ X,
6 113
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
where e ∈ C and f 0 (x; p) is the directional derivative of f at x in the direction p. Proof. Suppose on the contrary that f 0 (x; p) + |∇f |1y (x)e + rBY ⊂ −C for some p ∈ X, kpk = 1, and r > 0. Hence, there exists an ε > 0 such that f (x + tp) − f (x) + t(|∇f |1y (x) + rκ/2)e ∈ −C for all t ∈ (0, ε), where κ > 0 and κe ∈ BY . Consequently f (u) − f (x) + t(|∇f |1y (x) + rκ/2)e ∈ −C for some u ∈ Bε (x),
114 115 116 117
contradictory to the definition of |∇f |1y (x).
3.2. Strict slopes. Given a fixed point x¯ ∈ X (with y¯ = f (¯ x)) one can use the collection of slopes (3.2) computed at nearby points to define a more robust object – the strict outer slope of f at x¯: |∇f |> (¯ x) =
(3.4) 118 119 120
¤
lim inf
x→¯ x, fy¯+ (x)↓0
|∇f |y¯(x).
The word “strict” reflects the fact that slopes at nearby points contribute to the definition (3.4) making it an analogue of the strict derivative. The word “outer” is used to emphasize that only points outside the set Sy¯(f ) are taken into account. If Y = R, then, due to Proposition 3.1, definition (3.4) takes the form |∇f |> (¯ x) =
(3.5)
lim inf
x→¯ x, f (x)↓f (¯ x)
|∇f |(x)
and coincides with the strict outer slope defined in [11]. On the other hand, one can apply (3.5) to the scalar function (2.4) (with y = y¯). This leads to an equivalent representation of (3.4): |∇f |> (¯ x) = |∇fy¯+ |> (¯ x). 121 122 123 124 125 126 127
The last constant provides the exact lower estimate of the “uniform” descent rate of fy¯+ near x¯. The strict outer slope (3.4) is the limit of usual slopes |∇f |y¯(x) which themselves are limits and do not take into account how close to x¯ the point x is. This can be important when characterizing error bounds. In view of this observation, the next definition can be of interest. The uniform strict slope of f at x¯: |∇f |¦ (¯ x) :=
(3.6)
lim +inf
sup
x→¯ x, fy¯ (x)↓0 u6=x
(fy¯+ (x) − fy¯+ (u))+ . d(u, x)
It is easy to check that (3.6) coincides with the middle uniform strict slope [11] of fy¯+ at x¯. The following representation is straightforward: # " + + + + f (x) (f (x) − f (u)) y¯ y¯ y¯ , . |∇f |¦ (¯ x) = lim +inf max sup d(u, x) d(x, Sy¯(f )) x→¯ x, fy¯ (x)↓0 0 0. Then d(x, S0 (f )) = min(x1 , x2 ). Obviously, √ (f (x) − f (u))+ sup = |∇f |(x) = sup (v1 + v2 ) = 2, d(u, x) 0 0 such that fy¯+ (x) (3.10) > γ. d(x, Sy¯(f )) for any x ∈ Bδ (¯ x) \ Sy¯(f ). Take any xk ∈ B1/k (¯ x) with 0 < fy¯+ (xk ) ≤ 1/k. Then, by (3.10), for any k > δ −1 , one can find a wk ∈ Sy¯ such that fy¯+ (xk ) − fy¯+ (wk ) fy¯+ (xk ) = > γ. d(xk , wk ) d(xk , wk )
145
It follows from definition (3.6) that |∇f |¦ (¯ x) ≥ γ. Let γ > Er f (¯ x). Then for any δ > 0 there is an x ∈ Bδ min(1/2,γ −1 ) (¯ x) such that 0 < fy¯+ (x) < γd(x, Sy¯(f )).
146 147
Applying to fx¯+ the Ekeland variational principle with ε = fy¯+ (x) and an arbitrary λ ∈ (γ −1 ε, d(x, Sy¯(f ))), one can find a w such that fy¯+ (w) ≤ fy¯+ (x), d(w, x) ≤ λ and (3.11)
fy¯+ (u) + (ε/λ)d(u, w) ≥ fy¯+ (w),
∀u ∈ X.
Obviously, d(w, x¯) ≤ 2d(x, x¯) ≤ δ and fy¯+ (w) < γd(x, Sy¯(f )) ≤ δ. Besides, f (w) 6∈ Vy¯ since d(w, x) ≤ λ < d(x, Sy¯(f )). It follows from (3.11) that fy¯+ (w) − fy¯+ (u) ≤ ε/λ < γ, d(u, w)
∀u ∈ X.
8 148 149 150 151 152
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
This implies the inequality |∇f |¦ (¯ x) ≤ γ, and consequently |∇f |¦ (¯ x) ≤ Er f (¯ x).
¤
Remark 1. The assumptions that X is complete and fx¯+ is lower semicontinuous in Theorem 3.3 were used in the proof of inequality |∇f |¦ (¯ x) ≤ Er f (¯ x). The opposite inequality ¦ Er f (¯ x) ≤ |∇f | (¯ x) is always true and can be strict, for instance, if the assumption of lower semicontinuity is dropped. Example 3 ( [11]). Let f : R → R be defined as follows (Fig. 1): if x ≤ 0, −3x, 1 1 1 f (x) = 3x − i , if i+1 < x ≤ i , i = 0, 1, . . . , 2 2 2 2x, if x > 1.
153
Obviously, Er f (0) = 1 while |∇f |¦ (0) = 3. y
0
x
Figure 1. Example 3 154 155
Remark 2. When Y = R, Theorem 3.3 improves [11, Theorem 2] and goes in line with [18, Corollary 4.3] by Kummer.
157
3.4. Criteria in terms of slopes. Under the assumptions of Theorem 3.3, constants (3.4), (3.6), and (3.8) produce criteria for the error bound property of f at x¯.
158
UC1. |∇f |¦ (¯ x) > 0.
159
UC2.
160
C1. |∇f |> (¯ x) > 0.
156
−
|∇f |¦ (¯ x) > 0.
163
(Uniform) criteria UC1 and UC2 are sufficient while criterion C1 is necessary and sufficient. Due to (3.7) and (3.9), it holds C1 ⇒ UC2 ⇒ UC1. Another sufficient criterion
164
C2. |∇f |0 (¯ x) > 0
165
can be formulated using the next nonnegative constant:
161 162
fy¯+ (x) . x→¯ x d(x, x ¯) Comparing this definition with (3.2), one can easily establish the next relation: (3.12)
|∇f |0 (¯ x) := lim inf
|∇f |y¯(¯ x) = (−|∇f |0 (¯ x))+ , 166 167 168
which shows that when |∇f |0 (¯ x) = 0, this constant does not provide any new information. At the same time, when |∇f |0 (¯ x) > 0, the other constant (3.2) equals 0 and cannot be used for characterization of error bounds while criterion C2 involving (3.12) can be useful.
9 169 170 171 172 173 174 175 176 177
Proposition 3.4. If |∇f |0 (¯ x) > 0, then |∇f |0 (¯ x) = Er f (¯ x). Proof. If |∇f |0 (¯ x) > 0, then x¯ is an isolated point in Sy¯(f ), and consequently d(x, Sy¯(f )) = d(x, x¯) for all x near x¯. ¤ The short proof above shows that when |∇f |0 (¯ x) > 0 the situation with error bounds is pretty simple. We formulate criterion C2 explicitly to make the picture of the existing criteria complete and simplify the comparison between the different criteria. Note that criteria C1 and C2 are independent. For the function f in Example 2, one √ > − ¦ obviously has |∇f | (0) = |∇f | (0) = 2 while |∇f |0 (0) = 0. At the same time, for the function in the next example the situation is opposite. Example 4 ( [11, 17]). Let f : R → R be defined as follows (Fig. 2): −x, if x ≤ 0, 1 1 1 f (x) = , if < x ≤ , i = 1, 2, . . . , i+1 i i x, if x > 1.
178 179
Obviously |∇f |(x) = 0 for any x ∈ (0, 1), and consequently |∇f |> (0) = 0. At the same time, d(f (x), V0 ) = f (x) and |∇f |0 (0) = 1. y
0
x
Figure 2. Example 4 180 181 182
The relationships among the primal space error bound criteria UC1, UC2, C1, C2 are illustrated in Fig. 3. It is assumed that the space X is complete and the function fy¯+ is lower semicontinuous. ÂÁ »¼C1 ÂÁ »¼U C2
À¿
|∇f |> (¯ x) > 0½¾ ²
À¿
ÂÁ À¿ C2 |∇f |0 (¯ x) > 0½¾ »¼ O jj jjjj  j j j f is Lipschitz jjj  ² ujjjj À¿ ÂÁ ¦ x) > 0½¾ »¼U C1 |∇fO | (¯ −
|∇f |¦ (¯ x) > 0½¾
²
Er f (¯ x) > 0 Figure 3. Criteria in terms of slopes 183
4. Criteria in terms of directional derivatives and subdifferentials
184
In this section, X is assumed a normed linear space. The slopes considered above have corresponding to them directional derivatives and subdifferentials. The criteria in terms of slopes established above can be translated into the corresponding criteria in terms of directional derivatives and subdifferentials.
185 186 187
10 188 189
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
4.1. Directional derivatives. Given x, y, h ∈ X, the lower Dini derivative of f at x in direction h relative to y is defined as (4.1)
fy0 (x; h)
fy+ (x + tz) − fy+ (x) := lim inf . t↓0, z→h t
190
Note that (4.1) is actually the lower Dini derivative of fy+ .
191
Proposition 4.1.
192 193
(i) |∇f |y (x) ≥ (− inf fy0 (x; h))+ for all x, y ∈ X. khk=1
(ii) If dim X < ∞, then (i) holds as equality and the infimum in the right-hand side is attained. Proof. (i) Let x, y ∈ X, khk = 1. By (4.1), there exist sequences tk ↓ 0 and zk → h such that fy+ (x + tk zk ) − fy+ (x) → fy0 (x; h). tk Put xk := x + tk zk . Then d(xk , x) = tk kzk k → 0 and fy+ (x) − fy+ (xk ) → −fy0 (x; h). d(xk , x)
194
The conclusion follows from definition (3.2). (ii) Let dim X < ∞ and x, y ∈ X. Taking into account (i), we need to prove the opposite inequality. If |∇f |y (x) = 0, the required inequality holds trivially. Suppose |∇f |y (x) > 0. By definition (3.2), there exists a sequence xk → x such that fy+ (x) − fy+ (xk ) → |∇f |y (x). d(xk , x) Put tk := d(xk , x), zk := t−1 k (xk − x). Without loss of generality, zk → h, khk = 1. Then fy+ (x + tk zk ) − fy+ (x) = −|∇f |y (x). k→∞ tk
fy0 (x; h) ≤ lim
¤
195 196 197
The next statement provides estimates for the strict outer slope (3.4). Corollary 4.2.
(i) |∇f |> (¯ x) ≥ (− lim sup x→¯ x,
198 199 200 201
fy¯+ (x)↓0
inf fy¯0 (x; h))+ .
khk=1
(ii) If dim X < ∞, then (i) holds as equality and the infimum in the right-hand side is attained. The first assertion of Corollary 4.2 implies a similar estimate using the strict outer Dini derivative of f at x¯ in direction h: (4.2)
x; h) = f 0 > (¯
lim sup fy¯0 (x; h).
x→¯ x, fy¯+ (x)↓0
205
The slopes considered above characterize descent rates of f . If negative, the lower Dini derivative (4.1) characterizes the descent rate at the given point in the given direction while the strict outer Dini derivative (4.2), if negative, characterizes the “worst” descent rate in the given direction among all points near x¯ lying outside Vy¯.
206
Corollary 4.3. |∇f |> (¯ x) ≥ (− inf f 0 > (¯ x; h))+ .
202 203 204
khk=1
11
When x = x¯ and y = y¯ = f (¯ x) the lower Dini derivative (4.1) takes the form
207
fy¯0 (¯ x; h)
(4.3) 208
and corresponds to constant (3.12).
209
Proposition 4.4.
fy¯+ (¯ x + tz) = lim inf t↓0, z→h t
(i) |∇f |0 (¯ x) ≤ inf fy¯0 (¯ x; h). khk=1
(ii) If dim X < ∞, then (i) holds as equality and the infimum in the right-hand side is attained.
210 211
The uniform strict slopes (3.6) and (3.8) require different type of directional derivatives. Given h ∈ X, two uniform strict Dini derivatives of f at x¯ in direction h are defined
212
as f 0 ¦ (¯ x; h)
(4.4)
214 215
216
ε↓0
x→¯ x, fy¯+ (x)↓0
f 0 M (¯ x; h) := lim lim sup
(4.5)
213
:= lim lim sup
ε↓0
x→¯ x, fy¯+ (x)↓0
(iii)
¦
|∇f | (¯ x) ≥ (− inf f 0 M (¯ x; h))+ . khk=1
Proof. (i) The inequalities are obvious. (ii) Let khk = 1 and ε ∈ (0, 1). By (3.6), lim +inf
≥ (1 + ε)−1
219 220 221
kz−hk0 + tkzk (¯ x; h) for any h ∈ X; ¦ ¦ + 0 (ii) |∇f | (¯ x) ≥ (− inf f (¯ x; h)) ;
|∇f |¦ (¯ x) ≥
217
fy¯+ (x + tz) − fy¯+ (x) , inf kz−hk0 t
"
fy¯+ (x + tz) − fy¯+ (x) − lim sup inf t x→¯ x, fy¯+ (x)↓0 kz−hk0
#+ .
Passing to the limit as ε ↓ 0 in the right-hand side of the above inequality, we obtain |∇f |¦ (¯ x) ≥ (−f 0 ¦ (¯ x; h))+ . This inequality must hold for all h ∈ X with khk = 1. The conclusion follows. (iii) The proof repeats that of (ii) with appropriate changes caused by the differences between definitions (3.6) and (3.8). ¤
224
Using directional derivatives (4.2), (4.4), and (4.5) and taking into account Corollary 4.3 and Proposition 4.5, we can formulate another set of sufficient criteria for the error bound property of f at x¯.
225
UCD1. f 0 ¦ (¯ x; h) < 0 for some h ∈ X.
226
UCD2. f 0 M (¯ x; h) < 0 for some h ∈ X.
227
CD1. f 0 > (¯ x; h) < 0 for some h ∈ X.
222 223
228 229
The relationships among various primal space criteria are illustrated in Fig. 4. It is assumed that the space X is Banach and the function fy¯+ is lower semicontinuous.
12
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
ÂÁ »¼CD1
ÂÁ / C1 »¼
À¿
f 0 > (¯ x; h) < 0 for some h ∈ X½¾ ²
ÂÁ »¼U CD2
f 0 M (¯ x; h) < 0 for some h ∈ X½¾
ÂÁ »¼U CD1
f 0 ¦ (¯ x; h)
²
À¿
À¿
< 0 for some h ∈ X½¾
À¿
|∇f |> (¯ x) > 0½¾ ²
ÂÁ / »¼U C2
À¿
ÂÁ À¿ C2 |∇f |0 (¯ x) > 0½¾ »¼ O jj jjjj  j j j f is Lipschitz jjj  ² ujjjj À¿ ÂÁ ¦ / x) > 0½¾ »¼U C1 |∇fO | (¯ −
|∇f |¦ (¯ x) > 0½¾
²
Er f (¯ x) > 0 Figure 4. Primal space criteria 230 231 232 233 234 235 236 237 238 239
4.2. Subdifferentials. In this subsection, we discuss subdifferential error bounds criteria corresponding to the conditions formulated in Section 3 in terms of (primal space) slopes. Consider first a subdifferential operator ∂ defined on the class of extended real-valued functions and satisfying the following conditions (axioms): (A1) For any f : X → R∞ and any x ∈ X, the subdifferential ∂f (x) is a (possibly empty) subset of X ∗ . (A2) If f is convex, then ∂f coincides with the subdifferential of f in the sense of convex analysis. (A3) If f (u) = g(u) for all u near x, then ∂f (x) = ∂g(x). The majority of known subdifferentials satisfy conditions (A1)–(A3). In the general case of a function f : X → Y between normed linear spaces, we define the subdifferential and subdifferential slope of f at x relative to y ∈ Y as ∂y f (x) := ∂(fy+ )(x) and (4.6)
240
respectively. Here the convention inf ∅ = +∞ is in use. Subdifferential slopes (4.6) are the main building blocks when defining the strict outer subdifferential slope of f at x¯: (4.7)
241 242
243 244 245 246 247 248 249
|∂f |y (x) := inf{kx∗ k : x∗ ∈ ∂y f (x)}.
|∂f |> (¯ x) :=
lim inf
x→¯ x, fy¯+ (x)↓0
|∂f |y¯(x).
If ∂ is the Fr´echet subdifferential operator, we will use notations ∂yF f (x), |∂ F f |y (x), and |∂ F f |> (¯ x) respectively. Thus ½ ¾ fy+ (u) − fy+ (x) − hx∗ , u − xi F ∗ ∗ (4.8) ∂y f (x) = x ∈ X | lim inf ≥0 . u→x ku − xk This is a convex subset of X ∗ . If Y = R and y < f (x), then it coincides with the usual Fr´echet subdifferential of f at x and does not depend on y. The corresponding to (4.8) subdifferential slopes |∂ F f |y (x) and |∂ F f |> (¯ x) represent dual space counterparts of the primal space slopes (3.2) and (3.4) respectively. The relationship between the constants is straightforward. Proposition 4.6. (i) |∇f |y (x) ≤ |∂ F f |y (x); (ii) |∇f |> (¯ x) ≤ |∂ F f |> (¯ x).
13 250 251 252 253 254 255 256
The inequality in Proposition 4.6 (i) can be strict rather often (for example, if ∂yF f (x) = ∅). If fy+ is convex, then the two constants coincide (see [11, Theorem 5]). Let us now impose another condition on the subdifferential operator ∂. (A4) If x is a point of local minimum of f + g, where f : X → R∞ is lower semicontinuous and g : X → R is convex and Lipschitz continuous, then for any ε > 0 there exist x1 , x2 ∈ x + εB, x∗1 ∈ ∂f (x1 ), x∗2 ∈ ∂g(x2 ) such that |f (x1 ) − f (x)| < ε, |g(x2 ) − g(x)| < ε, and kx∗1 + x∗2 k < ε.
260
Obviously, inequality |g(x2 ) − g(x)| < ε in the above condition can be omitted. The typical examples of subdifferentials satisfying conditions (A1)–(A4) are RockafellarClarke and Ioffe subdifferentials in Banach spaces and Fr´echet subdifferentials in Asplund spaces.
261
Proposition 4.7. Let fy¯+ be lower semicontinuous.
257 258 259
262 263 264 265 266 267 268 269 270 271 272 273 274 275 276
(i) If the subdifferential operator ∂ satisfies conditions (A1)–(A4), then |∇f |> (¯ x) ≥ |∂f |> (¯ x). (ii) If X is Asplund, then |∇f |> (¯ x) = |∂ F f |> (¯ x). Proof. (i) If |∇f |> (¯ x) = ∞, the assertion is trivial. Take any γ > |∇f |> (¯ x). We are going > to show that |∂f | (¯ x) ≤ γ. By definition (3.4), for any β ∈ (|∇f |> (¯ x), γ) and any δ > 0 there is an x ∈ Bδ/2 (¯ x) such that 0 < fy¯+ (x) ≤ δ/2 and |∇f |y¯(x) < β. By definition (3.2), x is a local minimum point of the function u 7→ fy¯+ (u) + βku − xk. By (A4) and (A2), this implies the existence of a point w ∈ Bδ/2 (x) with 0 < fy¯+ (w) ≤ fy¯+ (x) + δ/2 and an element x∗ ∈ ∂fy¯+ (w) with kx∗ k < γ. The inequality |∂f |> (¯ x) ≤ γ follows from definition (4.7). (ii) Since in Asplund spaces Fr´echet subdifferentials satisfy conditions (A1)–(A4), the conclusion follows from (i) and Proposition 4.6 (ii). ¤ As a direct consequence of Proposition 4.7 we have the following statement. Proposition 4.8. If X is Asplund and fy¯+ is lower semicontinuous, then |∂ F f |> (¯ x) ≥ > |∂f | (¯ x) for any subdifferential operator ∂ satisfying conditions (A1)–(A4).
278
Thanks to Proposition 4.6, the following sufficient (under appropriate assumptions) subdifferential criteria can be used for characterizing the error bound property.
279
DC1. |∂f |> (¯ x) > 0.
277
Consider now the Fr´echet subdifferential of f at x¯ relative to y¯ = f (¯ x): ½ ¾ + ∗ fy¯ (x) − hx , x − x¯i ∂y¯F f (¯ x) = x∗ ∈ X ∗ | lim inf ≥0 . x→¯ x kx − x¯k x) is a regular set of subgradients of f at Following [11] we say that a subset G ⊂ ∂y¯F f (¯ x¯ if for any ε > 0 there exists a δ > 0 such that fy¯+ (x) − sup hx∗ , x − x¯i + εkx − x¯k ≥ 0, x∗ ∈G
280 281 282 283 284
∀x ∈ Bδ (¯ x).
x) itself does not have to be a regular set of subgradients – see [11, The set ∂y¯F f (¯ Example 10]. A regular set of subgradients is defined not uniquely. The typical examples are x); • any finite subset of ∂y¯F f (¯ • any subset of a regular set of subgradients; • the set of all x∗ ∈ X ∗ such that fy¯+ (x) ≥ hx∗ , x − x¯i for all x near x¯;
14
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
• in particular, the subdifferential of fy¯+ at x¯ if fy¯+ is convex.
285 286 287
Regular sets of subgradients are needed to define the internal subdifferential slope of f at x¯: (4.9)
|∂f |0 (¯ x) = sup{r ≥ 0| rB∗ is a regular set of subgradients of f at x¯}.
In other words, fy¯+ (x) − sup hx∗ , x − x¯i x∗ ∈rB∗ 0 |∂f | (¯ x) = sup r ≥ 0| lim inf ≥0 . x→¯ x kx − x¯k
289
x) coincides with Since supx∗ ∈rB∗ hx∗ , x − x¯i = rkx − x¯k, it follows immediately that |∂f |0 (¯ the primal space slope defined by (3.12).
290
Proposition 4.9. |∂f |0 (¯ x) = |∇f |0 (¯ x).
288
291 292
The next proposition is another consequence of definition (4.9). Proposition 4.10. |∂f |0 (¯ x) ≤ sup{r ≥ 0 : rB∗ ⊂ ∂y¯F f (¯ x)}.
296
In infinite dimensions the inequality in Proposition 4.10 can be strict [11, Example 11]. Inequality |∂f |0 (¯ x) > 0 obviously implies inclusion 0 ∈ int ∂y¯F f (¯ x). Thanks to Proposition 4.9 and condition C2, we can formulate another sufficient subdifferential criteria for the error bound property:
297
DC2. |∂f |0 (¯ x) > 0.
293 294 295
298 299 300 301
302 303
This criterion is equivalent to C2 and independent of DC1. Note that inclusion 0 ∈ int ∂y¯F f (¯ x) is in general not sufficient. The following nonlocal modification of the Fr´echet subdifferential (4.8), depending on two parameters α ≥ 0 and ε ≥ 0, can be of interest: ½ ¾ fy+ (u) − fy+ (x) − hx∗ , u − xi ¦ ∗ ∗ (4.10) ∂y,α,ε f (x) = x ∈ X | inf ≥ −ε . 0 0 and a function o : R+ → R such that limt↓0 o(t)/t = 0 and for any x ∈ Bδ (¯ x) with 0 < fy¯+ (x) ≤ δ and any x∗ ∈ ∂y¯F f (x) it holds (4.12)
308
|∂f |¦ (¯ x) =
fy¯+ (u) − fy¯+ (x) − hx∗ , u − xi + o(ku − xk) ≥ 0,
∀u ∈ X.
Then |∂f |¦ (¯ x) ≤ |∂ F f |> (¯ x). + (iii) If X is Asplund, fy¯ is lower semicontinuous near x¯, and uniformity condition (UC) is satisfied then Er f (¯ x) = |∂ F f |> (¯ x) = |∇f |> (¯ x) = − |∇f |¦ (¯ x) = |∇f |¦ (¯ x) = |∂f |¦ (¯ x) ≥ |∂f |0 (¯ x).
15
Proof. (i) If x∗ ∈ ∂y¯¦,α,ε f (x) for some x ∈ / Sy¯(f ), α > d(x, Sy¯(f )), and ε > 0, then, by (4.10), sup u6=x 309
fy¯+ (x) − fy¯+ (u) fy¯+ (x) − fy¯+ (u) = sup ≤ kx∗ k + ε. ku − xk ku − xk 0 (¯ x) = ∞, the inequality holds trivially. Let |∂ F f |> (¯ x) < γ < ∞ and ε > 0. By definition (4.7), for any δ ∈ (0, ε) there is an x ∈ X with kx − x¯k < δ, 0 < fy¯+ (x) < δ and an x∗ ∈ ∂y¯F f (x) with kx∗ k < γ. Without loss of generality we can take δ > 0 such that (4.12) holds true and o(t)/t < ε if 0 < t < δ. Then fy¯+ (u) − fy¯+ (x) − hx∗ , u − xi o(ku − xk) inf ≥− > −ε. 0 0.
313
316 317 318 319 320 321
When uniformity condition (UC) is imposed, Fr´echet subdifferentials (4.8) gain uniformity properties and, thanks to Proposition 4.11, sufficient condition DC1 in terms of the Fr´echet strict outer subdifferential slope becomes also necessary. The relationships among the subdifferential and primal space error bound criteria on Banach and Asplund spaces are illustrated in Fig. 5 and Fig. 6 respectively. fy¯+ is assumed lower semicontinuous. ÂÁ »¼DC1
À¿
|∂f |> (¯ x) > 0½¾
ÂÁ / C1 »¼ ÂÁ »¼U C2
À¿
ÂÁ »¼DC2
|∇f |> (¯ x) > 0½¾ ²
À¿
ÂÁ »¼C2 O ii iiii i  i i i f is Lipschitz iii  ² tiiiiÀ¿ ÂÁ ¦ x) > 0½¾ »¼U C1 |∇fO | (¯ −
|∇f |¦ (¯ x) > 0½¾
²
Er f (¯ x) > 0
À¿
|∂f |0 (¯ x) > 0½¾ O ²
À¿
|∇f |0 (¯ x) > 0½¾
ÂÁ / U DC1 »¼
À¿
|∂f |¦ (¯ x) > 0½¾
Figure 5. Subdifferential and primal space criteria in Banach spaces
322
5. Finite dimensional and convex cases
323
In this section, three special cases are considered: error bounds when either X or Y is finite dimensional and in the convex case.
324
16
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
ÂÁ »¼DC1
À¿
|∂ F f |> (¯ x) > 0½¾o
ÂÁ / C1 »¼
À¿
OÂ
(U C)
ÂÁ »¼U C2
ÂÁ »¼DC2
|∇f |> (¯ x) > 0½¾ Â
 ²
À¿ ÂÁ |∇f | (¯ x) > 0½¾ »¼C2 O ii iiii i  i i f is Lipschitz or (UC) iiii  ² tiiiiÀ¿ ÂÁ ¦ x) > 0½¾ »¼U C1 |∇fO | (¯ −
¦
²
À¿
|∂f |0 (¯ x) > 0½¾ O
²
À¿
|∇f |0 (¯ x) > 0½¾
ÂÁ
À¿
Er f (¯ x) > 0 o_ _ _ _ _/»¼U DC1 |∂f |¦ (¯ x) > 0½¾ (U C) Figure 6. Subdifferential and primal space criteria in Asplund spaces 5.1. X is finite dimensional. Let dim X < ∞ and fy¯+ be lower semicontinuous. Many relations and estimates from previous sections can be simplified and sharpened. In particular, the following limiting subdifferentials can be used: (5.1)
∂ > f (¯ x) = Lim sup ∂y¯F f (x), x→¯ x, fy¯+ (x)↓0
(5.2)
∂ ¦ f (¯ x) =
Lim sup + x→¯ x, ε↓0, fy ¯ (x)↓0 α−d(x,Sy ¯ (f ))↓0
325 326 327 328 329 330 331 332 333 334 335
∂y¯¦,α,ε f (x).
In the above definitions, Lim sup denotes the outer limit [31] operation for sets: each of the sets (5.1) and (5.2) is the set of all limits of elements of appropriate subdifferentials. Sets (5.1) and (5.2) are called the limiting outer subdifferential [17] and uniform limiting subdifferential [11] of f at x¯ respectively. Proposition 5.1. (i) If f is continuous near x¯, then − |∇f |¦ (¯ x) = |∇f |¦ (¯ x) (ii) |∂ F f |> (¯ x) = inf{kx∗ k : x∗ ∈ ∂ > f (¯ x)}. (iii) |∂f |0 (¯ x) = sup{r ≥ 0 : rB∗ ∈ ∂y¯F f (¯ x)}. ¦ ∗ ∗ ¦ (iv) |∂f | (¯ x) = inf{kx k : x ∈ ∂ f (¯ x)}. Proof. Assertion (i) follows from [11, Propopsition 11]. Assertions (ii) and (iv) are consequences of definitions (4.7), (4.11), (5.1), and (5.2). Assertion (iii) follows from definition (4.9) and [11, Propopsition 13]. ¤
337
Thanks to Proposition 5.1 (ii)–(iv) one can formulate the finite dimensional versions of criteria DC1, DC2, and UDC1.
338
SD1. 0 ∈ / ∂ > f (¯ x).
339
x). SD2. 0 ∈ int ∂y¯F f (¯
340
USD1. 0 ∈ / ∂ ¦ f (¯ x).
336
Criteria SD2 and SD1 are in general independent. They can be combined to form a weaker sufficient criterion x) \ int ∂y¯F f (¯ x). 0∈ / ∂ > f (¯ 341 342 343 344
x) ∩ ∂ > f (¯ If fy¯+ is semismooth at x¯ [22], then int ∂y¯F f (¯ x) = ∅ [11, Proposition 14], and consequently SD2 ⇒ SD1. The relationships among the error bound criteria for a lower semicontinuous function on a finite dimensional space are illustrated in Fig. 7.
17
ÂÁ »¼SD1
À¿
0∈ / ∂ > f (¯ x)½¾
jU U U fU is semismooth at x¯ U U U U ² ÂÁ À¿ ÂÁ À¿ F > x)½¾ x) > 0½¾ »¼SD2 0 ∈ O int ∂y¯ f (¯ »¼C1 |∇fO | (¯  (U C)  ² ² À¿ ÂÁ À¿ ÂÁ − ¦ C2 |∇f |0 (¯ x) > 0½¾ |∇f | (¯ x) > 0½¾ »¼ »¼U C2 O ii iiii i  i i f is continuous or (UC) iiii  ² tiiiiÀ¿ ÂÁ ¦ x) > 0½¾ »¼U C1 |∇fO | (¯ ²
O
ÂÁ
À¿
Er f (¯ x) > 0 o_ _ _ _ _ _/»¼U SD1 0 ∈ / ∂ ¦ f (¯ x)½¾ (U C) Figure 7. Finite dimensional case 345 346 347 348 349 350 351 352 353 354 355 356 357
5.2. Convex Case. In this subsection, X is a general Banach space and fy¯+ is convex lower semicontinuous. In the convex case, the Fr´echet subdifferential (4.8) coincides with the subdifferential of fy+ at x in the sense of convex analysis and uniformity condition (UC) is satisfied automatically. We will omit superscript F in all denotations where Fr´echet subdifferentials are involved. A number of constants considered in the preceding sections coincide while some definitions get simpler. Proposition 5.2. (i) Er f (¯ x) = − |∇f |¦ (¯ x) = |∇f |¦ (¯ x) = |∇f |> (¯ x) = |∂f |> (¯ x) = ¦ |∂f | (¯ x). (ii) |∂f |0 (¯ x) = sup{r ≥ 0 : rB∗ ∈ ∂y¯f (¯ x)}. (iii) 0 ∈ int ∂y¯f (¯ x) if and only if 0 6∈ bd ∂y¯f (¯ x). Proof. (i) and (ii) follow from [11, Theorem 5 and Proposition 15]. ∂y¯f (¯ x) always contains 0 since fy¯+ (¯ x) = 0. This observation proves (iii). ¤
359
Thanks to Proposition 5.2, we have another subdifferential condition for characterizing the error bound property.
360
SD3. 0 6∈ bd ∂y¯f (¯ x).
361
The relationships among the error bound criteria for a convex lower semicontinuous function on a Banach space are illustrated in Fig. 8.
358
362 363 364 365
5.3. Y is finite dimensional. Let Y = Rn , y = (y1 , y2 , . . . , yn ) ∈ Rn , f = (f1 , f2 , . . . , fn ) : X → Rn , and Vy = {(v1 , v2 , . . . , vn ) ∈ Rn | vi ≤ yi , i = 1, 2, . . . , n}. Then Sy (f ) and fy+ take the form (2.7) and (2.8) respectively. In this subsection, we suppose that Y is equipped with the maximum type norm: kyk = max(|y1 |, |y2 |, . . . , |yn |). Then (2.8) can be rewritten as · ¸+ + + (5.3) fy (u) = max (fi (u) − yi ) = max (fi (u) − yi ) . 1≤i≤n
366 367
1≤i≤n
Denote by Iy (u) the set of indexes, for which the maximum in max1≤i≤n (fi (u) − yi ) is attained.
18
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
ÂÁ »¼SD3
ÂÁ / »¼C2
À¿
0 6∈ bd ∂y¯f (¯ x)½¾o
À¿
²
ÂÁ »¼C1 ÂÁ »¼U C2
−
À¿
À¿ |∇f | (¯ x) > 0½¾o O >
ÂÁ / U C1 »¼
|∇f |¦ (¯ x) > 0½¾o
ÂÁ / SD2 »¼
|∇f |0 (¯ x) > 0½¾o
²
À¿
|∇f |¦ (¯ x) > 0½¾o O
ÂÁ / DC1 »¼ ÂÁ / U DC1 »¼
À¿
0 ∈ int ∂y¯f (¯ x)½¾ ²
À¿
|∂f |> (¯ x) > 0½¾ O ²
À¿
|∂f |¦ (¯ x) > 0½¾
²
Er f (¯ x) > 0 Figure 8. Convex case 368
Let x¯ ∈ X and y¯ = f (¯ x). The error bound modulus (2.11) of f at x¯ takes the form: (5.4)
x∈S / y ¯ (f )
369 370 371
372 373 374
As it was discussed in Section 3, error bounds can be characterized in terms of slopes. If fi (x) ≤ yi for all i = 1, 2, . . . , n, then, in accordance with (5.3), fy+ (x) = 0 and consequently (due to (3.2)) |∇f |y (x) = 0. Otherwise, ¯ µ ¶¯ ¯ (fi (x) − fi (u))+ ¯¯ (5.5) |∇f |y (x) = lim sup min = ¯∇ max fi ¯¯ (x), i∈Iy (x) d(u, x) u→x i∈Iy (x) provided the functions fi are lower semicontinuous at x for i ∈ Iy (x) and upper semicontinuous at x for i ∈ / Iy (x). The strict outer slope of f at x¯ (3.4) can be computed as |∇f |> (¯ x) =
(5.6) 375 376
fi (x) − yi . 1≤i≤n d(x, Sy¯(f ))
Er f (¯ x) = lim inf max x→¯ x
lim inf
x→¯ x max (fi (x)−¯ yi )↓0 1≤i≤n
|∇f |y¯(x),
where y¯i = fi (¯ x), y¯ = (¯ y1 , y¯2 , . . . , y¯n ), while for the uniform strict slope we have the following formula: (fi (x) − fi (u))+ (5.7) |∇f |¦ (¯ x) = lim inf sup min , x→¯ x i∈Iy (x) d(u, x) max (fi (x)−¯ yi )↓0 u6=x 1≤i≤n
377 378
379 380
provided all functions fi are continuous near x¯. At the same time, constant (3.12) admits the next representation: (fi (x) − y¯i )+ . (5.8) |∇f |0 (¯ x) = lim inf max x→¯ x 1≤i≤n d(x, x¯) Application of Theorem 3.3 and Proposition 3.4 leads to the following equivalent representations of the error bound modulus (5.4). Proposition 5.3. Er f (¯ x) =
(i) If X is complete and fi , i = 1, 2, . . . , n, are continuous, then lim inf x→¯ x max (fi (x)−¯ yi )↓0 1≤i≤n
(fi (x) − fi (u))+ ≥ |∇f |> (¯ x). d(u, x) u6=x i∈Iy (x)
sup min
(ii) If |∇f |0 (¯ x) > 0, then (fi (x) − y¯i )+ . 1≤i≤n d(x, x¯)
Er f (¯ x) = lim inf max x→¯ x
19 381 382 383 384 385
It is also possible to characterize error bounds using directional derivatives. Given x, y, h ∈ X, suppose the functions fi are lower semicontinuous at x for i ∈ Iy (x) and upper semicontinuous at x for i ∈ / Iy (x). Then for the lower Dini derivative (4.1) of f at x in direction h relative to y we have the following representations: fy0 (x; h) = 0 if fi (x) < yi for all i = 1, 2, . . . , n; otherwise µ ¶0 fi (x + tz) − fi (x) 0 (5.9) fy (x; h) = lim inf max = max fi (x; h). t↓0, z→h i∈Iy (x) i∈Iy (x) t If all fi are continuous in a neighborhood of x¯, then obviously Iy¯(x) ⊂ Iy¯(¯ x) for all x near x¯ and we have the following representations for the strict outer Dini derivative (4.2) and uniform strict Dini (4.4) derivative of f at x¯ in direction h: µ ¶0 > 0 f (¯ (5.10) x; h) = lim sup max fi (x; h), x→¯ x max (fi (x)−¯ yi )↓0 1≤i≤n
(5.11)
f 0 ¦ (¯ x; h) = lim ε↓0
lim sup
i∈Iy¯ (¯ x)
inf
x→¯ x max (fi (x)−¯ yi )↓0 1≤i≤n
kz−hk0
fi (x + tz) − fi (x) . 1≤i≤n t max
387
Constants (5.6), (5.7), and (5.8) as well as directional derivatives (5.10) and (5.11) give rise to the sufficient error bound criteria C1, UC1, C2, CD1 and UCD1 respectively.
388
References
389 390 391
[1] Auslender, A., and Teboulle, M. Asymptotic Cones and Functions in Optimization and Variational Inequalities. Springer Monographs in Mathematics. Springer-Verlag, New York, 2003. ´, D. A survey on error bounds for lower semicontinuous functions. In Proceedings of 2003 MODE[2] Aze SMAI Conference (2003), vol. 13 of ESAIM Proc., EDP Sci., Les Ulis, pp. 1–17. ´, D., and Corvellec, J.-N. On the sensitivity analysis of Hoffman constants for systems of [3] Aze linear inequalities. SIAM J. Optim. 12, 4 (2002), 913–927. ´, D., and Corvellec, J.-N. Characterizations of error bounds for lower semicontinuous func[4] Aze tions on metric spaces. ESAIM Control Optim. Calc. Var. 10, 3 (2004), 409–425. [5] Bednarczuk, E. Stability Analysis for Parametric Vector Optimization Problems, vol. 442 of Dissertationes Mathematicae. Polska Akademia Nauk, Instytut Matematyczny, Warszawa, 2007. [6] Bednarczuk, E. M., and Zagrodny, D. Vector variational principle. Arch. Math. (Basel) 93, 6 (2009), 577–586. [7] Bosch, P., Jourani, A., and Henrion, R. Sufficient conditions for error bounds and applications. Appl. Math. Optim. 50, 2 (2004), 161–181. ˘linescu, C. Conditioning and upper-Lipschitz inverse subd[8] Cornejo, O., Jourani, A., and Za ifferentials in nonsmooth optimization problems. J. Optim. Theory Appl. 95, 1 (1997), 127–148. [9] Corvellec, J.-N., and Motreanu, V. V. Nonlinear error bounds for lower semicontinuous functions on metric spaces. Math. Program., Ser. A 114, 2 (2008), 291–319. [10] Deng, S. Global error bounds for convex inequality systems in Banach spaces. SIAM J. Control Optim. 36, 4 (1998), 1240–1249. [11] Fabian, M., Henrion, R., Kruger, A. Y., and Outrata, J. V. Error bounds: necessary and sufficient conditions. Set-Valued Var. Anal. 18, 4 (2010), 121–149. [12] Henrion, R., and Jourani, A. Subdifferential conditions for calmness of convex constraints. SIAM J. Optim. 13, 2 (2002), 520–534. [13] Henrion, R., and Outrata, J. V. A subdifferential condition for calmness of multifunctions. J. Math. Anal. Appl. 258, 1 (2001), 110–130. [14] Henrion, R., and Outrata, J. V. Calmness of constraint systems with applications. Math. Program., Ser. B 104, 2-3 (2005), 437–464. [15] Ioffe, A. D. Regular points of Lipschitz functions. Trans. Amer. Math. Soc. 251 (1979), 61–69. [16] Ioffe, A. D. Metric regularity and subdifferential calculus. Russian Math. Surveys 55 (2000), 501– 558. [17] Ioffe, A. D., and Outrata, J. V. On metric and calmness qualification conditions in subdifferential calculus. Set-Valued Anal. 16, 2–3 (2008), 199–227.
386
392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421
20 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465
EWA M. BEDNARCZUK AND ALEXANDER Y. KRUGER
[18] Klatte, D., Kruger, A. Y., and Kummer, B. From convergence principles to stability and optimality conditions. To be published. [19] Klatte, D., and Kummer, B. Nonsmooth Equations in Optimization. Regularity, Calculus, Methods and Applications, vol. 60 of Nonconvex Optimization and its Applications. Kluwer Academic Publishers, Dordrecht, 2002. ´ra, M. Stability of error bounds for convex constraint [20] Kruger, A. Y., Ngai, H. V., and The systems in Banach spaces. SIAM J. Optim. 20, 6 (2010), 3280–3296. [21] Lewis, A. S., and Pang, J.-S. Error bounds for convex inequality systems. In Generalized Convexity, Generalized Monotonicity: Recent Results (Luminy, 1996), vol. 27 of Nonconvex Optim. Appl. Kluwer Acad. Publ., Dordrecht, 1998, pp. 75–110. [22] Mifflin, R. Semismooth and semiconvex functions in constrained optimization. SIAM J. Control Optimization 15, 6 (1977), 959–972. [23] Mordukhovich, B. S. Variational Analysis and Generalized Differentiation. I: Basic Theory, vol. 330 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 2006. [24] Ng, K. F., and Yang, W. H. Regularities and their relations to error bounds. Math. Program., Ser. A 99 (2004), 521–538. [25] Ng, K. F., and Zheng, X. Y. Error bounds for lower semicontinuous functions in normed spaces. SIAM J. Optim. 12, 1 (2001), 1–17. ´ra, M. Stability of error bounds for semi-infinite convex [26] Ngai, H. V., Kruger, A. Y., and The constraint systems. SIAM J. Optim. 20, 4 (2010), 2080–2096. ´ra, M. Error bounds in metric spaces and application to the perturbation [27] Ngai, H. V., and The stability of metric regularity. SIAM J. Optim. 19, 1 (2008), 1–20. ´ra, M. Error bounds for systems of lower semicontinuous functions in [28] Ngai, H. V., and The Asplund spaces. Math. Program., Ser. B 116, 1-2 (2009), 397–427. [29] Pang, J.-S. Error bounds in mathematical programming. Math. Programming, Ser. B 79, 1-3 (1997), 299–332. Lectures on Mathematical Programming (ISMP97) (Lausanne, 1997). [30] Penot, J.-P. Error bounds, calmness and their applications in nonsmooth analysis. In Nonlinear Analysis and Optimization II: Optimization, A. Leizarowitz, B. S. Mordukhovich, I. Shafrir, and A. J. Zaslavski, Eds., vol. 514 of Contemp. Math. AMS, Providence, RI, 2010, pp. 225–248. [31] Rockafellar, R. T., and Wets, R. J.-B. Variational Analysis, vol. 317 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. SpringerVerlag, Berlin, 1998. [32] Smale, S. Global analysis and economics. I. Pareto optimum and a generalization of Morse theory. In Dynamical systems (Proc. Sympos., Univ. Bahia, Salvador, 1971). Academic Press, New York, 1973, pp. 531–544. [33] Smale, S. Optimizing several functions. In Manifolds–Tokyo 1973 (Proc. Internat. Conf., Tokyo, 1973). Univ. Tokyo Press, Tokyo, 1975, pp. 69–75. [34] Studniarski, M., and Ward, D. E. Weak sharp minima: characterizations and sufficient conditions. SIAM J. Control Optim. 38, 1 (1999), 219–236. [35] Wu, Z., and Ye, J. J. Sufficient conditions for error bounds. SIAM J. Optim. 12, 2 (2001/02), 421–435. [36] Wu, Z., and Ye, J. J. On error bounds for lower semicontinuous functions. Math. Program., Ser. A 92, 2 (2002), 301–314.
467 468
Systems Research Institute, Polish Academy of Sciences, ul. Warszawa, Poland E-mail address:
[email protected]
469 470 471 472
School of Information Technology and Mathematical Sciences, Centre for Informatics and Applied Optimization, University of Ballarat, POB 663, Ballarat, Vic, 3350, Australia E-mail address:
[email protected]
466
Newelska 6, 01-447