g such that âg(x) and â(âg)(x) are both nonempty, we have. â(f + g)(x) â âf(x) + âg(x). We say that f â E(X) is ââsubdifferentiable at x if âf(x) = â
; moreover ...
Chapter 1 A MEAN VALUE THEOREM FOR K –DIRECTIONAL EPIDERIVATIVES Marco Castellani Dept. Sistemi ed Istituzioni per l’Economia University of L’Aquila
Anna D’Ottavio Dept. Sistemi ed Istituzioni per l’Economia University of L’Aquila
Massimiliano Giuli Dept. Sistemi ed Istituzioni per l’Economia University of L’Aquila
Abstract
A generalization of the Zagrodny mean value theorem is given for K– directional epiderivatives introduced by Elster and Thierfelder.
Keywords: Mean value theorem, local cone approximation, K–directional epiderivative Mathematics Subject Classification (2000) 41j52,26E15
1.
Introduction and notation
In the last years many authors have introduced different axiomatic approaches in order to derive more general results in nonsmooth optimization and in particular generalizations of mean value theorems (see Ioffe (1984), Correa et al (1994), Thibault and Zagrodny (1995), and Aussel et al (1995)). Such effort has been devoted to avoid redoubling of different results which proofs follow the same principles. Nevertheless the core of these approaches is to construct an axiomatic class of
1
2 abstract subdifferentials containing as special case all the well-known subdifferentials. The goal of this paper is to show that an abstract form of the approximate mean value theorem can be obtained also by means of the concept of K–epiderivative introduced by Elster and Thierfelder (1988). This approach allows us to avoid the analysis of the smoothness regularity of the norm of the Banach space (see the references in Aussel et al (1995)). In the sequel (X, k · k) is a real Banach space, (X ∗ , k · k∗ ) is its topological dual space endowed with the weak∗ topology and h·, ·i denotes the canonical pairing between X ∗ and X. The unit open ball centered at the origin is indicated by B. For each set A ⊆ X we define dA (x) = min kx − x0 k 0 x ∈A
the distance of x from A and
δ(x, A) =
0 if x ∈ A +∞ if x 6∈ A
the indicator function of A. We denote the topological interior, the closure and the convex hull of A by int A, cl A and conv A respectively. The set K ⊆ X is a cone if tK ⊆ K for each t > 0; the recession cone of A is the set 0+ A = {v ∈ X : A + tv ⊆ A, ∀t > 0} where 0+ ∅ = ∅. The extended–valued function f : X −→ IR ∪ {+∞} is said to be proper if dom f 6= ∅, where dom f = {x ∈ X : f (x) < +∞} is the (effective) domain of f . We denote by E(X) the class of the proper extended–valued functions defined on X, by F(X) ⊂ E(X) the subclass of the lower semicontinuous functions and by L(X) ⊂ F(X) the subclass of the locally Lipschitz functions. We recall that each convex and continuous function f : X −→ IR is directionally differentiable, i.e. for each point x ∈ X and for each direction v ∈ X the limit f 0 (x, v) = lim t↓0
f (x + tv) − f (x) t
exists and it is finite; moreover the subdifferential of f at x is ∂f (x) = {x∗ ∈ X ∗ : f (x0 ) ≥ f (x) + hx∗ , x0 − xi,
∀x0 ∈ X}.
3
A mean value theorem
2.
Mean value theorems
In the last years many nonsmooth generalizations of the classical mean value theorem for a differentiable function were stated via different directional derivatives. In this section we collect the main results. The first one is based on the concept of Dini derivative. Given f ∈ E(X), the lower Dini derivative of f at x ∈ dom f in the direction v ∈ X is D− f (x, v) = lim inf t↓0
f (x + tv) − f (x) . t
Theorem 2.1 (Diewert (1981)) Given a function f ∈ F(X), for each a, b ∈ dom f with a 6= b there exists x0 ∈ [a, b) such that D− f (x0 , b − a) ≥ f (b) − f (a). Such a result was presented in a weaker form also by Borwein and Str´ ojwas (1989) and by Penot (1988). An important application of Diewert’s mean value theorem is due to Komlosi (1995) that shows how quasiconvexity, pseudoconvexity and strict pseudoconvexity of lower semicontinuous functions can be characterized via the quasimonotonicity, pseudomonotonicity and strict pseudomonotonicity of the lower Dini derivative. A different mean value theorem has been stated by Lebourg (1975) for locally lipschitzian functions. It is well-known that for each f ∈ L(X) the directional derivative of Clarke (1975) of f at x ∈ dom f in the direction v ∈ X f (x0 + tv) − f (x0 ) t x0 →x,t↓0
f ◦ (x, v) = lim sup
is continuous and sublinear and therefore the set ∂ ◦ f (x) = {x∗ ∈ X ∗ : f ◦ (x, v) ≥ hx∗ , vi, ∀v ∈ X} is a nonempty convex compact set called Clarke subdifferential of f at x. The following mean value theorem plays a fundamental role in nonsmooth analysis theory. Theorem 2.2 (Lebourg (1979)) Given a function f ∈ L(X), for each a, b ∈ X with a 6= b there exist x0 ∈ (a, b) and x∗0 ∈ ∂ ◦ f (x0 ) such that hx∗0 , b − ai = f (b) − f (a). Borwein et al (1987) proved that, under the same assumptions, it is possible to substitute the Clarke subdifferential ∂ ◦ f with the smaller
4 subdifferential ∂ f introduced by Michel and Penot (1984) and defined as the support set of the Michel–Penot directional derivative f (x, v) = sup
lim sup
u∈X f (x+tu)→f (x),
f (x + tu + tv) − f (x + tu) . t t↓0
Both of the theorems that we have just described are in ”exact form”: the next one, instead, will be an ”approximate” mean value theorem and it is based on the directional derivative introduced by Rockafellar (1980). Given f ∈ F(X) the Clarke–Rockafellar directional derivative of f at x ∈ dom f in the direction v ∈ X is f ↑ (x, v) = sup lim sup
inf
r>0 x0 →f x, t↓0 v 0 ∈v+rB
f (x0 + tv 0 ) − f (x0 ) t
where x0 →f x means x0 → x and f (x0 ) → f (x), and the Clarke– Rockafellar subdifferential of f at x is the closed convex set ∂ ↑ f (x) = {x∗ ∈ X ∗ : f ↑ (x, v) ≥ hx∗ , vi, ∀v ∈ X}. Theorem 2.3 (Zagrodny (1988)) Given a function f ∈ F(X), for each a, b ∈ dom f with a 6= b there exist x0 ∈ [a, b) and two sequences {xk } ⊆ X with xk → x0 , {x∗k } ⊆ X ∗ with x∗k ∈ ∂ ↑ f (xk ) such that lim inf hx∗k , b − ai ≥ f (b) − f (a), k→+∞
lim inf hx∗k , b − xk i ≥ k→+∞
kb − x0 k (f (b) − f (a)). kb − ak
Analogously to Diewert’s mean value theorem, Zagrodny’s one is crucial in order to study the convexity of a lower semicontinuous function (see for instance the paper of Correa et al (1992)). We end this section describing a particular mean value theorem due to Clarke and Ledyaev (1994) that give an estimate of the rate of growth of a function f in multiple directions simultaneously. Theorem 2.4 (Clarke and Ledyaev (1994)) Let C1 , C2 ⊆ X be two nonempty closed convex bounded sets with at least one compact and f ∈ L(X). For each ε > 0 there exist x0 ∈ conv (C1 ∪ C2 ) and x∗0 ∈ ∂ ◦ f (x0 ) such that inf f − sup f < hx∗0 , x1 − x2 i + ε, C1
C2
∀x1 ∈ C1 , ∀x2 ∈ C2 .
5
A mean value theorem
If in addition both C1 and C2 are both compact sets, then min f − max f ≤ hx∗0 , x1 − x2 i, C1
C2
∀x1 ∈ C1 , ∀x2 ∈ C2 .
There are some interesting applications to calculus, flow invariance, and generalized solutions to partial differential equations.
3.
Axiomatic derivatives and subdifferentials
The concept of the directional derivative and the subdifferential of a convex function were used with advantage for treating convex optimization problems. Since more than thirty years much effort was made to establish similar concept in the nonconvex nonsmooth case with the introduction of modifications of the directional derivative. In accordance with such investigations, Elster and Thierfelder (1988) proposed an axiomatic appoach for constructing generalized directional derivatives of arbitrary functions: the basic idea is the fact that the epigraphs of the different directional derivatives of a function f ∈ E(X) can be considered as a cone approximations of the epigraph of f . The first step was to find an axiomatic definition of abstract local cone approximations. Definition 3.1 A set–valed map K : 2X × X ; X is said local cone approximation if to each set A ⊆ X and each point x ∈ X a cone K(A, x) is associated such that the following properties hold: (1) K(A, x) = K(A − x, 0), (2) K(A ∩ (x + rB), x) = K(A, x) for each r > 0, (3) K(A, x) = ∅ for each x 6∈ cl A, (4) K(A, x) = X for each x ∈ int A, (5) K(ϕ(A), ϕ(x)) = ϕ(K(A, x)) for each linear homeomorphism ϕ : X −→ X, (6) 0+ A ⊆ 0+ K(A, x) for each x ∈ cl A. The local cone approximation K is said to be isotone if for each A1 ⊆ A2 ⊆ X and for each x ∈ X we have K(A1 , x) ⊆ K(A2 , x). The second step was to use the concept of local cone approximation in order to describe generalized directional derivatives.
6 Definition 3.2 Let K be a local cone approximation, f ∈ E(X) and x ∈ dom f ; the positively homogeneous function f K (x, ·) : X −→ IR ∪ {±∞} defined f K (x, v) = inf {y ∈ IR : (v, y) ∈ K(epi f, (x, f (x)))} is said K–directional epiderivative of f at x. We assume inf ∅ = +∞. From the local property (2) of K we deduce that for each pair f1 , f2 ∈ E(X) of functions that coincide in a suitable neighbourhood of a point x ∈ X then f1K (x, v) = f2K (x, v), ∀v ∈ X. Using these two notions, general optimality conditions and duality results with respect to nonsmooth optimization problems could be derived (see for instance Elster and Thierfelder (1988), Castellani and Pappalardo (1995)). A different axiomatic appoach in nonsmooth optimization lies in the identification of the minimal main properties of the subdifferentials. Abstract classes of subdifferentials were considered by Ioffe (1984), Correa et al (1994), Thibault and Zagrodny (1995). We fix our attention on the paper of Aussel et al (1995) that introduced a class of generalized subdifferential with much less restrictive properties. Definition 3.3 A set–valued map ∂ : E(X) × X ; X ∗ is said generalized subdifferential if it satisfies the following properties: (1) for any convex function f and any x ∈ X, the set ∂f (x) collapses into the classical subdifferential, (2) 0 ∈ ∂f (x) whenever x ∈ dom f is a local minimum of f , (3) for any x ∈ X and for any real–valued convex continuous function g such that ∂g(x) and ∂(−g)(x) are both nonempty, we have ∂(f + g)(x) ⊆ ∂f (x) + ∂g(x). We say that f ∈ E(X) is ∂–subdifferentiable at x if ∂f (x) 6= ∅; moreover, if also ∂(−f )(x) 6= ∅ then we say that f is ∂–differentiable. In particular the Clarke–Rockafellar subdifferential satisfies all the three properties. In this direction, they showed that lower semicontinuous
7
A mean value theorem
functions on a Banach space satisfy an approximate mean value inequality with respect to any subdifferential operator for which the norm is smooth. Theorem 3.1 (Aussel et al (1995)) Let ∂ be a generalized subdifferential, and k · k be a ∂–smooth norm, i.e. the functions of the following form d2[a,b] where a, b ∈ X, ∆2 (x) = k µk kx − xk k2 where is a convergent sequence P
P
k
µk = 1, µk ≥ 0 and {xk } ⊆ X
are ∂–differentiable. Given f ∈ F(X), for each a, b ∈ X with a ∈ dom f and a 6= b, and for each r ≤ f (b) there exist x0 ∈ [a, b) and two sequences {xk } ⊆ X with xk →f x0 , {x∗k } ⊆ X ∗ with x∗k ∈ ∂f (xk ) such that lim inf hx∗k , b − ai ≥ r − f (a), k→+∞
lim inf hx∗k , x − xk i ≥ k→+∞
kx − x0 k (r − f (a)) kb − ak
for each x = x0 + t(b − a) with t > 0. Since for any norm the functions d2[a,b] , ∆2 ∈ L(X), then any norm is ∂ ↑ –smooth and Theorem 3.1 includes Theorem 2.3 as particular case.
4.
The main result
The aim of this section is to prove an approximate mean value theorem for K–directional epiderivatives. Our proof follows the line of the proof given by Thibault (1995). First of all we need the following necessary optimality condition for an unconstrained minimum problem. Theorem 4.1 If the function f ∈ E(X) assumes local minimum at x ∈ X then the necessary optimality condition f K (x, v) ≥ 0,
∀v ∈ X
is satisfied for each isotone local cone approximation K. Proof. The optimality of x can be written in the following way: there exists r > 0 small enough such that epi f ∩ (X × (−∞, f (x))) ∩ ((x, f (x)) + rB) = ∅
8 or, equivalently, epi f ∩ ((x, f (x)) + rB) ⊆ (X × (−∞, f (x)))c = X × [f (x), +∞). Since K is an isotone local cone approximation, we deduce K(epi f, (x, f (x))) ⊆ K(X × [f (x), +∞), (x, f (x))) = K(X × [0, +∞), (0, 0)) where the equality descends from Axiom 1. From the inclusion X × [0, +∞) = 0+ (X × [0, +∞)) ⊆ 0+ K(X × [0, +∞), (0, 0))
(1.1)
we deduce that K(X × [0, +∞), (0, 0)) is a nonempty set and for each (v, y) ∈ K(X × [0, +∞), (0, 0)) we have X × [y, +∞) = (v, y) + (X × [0, +∞)) ⊆ K(X × [0, +∞), (0, 0)). Obviously, if y < 0 we could deduce K(X × [0, +∞), (0, 0)) = X × IR; but, from (1.1) we would have X × (−∞, 0] = ⊆ = =
−0+ (X × [0, +∞)) −0+ K(X × [0, +∞), (0, 0)) 0+ K c (X × [0, +∞), (0, 0)) 0+ (X × IR)c = ∅
that is impossible. Therefore y ≥ 0 and then X × (0, +∞) ⊆ K(X × [0, +∞), (0, 0)) ⊆ X × [0, +∞) that implies K(epi f, (x, f (x))) ⊆ K(X × [0, +∞), (0, 0)) ⊆ X × [0, +∞). For this reason, for each v ∈ X f K (x, v) = inf{y ∈ IR : (v, y) ∈ K(epi f, (x, f (x)))} ≥ 0 that concludes the proof. We emphasize that the last result is equivalent to the Axiom (2) in the definition of generalized subdifferential given by Aussel et al (1995). The second result that we need is the well-known Ekeland’s variational principle.
9
A mean value theorem
Theorem 4.2 (Ekeland (1974)) Let (X, d) be a complete metric space and f ∈ F(X) be bounded from below. For each ε > 0, each x0 ∈ X such that f (x0 ) ≤ inf f + ε, X
and each λ > 0, there exists x ∈ X such that f (x) ≤ f (x0 ), d(x, x0 ) ≤ λ, f (x) > f (x) − λε d(x, x) for each x 6= x. We are now in position to state our main result. Theorem 4.3 Let f ∈ F(X) and K be a local cone approximation satisfying the following assumptions: (1) there exists an isotone local cone approximation H such that K(A, x) ⊆ H(A, x),
∀A ⊆ X, ∀x ∈ X;
(2) for each continuous convex function g we have (f + g)K (x, v) ≤ f K (x, v) + g 0 (x, v),
∀x ∈ dom f, ∀v ∈ X.
Then, for each a, b ∈ X with a ∈ dom f and f (b) ≥ f (a), there exist x0 ∈ [a, b) and a sequence {xk } ⊆ dom f with xk →f x0 such that lim inf f K (xk , b − a) ≥ 0,
(1.2)
lim inf f K (xk , b − xk ) ≥ 0.
(1.3)
k→+∞
k→+∞
Proof. that
Since f is lower semicontinuous, there exists x0 ∈ [a, b) such f (x0 ) = min f. [a,b]
Moreover there exist r > 0 and γ ∈ IR such that f (x) ≥ γ,
∀x ∈ Ω = [a, b] + rcl B;
for each k ∈ IN , let rk ∈ (0, r) be such that f (x) ≥ f (x0 ) −
1 , k2
∀x ∈ Ωk = [a, b] + rk B ⊂ Ω,
10 and tk > 0 such that γ + tk rk ≥ f (x0 ) −
1 . k2
Let us observe that
inf f (x) + δ(x, Ω) + tk d[a,b] + X
1 1 f + t d = inf k [a,b] + 2 ≥ f (x0 ); 2 Ω k k
in fact, if x ∈ Ωk , we have f (x) + tk d[a,b] (x) +
1 1 1 ≥ f (x0 ) − 2 + 2 = f (x0 ), 2 k k k
if x ∈ Ω \ Ωk , we have f (x) + tk d[a,b] (x) +
1 1 ≥ γ + tk rk + 2 ≥ f (x0 ). k2 k
Since Ω is a closed set, applying Ekeland’s variational principle to the lower semicontinuous function f (x) + δ(x, Ω) + tk d[a,b] (x) with ε = we have
1 k2
and λ = k1 , there exists xk ∈ Ω such that, for each x ∈ Ω,
1 , k f (xk ) + tk d[a,b] (xk ) ≤ f (x0 ), kxk − x0 k