Essentially Smooth Lipschitz Functions: Compositions and Chain Rules

3 downloads 0 Views 1MB Size Report
Keywords: Lipschitz functions, Chain rule, Null sets, Differentiability, Essentially strictly differentiable. AMS (1991) subject classification: Primary: 49J520; 46NlO ...
Essentially Smooth Lipschitz Functions: Compositions and Chain Rules Jonathan M. Borwein and Warren B. Moors

Abstract. We introduce a new class of real-valued locally Lipschitz functions, which we call arc-wise essentially smooth, and we show that if 9 : R n --+ R is arc-wise essentially smooth on R n and each function Ij : R m --+ R, 1 ::; j ::; n is strictly differentiable almost everywhere in R m , then 9 0 I is strictly differentiable almost everywhere in R m , where I == (/1,12, .. .fn). We also show that all the semi-smooth and pseudo-regular functions are arc-wise essentially smooth. Thus, we provide a large and robust lattice algebra of Lipschitz functions whose generalized derivatives are extremely well-behaved. Keywords: Lipschitz functions, Chain rule, Null sets, Differentiability, Essentially strictly differentiable AMS (1991) subject classification: Primary: 49J520; 46NlO Secondary: 58C20.

1

Introduction and Preliminaries

In this extended abstract we indicate, in finite dimensions, that locally Lipschitz functions which are essentially smooth (strictly differentiable almost everywhere) possess extremely strong closure properties. The full, quite technical, details are given in [4]. Essentially strictly differentiable functions were studied in great detail in [3]. In particular, it was shown therein that all such functions possess very well-behaved Clarke generalized gradients. More specifically, each such function is:

- generic: generically Frechet differentiable; - minimal: the Clarke subgradient is reconstructable from any dense set of gradients; - integrable: any function sharing the same Clarke subgradient must differ only by a constant (on any open connected set). In consequence, this class provides an appropriate "pathology free" environment for nonsmooth analysis and optimization, both theoretical and algorithmic. For example, it is by now well appreciated that "semi-smoothness" plays an important role in the analysis of nonsmooth algorithms. F. Cucker et al. (eds.), Foundations of Computational Mathematics © Springer-Verlag Berlin Heidelberg 1997

Essentially Smooth Lipschitz Functions: Compositions and Chain Rules

17

We recall some preliminary definitions regarding the Clarke subdifferential mapping, [6]. A real-valued function f defined on a non-empty open subset A of a Euclidean (finite dimensional) space X is locally Lipschitz on A, if for each Xo E A there exists a K > 0 and a b > a such that

If(x) - f(y)1 :::; Kljx -

yll

x, y

for all

B(xo, b)

E

(B(xo,b) is the ball ofradius b around xo.) For functions in this class, it is often instructive to consider the following three directional derivatives: i. The upper Dini derivative at x E A, in the direction y, is given by, f+ (x; y) == lim sup

f(x

+ >..y) - f(x) >..

A-'O+

.

ii. The lower Dini derivative at x E A, in the direction y, is given by, . f f(x f -( x,. y ) -I' = Imln A-'O+

+ >..y)\ - f(x) . A

iii. The Clarke generalised directional derivative at x E A, in the direction y, is given by,

fO(xjy)==limsup

fez

%~X

'\~o+

+ >..y) - f(z) . >..

It is immediate from these three definitions that for each x E A and each y EX, We require several notions of differentiability associated with a locally Lipschitz function f: iv.

f is differentiable at x, in the direction lim f(x f '(X', y) -= A-'O

v.

y, if

+ >..y) - f(x) >..

exists.

f is Gateaux differentiable at x, if V' f(x)(y) == lim f(x A-'O

+ >..~ -

f(x)

exists.

for each y E X and V' f(x) is a continuous linear functional on X. We also require two slightly stronger notions of differentiability. vi.

f is strictly differentiable at x, in the direction lim f(z >.%~+

+ >..y) - f(z) A

y, if

exists.

18

vii.

Jonathan M. Borwein, Warren B. Moors

I is strictly differentiable at x, if I is strictly differentiable at x, in every direction y EX. We note that a function I is strictly differentiable at x, in the direction y if, and only if,

In [3] the present authors called a real-valued locally Lipschitz function I, defined on a non-empty open subset A, of a separable Banach space X, essentially strictly differentiable on A, if {x E A : I is not strictly differentiable at x} is a null set, (Haar-null in the general setting, Lebesgue null in finite dimensions) and they denoted by Se(A), the family of all real-valued essentially strictly differentiable locally Lipschitz functions defined on A. We shall also call such functions essentially smooth since strict differentiability is an appropriate localization of continuous differentiability. We also have reason to consider vector-valued functions. Let A be a nonempty open subset of a Euclidean space X and let I : A --> Rn be defined by

I(x) == (ft (x), h(x), ···In(x))

where Ij : A

-->

R.

We say that the vector-valued function I is essentially strictly differentiable on A if!J E Se(A) for each 1 :::; j :::; n, and in this case we write: IE Se(Aj Rn). In addition, we will say that I is strictly differentiable at x E A, in the direction y if,

IJ(xj y) = - IJ(xj -y)

for each 1 :::; j :::; n.

In this case we write: IO(x; y) = - IO(x; -y). Let us now return to the purpose at hand: to show that the family of functions

Se(A), possesses very striking closure properties under composition. A first and optimistic guess might be, that if iI, 12,"" In E Se(A) and 9 E Se(Rn), then go I E Se(A), where I == (ft, 12, ···In)· However, the following example shows that in full generality this is not true.

Example 1 Let A denote the open interval (0,1) in R and let C denote a Cantor subset of A with j.L(C) > O. Define the real-valued functions ft, 12 and de on A by: iI == 0, h(t) == t and dc(t) == dist(t, C). Further, define 9 : R 2 --> R by

g(x, y) == dist((x, y), {O} x C).

Clearly, ft and 12 are strictly differentiable almost everywhere, in fact iI and 12 are strictly differentiable everywhere. Moreover, by Theorem 8.5 in [3] we have that 9 E Se(R2). We claim that go I (j. Se(A), where I == (iI, h)· To see this, observe that go I(x) = dc(x) for each x E A. Now, it is standard that de is not strictly differentiable at any point x E C. Hence, it follows that go I is not strictly differentiable at any point of C, and so go I (j. Se(A). We will see that imposing a tighter condition on 9 will repair this defect.

Essentially Smooth Lipschitz Functions: Compositions and Chain Rules

2

19

A chain rule for Lipschitz functions.

We now indicate that there is a large subclass of the essentially strictly differentiable functions, that is closed under composition. This class of functions is closely related to Valadier's saine functions, [9]. Let A be a non-empty open subset of R n . We say that a real-valued locally Lipschitz function f defined on A is arc-wise essentially smooth on A if for each locally Lipschitz function x E Se((O, 1); Rn), JL( {t E (0,1) : fO(x(t);x+(t))

=1= -

fO(x(t); -x+(t))}) = O.

[Here, x+(t) denotes a vector of Dini derivatives. Valadier requires this to hold for all absolutely continuous arcs.] We shall denote by Ae(A), the family of all arc-wise essentially smooth functions on A. Proposition 1 For each non-empty open subset A of Rn, Ae(A)

~

Se(A).

Proposition 2 Let A be a non-empty open subset of R. Then Ae(A)

= Se(A).

Our next goal is to show that the class of functions Ae(A) is reasonably large. Let us begin with the obvious observation that Ae(A) contains all the C 1 functions defined on A. A useful adjunct is: Lemma 3 Let A be a non-empty open subset of Rn. Let f be a real-valued locally Lipschitz function defined on A. Then f E Ae(A), if for each essentially strictly differentiable curve x : (0,1) -+ A, JL({t E (0,1) : fO(x(t);x'(t)) = !,(x(t);x'(t))}) = 1.

The following lemma encapsulates the heart of our chain rule. Lemma 4 Let A be a non-empty open subset of a Euclidean space X. Let f == (!I, h, .. ·fn) be a locally Lipschitz function from A into Rn. Furthermore, let 9 be a real-valued locally Lipschitz function defined on a non-empty open subset B of Rn which contains f (A). Suppose f is strictly differentiable at some point xo E A, in the direction y and that either, (i) 9 is strictly differentiable at f (xo), in the direction f' (xo; y) or, (ii) f' (xo; y) = O. Then go f is strictly differentiable at Xo, in the direction y.

We need a few more definitions before we can show that Ae(A) contains a significant class of functions, (above and beyond the Cl functions). Let f be a real-valued locally Lipschitz function defined on a non-empty open subset A of a Euclidean space X. Then f is upper semi-smooth (lower semi-smooth) at a point x E A, in the direction y if, f+(x;y);:::: limsupf+(x+ty';y) t~O+

y'-y

(r(x;y):::; liminfr(x+ty';y)). t-o+ y'-y

20

Jonathan M. Borwein, Warren B. Moors

Moreover, we say that f is semi-smooth at a point x E A, in the direction y if, lim sup f+(x ',_0+ v'-v

+ ty'; y) =

f+(x; y)

= r (x; y) = liminf r(x + ty'; y). t-;o+

Y -Y

Additionally, we say that f is arc-wise essentially upper semi-smooth (arc-wise essentially lower semi-smooth) on A, if for each x E Se((O, 1); A) p,({t E (0,1) : f is upper (lower) semi-smooth at x(t) in the direction x'(t)}) = 1.

The following technical result allows us to perform only "unilateral" analysis in our main theorem.

Theorem 5 Let f be a real-valued locally Lipschitz function defined on a nonempty open subset A of a Euclidean space X. If f is arc-wise essentially upper semi-smooth (arc-wise essentially lower semi-smooth) on A, then f E Ae(A). Now, we may show that the family offunctions Ae(A) enjoys quite remarkable closure properties.

Theorem 6 Let A be a non-empty open subset of R m and suppose h, 12, ... fn E Ae(A). Letg be a real-valued function locally Lipschitz function defined on a nonempty open subset U of R n , which contains f(A), where f == (h, 12,"" fn). If 9 E Ae(U), then 9 0 f E Ae(A). On relaxing the hypotheses on the h, 12, ... , f n we may still recover a most satisfactory theorem:

Theorem 7 Let A be a non-empty open subset of Rm, and suppose h, 12, ... fn E Se (A). Suppose further, that U is a non-empty open subset of Rn, which contains f(A), where f == (h, 12,"" fn). Then for each 9 E Ae(U), go f E Se(A). Corollary 8 Let A be a non-empty open subset of R m then Ae(A) and Se(A) are function algebras and vector lattices containing the C1-functions and the convex functions. We should also observe that from the proof of the Theorem 7, it can be shown that for almost all x in A, \7(g 0 f)(x) = 8g(J(x» . \7 f(x).

Here 8g is the Clarke subgradient of g. Note, however, that 9 may fail to be strictly differentiable at f (x), unless the range of \7 f (x) is all of Rn. Finally, let us observe that the class of functions A e (Rn) contains many distance functions. Indeed, one can show that if C is a regular subset of R n , then the distance function generated by this set and any smooth norm on Rn, is in this class. To see this, we need to make the following observations:

Essentially Smooth Lipschitz Functions: Compositions and Chain Rules

21

(i) For any norm on Rn, the distance function de is regular at each point of C, see [1]. (ii) If the norm is smooth on R n , then -de is regular on Rn\C, see [2]. We may now define an appropriate class of closed subsets of Rn as follows. A subset C of Rn is arc-wise essentially smooth if for each x E Se((O, 1); R n ) the set {t E (0,1) : x'(t) E Kc(x(t))\Tc(x(t))}. has measure zero. Here Kc(x) denotes the contingent cone of C at x, and Tc(x) denotes the Clarke tangent cone of C at x. Recall that C is regular at x if Kc(x) = Tc(x) , [6]. Note that all sets that are regular except on a countable set are arc-wise essentially smooth. Thus, all closed convex sets and all smooth manifolds are arc-wise essentially smooth.

be a smooth norm on R n and let C be a non-empty closed subset of Rn. Then the distance function generated by the norm II . II and the set C is a member of Ae(Rn) if, and only if, C is arc-wise essentially smooth.

Proposition 9 Let

II . II

It is immediate from Proposition 9 and Theorem 6 that, unlike the family of regular sets, the family of arc-wise essentially smooth sets is closed under finite unions. We should also observe that from Proposition 2 it follows that the boundary of every arc-wise essentially smooth set is a Lebesgue null set. We also note that the set of Example 1 is a null set but is not arc-wise essentially smooth. Finally, observe that we may now provide general conditions to ensure that exact penalty functions of the form

f(x)

+ Kdc(x)

will be in Se(A) or in Ae(A). Reasonable marginal functions also lie in Ae(Rn) or Se(R n ) ([3]). Thus, these classes successfully exclude most of the "monsters" from nonsmooth optimization.

References 1. J .M. Borwein and M. Fabian, "A note on regularity of sets and of distance functions

in Banach space," J. Math. Anal. Appl, 182 (1994), 566-560. 2. J. M. Borwein, S. P. Fitzpatrick and J. R. Giles, "The differentiability of real functions on normed linear spaces using generalized subgradients," J. Math. Anal. and Appl. 128 (1987), 512-534. 3. J. M. Borwein and W. B. Moors, "Essentially strictly differentiable Lipschitz functions," CECM Report 95-029. 4. J. M. Borwein and W. B. Moors, "A Chain Rule for Lipschitz Functions," CECM Report 96-057. 5. J. P. R. Christensen, "On sets of Haar measure zero in Abelian groups," Israel J. Math. 13 (1972), 255-260. 6. F. H. Clarke, Optimization and Nonsmooth Analysis, John Wiley, New York (1971).

22

J.M. Borwein and W.B. Moors: Essentially Smooth Lipschitz Functions

7. W. B. Moors, "A characterisation of minimal subdifferential mappings of locally Lipschitz functions," Set-Valued Analysis 3 (1995),129-141. 8. M. E. Munroe, Measure and Integration, Second edition, Addison-Wesley, Reading, 1971. 9. M. Valadier, "Entrainement unilateral, !ignes de descente, fouctions Lipshitziennes non pathologiques," C. R. Acad. Sci. Paris 308 Series I (1989), 241-244.

Suggest Documents