Complete convergence results are obtained for λ-convex functionals together with ... slope, i.e. an absolutely continuous curve u : (a, b) â S , satisfying d dt .... every curve v â AC(a, b;S ), the function g ⦠v is Borel and ...... The last statement of.
Report no. 05/10
Two Variational Techniques for the Approximation of Curves of Maximal Slope Christoph Ortner
This report generalizes several results on approximations of curves of maximal slope in the book by Ambrosio, Gigli and Savar´e (Gradient Flows, 2005), allowing general approximations of the functional and of the metric. The conditions guaranteeing the convergence of the approximations are closely related to the conditions of Γ-convergence. Complete convergence results are obtained for λ-convex functionals together with general results indicating a possible procedure for even more general problems. The theory presented provides convergence results as well as explicit a-priori error estimates for gradient flows of functionals which may be non-convex as well as non-differentiable.
Key words and phrases: curves of maximal slope, Gamma convergence, gradient flows, metric spaces
Oxford University Computing Laboratory Numerical Analysis Group Wolfson Building Parks Road Oxford, England OX1 3QD
June, 2005
2
Contents 1 Introduction 1.1 Analysis in metric spaces . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Metric derivative . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Upper gradient and local slope . . . . . . . . . . . . . . . . 2 Approximation by Γ-Convergence 2.1 The discrete scheme . . . . . . . 2.2 Topological conditions . . . . . . 2.3 Auxiliary results . . . . . . . . . 2.4 The abstract convergence result . 2.5 The λ-convex case . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
3 Variational Inequalities 3.1 Relaxed convexity and coercivity conditions 3.2 Evolutionary variational inequalities . . . . . 3.3 Continuous case . . . . . . . . . . . . . . . . 3.4 Discrete case . . . . . . . . . . . . . . . . . 3.5 Examples for (3.25) . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
3 4 5 5
. . . . .
6 6 6 8 9 14
. . . . .
19 19 22 25 30 35
3
1
Introduction
Let (S , d) be a metric space and φ : S → R a functional defined on S . The natural generalization of a gradient flow to the metric setting is a curve of maximal slope, i.e. an absolutely continuous curve u : (a, b) → S , satisfying d 1 1 φ(u(t)) = − |u0 |2 (t) − |∂φ|2 (u(t)), dt 2 2
(1.1)
where |u0 | is the metric derivative of u and |∂φ|(u) the local slope of φ at u (see Sections 1.1.1 and 1.1.2 for the definitions). It can be easily seen that in the Hilbert space setting, (1.1) is equivalent to the classical formulation of a gradient flow if φ is differentiable. For, suppose that S is a Hilbert space, and that φ and the curve u are differentiable and satisfy u˙ = −φ0 (u); then d 1 1 0 φ(u) = (φ0 (u), u) ˙ = −kukkφ ˙ (u)k = − kuk ˙ 2 − kφ0 (u)k2 . dt 2 2 (Note that in this case kuk ˙ = |u0 | and kφ0 (u)k = |∂φ|(u).) On the other hand, if (1.1) is satisfied then 1 d 1 ˙ 2 − kφ0 (u)k2 = φ(u) − kuk 2 2 dt = (φ0 (u), u) ˙ 0 ≥ −kφ (u)kkuk ˙ 1 1 ≥ − kuk ˙ 2 − kφ0 (u)k2 . 2 2 Hence equalities hold in the Cauchy-Schwarz and the Cauchy inequalities which we have applied; this, together with the fact that φ is decreasing along u, shows that u˙ = −φ0 (u). The motivation for the study of approximations of curves of maximal slope was the wish to analyze gradient flows of the Mumford-Shah functional, with an eye to applications in computational fracture mechanics. Interesting results for a regularization of the Mumford-Shah functional were recently obtained by Feng and Prohl [FP04] but, except in the one-dimensional setting in a work by Gobbino [Gob98], they could not be extended to gradient flows of the Mumford-Shah functional itself. Even the theory developed here could so far not be applied to this problem, but it is sufficiently general that it should be useful in many other applications. Chapters 1-3 in [AGS05] provide an existence theory for curves of maximal slope which is based on an implicit (backward Euler) time-discretization, which requires, at the jth time-step, to minimize the functional U j 7→
d2 (U j , U j−1 ) + φ(U j ). 2∆t
(1.2)
4 Using a compactness argument and lower semicontinuity assumptions for φ and |∂φ| it can be shown that certain interpolants of (Uj ) converge to a curve of maximal slope in a suitable topology. In this paper, a useful generalization of this approach is presented. We allow that the functional φ in (1.2) is replaced by an approximation φh . More precisely, for every h ∈ N, we consider an approximate functional φh and a time-partition τh = (t0h , t1h , . . . ) and compute the minimizers Uhj of the functionals Uhj
d2 (Uhj , Uhj−1 ) j 7→ j j−1 + φh (Uh ). 2(th − th )
In Section 2 it is shown that, under suitable conditions on the family (φh ), certain interpolants of (Uhj )j=0,1,... converge to a curve of maximal slope. For some classes of functionals, these conditions reduce to Γ(d)-convergence of φh to φ, cf. Section 2.5. Chapter 4 in [AGS05] provides an alternative theory for curves of maximal slope in a more restrictive setting. Under relaxed convexity conditions on the functional φ and strong conditions on the metric (cf. Condition (3.2)), it is shown that the curve of maximal slope u(t) with a given initial value is unique and that it satisfies the evolutionary variational inequality 1d 2 d (u(t), v) + λd2 (u(t), v) + φ(u(t)) ≤ φ(v) ∀v ∈ S , 2 dt where λ is related to the relaxed convexity condition (3.2). Based on this inequality, we provide a second theory for the approximation of curves of maximal slope, which yields significantly stronger results, although it is restricted to a much smaller class of problems. In Section 3.3 a time-continuous convergence analysis is developed, whereas Section 3.4 presents the corresponding theory for fully discrete approximations. Section 3 can be read independently from Section 2. In the remainder of the Introduction, we review some results from the theory of metric spaces, which is required for the subsequent analysis.
1.1
Analysis in metric spaces
Let (S , d) be a complete metric space, endowed with an additional Hausdorff topology σ. Throughout this paper, we denote convergence as follows. If d(uj , u) → 0 as j → ∞, then we write uj → u. If uj converges to u in the topology σ, then σ we write uj * u. As σ is typically a weak topology with good compactness propσ erties, we call the convergence uj * u weak convergence. We assume throughout this paper, that σ is compatible with the topology induced by d, in the sense that σ if uj → u then uj * u as well. By B(u, r) we denote the closed ball in S with radius r and center u.
5 1.1.1
Metric derivative
Let v : (a, b) → S , where (a, b) ⊂ R. We say that the curve v belongs to AC p (a, b; S ), for 1 ≤ p ≤ ∞, if there exists a function A ∈ Lp (a, b) such that Z t d(v(s), v(t)) ≤ A(r) dr ∀s < t ∈ (a, b). s
We identify AC = AC. If v ∈ AC p (a, b; S ), then there exists function |v 0 | ∈ Lp (a, b), the metric derivative of v, such that 1
lim s→t
d(v(t), v(s)) = |v 0 |(t), |t − s|
for a.e. t ∈ (a, b).
(1.3)
If v ∈ AC p (a, b; S ) then v is uniformly continuous as a mapping from [a, b] to S . In particular, if a ∈ R (or b ∈ R) then v(a+) (or v(b−)) exist. If v ∈ AC(a, b; S ) then v(a+) and v(b−) exist even in the case (a, b) = R. If S is a Hilbert space and d the metric induced by its norm, then for any curve v ∈ AC(a, b; S ), v(t) ˙ exists for a.e. t ∈ (a, b) and the metric derivative |v 0 | coincides with kvk ˙ S. For further detail and proofs, see Section 1.1 in [AGS05]. 1.1.2
Upper gradient and local slope
Let φ : S → (−∞, ∞] be a functional on S . We denote its domain by D(φ), i.e., D(φ) = {v ∈ S : φ(v) < ∞}. We say that a function g : S → [0, ∞] is a strong upper gradient for φ if for every curve v ∈ AC(a, b; S ), the function g ◦ v is Borel and Z t |φ(v(t)) − φ(v(s))| ≤ g(v(r))|v 0 |(r) dr. (1.4) s
A typical candidate for an upper gradient, the local slope of φ at a point v is defined as (φ(v) − φ(w))+ . (1.5) |∂φ|(v) = lim sup d(v, w) w→v Here and below we assume p ∈ (1, ∞) and set p0 = p/(p − 1) so that 1/p + p 1/p0 = 1. We say that a curve u ∈ ACloc (a, b; S ) is a p-curve of maximal slope for the functional φ with respect to its upper gradient g if t 7→ φ(u(t)) is L 1 -a.e. equal to a non-increasing map and d 1 1 0 φ(u(t)) ≤ − |u0 |p (t) − 0 g(u(t))p dt p p
(1.6)
for L 1 -a.e. t ∈ (a, b). If S is a Hilbert space and φ Fr´echet differentiable at u, then |∂φ|(u) = 0 kφ (u)kS . For further information and proofs, see Sections 1.2-1.4 in [AGS05].
6
2 2.1
Approximation by Γ-Convergence The discrete scheme
For every h ∈ N, let τh be a mesh on [0, ∞), τh = (0 = t0h , t1h , . . . ). We denote τhj = tjh − thj−1 and |τh | = supj≥1 τhj , and we assume that |τh | → 0 as h → ∞. Let (φh )h∈N be a family of functionals and (dh )h∈N be a family of metrics on S , which approximate φ and d respectively as h → ∞ in a sense that is made precise in Section 2.2. The discrete slopes are defined as (φh (v) − φh (w))+ . dh (v, w) dh (w,v)→0
|∂h φh |(v) = lim sup
Let (Uh0 )h∈N be a family of approximate initial values. We say that (Uhj )j≥0 is a discrete solution (of the gradient flow) if for every j ≥ 1, Uhj minimizes the functional dph (Uhj−1 , v) j−1 j v 7→ Φh (Uh , τh ; v) := + φh (v) (2.1) p(τhj )p−1 over S , and we define Jh,τ [u] to be the set of minimizers of Φh (u, τ ; ·). Suppose that (Uhj )j≥0 is a discrete solution. We define the interpolants 0 Uh , t = 0 ¯ Uh (t) = Uhj , tj−1 < t ≤ tjh , j ≥ 1 h |Uh0 |(t)
dh (Uhj , Uhj−1 ) , = τhj
if tj−1 < t < tjh . h
By U˜h (t) (De Giorgi’s variational interpolant) we denote any interpolant of (Uhj ) such that U˜h (t) ∈ Jh,t−tj−1 [Uhj−1 ] whenever tj−1 < t ≤ tjh . Finally, we denote h h
Gh (t) =
dh (U˜h (t), Uhj−1 ) , t − tj−1 h
if tj−1 < t ≤ tjh . h
It follows from Lemma 1 (b), that 0
|∂h φh |p (U˜h (t)) ≤ Gph (t) ∀t ∈ (0, ∞), ∀h ∈ N,
(2.2)
thus providing a connection between Gh and the discrete slopes. The existence of an interpolation U˜h is guaranteed for sufficiently small |τh | by condition (2.3) below.
2.2
Topological conditions
We start by a condition that guarantees the solubility of the discrete scheme for a sufficiently fine partition. We assume that ∃τ ∗ > 0 such that ∀ 0 < τ ≤ τ ∗ , u ∈ S , h ∈ N : Jh,τ [u] 6= ∅.
(2.3)
7 Detailed assumptions under which (2.3) is satisfied are standard. See for example [Dac89] and in particular Section 2.2 [AGS05]. Next, the initial conditions are required to satisfy σ
Uh0 * u0 ,
φh (Uh0 ) → φ(u0 ),
and sup d(Uh0 , u0 ) < ∞.
(2.4)
h
A priori bounds on φh (uh ) are usually obtained by an equi-coercivity assumption on the familiy (φh ). Here, in the time-dependent case, the following condition is sufficient. We assume that ∃ u∗ ∈ S , τ ∗ > 0 : inf inf Φh (τ ∗ , u∗ ; ·) = c0 > −∞.
(2.5)
h∈N S
We have used the time-step restriction τ ∗ both in (2.3) and (2.5). Of course, we may assume without loss of generality that they are the same, but more importantly, they are typically closely related, since it represents the threshold where Φ(τ, u; ·) becomes bounded below. To obtain compactness of the family of discrete solutions, we require that dh and φh -bounded sets are sequentially σ-compact: sup(dh (Uh0 , uh ) + φh (uh )) < ∞
⇒
h
σ
∃(hk )k∈N ↑ ∞, u ∈ S s.t. uhk * u. (2.6)
Here Uh0 denotes the approximate initial value which should also satisfy (2.4). To prove that limits of discrete solutions are curves of maximal slope three liminf conditions, similar to the one of Γ-convergence1 , are crucial. We will require for several results below that the following hold: σ
(uh , vh ) * (u, v) ⇒ lim inf dh (uh , vh ) ≥ d(u, v) and h→∞
σ
⇒
uh * u
lim inf φh (uh ) ≥ φ(u). h→∞
(2.7) (2.8)
The third liminf condition is a liminf condition for the slopes. Let g : S → [0, ∞] be defined as σ (2.9) g(u) = Γ(σ)−lim inf |∂h φh | (u) = inf{lim inf |∂h φh |(uh ), uh * u}. h→∞
h→∞
The most general way to formulate the remaining liminf condition is to require that g is a strong upper gradient. In practice, it may be advantageous to show that |∂φ| is an upper gradient and is bounded above by g, i.e., to require that σ
uh * u
⇒
|∂φ|(u) ≤ lim inf |∂h φh |(uh ). h→∞
(2.10)
A reader familiar with Γ-convergence will have noticed that no limsup condition was formulated. As it turns out, the limsup condition is implicitly hidden in the assumption that g is an upper gradient. This condition is probably the most difficult one to verify. In Section 2.5 we give an example of such a technique. 1
For an introduction to Γ-convergence see for example [Bra02].
8
2.3
Auxiliary results
For every τ > 0, u, v ∈ S let Φ(u, τ ; v) =
dp (u, v) + φ(v), pτ p−1
and Jτ [u] = argmin Φ(u, τ ; ·)
We define the Moreau-Yosida approximation of φ to be φτ (u) = inf Φ(τ, u; ·). If Jτ [u] 6= ∅ then we also define − d+ τ (u) = sup d(uτ , u) and dτ (u) = uτ ∈Jτ [u]
inf
uτ ∈Jτ [u]
d(uτ , u).
The following Lemma is the first crucial step towards the convergence theorem. In particular, part (c) will provide a discrete version of (1.6). Lemma 1 (Properties of the Moreau-Yosida Approximation) Assume that φ is d-lower semicontinuous and that there exists τ ∗ > 0 such that for 0 < τ < τ ∗ , Jτ [u] 6= ∅. (a) The mapping (τ, u) 7→ φτ (u) is continuous in (0, τ ∗ ) × S and if 0 < τ0 < τ1 < τ ∗ and uτi ∈ Jτi [u], then φ(u) ≥ φτ0 (u) ≥ φτ1 (u), φ(u) ≥ φ(uτ0 ) ≥ φ(uτ1 ),
d(uτ0 , u) ≤ d(uτ1 , u), − + d+ τ0 (u) ≤ dτ1 (u) ≤ dτ1 (u).
(2.11)
In particular, lim φτ (u) = lim τ ↓0
inf
τ ↓0 uτ ∈Jτ [u]
φ(uτ ) = φ(u),
and if u ∈ D(φ) then limτ ↓0 d+ τ (u) = 0. Furthermore, there exists an at ∗ most countable set Nu ⊂ (0, τ ) such that d− (u) = d+ (u) ∀τ ∈ (0, τ ∗ ) \ Nu . (b) If uτ ∈ Jτ [u], then uτ ∈ D(|∂φ|) and 0
|∂φ|p (uτ ) ≤
dp (uτ , u) . τp
(c) For every u ∈ S , the map τ 7→ φτ (u) is locally Lipschitz continuous in (0, τ ∗ ) and d (d± (u))p φτ (u) = − τ 0 p ∀τ ∈ (0, τ ∗ ) \ Nu . dτ pτ In particular, we have that, for every τ ∈ (0, τ ∗ ) and uτ ∈ Jτ [u], Z τ ± dp (uτ , u) (dr (u))p + dr = φ(u) − φ(uτ ). (2.12) pτ p−1 p0 r p 0
9 For a proof of this lemma see Lemmas 3.1.2 to 3.1.5 as well as Remark 3.1.7 in [AGS05]. Note, however, that the completeness of the metric space is not required here. This allows a more general choice of the approximate metrics dh . We also state two compactness theorems which are used in the analysis of Section 2. For their proofs see Proposition 3.3.1 and Lemma 3.3.3 in [AGS05]. Lemma 2 (Ascoli-Arzel` a Theorem) Let T > 0, K ⊂ S be a sequentially compact set with respect to σ, and, for h ∈ N, let uh : [0, T ] → S be curves such that ∀t ∈ [0, T ], h ∈ N : uh (t) ∈ K, and ∀s, t ∈ [0, T ] : lim sup d(uh (s), uh (t)) ≤ w(s, t) h→∞
where w is a symmetric function w : [0, T ] × [0, T ] → [0, ∞) such that ∀r ∈ [0, T ] \ N :
lim
w(s, t) = 0,
(s,t)→(r,r)
where N may be at most countable. Then, there exists a subsequence (uhk ) and a limit curve u : [0, T ] → S such that u is d-continuous in [0, T ] \ N and σ
∀t ∈ [0, T ] : unk (t) * u(t). Lemma 3 (Helly’s Theorem) Suppose that, for h ∈ N, ϕh : [0, T ] → [−∞, ∞] are non-increasing functions. Then, there exists a subsequence (ϕhk ) and a nonincreasing map ϕ : [0, T ] → [−∞, ∞] such that limk ϕhk (t) = ϕ(t) for every t ∈ [0, T ].
2.4
The abstract convergence result
In this section we show all steps to prove the convergence of discrete solutions to a curve of maximal slope. Proposition 4 provides a discrete version of (1.6) and bounds on the energy and the distance, which are used in Proposition 5 to establish the weak sequential compactness of discrete solutions. Finally, in Proposition 6 we show that any accumulation point of the family of discrete solutions is a curve of maximal slope, using the liminf conditions of Section 2. The following Lemma is a version of Lemma 3.2.2 in [AGS05], adapted to the present case, where it is crucial that all a-priori bounds are independent of h. Proposition 4 (A-Priori Estimates) Suppose that (2.3), (2.4) and (2.5) hold and suppose that for every h ∈ N, (Uhj )j∈N is a discrete solution as described in Section 2.1. Then, for every T > 0, there exists a constant C0 ∈ R such that ∀j ∈ N, ∀h ∈ N such that tjh ≤ T, we have φh (Uhi ) ≥ C0 .
(2.13)
10 < t ≤ tjh ≤ T : Furthermore, the following estimates hold for tj−1 h 1 p
Z
j
th2
j th1
|Uh0 |p (t) dt
1 + 0 p
Z
j
th2
j th1
|Gh (t)|p dt + φh (Uhj2 ) = φh (Uhj1 ), (2.14)
1 p j 0 0 dh (Uh , Uh ) ≤ (tjh )p/p (φh (Uh0 ) − C0 ), (2.15) p 1 p ˜ 2p |τh |p−1 dh (Uh (t), U¯h (t)) ≤ p p
tjh
Z
tj−1 h
|Uh0 |p (t) dt ≤ 2p |τh |p−1 (φh (Uh0 ) − C0 ). (2.16)
Proof Using equation (2.12) and the definitions of |Uh0 | and Gh gives 1 p
Z
tjh
tj−1 h
|Uh0 |p (t) dt
1 + 0 p
Z
tjh
j−1 th
|Gh (t)|p dt + φh (Uhj ) = φh (Uhj−1 ).
After summing over j = j1 + 1, . . . , j2 , we obtain (2.14). To obtain a lower bound on the energy, consider φh (Uhj ) = Φh (τ ∗ , u∗ ; Uhj ) − d2 (u∗ , Uhj ) ≥ c0 − d2 (u∗ , Uhj ),
(2.17)
where we used assumption (2.5). It is therefore sufficient to compute an upper bound on d(Uhj , u∗ ). Following the proof in Lemma 3.2.2 [AGS05], we estimate j 1 2 0 ∗ 1X 2 i ∗ 1 2 j ∗ d (Uh , u ) − d (Uh , u ) = d (Uh , u ) − d2 (Uhi−1 , u∗ ) 2 2 2 i=1
≤
j X
d(Uhi , Uhi−1 )d(Uhi , u∗ )
i=1
≤ 2
Z
tjh
j
|Uh0 |2 (t) dt +
0
1 X i 2 j ∗ τh d (Uh , u ) 2 i=1
j
≤
φh (Uh0 )
1 X i 2 i ∗ − c0 + ∗ d2 (Uhj , u∗ ) + τh d (Uh , u ). 2τ 2 i=1
Choosing = τ ∗ /2, and using (2.4) and (2.5), we obtain d2 (Uhj , u∗ ) ≤ a + b
j X
τhi d2 (Uhi , u∗ ),
i=1
where a and b are constants which depend only on the bounds in (2.4) and (2.5), to which we can apply Gronwall’s inequality, Lemma 31, to get a constant D, depending on T , but not on h such that d2 (Uhj , u∗ ) ≤ D
∀h, j ∈ N s.t. tjh ≤ T.
(2.18)
11 This Together with (2.17), this gives (2.13). Successive application of the triangle inequality gives the bound dh (Uhj , Uh0 )
≤
j X
dh (Uhi , Uhi−1 ) τhi τhi i=1
Z =
tjh
|Uh0 |(t) dt
0
to which we apply H¨older’s inequality to obtain dh (Uhj , Uh0 )
≤
0 (tjh )1/p
Z
tjh
!1/p |Uh0 |p (t) dt
.
0
Equation (2.15) is obtained by taking the p-th power and applying (2.14) as well as (2.13). Suppose that thj−1 < t < tjh . From (2.11), we see that ˜h (t), U ¯h (t)) ≤ dh (U ˜h (t), U j−1 ) + dh (U j−1 , U j ) ≤ 2dh (U j , U j−1 ). dh (U h h h h h Equation (2.16) can be obtained by taking the p-th power and applying (2.14) again.
Proposition 5 (Compactness) Suppose now, that for every h ∈ N, (Uhj )j∈N is a discrete solution and that all conditions (2.3)–(2.8) are satisfied. Then, there p exists a subsequence (hn )n∈N ↑ ∞, and a curve u ∈ ACloc ([0, ∞); S ), a nonincreasing function ϕ : [0, ∞) → [−∞, ∞], and a function A ∈ Lploc [0, ∞) such that σ σ ∀t ≥ 0 : U¯hn (t) * u(t), U˜hn * u(t) as n → ∞, ∀t ≥ 0 : lim φhn (U¯hn (t)) = ϕ(t) ≥ φ(u(t)), and
(2.19) (2.20)
|Uh0 n | * A in Lploc [0, ∞), and A ≥ |u0 | L 1 −a.e.
(2.21)
n
Proof Equation (2.14) shows that |Uh0 | is uniformly bounded and hence weakly compact in Lp (0, T ) for each T > 0, which guarantees the existence of a subsequence (hn ) such that |Uh0 n | converges weakly in Lploc [0, ∞) to some function A ∈ Lploc [0, ∞). By Lemma 3, possibly upon extracing a further subsequence, we obtain a limit function ¯h (t)) → ϕ(t) for every t ≥ 0, and ϕ(0) = φ(u0 ). ϕ, such that φhn (U n ¯h (t), U 0 ) is uniformly bounded in h and Condition (2.14) and (2.15) show that dh (U h ¯h are contained in a t, for 0 ≤ t ≤ T . By assumption (2.6) the ranges of the curves U σ-compact set. Fix 0 ≤ s < t and define s(h) = max{tjh < s}
and t(h) = min{tjh > t},
so that s(h) ≤ s ≤ t ≤ t(h) and lim|τh |→0 s(h) = s and lim|τh |→0 t(h) = t. We have ¯h (s), U ¯h (t)) ≤ dh (U s(h) , U s(h)+1 ) + dh (U s(h)+1 , U t(h) ) = dh (U h h h h
Z
t(h)
s(h)
|Uh0 |(r) dr,
12 and therefore, using the weak convergence of |Uh0 n |, ¯h (s), U ¯h (t)) ≤ lim sup dhn (U n n n→∞
Z
t
A(r) dr. s
This allows us to use Lemma 2 on bounded intervals [0, T ], so that, upon extracting a further subsequence, we may assume that there exists a limit-curve u(t) such that σ ¯h (t) * U u(t) for every t ≥ 0. n The liminf condition for the functionals (2.8) gives (2.20) and the liminf condition for the metric (2.8) gives the limit inequality ¯h (s), U ¯h (t)) ≤ d(u(s), u(t)) ≤ lim inf dhn (U n n n→∞
Z
t
A(r) dr, s
p which shows that u ∈ ACloc (0, ∞; S ), and that |u0 | ≤ A a.e. in (0, ∞).
Proposition 5 put us into the position to prove the convergence of discrete solutions to a curve of maximal slope. All we require is some information about the limit of Ghn . A trivial lower bound for lim inf n Ghn (t) is the function g(u(t)), defined in (2.9). Proposition 6 Suppose that for every h ∈ N, (Uhj )j∈N is a discrete solution, as described in Section 2.1, with time-partition τh and u is a limit curve, satisfying (2.19) - (2.21). Suppose also that (2.3) and (2.4), as well as (2.7) and (2.8) are satisfied, and that the functional g, defined in (2.9) is a strong upper gradient for φ. Then, u is a curve of maximal slope for φ with respect to g, φ◦u is an element of ACloc ([0, ∞); S ), the energy identity Z Z 1 t 1 t 0p 0 |u | (t) dt − 0 g(u(t))p dt (2.22) φ(u(t)) = φ(u(0)) − p 0 p 0 holds for all t > 0, and we have the following convergence properties for the discrete solution: lim φhn (U¯hn (t)) = φ(u(t)) ∀t ∈ [0, ∞),
n→∞
lim |∂hn φhn |(U¯hn ) = g ◦ u
n→∞
in Lploc [0, ∞), and
lim |Uh0 n | = |u0 | in Lploc [0, ∞).
n→∞
(2.23)
13 Proof Using (2.21), (2.9), and (2.20), we obtain for any 0 < t,
≤ ≤ ≤ ≤ =
Z Z 1 t 1 t 0p 0 |u | (r) dr + 0 g(u(r))p dr + φ(u(t)) (2.24) p 0 p 0 Z Z 1 t 1 t 0 p ˜h (r)) dr + lim φh (U ¯h (t)) |A(r)| dr + 0 lim inf |∂hn φhn |p (U n n n n n p 0 p 0 Z t Z t 1 1 ¯h (t)) lim inf |Uh0 n |p (r) dr + 0 lim inf |Ghn (r)|p dr + lim φhn (U n n n n p 0 p 0 Z t Z t 1 1 ¯h (t)) |Uh0 n |p (r) dr + lim inf 0 lim sup |Ghn (r)|p dr + lim φhn (U n n n p 0 p 0 n Z t Z 1 t 1 ¯h (t)) lim sup |Uh0 n |p (r) dr + 0 |Ghn (r)|p dr + φhn (U n p 0 p 0 n ¯h (0)) = ϕ(0) = φ(u0 ). lim sup φhn (U n n
Since g is a strong upper gradient for φ, we also have Z
t
φ(u(0)) − φ(u(t)) ≤
g(u(r))|u0 |(r) dr
(2.25)
0
Z ≤ 0
t
1 1 0p 0 |u | (r) + 0 |g(u(r))|p dr, p p
and therefore all of the estimates stated in (2.24) must be equalities. First, we immediately obtain line one in (2.23). Second, we find that the norms k |Uh0 n | kLp (0,t) converge and hence |Uh0 n | converges strongly in Lploc [0, ∞) to A, and furthermore that ˜h )|, note that due to (2.14), the family A = |u0 |. To see that g ◦ u = lim |∂hn φhn (U n 0 ˜h ) is bounded in Lp (0, ∞). Hence there exists a subsequence (hn ) and a |∂hn φhn |(U n k 0 ˜h ) * G, weakly in Lp0 (0, ∞). As above, function G ∈ Lp (0, ∞), so that |∂hnk φhnk |(U nk we may deduce that g ◦ u = G. This shows that the limit does not depend on the ˜h ) converges weakly to subsequence chosen and hence the entire sequence |∂hn φhn |(U n 0 p g ◦ u. To obtain strong convergence in L (0, t) and equation (2.23), we may use a similar argument as for the convergence of |Uh0 n |. The energy identity (2.22) as well as the fact that φ ◦ u ∈ ACloc [0, ∞) follow from t
Z
g(u(r))|u0 |(r) dr
φ(u(t)) − φ(u(0)) = 0
=
1 p
Z 0
t
1 |u | (r) dr + 0 p 0 p
Z
t
0
|g(u(r))|p dr
0
from which we may also deduce that d 1 1 0 φ(u(t)) = − |u0 |p (t) − 0 g(u(t))p for L 1 −a.e. t ∈ (0, ∞). dt p p
14 Remark 7 The convergence result contained in Proposition 6 combined with Proposition 5 merely states that in the family of approximations there exists one subsequence converging to a solution. This might seem a very weak result at first glance. However, note first that if only one solution exists then every convergent subsequence must converge to this solution and it follows easily that the entire family of approximations converges. In practical computations, we do not expect this to be a problem at all. First, we may not even refine the approximation, and even if we did, we would typically use the information previously obtained and thus compute a new discrete solution which lies close to the one we had obtained previously. Unless two exact solutions lie close together, it is therefore unlikely that we would obtain two subsequences converging to different solutions.
2.5
The λ-convex case
Equation (2.10) can be verified in the case when φ is convex and weak convergence is equivalent to convergence in the metric. The following Proposition is intended only as an example, as for many applications, the conditions are too strong. Proposition 8 Assume that S is a Banach space and that d is the metric induced by its norm. Suppose that φ and φh are convex and respectively d and dh -lower semicontinuous and that φh Γ(σ)-converges to φ, i.e., that in addition to (2.8), we have σ
∀ v ∈ S ∃ (vh )h∈N ∈ S s.t. vh * v and φh (vh ) → φ(v).
(2.26)
Furthermore, suppose that the continuity condition σ
σ
uh * u and vh * v
⇒
dh (uh , vh ) → d(u, v).
(2.27)
holds. Then, the slopes satisfy the liminf condition (2.10) and |∂φ| is an upper gradient for φ. Proof Note first, that if S is a Banach space and φ is convex and lower semicontinous, |∂φ| satisfies (φ(u) − φ(v))+ . (2.28) |∂φ|(u) = sup d(u, v) v6=u and based on this equation, one can show that |∂φ| is an upper gradient for φ. For a proof see Theorem 1.2.5 and Proposition 1.4.4 in [AGS05]. Let vj be a sequence satisfying (φ(u) − φ(vj ))+ |∂φ|(u) = lim , j→∞ d(u, vj )
15 and for every j ∈ N, let (wj,h )h∈N be the recovery sequence for vj , described by (2.26). σ Let uh * u. With these preparations, we can now estimate (φ(u) − φ(vj ))+ j d(u, vj ) (lim inf h φh (uh ) − limh φ(wj,h ))+ ≤ lim j limh dh (uh , wj,h ) (φh (uh ) − φ(wj,h ))+ = lim lim inf j h dh (uh , wj,h ) (φh (uh ) − φ(wj,h ))+ ≤ lim inf lim sup h dh (uh , wj,h ) j
|∂φ|(u) = lim
(2.29)
≤ lim inf |∂h φh |(uh ). h
We move on to a generalization of the above result, treating functionals whose non-convexity is controllable along certain curves in the metric space. Assumption 9 For a given λ ∈ R, p ∈ (1, ∞), we assume that for any v0 , v1 ∈ D(φ) there exists a curve (vt )t∈(0,1) such that, for all 0 < τ < 1/λ− , λ− = max(0, −λ), Φ(τ, v0 ; vt ) ≤ (1 − t)Φ(τ, v0 ; v0 ) + tΦ(τ, v0 , v1 ) −
1 1 λ + p−1 t(1 − tp−1 )dp (v0 , v1 ), p τ (2.30)
where Φ(τ, u; v) = dp (u, v)/(pτ p−1 ) + φ(v). Remark 10 Note that the curve vt above is chosen independently of τ . This is used in the proof of Lemma 12 below. Remark 11 An archetypical case for Assumption 9 is when the metric space S is geodesically complete, i.e., for any two points v0 , v1 ∈ S there exists a curve vt (a constant-speed geodesic) such that d(v0 , vt ) = td(v0 , v1 ). In this case, if a function φ satisfies 1 φ(vt ) ≤ (1 − t)φ(v0 ) + tφ(v1 ) − λt(1 − tp−1 )dp (v0 , v1 ) p
(2.31)
along every geodesic t 7→ vt , then (2.30) is satisfied. The surprising result in [AGS05] was that much more general curves may be used if (2.30) is assumed directly. This procedure has, for example, applications to gradient flows with respect to the Wasserstein metric of probability measures (c.f. Part II in [AGS05]).
16 Lemma 12 If Assumption 9 is satisfied, then the local slope admits the representation + φ(v) − φ(w) 1 p−1 |∂φ|(v) = sup + λd (v, w) . (2.32) d(v, w) p w6=v If, in addition, φ is d-lower semicontinuous then |∂φ| is a strong upper gradient for φ and is d-lower semicontinuous. Proof The proof for p = 2 is given in Theorem 2.4.9 and Corollary 2.4.10 [AGS05]. The proof we give here is a simple extension to the case p 6= 2. Note first that we can rewrite (2.30) as dp (v0 , vt ) + φ(vt ) pτ p−1 tdp (v0 , v1 ) 1 1−p − τ + λ t(1 − tp−1 )dp (v0 , v1 ) p−1 pτ p tdp (v0 , v1 ) p−1 = (1 − t)φ(v0 ) + tφ(v1 ) + t − λτ p−1 (1 − tp−1 ) . (2.33) p−1 pτ
≤ (1 − t)φ(v0 ) + tφ(v1 ) +
Multiplying (2.33) by τ p−1 passing to the limit as τ → 0+ (note that the vt are independent of τ ), we obtain d(v0 , vt ) ≤ td(v0 , v1 ).
(2.34)
Neglecting the first term on the right-hand side of (2.33), we get n o dp (v0 , v1 ) p−1 p−1 p−1 . λτ (1 − t ) − t φ(v0 ) − φ(vt ) ≥ t φ(v0 ) − φ(v1 ) + pτ p−1
(2.35)
We divide (2.35) by d(v, vt ) and use (2.34) to obtain φ(v0 ) − φ(vt ) d(v0 , vt )
≥ ≥
o td(v0 , v1 ) n φ(v0 ) − φ(v1 ) dp−1 (v0 , v1 ) p−1 p−1 p−1 + λτ (1 − t ) − t d(v0 , vt ) d(v0 , v1 ) pτ p−1 φ(v0 ) − φ(v1 ) dp−1 (v0 , v1 ) + λτ p−1 (1 − tp−1 ) − tp−1 . (2.36) p−1 d(v0 , v1 ) pτ
Fix v, w ∈ S and choose v0 = v and v1 = w in (2.36). Using (2.34), we have vt → v as t → 0 and hence, using the fact that s 7→ (s)+ is monotone and continuous, φ(v) − φ(vt ) + |∂φ(v)| ≥ lim sup d(v, vt ) t→0 + φ(v) − φ(w) λ p−1 ≥ , + d (v, w) d(v, w) p which shows that the left-hand side is greater or equal the right-hand side in (2.32). The reverse inequality is trivial; hence we have shown (2.32). To conclude, we simply note that the proof of Corollary 2.4.10 in [AGS05] does not depend on the specific formula for |∂φ| but only on the structure. It can be repeated
17 word for word to show that |∂φ| is lower semicontinuous and that it is a strong upper gradient for φ.
As we have obtained a formula for |∂φ| which is quite similar to (2.28), it is natural to consider an analogous argument to the proof of Proposition 8 for λconvex functionals. Proposition 13 Let p ∈ (1, ∞). Suppose that Assumption 9 is satisfied for Φ as well as all Φh with the same λ. Suppose also that (2.27) holds and that φh Γ(d)converges to φ (i.e. (2.26) and (2.8) are satisfied). Then, the liminf condition for the slope (2.10) holds. Proof We argue as in the proof of Proposition 8. Let vj be a sequence satisfying |∂φ|(u) = lim
j→∞
+ φ(u) − φ(vj ) λ p−1 + d (u, vj ) , d(u, vj ) p
and for every j ∈ N, let (wj,h )h∈N be the recovery sequences for vj described in (2.26). Using the liminf condition (2.8) and Lemma 12 it is again easy to deduce that for σ uh * u, + φ(u) − φ(vj ) λ p−1 + d (u, vj ) |∂φ|(u) = lim j→∞ d(u, vj ) p !+ lim inf h φh (uh ) − limh φh (wj,h ) λ ≤ lim + lim dh (uh , wj,h ) j→∞ p h limh dp−1 h (uh , wj,h ) + φh (uh ) − φh (wj,h ) λ p−1 = lim lim inf + dh (uh , wj,h ) j→∞ h→∞ dh (uh , wj,h ) p + φh (uh ) − φh (wj,h ) λ p−1 + dh (uh , wj,h ) ≤ lim inf sup h→∞ j∈N dh (uh , wj,h ) p ≤ lim inf |∂h φh |(uh ). h→∞
Remark 14 1. A possible avenue for obtaining the liminf condition for the slopes if the functionals are not uniformly λ-convex, is to construct a more specific recovery sequence. If it were possible to strengthen the limsup condition (2.26), i.e. to find wh,j satisfying also limj→∞ dh (wh,j , vh ) = 0, then the proof of Proposition 13 applies even there is no representation for the slopes as the one obtained in Lemma 12. 2. The case where the approach taken in this section fails completely is when σ is much weaker than d. This situation arises, for example, if the compactness in Condition (2.6) is induced by the metric rather than the functional. Theorem 15 Suppose that Assumption 9 holds for φ and all φh , and with the same λ, that equation (2.27) is satisfied and that φh Γ(σ)-converges to φ. Suppose
18 also that (2.3) – (2.6) are satisfied. If, for every h ∈ N, (Uhj )j∈N is a discrete solution as described in Section 2.1, then there exists a subsequence (hk )k∈N of N such that (Uhjk )j∈N converges in the sense of Proposition 5 to a curve of maximal slope for φ with respect to the strong upper gradient |∂φ|, with initial value u0 . Proof Simply combine Lemma 12 and Propositions 13 with Propositions 5 and 6.
19
3
Variational Inequalities
The condition of λ-convexity which we used in the previous section to obtain concrete convergence results can be tightened and used favourably to obtain much stronger convergence results based on variational inequalities. In the setting of this section we can only consider 2-curves of maximal slope. We shall call them simply curves of maximal slope for the remainder of the paper.
3.1
Relaxed convexity and coercivity conditions
We review the definition of λ-convexity. For u, v ∈ S , τ > 0, let Φ(τ, u; v) =
d2 (u, v) + φ(v). 2τ
Compare this with equation (1.2) to see the relationship to the time-discretization. Suppose for the moment that (S , d) is a Hilbert space. We say that φ is λ-convex, for some λ ∈ R, if λ φ((1 − t)v0 + tv1 ) ≤ (1 − t)φ(v0 ) + tφ(v1 ) − t(1 − t)kv0 − v1 k2 2
∀t ∈ [0, 1]. (3.1)
In this case, it is easy to show that for 0 < τ < 1/λ− , where we define λ− = max(0, −λ), we have Φ(τ, u; (1−t)v0 +tv1 ) ≤ (1−t)Φ(τ, u; v0 )+tΦ(τ, u; v1 )−
1 −1 τ +λ t(1−t)kv0 −v1 k2 . 2
Note that for sufficiently small τ , the ‘time-step’ (1.2) becomes a uniformly convex minimization problem, which is considerably easier to solve both theoretically and numerically. In the general metric case, the following condition is required. We say that Φ(τ, u; ·) is (τ −1 + λ)-convex if for any two points v0 , v1 ∈ S , and t ∈ (0, 1), there exist vt ∈ S such that, for all 0 < τ < 1/λ− , we have Φ(τ, u; vt ) ≤ (1 − t)Φ(τ, u; v0 ) + tΦ(τ, u; v1 ) −
1 −1 τ + λ t(1 − t)d2 (v0 , v1 ), (3.2) 2
for all t ∈ [0, 1]. Condition (2.30) is considerably more general than Condition (3.2), as there, the convexity condition needed to be satisfied only for the base point v0 . To obtain a better feel for the meaning of λ-convexity, consider the following simple proposition.
20 Proposition 16 Let S be a closed, convex subset of a Hilbert space H with norm k · k and φ : H → (−∞, +∞]. Then φ is λ-convex in S if and only if u 7→ φ(u) − λ2 kuk2 is convex in S . In particular, if φ is differentiable in S then φ is λ-convex if, and only if, φ0 (v1 ) − φ0 (v0 ), v1 − v0 ≥ λkv1 − v0 k2 (∀ v1 , v0 ∈ S ) (3.3) holds. If φ is twice differentiable in S and φ00 (u; v, v) ≥ λkvk2
(∀v ∈ H )
(3.4)
holds for all points u which are not extremal in S then φ is λ-convex. If φ = φ1 + φ2 , where φi : S → (−∞, +∞], φ1 is convex and φ2 is λ-convex, then φ is λ-convex. Proof Throughout the proof, let v0 , v1 ∈ S , vt = (1 − t)v0 + tv1 and F (v) = φ(v) − λ/2kvk2 . The crucial observation is that u 7→ 12 kuk2 is 1-convex, in fact, we even have 1 1 1 1 kvt k2 = (1 − t) kv0 k2 + t kv1 k2 − t(1 − t)kv0 − v1 k2 . 2 2 2 2
(3.5)
Suppose now that φ is λ-convex. Using (3.5), we have λ t(1 − t)kv0 − v1 k2 2 λ λ λ −(1 − t) kv0 k2 − t kv1 k2 + t(1 − t)kv0 − v1 k2 2 2 2 = (1 − t)F (v0 ) + tF (v1 ).
F (vt ) ≤ (1 − t)φ(v0 ) + tφ(v1 ) −
On the other hand, if F is convex we may traverse the above inequality in the opposite direction, to obtain that φ is λ-convex. The derivative of kvk2 /2 is (v, ·), its second derivative is (·, ·). If φ satisfies (3.3) then F 0 (v1 ) − F 0 (v0 ), v1 − v0 = φ0 (v1 ) − φ0 (v0 ), v1 − v0 − λkv1 − v0 k2 ≥ 0, which is equivalent to F being convex. If φ is twice differentiable and satisfies (3.4) then F 00 (u; v, v) = φ00 (u; v, v) − λkvk2 ≥ 0, for all points u which are not extremal and hence F is convex. The last statement of the proposition is trivial.
If Φ satisfies (3.2) then typical coercivity conditions can be generalized as well. Equation (2.4.10) in [AGS05] is a very general condition, if constant speed geodesics are available. Here, we formulate a condition which reduces the technicality of
21 some proofs but seems to be sufficiently general for most applications. We assume that ∃τ ∗ > 0, u∗ ∈ D(φ) s.t. inf Φ(τ ∗ , u∗ ; ·) ≥ m∗ > −∞. (3.6) S
Of course, inf S φ > −∞ would be sufficient, but this condition may be too restrictive for some applications. If φ satisfies (3.6) we say that φ is coercive. It can be easily checked that (3.6) implies equation (2.4.10) in [AGS05]. A family (φi )i∈I is called equi-coercive if (3.6) is simultaneously satisfied for all φi , with the same choice of τ ∗ , u∗ and m∗ . A typical case of equi-coercivity is covered in the following proposition. Proposition 17 Let I be an index set and for every i ∈ I let φi : S → (−∞, ∞]. Suppose that for every i ∈ I, φi satisfies (3.2) for the same λ ∈ R. Suppose also that there exists u∗ ∈ S and R > 0 such that, for every i ∈ I, there exists u∗i ∈ B(u∗ , R/2) so that inf inf φi = c0 > −∞ and sup φi (u∗i ) = c1 < ∞. ∗ i∈I B(u ,R)
i∈I
Then the family (φi )i∈I is equicoercive. Proof We assume without loss of generality that λ < 0. For v ∈ B(u∗ , R), we trivially have Φi (τ, u∗i ; v) ≥ c0 for all i ∈ I and all τ . For v ∈ S \ B(u∗ , R), let τ = 1/(2λ− ), u = u∗i , v0 = u∗i and v1 = v in (3.2) to obtain Φi (τ, u∗i ; vt ) ≤ (1 − t)Φi (τ, u∗i ; u∗i ) + tΦi (τ, u∗i ; v) +
λ t(1 − t)d2 (u∗i , v). 2
Set t = R/(4d(u∗i , v)) and note that in the proof of 12 we had obtained that d(v0 , vt ) ≤ td(v0 , v1 ), so that vt ∈ B(u∗ , R). Hence, we get 4d(u∗i , v) 4d(u∗i , v) − R − Φi (τ, u∗i ; vt ) − φi (u∗i ) R R λ R − 1− d2 (u∗i , v) 2 4d(u∗i , v) λ c0 − c1 ≥ − d2 (u∗i , v) + d(u∗i , v) + c0 4 R
Φi (τ, u∗i ; v) ≥
The last term in the above estimate is a quadratic polynomial in d(u∗i , v) with a positive leading order term and coefficients independent of i, and is hence bounded below by a constant independent of i, C0 , say. Using d2 (u∗i , v) d2 (u∗ , v) d2 (u∗i , u∗ ) d2 (u∗ , v) R2 ≤ + ≤ + , 2τ τ τ τ 4τ we also have
1 R2 1 R2 Φi (τ, u∗i ; v) − ≥ C0 − , 2 2τ 2 2τ which shows that the family (φi )i∈I is equicoercive. Φi (τ, u∗ ; v) ≥
22
3.2
Evolutionary variational inequalities
Here, we summarize several results from [AGS05] which exploit the convexity conditions of the previous section. In particular, we present evolutionary variational inequalities that are satisfied by curves of maximal slope and their timediscretizations, as well as some approximation results. Theorem 18 (Existence and uniqueness) Let (S , d) be a complete metric space, φ a coercive and lower semicontinous functional on S , satisfying (3.2). (a) For each u0 ∈ D(φ), there exists a unique curve of maximal slope u for φ. (b) The curve u is locally Lipschitz continuous in (0, ∞) with u(t) ∈ D(|∂φ|) ⊂ D(φ) for all t > 0 and it is the unique solution of the evolutionary variational inequality 1d 2 λ d (u, v) + d2 (u, v) + φ(u) ≤ φ(v) ∀ v ∈ S , a.e. t ∈ (0, ∞) 2 dt 2
(3.7)
among all locally absolutely continuous curves v satisfying v(0+) = u0 . Theorem 19 (Time-discretization) Let (S , d) be a complete metric space, φ a coercive, lower semicontinous functional on S , satisfying (3.2). (a) Let τ = (0 = t0 , t1 , t2 , . . . ), τ j = tj − tj−1 , be a partition of [0, ∞) with |τ| = supj τ j ≤ 1/λ− . Then, for every Uτ0 ∈ D(φ), there exists a unique solution of the time-discrete scheme For j = 1, 2, . . . find Uτj ∈ argmin Φ(τ j , Uτj−1 ; ·).
(3.8)
(b) (Uτj )j=0,1,... is the unique solution of the discrete evolutionary variational inequality λ 1 2 j j−1 2 (3.9) d (U , v) − d (U , v) + d2 (Uτj , v) + φ(Uτj ) ≤ φ(v) τ τ 2τ j 2 ∀ v ∈ S , j = 0, 1, . . . . (c) Let U¯τ (t) = Uτj if tj−1 < t ≤ tj . If φ(Uτ0 ) is bounded and Uτ0 → u0 as |τ| → 0 then U¯τ → u in L∞ loc ([0, ∞); S ) as |τ| → 0. For explicit error estimates for the time-discretization see especially Theorems 4.0.7, 4.0.9 and 4.0.10 in [AGS05]. To see, at least heuristically, where (3.7) comes from, see Proposition 20 below. It turns out that the convergence analysis of Sections 3.3 and 3.4 is not restricted to gradient flows of functionals which are constant in time, but may be
23 applied to gradient flows of time-dependent functionals, i.e., to equations of the form u(t) ˙ = −φ0 (t, u(t)), (3.10) where φ0 denotes the Fr´echet derivative with respect to u. By setting v = (v1 , v2 ) ∈ R × S and A(v) = (1, −φ0 (v1 , v2 )), (3.10) can be transformed into the autonomous nonlinear evolution equation v˙ = A(v), for which an extensive theory exists; see in particular the recent work of Nochetto and Savar´e [NS]. Note that condition (IS3 ) in [NS] corresponds exactly to equation (3.3) in the present work. For gradient flows in the metric setting, no such theory exists to the author’s knowledge. It is conceivable, however, that the techniques of Chapter 4 in [AGS05] can be modified to give an existence and approximation theory for time-dependent gradient flows, if φ(t, u) is sufficiently regular in t. In particular, note that the time-dependent version of definition (1.6), for p = 2, is 1 1 d φ(t, u(t)) = − |u0 |2 (t) − |∂φ|(t, u(t)) + ∂t φ(t, u(t)), dt 2 2
(3.11)
where |∂φ|(t, u) = |∂φ(t, ·)|(u) and ∂t φ(t, u) = lim s→t
φ(s, u) − φ(t, u) . s−t
We postpone this analysis to a later time and assume for now that such a solution does exist. The next two propositions show that the basis of our analysis, the evolutionary variational inequalities, do not change if we consider time-dependent functionals. Proposition 20 (a) Suppose that H is a Hilbert space, φ : (a, b) × H → R is differentiable in u, and that the curve u ∈ AC(a, b; H ) satisfies (3.10). If φ(t, ·) is λ-convex in H for every t ∈ (a, b), then u satisfies 1d λ ku(t) − vk2 + ku(t) − vk2 + φ(t, u(t)) ≤ φ(t, v) ∀ v ∈ H , 2 dt 2 for a.e. t ∈ (a, b). (b) Let (S , d) be a metric space, φ : (a, b) × S → (−∞, ∞] satisfy (3.2), and let u0 ∈ S . Then there exists at most one absolutely continuous curve u with u(0) = u0 satisfying 1d 2 λ d (u(t), v)+ d2 (u(t), v)+φ(t, u(t)) ≤ φ(t, v) ∀ v ∈ S , a.e. t ∈ (0, T ). 2 dt 2 (3.12)
24 Proof (a) Fix v ∈ H . Take the inner product of u˙ + φ0 (t, u) with u − v, to obtain (u, ˙ u − v) + (φ0 (t, u), u − v) = 0.
(3.13)
For s ∈ [0, 1], set vs = (1 − s)u + sv = u + s(v − u). By a simple computation, we have Z 1 (φ0 (vs ), u − v) ds = 0. (3.14) φ(t, u) − φ(t, v) − 0
Adding (3.14) to (3.13) gives Z 1 (φ0 (u) − φ0 (vs ), u − v) ds + φ(t, u) = φ(t, v). (u, ˙ u − v) +
(3.15)
0 d First note that dt ku − vk2 = 2(u, ˙ u − v), and second, that s(u − v) = u − vs . We use these facts, together with (3.3) to obtain from (3.15), Z 1 1d 1 0 2 φ(t, v) = ku − vk + (φ (u) − φ0 (vs ), u − vs ) ds + φ(t, u) 2 dt s 0 Z 1 1d 1 ≥ ku − vk2 + λ ku − vs k2 ds + φ(t, u) 2 dt s 0 Z 1 1d 2 2 s ds + φ(t, u) = ku − vk + λku − vk 2 dt 0 1d λ = ku − vk2 + ku − vk2 + φ(t, u), 2 dt 2
which shows part (a). (b) The uniqueness follows directly from the error estimate (3.26), by taking φh = φ and vh = u.
The time-dependent implicit time-discretization of (3.10) is simply obtained by replacing φ(u) by φ(t, u) in (3.2). Let Φ(τ, u; t, v) = d2 (u, v)/(2τ ) + φ(t, v). We say that (Uτj )j=0,1,... is a discrete solution of the gradient flow with respect to the partition τ, if ∀j = 1, 2, . . . : Uτj minimizes Φ(τ j , Uτj−1 ; tj , ·).
(3.16)
Proposition 21 Let (S , d) be a complete metric space, and for every t > 0, let φ(t, ·) be d-lower semicontinuous and satisfy (3.2) for the same λ ∈ R. If |τ| < 1/λ− , then for a given initial value Uτ0 , (3.16) has a unique solution (Uτj ), which is also the unique solution of λ 1 2 j j−1 2 d (U , v) − d (U , v) + d2 (Uτj , v) + φ(tjτ , Uτj ) ≤ φ(tjτ , v) (3.17) τ τ 2τ j 2 ∀ v ∈ S , j = 0, 1, . . . , given the same initial value Uτ0 . Proof For every time-step, the problem can be reduced to (3.8), hence Theorem 19 applies except (possibly) for the convergence result of part (c).
25
3.3
Continuous case
In this section, we outline a general approach for analysing approximations of curves of maximal slope for a time-dependent functional φ : [0, T ] × S → R, by replacing it with a suitable approximation φh . Based on the relaxed convexity and coercivity conditions of Section 3.1 and the variational inequalities of Section 3.2, we are able to treat non-convex and non-differentiable functionals. The functional approximations are sufficiently general to allow for a wide variety of applications. They are not, for example, restricted to Galerkin or similar discretizations. We summarize the basic assumptions for the approximation analysis in the following hypothesis: Assumption 22
(a) Completeness: (S , d) is a complete metric space.
(b) Lower semicontinuity: For every 0 < t ≤ T and h ∈ N, φ(t, ·) and φh (t, ·) are d-lower semicontinuous functionals on S . (c) Uniform λ-convexity: There exists λ ∈ R such that (3.2) holds for φ(t, ·) and for φh (t, ·), for all 0 < t ≤ T . (d) Equi-coercivity: The families (φ(t, ·))0 0, u, v ∈ S , let d2 (u, v) + φh (t, v), Φh (τ, u; t, v) = 2τ and define Uhj to be the (unique) minimizer of Φh (τhj , Uhj−1 ; tjh , ·). For comparison, let uh = (ujh )j∈N be the solution of the discrete scheme for φ, i.e., u0h = u0 and ujh is the minimizer of Φ(τhj , uhj−1 ; tjh , ·), where Φ is defined analogously to Φh . Following Theorem 21, (ujh )j=0,1,... and (Uhj )j=0,1,... satisfy the variational inequalities d2 (ujh , v) − d2 (uhj−1 , v) λ 2 j + d (uh , v) + φ(tjh , ujh ) ≤ φ(tjh , v) ∀ v ∈ S , (3.36) j 2 2τh d2 (Uhj , v) − d2 (Uhj−1 , v) λ 2 j + d (Uh , v) + φh (tjh , Uhj ) ≤ φh (tjh , v) ∀ v ∈ S .(3.37) j 2 2τh
31 As before, we test (3.36) with Uhj and (3.37) with a family Vhj , which satisfies good approximation properties. Adding the two inequalities and arguing as in the continuous case, we obtain j−1 d2 (ujh , Uhj ) − d2 (uj−1 h , Uh ) + (λ − |λ|/2)d2 (ujh , Uhj ) ≤ j1 + j2 + j3 + j4 , (3.38) 2τhj
where j1 = φ(tjh , Uhj ) − φh (tjh , Uhj ), j2 = φh (tjh , Vhj ) − φ(tjh , ujh ), j3 = |λ|d2 (ujh , Vhj ), and j4 =
j j−1 2 j−1 d2 (Uhj−1 , Vhj ) − d2 (Uhj , Vhj ) + d2 (uj−1 h , Uh ) − d (uh , Uh ) . 2τhj
Again, the error term 4 is the most difficult one to control. Lemma 28 is the discrete counterpart of Lemma 24. Lemma 28 Let (S , d) be a Hilbert space, then j4
=
Uhj − Uhj−1 j , Vh − uj−1 h j τh
!
In general, we shall assume that the space (S , d) satisfies the inequality d(u1 , u0 ) d2 (u0 , v0 ) − d2 (u1 , v0 ) + d2 (u0 , v1 ) − d1 (u1 , v1 ) ≤ c1 d(v0 , v1 ) 2τ τ
(3.39)
for all points u0 , u1 , v0 , v1 ∈ S . To simplify notation, we define |Uh0 |(t)
d(Uhj−1 , Uhj ) = τhj
j for t ∈ (tj−1 h , th ),
and |u0h | analogously. Furthermore, we denote the backward piecewise constant interpolation of (Uhj ) by U¯h , i.e., U¯h (t) = Uhj ,
if tj−1 < t ≤ tjh . h
Theorem 29 Suppose that Assumption 22 is satisfied and that, in addition, (S , d) satisfies (3.39). For each h ∈ N, let (Uhj )j=0,1,... and (ujh )j=0,1,... be the discrete solutions of the gradient flow in the sense of (3.16), for, respectively, φh and φ and for the time-partition τh .
32 (a) For every j = 1, 2, . . . we have
αtjh 2
e
d
(ujh , Uhj )
2
≤ d
(u0h , Uh0 )
+2
k X
i τhi eαth
n
φ(tih , Uhi ) − φh (tih , Uhi )
i=1 i i + φh (th , Vh ) − φ(tih , uih ) + |λ|d2 (Vhi , uih ) o d(Uhi , Uhi−1 ) i−1 i +c1 d(V , u ) , ∀(Vhi )j=1,2,... h h τhi
(3.40) ⊂ S,
where α = 2(λ − |λ|/2)/(1 − 2(λ − |λ|/2)|τh |) and c1 is the constant from equation (3.39). (b) Suppose that, in addition, φ and φh are independent of t, and that the τh are partitions of [0, ∞) now, so that Theorems 18 and 19 apply. Suppose also, that (3.27), holds, that (3.28) holds for every t, and that ∀h ∈ N, ∃(Vhj )j=1,2,... such that φh (V¯h (t)) → φ(u(t)) for a.e. t ∈ (0, T ), φ (V j ) ≤ c2 (1 + φ(u(tjh ))) and ¯h h Vh → u in L2 (0, T ; S )
(3.41)
is satisfied. Then, U¯h → u in L∞ (0, T ; S ) for all T > 0.
Proof The error estimate (a) is a simple application of Lemma 31 to (3.38). To obtain (b), note that d(ujh , u(tjh )) → 0, uniformly for bounded tjh (cf. Theorem 19). Fix 0 ≤ s ≤ T and let jh be the smallest index such that tjhh ≥ T . To estimate the first error term in (3.40), we argue similarly as in the proof of Theorem 25, using (2.13) to obtain a bound which allows the application of Fatou’s Lemma,
lim sup max
j X
h→∞ j=1,...,jh i=1
j
τhi
φ(Uhi ) − φh (Uhi ) ≤ lim sup h→∞
Z
thh
0
Z
T
≤ lim sup h→∞ T
Z ≤
0
¯h ) − φh (U ¯h ) + dt φ(U
¯h ) − φh (U ¯h ) + dt + δ1 φ(U
0
¯h ) − φh (U ¯h ) dt + δ1 lim sup φ(U h→∞
≤ 0 + δ1 ,
33 R tjhh
where δ1 , the error made by replacing
0
with
RT 0
, can be estimated via
j
Z
thh
δ1 = lim sup h
¯h ) − φh (U ¯h ) dt φ(U
T j
Z ≤ lim sup h
thh
¯h ) dt c1 + (c1 − 1)φh (U
T j
Z ≤ lim sup h
thh
c1 + (c1 − 1)φh (Uh0 ) dt
T
= 0. The second error term can be treated similarly. Let wh be the piecewise constant interpolant of (u(tjh ))j=1,2,... (cf. Lemma 32). Let Vhj be the recovery sequence from equation (3.41), then j
max
j=1,...,jh
th X
j
τhi
φh (Vhi ) − φ(uih ) ≤
Z
thh
|φh (V¯h ) − φ(¯ uh )| dt
0
i=1
Z ≤
T
|φh (V¯h ) − φ(¯ uh )| dt + δ2
0
The error term δ2 can be estimated with the help of Lemma 32, showing that limh→∞ δ2 = 0. The bound |φh (V¯h )−φ(¯ uh )| ≤ C1 +C2 φ(u(0)) shows that we can apply the dominated convergence theorem. Using |φh (V¯h ) − φ(¯ uh )| ≤ |φh (V¯h ) − φ(u)| + |φ(u) − φ(wh )| + |φ(wh ) − φ(¯ uh )| → 0, almost everywhere in (0, ∞), we obtain that the second error term in (3.40) tends uniformly to zero as h → ∞. Finally, the third and fourth error terms can be estimated by using V¯h → u in L2 , together with the interpolation error estimate of Lemma 32.
Remark 30 1. The result of part (b) in Theorem 29 is restricted to timeindependent functionals only because, for the time being, we do not have a convergence theory for the time-discretization, like Theorem 19 (c). If this gap were closed, which can be expected, then Theorem 29 (b) will hold for time-dependent functionals as well. 2. Estimates of the type (3.41) should be easy to obtain with the help of Lemma 32. Lemma 31 Let (xj )j∈N , (aj )j∈N , (τj )j∈N ⊂ R, xj , τj ≥ 0 and λ ∈ R be such that xj − xj−1 + λxj ≤ aj , 2τj
j = 1, 2, . . . .
34 If |τ| = supj τj ≤ 1/(2λ− ) then (xj ) satisfies αtj
e
j X
xj ≤ x0 + 2
τi eαti−1 ai ,
(3.42)
i=1
where ti =
Pi
k=1 τk
and α = 2λ if λ ≤ 0 and α = λ/(1 + 2λ supj τj ), if λ > 0.
Proof Using ex/(1+x) ≤ (1 + x) for x > −1, we have, for every j = 1, 2, . . . , 2λτj
eατj xj ≤ e 1+2λτj ≤ (1 + 2λτj )xj ≤ xj−1 + 2τj aj ,
(3.43)
which also proves (3.42) for j = 1. We proceed by induction. Assume that (3.42) holds for some j; then, using (3.43), we obtain eαtj+1 xj+1 ≤ eαtj (1 + 2λτj+1 )xj+1 ≤ eαtj xj + 2τj+1 eαtj aj+1 ≤ x0 + 2
j+1 X
τi eαti−1 ai .
i=1
Lemma 32 (Interpolation Error) Suppose that Assumption 22 holds for a 2 time-independent functional φ. Let u ∈ ACloc ([0, ∞); S ) be a curve of maximal slope for φ and let w be its piecewise constant interpolant with respect to the mesh τ, i.e., w(t) = u(tj ), if tj−1 < t ≤ tj . Then, the following two error estimates hold for all j > 0: tj
Z tj |τ| |τ| d (u, w) dt ≤ |u0 |2 (t) dt ≤ φ(u(0)) − φ(u(tj )), (3.44) 1 − |τ| 1 − |τ| 0 0 Z tj Z tj φ(u) − φ(w) dt ≤ |τ| |∂φ|(u)|u0 | dt ≤ |τ| φ(u(0)) − φ(u(tj )). (3.45) Z
2
0
0
Proof For every j, we have Z
tj
d2 (u, w) dt =
tj−1
tj
Z
tj−1 tj
Z
Z Z
tj
s=t tj
≤ tj−1
≤ τj
Z
d 2 d (uh , wh ) ds dt dt 2d(uh , wh )|u0 | ds dt
s=t tj
tj−1
2 d (uh , wh ) + |u0 |2 (t) dt,
35 which gives (3.44). Similary, we can estimate the error in the energy, using the fact that |∂φ| is an upper gradient for φ, Z tj Z tj Z tj d φ(u) − φ(w) dt = φ(u(s)) ds dt tj−1 tj−1 s=t ds Z tj Z tj |∂φ|(u)|u0 | ds ≤ tj−1
≤ τj
Z
t tj
|∂φ|(u)|u0 | dt.
tj−1
Estimate (3.45) now follows directly from the definition of a curve of maximal slope.
Remark 33 1. For mere convergence results the theory presented may be sufficient. Further results concerning the convergence of the sequences of slopes |∂φh | and metric derivatives |Uh0 | may be obtained by combining this analysis with the results of Section 2. In the Hilbert space case, these may be even extended to the convergence of the gradients φ0h (Uh ) and the speeds U˙ h . 2. For applications in numerical analysis, especially for obtaining a-posteriori error estimates, one may prefer to compare an interpolated version of the discrete variational inequality (3.9) directly to the continuous variational inequality (3.7). This is beyond the scope of this paper though. 3. Using a different version of the Gronwall Lemma one can obtain a lower exponential factor in Lemmas 26 and (3.42) at the expense of a slightly larger error bound and a somewhat more complicated error analysis. See also Lemma 4.1.8 [AGS05].
3.5
Examples for (3.25)
We conclude by showing the relationship between the smoothness of the metric and equation (3.25). Proposition 34 Suppose that (S , k · k) is a normed space such that the functional F (v) = kvk2 is Fr´echet differentiable and its derivative has Lipschitz constant c1 . Then, for any curve u ∈ AC(a, b; S ) and any two points v0 , v1 ∈ S , we have that d ku − v0 k2 − ku − v1 k2 ≤ c1 kv0 − v1 kkuk. ˙ dt Proof If F is differentiable and u ∈ AC(a, b; S ), then where h·, ·i is the duality pairing, and hence
d dt F (u
− vi ) = hF 0 (u − vi ), ui, ˙
d F (u − v0 ) − F (u − v1 ) = hF 0 (u − v0 ) − F 0 (u − v1 ), ui ˙ dt ≤ kF 0 (u − v0 ) − F 0 (u − v1 )k∗ kuk ˙ ≤ c1 kv1 − v0 kkuk, ˙
36 where k · k∗ is the norm in the dual Banach space S ∗ .
For the most common norms, the Lp norms (p 6= 2), Proposition 34 does not apply and in fact, (3.25) does not hold. As Lp spaces with p 6= 2 are not candidates for the theory of this section, we do not go into further details, but show only on a simpler example, that the smoothness of the norm is crucial. Proposition 35 Let S = R2 be equipped with the norm kvk`1 = |v1 | + |v2 |. Then, there exists a constant c0 such that for every > 0, there exists a Lipschitz continuous curve u and points v, w such that h i d 1 − 2 2 2 ˙ `1 kv − wk`1 . (3.46) dt ku − vk`1 − ku − wk`1 ≥ c0 kuk Proof Let v = (−, 0), w = (0, ), and u(t) such that − ≤ u1 (t) ≤ , then d 2 2 dt d (u, v) − d (u, w) = |2(u1 − )u˙ 1 − 2(u1 + )u˙ 1 + 4u˙ 1 u2 + 4u1 u˙ 1 | ≥ 4|u˙ 1 | |u2 | − 8(|u˙ 1 | + |u˙ 2 |) = 4|u˙ 1 | |u2 | − 4kuk ˙ `1 kv − wk`1 . If we choose u1 and u2 accordingly, then we obtain the desired result. For example, let u˙ 1 ∈ {−1, 1} and u2 ≡ 1, then (3.46) holds with c0 = 2.
Interestingly, some metrics which are quite far from being Hilbert space metrics satisfy (3.25) and (3.39) as well. Consider, for example, the following metric defined on BV (Ω) ∩ L2 (Ω), which is related to strict convergence in BV (Ω); h 2 i1/2 d(v, w) = kv − wk2L2 + |Dv|(Ω) − |Dw|(Ω) , where |Du|(Ω) is the total variation of u in Ω. It can be quite easily checked that this metric satisfies (3.25) and (3.39). The reason is the one-dimensional structure of the second term in d. It would, however, be very difficult to find curves vt which are candidates for (3.2). Acknowledgements I would like to thank Johannes Zimmer for bringing to my attention the excellent book by Luigi Ambrosio, Nicola Gigli and Giuseppe Savar´e on the theory of curves of maximal slope. I have profited from several helpful discussions with Endre S¨ uli and Johannes Zimmer during the writing of this paper. Section 3.5 is based on some comments by Bernd Kirchheim. I am indebted to him as well.
References [AGS05] L. Ambrosio, N. Gigli, and G. Savar´e. Gradient Flows in Metric Space and in the Space of Probability Measures. Birkh¨auser Verlag, 2005.
37 [Bra02] A. Braides. Γ-convergence for beginners, volume 22 of Oxford Lecture Series in Mathematics and its Applications. Oxford University Press, Oxford, 2002. [Dac89] B. Dacorogna. Direct Methods in the Calculus of Variations. Springer Verlag, 1989. [FP04]
X. Feng and A. Prohl. Analysis of gradient flow of a regularized Mumford-Shah functional for image segmentation and image inpainting. M2AN Math. Model. Numer. Anal., 38(2):291–320, 2004.
[Gob98] M. Gobbino. Gradient flow for the one-dimensional Mumford-Shah functional. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4), 27(1):145–193 (1999), 1998. [NS]
R. Nochetto and G. Savar´e. Nonlinear evolution governed by accretive operators in Banach spaces: Error control and applications. (unpublished) http://www.math.umd.edu/~rhn/publications.html.