PROBABILITY METRICS AND UNIQUENESS OF THE SOLUTION TO ...

2 downloads 0 Views 220KB Size Report
2. G. TOSCANI AND C. VILLANI. In this paper, we shall be interested with the case s = 2. To under- stand why this case separates in a natural way from the other ...
PROBABILITY METRICS AND UNIQUENESS OF THE SOLUTION TO THE BOLTZMANN EQUATION FOR A MAXWELL GAS G. TOSCANI AND C. VILLANI Abstract. We consider a metric for probability densities with finite variance on Rd , and compare it with other metrics. We use it for several applications, including a uniqueness result for the solution of the spatially homogeneous Boltzmann equation for a gas of true Maxwell molecules.

Key-words : spatially homogeneous Boltzmann equation, probability metrics, Maxwellian molecules. 1. Introduction Denote by Ps (Rd ), s > 0, the class of all probability distributions F on Rd , d ≥ 1, such that Z |v|s dF (v) < ∞.

Rd

We introduce a metric on Ps (Rd ) by (1)

|fb(ξ) − gb(ξ)| |ξ|s ξ∈Rd

ds (F, G) = sup

where fb is the Fourier transform of F , Z fb(ξ) = e−iξ·v dF (v).

Rd

Let us write s = m + α, where m is an integer and 0 ≤ α < 1. In order that ds (F, G) be finite, it suffices that F and G have the same moments up to order m. The norm (1) has been introduced in [6] to investigate the trend to equilibrium of the solutions to the Boltzmann equation for Maxwell molecules. There, the case s = 2 + α, α > 0, was considered. Further applications of ds , with s = 4, were studied in [3], while the cases s = 2 and s = 2 + α, α > 0, have been considered in [4] in connection with the so-called Mc Kean graphs [8]. 1

2

G. TOSCANI AND C. VILLANI

In this paper, we shall be interested with the case s = 2. To understand why this case separates in a natural way from the other ones, let us briefly introduce and discuss other well-known metrics on Ps (Rd ). Let F , G in Ps (Rd ), and let Π(F, G) be the set of all probability distributions L in Ps (Rd × Rd ) having F and G as marginal distributions. Let Z (2) Ts (F, G) = inf |v − w|s dL(v, w). L∈Π(F,G)

1/s

Then τs = Ts metrizes the weak-* topology T W∗ on Ps (Rd ). We note that T1 is the Kantorovich-Vasershtein distance of F and G [7, 18]. For a detailed discussion, and application of these distances to statistics and information theory, see Vajda [17]. See also [10] for a recent application to kinetic theory. The case s = 2 was introduced and studied independently by Tanaka [15] who, in the one-dimensional case d = 1, applied T2 to the study of Kac’s equation. Subsequently, the properties of T2 were studied in the multidimensional case by Murata and Tanaka [9]. Applications to the kinetic theory of rarefied gases were finally studied by Tanaka in [16]; several of these applications were given a simplified proof in [12]. The importance of Tanaka’s distance τ2 mainly relies on its convexity and superadditivity with respect to rescaled convolutions. We recall this property, that is at the basis of most of the applications of T2 . Let {X0 , Y0 }, {X1 , Y1 } be two independent pairs of random variables, and let Fi (resp. Gi ) be the probability distribution of Xi (resp. Yi ), i = 0, 1. Gλ√ ) be the probability distribution √ For 0 < √ √ λ < 1, let Fλ (resp. of λX0 + 1 − λX1 (resp. λY0 + 1 − λY1 ), i.e. µ ¶ µ ¶ 1 · · 1 (3) Fλ = d/2 F0 √ ∗ F1 √ . λ (1 − λ)d/2 1−λ λ Then, (4)

T2 (Fλ , Gλ ) ≤ λT2 (F0 , G0 ) + (1 − λ)T2 (F1 , G1 ).

Superadditivity is also known for convex functionals (relative entropies), like Boltzmann’s relative entropy Z f (v) f dv, (5) H(f |M ) = f (v) log f M (v) Rd where f is a probability density and M f is the Gaussian density with the same mean vector and variance as those of f . This means that the property (4) holds with T2 replaced by H and {G0 , G1 } replaced by {M f0 , M f1 }. This is a consequence of Shannon’s entropy power

PROBABILITY METRICS AND UNIQUENESS

3

inequality (Cf [13, 1]). The same property holds for the relative Fisher information, Z ¯ ¯ f ¯∇ log f (v) − ∇ log M f (v)¯2 f (v) dv (6) I(f |M ) =

Rd

(see again [13, 1]). As discussed by Csiszar [5], by means of the relative entropy H, one can define the so-called H-neighbourhoods. Even if those do not define a topological space, in the usual sense, their topological structure is finer than the metric topology defined by the total variation distance, Z ν(f, g) = |f (v) − g(v)| dv, in the sense that (7)

ν(f, g) ≤

p

2H(f |g),

which is the so-called Csiszar-Kullback inequality. It turns out that these properties of superadditivity and convexity also hold for d2 , the proofs being in fact much more simple. As an illustration of the interest of these properties, we shall give a version of the central limit theorem and a very simple proof of Kac’s theorem. We shall also apply d2 to the study of the Boltzmann equation with Maxwellian molecules, ∂f (8) (t, v) = Q(f, f )(t, v) ∂t µ ¶ Z u·n = σ [f (v 0 )f (w0 ) − f (v)f (w)] dw dn, |u| 3 2 R ×S where u = v − w is the relative velocity of colliding particles with velocity v and w, and v + w |u| v + w |u| v0 = + n, w0 = − n 2 2 2 2 are the postcollisional velocities. σ(ν) is a nonnegative function which for true Maxwell molecules has a nonintegrable singularity of the form (1 − ν)−5/4 as ν → 1. Usually, one truncates σ in some way, so that it become integrable (cut-off assumption). We shall prove that d2 shares a remarkable property with Tanaka’s distance : it is nonexpanding with time along trajectories of the Boltzmann equation; that is, if f and g are two such solutions, (9)

d2 (f (t), g(t)) ≤ d2 (f (0), g(0)).

This holds even if σ is singular. As an immediate corollary, we obtain that the solution to the Cauchy problem for the Boltzmann equation is

4

G. TOSCANI AND C. VILLANI

unique. Up to our knowledge, this is so far the only uniqueness result available for long-range interactions. The organization of the paper is as follows. First, in section 2, we investigate the connections between several distances on P2 (Rd ), including d2 and τ2 . In section 3, we establish the basic properties of superadditivity for d2 and give applications. Then, in section 4, we apply this metric to the study of the Boltzmann equation. 2. Metrics on P2 (Rd ) In order that d2 be well-defined, we need to restrict it to some space of probability densities with the same mean vector. For simplicity, we shall restrict to probability measures with zero mean vector, and we shall work on ½ ¾ Z Z d 2 (10) Dσ = F ∈ P2 (R ); vi dF (v) = 0, |v| dF (v) = d · σ where σ is some positive real number. We begin with two elementary lemmas. Lemma 1. Let (Fn ) in Dσ , Fn * F in P0 (Rd ). Then Z (11) F ∈ Dσ ⇐⇒ lim sup |v|2 1|v|≥K dFn (v) = 0. K→∞ n→∞

R Proof. It is clear that if limK→∞ supn→∞ |v|2 1|v|≥K dFn (v) = 0, then F ∈ Dσ . On the other hand, if this condition is not satisfied, then R 2 there exists ε > 0 and (Kn ) → ∞ with |v| R 1|v|≤Kn dFn (v) ≤ d · σ − ε. Since |v|2 1|v|≤Kn * |v|2 , this implies that |v|2 dF (v) ≤ dσ − ε, hence F ∈ / Dσ . ¤ Lemma 2. Let (Fn ) and F in Dσ , such that Fn * F . Then, there exists a nonnegative function φ such that φ(r)/r −→ ∞ as r → ∞, which can be chosen smooth and convex, and a constant M > 0, such that Z (12) sup φ(|v|2 ) dFn (v) ≤ M. n

Proof. Using Lemma 1, copy the construction thatRwas done in [6] for one single function : there exists k1 such that supn |v|≥k1 |v|2 dFn (v) ≤ R 1/2; then there exists k2 ≥ k1 such that supn |v|≥k2 |v|2 dFn (v) ≤ 1/4; and so on... for all p ≥ 2, there exists kp ≥ kp−1 such that Z 1 sup |v|2 dFn (v) ≤ p . 2 n |v|≥kp

PROBABILITY METRICS AND UNIQUENESS

5

Then choose φ(|v|2 ) = p|v|2 if |v| ∈ [kp , kp+1 ); smooth this function and slow its growth if necessary, as in [6]. ¤ In the sequel of the paper, for M > 0, we shall denote by ½ ¾ Z M d 2+α (13) P2+α (R ) = F ∈ Dσ ; |v| dF (v) ≤ M , (14)

PφM (Rd )

½ ¾ Z 2 = F ∈ Dσ ; φ(|v| ) dF (v) ≤ M

for φ nonnegative, φ(r)/r −→ ∞ as r → ∞. Lemma 2 enables us to restrict to spaces PφM in all the cases when one is interested with weak convergence in Dσ . In addition to the metrics d2 and τ = τ2 which were introduced in the last section, we consider • Prokhorov’s distance ρ(F, G) : for δ ≥ 0 and U ⊂ Rd , we define © ª © ª U δ = v ∈ Rd ; d(v, U ) < δ , U δ] = v ∈ Rd ; d(v, U ) ≤ δ , where d(v, U ) = inf{kv − wk, w ∈ U }. Let © ª σ(F, G) = inf ε > 0/F (A) ≤ G(Aε ) + ε for all closed A ⊂ Rd ; we set (15)

ρ(F, G) = max{σ(F, G), σ(G, F )}.

• the (C m )∗ distance kF − Gk∗m : for m ≥ 1, let C m (Rd ) be the set of m-times continuously differentiable functions, endowed with its natural norm k · km . Then, let ¯ ½¯Z ¾ ¯ ¯ ∗ m ¯ ¯ (16) kF − Gkm = sup ¯ ϕ dF (v)¯ , ϕ ∈ C , kϕkm ≤ 1 . Theorem 1. Let (Fn ) in Dσ and F in P2 (Rd ). Then, the following statements are equivalent. (i) Fn * F and F ∈ Dσ ; (ii) τ (Fn , F ) −→ 0; (iii) ρ(Fn , F ) −→ 0 and F ∈ Dσ ; (iv) For any m ≥ 1, kFn − F k∗m −→ 0, and F ∈ Dσ ; (v) d2 (Fn , F ) −→ 0. Proof. Here we shall only show (i) ⇐⇒ (v). First, if d2 (Fn , F ) −→ 0, then obviously, for all ξ ∈ Rd \ {0}, fbn (ξ) −→ fb(ξ); on the other hand, fbn (0) = fb(0) = 1. This entails that Fn * F ; it remains to prove that F ∈ Dσ . But for fixed ξ, |ξ| = 1, t ≥ 0, since the first derivatives of

6

G. TOSCANI AND C. VILLANI

fbn and fb coincide at the origin, and since their Fourier transforms are twice continuously differentiable, ¯ ¯ ¯ ¯ ¯ ¯ b b f (tξ) − f (tξ) ¯ 2 b ¯ ¯ ¯ n b D ( f − f )(0) · (ξ, ξ) = lim ¯ ¯ ¯ ¯ ≤ d2 (Fn , F ) −→ 0. n ¯t→0 ¯ t2 Conversely, suppose that Fn * F ∈ Dσ . By Lemma 2, there exists φ and M such that Fn ∈ PφM . By Lemma 3.1 of [6], all D2 fbn have a uniform modulus of continuity. In particular, for ε > 0, there exists δ > 0 such that ∀n |ξ| ≤ δ ⇒ |D2 fbn (ξ) − D2 fbn (0)| ≤ ε. Since Z 1h i µξ ξ ¶ fbn (ξ) − fb(ξ) 2b 2b = D fn (tξ) − D fn (0) · , (1 − t) dt |ξ|2 |ξ| |ξ| 0 iµ ξ ξ ¶ Z 1h iµ ξ ξ ¶ 1h 2b 2b 2b 2b + D fn (0) − D f (0) , − D f (tξ) − D f (0) · , (1−t) dt, 2 |ξ| |ξ| |ξ| |ξ| 0 we obtain

|fbn (ξ) − fb(ξ)| ≤ ε. |ξ|2 |ξ|≤δ On the other hand, clearly, there exists D > 0 such that |fbn (ξ) − fb(ξ)| |ξ| ≥ D =⇒ ≤ ε. |ξ|2 sup

b

b

f (ξ)| Thus d2 (Fn , F ) ≤ max{ε, supδ≤|ξ|≤D |fn (ξ)− } ≤ max{ε, supδ≤|ξ|≤D |ξ|2 fb(ξ)|}. Now, since Fn * F we have

1 b |f (ξ)− δ2 n

∀ξ, fbn (ξ) −→ fb(ξ); and since |D2 fbn (ξ)| ≤ dσ and Dfbn (0) = 0, (fbn ) is uniformly equicontinuous on the compact set {δ ≤ |ξ| ≤ D}. By Ascoli’s theorem, this entails that supδ≤|ξ|≤D |fbn (ξ) − fb(ξ)| goes to 0, and thus there exists n0 ≥ 0 such that for n ≥ n0 , d2 (Fn , F )) ≤ ε. ¤ Next, we would like to compare more precisely these different metrics. We shall say that two metrics m1 and m2 define the same weak-* uniformity on a set S ⊂ P2 (Rd ) if for all ε > 0 there exists η > 0 such that for all F, G ∈ S, m1 (F, G) ≤ η =⇒ m2 (F, G) ≤ ε, m2 (F, G) ≤ η =⇒ m1 (F, G) ≤ ε.

PROBABILITY METRICS AND UNIQUENESS

7

Theorem 2. Let M > 0 and φ be fixed. Then, for any m ≥ 1, τ , ρ, k · k∗m and d2 define the same weak-* uniformity in PφM . In order to simplify the proof, we shall prove this theorem only on M P2+α , where the bounds are explicit. In all the sequel α > 0, M > 0 and m ≥ 1 will be fixed. The main part of the proof has already been performed in [6], therefore we shall only “fill the gaps”. We split the proof in three steps. M First step : τ and ρ define the same weak-* uniformity on P2+α . More precisely, α

(17)

(a) τ (F, G)2 ≤ (2M + 8)ρ(F, G) α+2 + 4ρ(F, G)2 ; (b) ρ(F, G) ≤ τ (F, G)3/2 .

Proof. Part (a) is Theorem 5.2 of [6]. As for part (b), let β > 0, then there exists L∗ ∈ Π(F, G) such that Z T2 (F, G) = |v − w|2 dL∗ (v, w) ≥ β 2 L∗ (|v − w| ≥ β) . Chosing β = T2 (F, G)1/3 , we see that L∗ (|v − w| ≥ β) ≤ β. By Theorem 5.1 of [6], this implies that F (A) ≤ G(Aδ] ) + β for all closed A ⊂ Rd . Thus, ρ(F, G) ≤ β = τ (F, G)3/2 . ¤ Second step : For all m ≥ 1, ρ and k · k∗m define the same weak-* uniformity on P0 (Rd ). This is done by putting together Lemma 5.3 and Corollary 5.5 in [6]. Third step : For all m ≥ 1, k · k∗m and d2 define the same weak-* M uniformity on P2+α . More precisely, d+1

(18)

2

(a) kF − Gk∗d+3 ≤ Cd σ d+3 d2 (F, G) d+3 ; α 2 (b) d2 (F, G) ≤ Cd M 2+α (k|F − Gk∗1 ) 2+α .

Proof. (a) is an easy adaptation of Lemma 5.7 in [6]. Let ϕ ∈ C d+3 (Rd ), kϕkd+3 ≤ 1; let R ≥ 1. Let χR be a smooth function such that 0 ≤ χR ≤ 1, χR = 1 for |v| ≤ R, χR (v) = 0 for |v| ≥ R + 1. We estimate first the tails of the distributions. ¯ Z ¯Z Z ¯ ¯ 2σ ¯ (1 − χR )ϕ d(F − G)¯ ≤ dF + dG ≤ 2 . ¯ ¯ R |v|≥R |v|≥R

8

G. TOSCANI AND C. VILLANI

Then, by Parseval, ¯Z ¯ ¯Z ¯ ¯ ¯ ¯ ¯ b ¯ χR ϕ d(F − G)¯ = ¯ ϕχ ¯ ¯ ¯ ¯ dR (ξ)[f (ξ) − gb(ξ)] dξ ¯ £ ¤ |fb(ξ) − gb(ξ)| d+3 sup χd ) ≤ sup R ϕ(ξ)(1 + |ξ| 2 |ξ| Rd Rd

Z

|ξ|2 dξ. 1 + |ξ|d+3

Using the classical inequality © ª d+1 sup {(1 + |ξ|m )|χd |Dm (χR ϕ)(v)| ≤ Cd (1+R)d+1 , R ϕ(ξ)|} ≤ Cd sup (1 + |v|) ξ

v

we conclude that ¯ ¯Z ¯ ¯ ¯ ϕ d(F − G)¯ ≤ 2σ + Cd Rd+1 d2 (F, G). ¯ ¯ R2 Optimizing over R, we get the desired result. As for (b) : clearly, ¯Z ¯ ¯Z ¯ ¯ ¯ sin(ξ · v) ¯ |fb(ξ) − gb(ξ)| ¯¯ cos(ξ · v) ¯+¯ ¯. ≤ d(F − G) d(F − G) ¯ ¯ ¯ ¯ |ξ|2 |ξ|2 |ξ|2 Let us estimate for instance the first term in the right-hand side. First we note that Z Z cos(ξ · v) cos(ξ · v) − 1 d(F − G) = d(F − G). |ξ|2 |ξ|2 For fixed ξ, Φξ (v) =

cos(ξ · v) − 1 |ξ|2

is a C 2 function, vanishing at the origin, as well as its first-order derivatives, and elementary estimates show that |Φ0ξ (v)| ≤ |v|, Φξ (v) ≤ |v|2 /2, hence kΦξ χR k1 ≤ CR2 , whence by definition ¯ ¯Z ¯ 1 ¯¯ Φξ χR d(F − G)¯¯ ≤ kF − Gk∗1 . ¯ 2 CR On the other hand, ¯ ¯Z ¯ ¯ ¯ (1 − χR )Φξ d(F − G)¯ ≤ 2M . ¯ ¯ Rα Optimizing over R, we get the result.

¤

PROBABILITY METRICS AND UNIQUENESS

9

3. Superadditivity Let {X0 , Y0 } and {X1 , Y1 } be two independent pairs of random variables with zero mean vector. Let Fi (resp. Gi ) denote the law of Xi (resp. Yi ). For 0 < λ < 1, the law of √ √ Xλ = λ X0 + 1 − λ X1 is

¶ ¶ µ 1 · · ∗ . (19) Fλ = d/2 F0 √ F1 √ λ (1 − λ)d/2 1−λ λ Theorem 3. d2 is superadditive with respect to the rescaled convolution, i.e. 1

(20)

µ

d2 (Fλ , Gλ ) ≤ λd2 (F0 , G0 ) + (1 − λ)d2 (F1 , G1 ).

Corollary. The rescaled convolution is continuous with respect to d2 : with obvious notations, if d2 (F0n , G0 ) −→ 0 and d2 (F1n , G1 ) −→ 0, then d2 (Fλn , Gλ ) −→ 0. Proof. We denote by fbi and gbi the Fourier transforms of Fi and Gi . Since √ √ fbλ (ξ) = fb0 ( λ ξ)fb1 ( 1 − λ ξ), and an analogous formula holds for gbλ , we have √ √ √ √ |fb0 ( λ ξ)fb1 ( 1 − λ ξ) − gb0 ( λ ξ)gb1 ( 1 − λ ξ)| d2 (Fλ , Gλ ) = sup |ξ|2 ξ ( √ √ √ |fb1 ( 1 − λ ξ) − gb1 ( 1 − λ ξ)| b = sup (1 − λ)|f0 ( λ ξ)| (1 − λ)|ξ|2 ξ ) √ √ √ |fb0 ( λ ξ) − gb0 ( λ ξ)| +λ|gb1 ( 1 − λ ξ)| . λ|ξ|2 Since kfb0 k∞ ≤ 1 and kgb1 k∞ ≤ 1, this expression can be bounded by (1 − λ) d2 (F1 , G1 ) + λ d2 (F0 , G0 ). ¤ Remark. This property is sufficient to imply that d2 is nonexpandive along solutions of the Boltzmann equation if d = 2 (Cf. [2]) . But we shall show in the next section that this restriction can be dispended with.

10

G. TOSCANI AND C. VILLANI

As an application, for X a random variable with law F , let us consider the functional dF | |fb − M (21) J(X) = J(F ) = d2 (F, M F ) = sup , |ξ|2 ξ where this time M F is the Gaussian distribution with the same mean vector and covariance matrix as F . The same proof as before shows the Theorem 4. For any two independent random variables X0 and X1 , and 0 < λ < 1, ³√ ´ √ (22) J λ X0 + 1 − λ X1 ≤ λJ(X0 ) + (1 − λ)J(X1 ) with equality if and only if X0 and X1 are gaussian variables with the same mean vector and covariance matrix. Proof. Suppose that there is equality in the proof of Theorem 3, with G0 and G1 replaced by a centered gaussian probability law (one can always reduce to this case). Suppose that F0 6= M F0 . Let us denote by g0 and g1 the densities of M F0 and M F1 . Let (ξn ) such that √ √ √ √ √ |fb0 ( λ ξn )fb1 ( 1 − λξn ) − gb0 ( λ ξn )gb1 ( 1 − λ ξn )| n→∞ √ −→ J( λ X + 1 − λ X1 ); 0 |ξn |2 the left-hand side is bounded by ( √ √ √ |fb1 ( 1 − λ ξn ) − gb1 ( 1 − λ ξn )| b (1 − λ)|f0 ( λ ξn )| (1 − λ)|ξn |2 ) √ √ √ |fb0 ( λ ξn ) − gb0 ( λ ξn )| . +λ|gb1 ( 1 − λ ξn )| λ|ξn |2 √ √ If |ξn | −→ ∞, then |gb1 (ξn )| ≤ 1/2 for large n, and J( λ X0 + 1 − λ X1 ) ≤ (1 − λ)J(X1 ) + λ/2J(X1 ), which is impossible since J(X1 ) 6= 0. Therefore, extracting a subsequence if necessary, we may assume that ξn −→ ξ ∈ Rd . The same argument shows then that ξ = 0. But by defintion, √ √ |fb0 ( λ ξn ) − gb0 ( λ ξn )| −→ 0 λ|ξn |2 as n → ∞. Hence, J(Xλ ) = 0, and J(X0 ) = J(X1 ) = 0.

¤

As remarked by Murata and Tanaka [9] and others, a functional having these properties can be used to several applications, as the following.

PROBABILITY METRICS AND UNIQUENESS

11

First application : the central limit theorem. Let (Xn ) be a sequence of independent identically distributed variables with finite variance, and let (23)

1 ξn = √ (X1 + · · · + Xn ). n

Then, if F denotes the common probability law of each Xn , Gn the probability law of ξn , and G the gaussian probability law with the same mean vector and covariance matrix as those of F , (24)

d2 (Gn , G) −→ 0 as n → ∞.

Moreover, given ε > 0, knowing d2 (F, G) and a modulus of continuity of D2 fˆ at the origin, one can compute explicitly n0 such that for n ≥ n0 , d2 (Gn , G) ≤ ε. Proof. For P and Q two probability let us denote by P ◦Q √ √ distributions, −d the rescaled convolution 2 P (·/ 2) ∗ Q(·/ 2). Let ηk = ξ2k , and Pk its probability law, then, thanks to Theorem 4, J(ηk+1 ) = J(Pk ◦ Pk ) ≤ J(Pk ) = J(ηk ) : (J(ηk )) is decreasing, hence converging to some limit l. Admit for a while that there exists Q so that for some subsequence kp , d2 (Pkp , Q) −→ 0, so that l = J(Q). Then, since the rescaled convolution is continuous with respect to d2 , J(Pkp ◦ Pkp ) −→ J(Q ◦ Q); but it is also J(Pkp +1 ) −→ l = J(Q), so that J(Q ◦ Q) = J(Q). This implies that Q is the gaussian distribution G, whence J(ηk ) −→ 0. P Now, any integer n ≥ 1 can be written n0 αk 2k with αk ∈ {0, 1}, and 1X αk 2k J(ηk ) −→ 0. J(ξn ) ≤ n Finally, to prove that (Pk ) has a weak cluster point with respect to note that the Fourier transform of to the topology³of d2 , it suffices ´n √ Fn is fbn (ξ) = fb(ξ/ n) , whose second derivatives can be readily computed, µ µ µ µ ¶¶n ¶ ¶ µ ¶n−2 n − 1 ξ ξ ξ ξ 2 = Dij fb √ Di fb √ Dj fb √ fb √ n n n n n µ µ ¶n−1 ¶ ξ ξ 2 b b +f √ Dij f √ . n n

12

G. TOSCANI AND C. VILLANI

If one denotes by ψ a modulus of continuity of D2 fb near 0, i.e. ¯ ¯ ¯ 2b ¯ ¯D f (ξ) − D2 fˆ(0)¯ ≤ ψ(|ξ|), where ψ is chosen to be increasing from 0, we obtain at once that for D2 fbn one can take as modulus of continuity µ ¶ t 2 2 ψn (t) = σ t + ψ √ . n ψn is bounded, uniformly in n, by ψ1 (t) = σ 2 t2 + ψ(t). This is enough to conclude. Stated in this form, this would seem to be only an overcomplicated way of proving the central limit theorem; but the interest of this method is that it can immediately lead to explicit computations. Indeed, let ε > 0, and let us look for n0 such that for n ≥ n0 , d2 (Gn , G) ≤ ε. First, since J(Pk ) is decreasing, it we look for k0 such that J(Pk0 ) ≤ ε. The proof of Theorem 4 clearly shows that ( Ã Ã ! !) 2 1 + e−|ξ| /2 J(Pk+1 ) = J(Pk ◦ Pk ) ≤ sup inf ψ(|ξ|), J(Pk ) . 2 ξ Let η such that ψ(η) ≤ ε. As long as J(Pk ◦ Pk ) ≥ ε, the supremum can only be obtained for |ξ| ≥ η, hence J(Pk+1 ) ≤ µJ(Pk ) with 1 + e−η µ= 2

2 /2

< 1.

Therefore, one can take k0 = −

log J(X1 ) . log µ

Now, for n ≥ 2k0 , one writes X αk 2k X αk 2k 2k0 J(X1 ) + ε ≤ J(X1 ) + ε, J(Xn ) ≤ n n n k≥k k

Suggest Documents