On logarithmic Sobolev inequalities, Csisz ar ...

111 downloads 0 Views 619KB Size Report
Csisz ar-Kullback inequalities, and the rate of convergence to equilibrium for Fokker-Planck type equations. June 9, 1998. Anton Arnold1, Peter Markowich2, ...
On logarithmic Sobolev inequalities, Csiszar-Kullback inequalities, and the rate of convergence to equilibrium for Fokker-Planck type equations June 9, 1998

Anton Arnold1 , Peter Markowich2, Giuseppe Toscani3, Andreas Unterreiter4 1

Fachbereich Mathematik, TU{Berlin, MA 6-2, D{10623 Berlin, Germany

2

Institut f ur Analysis und Numerik, Universit at Linz, A-4040 Linz, Austria

3

Dipartimento di Matematica, Universit a di Pavia, I-27100 Pavia, Italy

Fachbereich Mathematik, Universit at Kaiserslautern, D-67653 Kaiserslautern, Germany 4

[email protected], [email protected], [email protected], [email protected]

1

Acknowledgements This research was partially supported by the grants ERBFMRXCT970157 (TMRNetwork) from the EU, the bilateral DAAD-Vigoni Program, the DFG under Grant-No. MA 1662/1-3, and the NSF (DMS-9500852). The rst author acknowledges fruitful discussions with L. Gross and D. Stroock, the second author with D. Bakry, and the second and third authors interactions with C. Villani. AMS 1991 Subject Classication: 35K57, 35B40, 47D07, 26D10, 60J60

Abstract It is well known that the analysis of the large-time asymptotics of Fokker-Planck type equations by the entropy method is closely related to proving the validity of logarithmic Sobolev inequalities. Here we highlight this connection from an applied PDE point of view. In our unied presentation of the theory we present new results to the following topics: an elementary derivation of Bakry-Emery type conditions, results concerning perturbations of invariant measures with general admissible entropies, sharpness of convex Sobolev inequalities, explicit construction of optimal Csiszar-Kullback inequalities, applications to nonsymmetric linear (e.g. kinetic), and certain non-linear Fokker-Planck type equations (Desai-Zwanzig model, drift-di usion-Poisson model).

Contents 1 Introduction

3

2 Decay of the relative Entropy as t! 1

9

2.1 Spectral gap of the symmetrized equation . . . . . . . . . . . . . 2.2 Admissible relative entropies and their generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Exponential decay of the entropy production and the relative entropy 2.4 Convex Sobolev inequalities in entropy version . . . . . . . . . . . 2.5 Non-symmetric Fokker-Planck equations . . . . . . . . . . . . . .

2

10 12 18 28 30

3 Sobolev Inequalities 3.1 3.2 3.3 3.4

32

Three versions of convex Sobolev inequalities . Perturbation lemmata for the potential A(x) . Poincare-type inequalities . . . . . . . . . . . Sharpness results . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

4 A generalized Csiszar-Kullback Inequality 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8

Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . An extension of the inequality . . . . . . . . . . . . . . . . Comparison with the classical Csiszar-Kullback inequality An example . . . . . . . . . . . . . . . . . . . . . . . . . . Asymptotic behavior of U as e (u) ! 0 . . . . . . . . . . . Passing from weak to strong convergence in L1 (d) . . . . The optimal inequality for the logarithmic entropy . . . . . Entropy-type estimates on kf ; gkL1(d) . . . . . . . . . .

5 Nonlinear Model Problems

32 35 39 40

44 . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

51 52 54 54 55 58 59 60

62

5.1 Desai-Zwanzig type models . . . . . . . . . . . . . . . . . . . . . . 63 5.2 The drift-diusion-Poisson model . . . . . . . . . . . . . . . . . . 67

6 Appendix: Proofs of Section 4

70

1 Introduction One of the fundamental problems in kinetic theory is the analysis of the time decay (rate) of solutions of kinetic models towards their equilibrium states. The maybe most often applied methodology in time asymptotics is the entropy method, where the convergence towards equilibrium is concluded using the time monotonicity of the physical entropy of the system. A classical example for this approach is provided by the spatially homogeneous Boltzmann equation, where convergence to the Maxwellian equilibrium state has been proven in this way (cf., e.g., CeIlPu94], Theorem 6.4.1 and the references therein). 3

To illustrate the entropy approach we shall now present two simple (even explicitly solvable) kinetic model problems outlining the ideas which we will develop in this paper. At rst we consider the Bhatnagar-Gross-Krook (BGK) model of gas dynamics, which is a `simplied' version of the Boltzmann equation carrying to some extent the same amount of physical information BhGrKr54]. The IVP for the space homogeneous version reads: @f = J (f ) :=  ;M f ; f   v 2 IRn t  0 (1.1a) @t

f (v t = 0) = fI (v) v 2 IRn (1.1b) (of n = 3 is of physical interest), with the normalization R course, only the case f f dv = 1. Here M = M f (v) denotes the Maxwellian distribution function IRn I   2 j v ; u j ; n= 2 f (1.2) M (v) = m (2) exp ; 2 with mass m= mean velocity

=

IRn

Z

u= and temperature

Z

IRn

Z

IR

n

fdv

vfdv=m

(v ; u)2f=nm:

The relaxation rate  is assumed to be a positive constant. It is then an easy exercise to show that m u and  are left invariant under the temporal evolution of (1.1), such that these quantities can be computed from the initial data fI , and M f (t)  M fI follows. The exponential L1 (IRnv)-convergence of f (: t) to its equilibrium state M fI with rate  can be easily veried by explicitly solving (1.1). For carrying out the entropy approach we consider the physical entropy (Boltzmann's H -functional)

H (f ) =

Z

IRn

4

f ln fdv

(1.3)

and the entropy production

I (f ) =

Z

IRn

ln fJ (f )dv

(1.4)

with the easily veriable relation d H (f (t)) = I (f (t)): (1.5) dt Since ln M f is quadratic in v and due to the conservation of mass, mean velocity and temperature, we calculate

I (f (t)) =  = =

Z

ln f (t) M f (t) ; f (t) dv n ;

IRZ

; IR Z ; IR



ln f (t) ; ln M f (t) f (t) ; M f (t) dv

; n

;











Z f ( t ) M f (t) M f (t) dv: ln f ( t ) dv ;  ln n M f (t) f (t) IRn

Since f (t) and M f (t) have the same mass, Jensen's inequality implies that both terms are nonpositive. Thus we obtain the stronger version of Boltzmann's H Theorem

;I (f (t))  e(f (t)jM f t ) ()

(1.6)

where the so called relative (to the Maxwellian) entropy is dened by

e(f j

M f ) :=



Z



f dv: f ln Mf IRn

(1.7)

Note that e(f jM f ) = 0 i f = M f (Gibb's Lemma). Again, since ln M f is quadratic in v and since f (t) and M f (t) have the same mass, velocity and temperature, we have

e(f (t)jM f (t) ) = H (f (t)) ; H (M f (t) )

and since M f (t)  M fI we conclude from (1.5), (1.6) d e ;f (t)jM f (t)  = d H (f (t)) = I (f (t))  ;e ;f (t)jM f (t)  : dt dt Exponential convergence to zero of the relative entropy with rate  follows immediately (for initial data which have nite relative entropy)

e(f (t)jM fI )  e;t e(fI jM fI ) t  0: 5

(1.8)

The Csiszar-Kullback inequality Csi63],Kul59]

kf ; M f kL 2

1(

IRn )

 2e(f jM f )

(1.9)

then gives L1 (IRn)-convergence to equilibrium (at the suboptimal rate 2 ). Summing up, we used the lower bound (1.6) of the (negative) entropy production in terms of the relative entropy to explicitly control the convergence of the (relative and absolute) entropies and to control the strong convergence of the solution to its equilibrium state. The second model problem, which is a two-velocity radiative transfer model, demonstrates that the approach of the rst example may not be general enough. We consider the ODE in IR2:   d = ;1 1  t  0 (1.10a) 1 ;1 dt (0) = 





uI vI



(1.10b)

where (t) = uv((tt)) and > 0: Since we consider (1.10) as a kinetic model, we assume { for physical reasons only { uI  vI  0 which implies u(t) v(t)  0: Obviously, u(t) and v(t) tend exponentially with rate 2 to their equilibrium state w1 = (uI + vI )=2. To apply the entropy approach let = (s) be any smooth strictly convex function, dened on IR+, with (1) = 0. We introduce the relative entropy generated by :   u v e (j1) := ( w ) + ( w ) w1 (1.11) 1 1 where we denoted the steady state   w 1 1 = w : 1 Dierentiating (1.11) with respect to t and using (1.10) gives the entropy equation d e ((t)j ) = I ((t)j ) (1.12) 1 1 dt with the production term   u v 0 0 I (j1) = ; (u ; v) ( w ) ; ( w ) : (1.13) 1 1 Since is strictly convex we have I (j1)  0 with equality i  = 1. Clearly, now we cannot bound the entropy production (1.13) from below directly 6

by a multiple of the relative entropy (1.11). Therefore we consider the time evolution of the entropy production. Dierentiating (1.13) gives d I ((t)j ) = R ((t)j ) (1.14) 1 1 dt with the entropy production rate 2 ( u ( t ) ; v ( t )) 00 ( u(t) ) + 00 ( v (t) ) : R ((t)j1) = ;2 I ((t)j1) + w1 w1 w1 (1.15)





2

We can now bound the entropy production rate from below by a multiple of the entropy production:

R ((t)j1 )  ;2 I ((t)j1 )

and conclude exponential convergence with rate 2 of the entropy production from (1.14)

jI ((t)j1)j  jI (I j1)je; t:

(1.16)

2

Inserting I ((t)j1) from (1.15) into (1.12) and using (1.14) gives after integration from s = t to s = 1 (using limt!1 e((t)j1 ) = limt!1 I ((t)j1) = 0)

e ((t)j1 ) +

Z 1

t

(u(s) ; v(s))2 00( u(s) ) + 00 ( v(s) ) ds = ; I ((t)j1) : 2w1 w1 w1 2 (1.17) 



First of all, (1.17) gives the exponential decay with rate 2 of the relative entropy e , and hence again the suboptimal decay rate of (t) towards 1. Moreover, (1.17) furnishes an inequality involving the strictly convex function , i.e. we obtain the Sobolev-type inequality     v 1 u v u I I I I 0 0 ( w ) + ( w ) w1  2 (uI ; vI ) ( w ) ; ( w ) (1.18) 1 1 1 1

for all uI  vI  0 by setting t = 0 in (1.17). Also, (1.17) implies that equality in (1.18) holds i u(t)  v(t) for all t  0 which is equivalent to uI = vI = w1. Clearly, the inequality (1.18) can easily be obtained in a direct way (cp. Example 2.6 of Gro93]), however, the presented approach allows generalization to much more complex kinetic problems (as will be the subject of the next sections). Particularly, the second example clearly shows that in some cases the entropy equality itself is not su!cient to derive exponential decay of the relative entropy (and equivalently, Sobolev-type inequalities). It may be necessary to consider 7

the evolution of the entropy production and to nd a lower bound of the entropy production rate in terms of the entropy production. This leads directly to the main subject of this paper. In the following sections we shall consider the IVP for Fokker-Planck type equations (cf. (2.1)) and apply the methodology presented above. More specically, Section 2 will be concerned with the decay of the relative entropy as t! 1. There we shall obtain conditions on the entropy, the diusion matrix and on the velocity eld which allows for exponential convergence under weak conditions on the initial data (bounded relative entropy). Section 3 then is concerned with the derivation of various forms of Sobolev-type inequalities and with the analysis of conditions which guarantee that the inequalities `saturate' nontrivially. In Section 4 we generalize the Csiszar-Kullback inequality to rather general convex entropies, and in Section 5 we apply the developed theory to nonlinear Fokker-Planck models. We conclude this introduction with a brief (and incomplete) discussion of the relevant literature and of those issues where our approach diers from already existing ones. Most importantly, our approach is to some extent based on the work by D. Bakry and M. Emery (cf. e.g., BaEm84], Bak91], Bak94]), which provides a very general framework for hypercontractive semigroups and generalized Sobolev-type inequalities using the so-called iterated gradient (;2 ) formalism BaEm86]. In fact, some of our results are concretizations of the Bakry-Emery criterion (cf. the references cited above) to Fokker-Planck type equations. However, the Bakry-Emery approach does not allow an explicit control of the remainder term in the Sobolev inequalities. Let us explain this point in more detail. Careful reading of the radiative transfer-type example presented above shows that the analysis of the time decay of the entropy production gives at the same time the \sharp" decay of the relative entropy as well as the remainder in (1.17). The knowledge of this remainder allows to identify in (1.18) the (unique) state saturating the Sobolev-type inequality. Hence, the entropy approach gives at the same time a proof of a \convex" inequality and all cases of equality. As far as the classical Gross logarithmic Sobolev inequality is concerned Gro75], the identication of the cases of equality is due to Carlen Car91]. His approach is based on a slightly dierent method, which relies on information-type inequalities, and ultimately requires a deep investigation of properties of Gaussian functions. In the same paper, the connection between Gross' logarithmic Sobolev inequality and the linear Fokker-Planck equation, through the Fisher measure of information and the Blachman-Stam inequality Bla65], Sta59] has been fruitfully developed. The entropy-entropy production approach for the same FokkerPlanck equation has been recently addressed in Tos96b], Tos97a]. There, the cases of equality follow easily from the identication of the remainder. Physically speaking, the entropy approach put in evidence that the remainder in this type 8

of inequalities depends on the complete dynamics in time of the solution of the (mass-conserving) linear problem that generates the inequality itself. In other words, many Sobolev type inequalities can be interpreted as an estimate for the relative entropy of the initial state with respect to the steady state of a linear mass conserving system. An excellent background reference for logarithmic Sobolev inequalities is the overview paper Gro93]. General linear homogeneous radiative transfer equations generating certain inequalities involving convex functions are analyzed in GaMaUn97]. A calculation of the logarithmic Sobolev constant for non-symmetric ODEs in IR2 (generalizing our second example above) is presented in the appendix of DiSC96]. Entropy production arguments for the decay of the solution of the heat equation in IRn towards the fundamental solution were used in Tos96a]. A further application in kinetic theory is to be found in the asymptotic analysis of the spatially homogeneous Boltzmann equation, which in the so called `grazing collisions limit' leads to the Landau-Fokker-Planck equation Vil96], Vil97], Tos97b]. There it is hoped that the decay results of the relative entropy might give an indication of the rate of decay towards equilibrium of the spatially homogeneous Boltzmann equation.

2 Decay of the relative Entropy as t! 1 We now consider the IVP for the Fokker-Planck type equation t = div(D(r + rA)) x 2 IRn t > 0 (2.1a) 1 n (t = 0) = I 2 L (IR ): (2.1b) 21 with A = A(x) su!ciently regular (i.e. A 2 Wloc (IRn) and e;A 2 L1(IRn). We assume that the symmetric diusion matrix D = D(x) = (dij (x)) is locally 21 uniformly positive denite on IRn and dij 2 Wloc (IRn), i j = 1 : : :  n. Obviously we have the conservation property Z

IRn

(x t)dx =

Z

IRn

I (x)dx:

(2.2)

In a kinetic context, the independent variable x in (2.1) stands for the velocity. In this Section we assume (without restriction of generality) Z

IRn

I (x)dx = 1:

One easily sees that (2.1a) has the steady state 1(x) = e;A(x) 2 L1+ (IRn) 9

(2.3)

R

assuming (w.r.o.g.) A to be normalized as IRn e;A(x) dx = 1. We remark that (by a simple minimum principle) I (x)  0 implies (x t)  0 for all x 2 IRn t  0. In this Section we shall investigate the convergence of (t) towards the steady state 1 in various norms and in relative entropy. In particular we are concerned with equations (2.1a) that exhibit an exponential decay to the steady state (2.3).

2.1 Spectral gap of the symmetrized equation We transform equation (2.1) to symmetric form (on L2 (IRn)). Therefore we set

z := =p 1

which satises the IVP

zt = div(Drz) ; V (x)z x 2 IRn t > 0 z(t = 0) = zI := I =p 1 on IRn:

(2.4)

Here, V (x) denotes the potential 2 V (x) = ; 12 Tr(D @@xA2 ) ; 12 (rA)>DrA + (div D)  rA] x 2 IRn t > 0: (2.5)

is the Hessian of A(x) and the superscript \>" denotes transposition. Now dene the Hamiltonian @2A @x2

Hz = ;div(Drz) + V z

on the domain

(2.6)

DQ := fz 2 L2 (IRn)jQ(z z) < 1g

of the quadratic form Q(z1  z2) := (Hz1  z2)L2 (IRn) given by

Q(z1  z2 ) :=

Z

IRn

r

>





pz 11 D(x)r





pz 21 1(dx)

(obtained from (2.6) by a simple calculation). For the following we shall assume that the diusion matrix D(x) and the potential A(x) are such that H generates a C 0-semigroup on L2 (IRn) (for various su!cient conditions see ReSi86]). Note that H satises   z p  z 2 DQ  Hz = 1N p 1 10

where N is the Dirichlet form-type operator on L2(IRn d 1) dened by (Nu v)L2(IRn d1) :=

Z

>uDrv (dx) r 1 IR n

(see Gro93]). For future reference we also remark that   Z 1 2 2 ; IRn 1 L 1 dx = Q p  p  p 1  p 2 2 DQ: 1 1 1 1

(2.7)

Here L denotes the appropriate extension of the Fokker-Planck type operator

L := div (D (r + rA)) : Obviously the spectrum (H ) is contained in 0 1). Since Q(z) = 0 i z = p const 1 we conclude that the ground state z1 = exp(;A=2) of H is nondegenerate and that 1 is the unique normalized steady state of (2.1a). The solution of (2.4) can be written as Z p z(t) = 1 + (2.8) e;t d(P p I ) 1 (01) where Ppis the projection valued spectral measure of H . Of course, we assume spectral representation we immediately conhere I = 1 2 L2 (IRn). From this p clude the convergence of z(t) to 1 in L2(IRn) as t! 1. Let us now consider a Hamiltonian H with a positive spectral gap 0 (i.e. distance of (H ) nf0g from the eigenvalue 0). For the case D(x)  I, the identity matrix, a simple su!cient condition for a positive spectral gap is given by V 2 L1loc(IRn% IR), V (x) bounded below and V (x)! 1 for jxj! 1 (see e.g. ReSi87a, Th. XIII.67]). In this case we obtain exponential convergence: kz(t) ; p 1kL2(IRn)  kzI ; p 1kL2(IRn)e;t0  (2.9) which implies exponential convergence of (t) to 1 in L1 (IRn): Z Z p 1 j (tp) ; 1j dx j ( t ) ; j dx = 1 1 IRn IRn 1 p  k 1kL2 1(IRn)kz(t) ; 1kL2(IRn) = O(e;t0 ): (2.10) Thispvery simple approach of course only holds under the restrictive assumption I = 1 2 L2 (IRn). In order to weaken this restriction and to better illustrate the convergence of (t) towards 1 we now investigate the decay in relative entropy e( (t)j 1) dened by (1.7). In fact, we shall not only consider this logarithmic \physical entropy", but a wider class of relative entropies that essentially lie `between' this physical entropy and the quadratic functional k (t)k2L2 (IRn ;11(dx)) . 11

For future reference we also consider a second symmetrization of (2.1) (on L2 (IRn d 1)) which is more commonly used in the probabilistic literature on this subject. We set

 := = 1 which satises the IVP

t = ;11 div(D 1r) x 2 IRn t > 0 I := I = 1 2 L1 (IRn d 1):

(2.11)

In terms of this symmetrized problem we shall extend the allowed initial data I from L2 (IRn 1) to the Orlicz space L1 log L(IRn 1).

2.2 Admissible relative entropies and their generating functions De nition 2.1. Let J be either IR or IR+. Let

denotes the closure of J ) satisfy the conditions

2 C (J ) \ C (J ) (where J 4

(1) = 0

(2.12a)

00  0 00 6 0 on J

(2.12b)

( 000)2  1 00 IV on J: 2

(2.12c)

Let 1 2 L1 (IRn ) 2 2 L1+(IRn) with 1dx = 2 dx = 1 and 1 = 2 2 J 2 (dx); a.e. Then R

e ( 1 j 2) :=

R

Z

1 ) (dx) ( n 2

IR

2

(2.13)

is called an admissible relative entropy (of 1 with respect to 2 ) with generating function .

Note that

e ( 1j 2 )  0

follows from Jensen's inequality. 12

(2.14)

Remark 2.2. If satises the conditions (2.12a){(2.12c), so does its "normalization" e( ) = ( ) ; 0(1)( ; 1), and they both generate the same relative entropy: e e( 1 j 2 ) = e ( 1 j 2). In the sequel we will therefore assume that the generator of e be normalized as

0(1) = 0: (2.12d) This implies that the relative entropies e and their generators are bijectively related. Due to the convexity of we have  0 on J: (2.15) We now list some typical examples of admissible relative entropies on J = IR+: The physical relative entropy (1.7) is generated by () =  ln  ;  + 1 rather than by () =  ln . It is a special case of +  ; ( ; 1) () = ( +  ) ln 1 +  > 0   0: (2.16a)  For 1 < p < 2

p() =  ( +  )p ; (1 +  )p ; p(1 +  )p;1( ; 1)   > 0   0 (2.16b) and for p = 2 '() = ( ; 1)2  > 0 (2.16c) generate admissible relative entropies. In the last example we clearly have e'( 1 j 2) = (k 1k2L2 (IRn;2 1 (dx)) ; 1): In the above denition we excluded linear entropy functionals as they would be zero due to the assumed normalizations of 1 and 2 . In the following Lemmata we derive important properties of admissible entropies: 



Lemma 2.3. a) Admissible entropies are generated by strictly convex functions . b) For J = IR all admissible entropies are generated by (2.16c). Proof: a) Let g ( ) := 001( )  

(2.12c) is equivalent to

2 J with g() 2 (0 1].

Then, condition

g00()  0 (2.17) whenever 00 () > 0. Since (2.17) excludes (positive) poles of g on J , we conclude 00 > 0 on J . b) g00  0 and g > 0 on IR implies g() = const > 0, and the assertion follows. 13

Hence, our class of generating functions coincides with those considered in BaEm84] (up to the normalizations (2.12a), (2.12d)). For the following we shall assume J = IR+ and I  0. Except being reasonable from a kinetic viewpoint, this case allows for a richer mathematical structure since only quadratic admissible entropies exist for J = IR. The above examples (2.16a), (2.16c) of admissible entropies include the two limiting cases for the asymptotic behavior (as  ! 1) of the generating function . An admissible relative entropy e ( 1 j 2) can be bounded below by a logarithmic subentropy e and bounded above by a quadratic superentropy e' : Lemma 2.4. Let generate an admissible relative entropy with J = IR+. Then there exists a logarithmic-type function  (2.16a) and a quadratic function ' (2.16c) such that ()  ()  '()  2 J (2.18a) and hence (2.18b) 0  e ( 1 j 2)  e ( 1j 2 )  e' ( 1j 2 ):  and ' both satisfy (2.12) and thus generate, respectively, an admissible suband superentropy for e . Proof: Since J = IR+ , the function g from the proof of Lemma 2.3 satises

g > 0 g0  0 g00  0 on J: (2.19) Now denote the derivatives of the given function by (1) = 0 0 (1) = 0 00 (1) =: 2 > 0 000(1) =: 3  0: (2.20) From (2.19) we readily get the estimate  ;2 1 0 <  < 1  g()   +   > 0 ;2 1  > 1 with  := ;3;2 2  0  := (2 + 3);2 2  0. Integrating the corresponding estimate for 00 = g1 , 

2 = 0 <  < 1 2  > 1 we obtain with (2.20) the upper bound for :    2 ( ln  ;  + 1) 0 <  < 1 ()  2 ( ; 1)2  >1 2  2( ; 1)2 =: '()  > 0: ( + );1



00()



14

(2.21)

(2.22)

To derive the lower bound of one integrates (2.21) twice to show ()  (). For  > 0 the function () is given by (2.16a) with  = 1   = . If  = 0 we set (2.23) () = 22 ( ; 1)2 : The well-known Csiszar-Kullback inequality (Csi63], Kul59]) shows that the logarithmic relative entropy (1.7) is a `measure' forRthe distance between two R n 1 normalized L+(IR )-functions 1 2 with IRn 1 dx = IRn 2dx = 1: 1 k ; k2 n  e( j ): (2.24) 1 2 2 1 2 L1(IR ) In a simple calculation using the subentropy e and Lemma 2.4 this inequality can be extended to any admissible entropy e : For  > 0, and  hence given 2 by (2.16a) we introduce the normalized function e :=  1 ++ 

 0, and estimate using (2.24): 1 k ; k2 n = 22 k e ; k2 n  22 e( ej ) = 1 e ( j ): 2 L1 (IR ) 2 2 1 2 L1 (IR ) 223 23 2 1 2

(2.25)

With (2.18b) this gives 1 e ( j ) 1 k ; k2 (2.26) 1 2 L1 (IRn )  2 2 1 2 with the notation 2 = 00 (1) = 00(1). If  = 0, (2.26) is easily derived with the estimate (2.10) and with (2.23). A detailed analysis of Csiszar-Kullback inequalities for a larger class of generating functions will be the topic of x4. In particular we shall derive a sharper variant of (2.24). In our subsequent analysis we shall need the continuity of the relative entropy: Lemma 2.5. Let jR! (as j ! 1) in L2+(IRn ;11(dx)) with the R the sequence R normalization j dx = dx = 1 dx = 1. Then for all admissible entropies with J = IR+:

e ( j j 1) ! e ( j 1) as j ! 1:

Proof: Since

(2.27)

2 C 0 1] there exists for any " > 0 a positive  = (") such that (2.28) j ( ) ; ( )j < " for 0     ("): 1

12

2

15

Integrating (2.21) we readily obtain the estimate:

9 C () such that j 0()j  C ()(1 + ) for   : (2.29) We shall rst derive estimates on j (a) ; (b)j% a b 2 J , for three dierent cases of a and b. For a b   we use the mean value theorem and the monotonicity of 0 1

1

to estimate:

j (a) ; (b)j  ja ; bj(j 0(a)j + j 0(b)j)  ja ; bjC ()(2 + a + b): (2.30) For the next case assume a   b <  (or vice versa). With (2.30) this yields j (a) ; (b)j  j (a) ; ()j + j () ; (b)j  (a ; )C ()(2 + a + ) + ( ; b)C ()  ja ; bj C ()(2 + a + b + ) + C ()]  (2.31) with C () := supc2  j

;;c c j < 1. Finally, for a b <  we have j (a) ; (b)j < " from (2.28). 1

2

1

1

2

( )

0 )

2

( )

Using these three estimates we can now control the dierence of relative entropies. For arbitrarily small " we estimate:

je ( j j 1) ; e Z( j 1)j     j ; 1 dx  1 1   g f 1  or 1  Z j 

+



 < g f 1j  1

j 1





+ C1()



; 1 1dx

C1()(2 + ) + C2()] Z

Z

IRn

j j ; jdx

(2.33)

j pj ; j pj + dx + "Z 1dx

1 1 IRn  C1()(2 + ) + C2()] k j ; kL1(IRn) + C1()k j ; kL2(IRn;11(dx)) k j + kL2(IRn;11(dx)) + ": IRn

(2.32)

(2.34)

As j ! 1 this last term converges to ", since L2 (IRn ;11(dx))-convergence implies L1 (IRn)-convergence. This nishes the proof. For future reference we state the following elementary result for generators of admissible relative entropies: 16

Lemma 2.6. The generator of every admissible relative entropy e satises: a)









() +  1 ; 1  0()  2 () +  1 ; 1   > 0 2 2     with the notation 2 = 00 (1) > 0, b)

() ()

 







(2.35)



2  (0)  + 2  ; 1 ( ; 1)   0 > 0 (2.36) 0 0     (2.37) (0)  + 2  ; 1 ( ; 1) 0   > 0: 0

0

Proof: a) For the right inequality of (2.35) we have to show that G1 ( ) :=

2 + 2( ; 1) ;  0  0. But this follows from G1 (1) = G01 (1) = 0, G001 () = ; 000  0  > 0 (see (2.20)). For the left inequality of (2.35) we have to show that G2 () := + 2 ( ; 1) ;  0  0. Like before we have G2 (1) = G02 (1) = 0 and it remains to show G002 () = ; 00 ;  000  0  > 0: (2.38) Since generates an admissible entropy we have 000 ()  0 IV ()  0 (see (2.19), (2.12c)). Thus there exists a unique 1 2 0 1] such that 000() < 0 on (0 1) and 000() = 0 on 1  1). In the case  2 1  1) we have G002 () = ; 00() < 0. In the case  2 (0 1) we rewrite (2.12c) as  00 0 1  ; 000 and integrate over the interval (" ) 0 < " < 1 : 00 ( ) 00 (") 00 ( )  ; "  ; 000 () + 000(")  ; 000 () : Letting tend " ! 0 we obtain G002 = ; 00 ;  000  0, and this nishes the proof of (2.35). b) Applying the Gronwall lemma to the right inequality of (2.35) (with xed 0 ) gives  2      = 0+1 ()  (0)  + 2  ; 1  ; 2    0 > 0 0 0 17

and (2.36) follows from

0  1. For the proof of (2.37) we use the left inequality of (2.35) and estimate: 







 () + 2 1 ; 1  () + 2 1 ; 12   > 0: Applying the Gronwall lemma to (2.39) gives ()  (0 )  + 2( ln  + 1 ;  ) 0 <   0  0 0 0 and (2.37) follows with the estimate ln x  x ; 1. 0 ()

(2.39)

2.3 Exponential decay of the entropy production and the relative entropy With our notion of superentropies we can now show the convergence of (t) to 1 in relative entropy. Lemmap2.7. Let e be an admissible relative entropy and assume n 2 zI = I = 1 2 L (IR ). Then e ( (t)j 1) ! 0 as t ! 1. Proof: With the notation 2 = 00 (1) we estimate using (2.18b), (2.22):

0  e ( (t)j 1)  e'( (t)j 1) =

(t) p 1k2L2(IRn) ' ( dx ) =  1 2 kz (t) ; 1 IRn (2.40)

Z

and the assertion follows from (2.8). If the Hamiltonian H in (2.6) has a spectral gap 0 > 0, then the exponential decay from (2.9) of course carries over to the relative entropy. In the above lemma, p the assumption I = 1 2 L2 (IRn) is unnaturally restrictive. The subsequent analysis aims at considering initial data which only have nite relative entropy and at proving explicit decay bounds for the relative entropy in terms of the initial relative entropy. The `price' for this extension will be a condition on the evolution problem (2.1a) (see (A3) below) that is stronger than assuming H to have a spectral gap. We now proceed similarly to Tos96b] and to example 2 of the introduction. Consider the entropy production (2.41) I ( (t)j 1) := dtd e ( (t)j 1) 18

and the entropy production rate

R ( (t)j 1) := dtd I ( (t)j 1):

(2.42)

To facilitate the computations we rewrite (2.1a) in the following form: t = div(Du 1) (2.43) with the notation u = r( 1 ). Dierentiating the relative entropy e ( (t)j 1) gives Z

I ( (t)j 1) = n 0 t dx: (2.44) 1 IR By using (2.43) we obtain after an integration by parts Z

I ( (t)j 1) = ; n 00 u>Du 1 dx  0 (2.45) 1 IR due to the positivity of D. Using (2.43) we compute (2.42): Z

R ( (t)j 1) = ; n 000 div(Du 1)u>Du dx 1 Z IR

(2.46) ; 2 IRn 00 1 u>Dut 1dx: Clearly, the computations which lead to (2.45) and (2.46) are formal. However, they can easily be justied for initial data I 2 L2 (IRn% ;11(dx)) and for entropy generators without singularity at  = 0 by taking into account the semigroup property of the Hamiltonian H and the partial integration formula (2.7). General admissible entropies can easily be dealt with by a local cut-o at  = 0.

We now return to proving the exponential decay of e ( (t)j 1) under additional assumptions on A and D. At rst we shall derive an exponential decay rate for the entropy production I by using the special form of the entropy production rate (2.46). At rst we consider the case of a scalar diusion, i.e. D(x) = ID(x). Lemma 2.8. Let the initial condition I 2 L2(IRn% ;11(dx)) satisfy jI ( I j 1)j < 1 for the admissible entropy e . Assume that the scalar coecients A(x) and D(x) of (2.1a) satisfy the condition

(A1) 9 1 > 0 such that

1 n 1 rD rD + 1 (&D ; rD  rA)I ; 2 4 D 2 2 2 +D @@xA2 + rA rD +2 rD rA ; @@xD2

19



1I

(in the sense of positive denite matrices) 8x 2 IRn . Then the entropy production converges to 0 exponentially: jI ( (t)j 1)j  e;21tjI ( I j 1)j t > 0: (2.47) Proof: After an integration by parts (which can be justied as mentioned above)

the rst term of the entropy production rate (2.46) reads

R1 = We set

Z

IV A 2 4 ;A ( e ) D j u j e dx IRnZ Z 000 A 2 ;A > @u +2 n (e )D e u @x udx + n 000 (eA )e;ADjuj2u  rDdx: IR IR

ut = r(eAdiv(De;Au)) in the second term of (2.46), which becomes R2 = ;2

Z

00 (eA )De;A u  r(Ddivu) + u>r (rD ; rAD)u + u> @u (rD ; rAD)]dx: n @x

IR

We dierentiate u  r(Ddivu) = (u  rD)divu + Du  r(divu), and we express Du  r(divu) according to   1 &(Djuj2) = 1 juj2&D + 2rD> @u u + D X @ui 2 + Du  r(divu): 2 2 @x ij @xj We rewrite R2 as R2 = S1 + S2 with

S1 = =

S2 = 2

Z

; IR 00(eA )divDr(Djuj )e;A]dx Z ;  000(eA ) D u  r(juj ) + DrD  ujuj e;Adx IR 2

n

2

2

2

n

"

Z

IR

n

00 (eA )De;A u>r (rAD ; rD)u #

+ 1 &Djuj2 ; 1 juj2rD  rA + 1 2 ; n (u  rD)2 dx 2 2 D 4 " Z +2 n 00(eA )e;A n ; 2 (u  rD)2 ; D(u  rD)divu 4 IR # X  @ui 2 1 @u 2 2 2 > + juj jrDj dx +2DrD u + D @x 2 ij @xj = T1 + T2 : 20

The second integral in S2 can be written as Z

T2 = 2

IRn

X 00(eA )e;A





2 @u 1 @D 1 i 1 @D D @x + 2 @x uj + 2 ui @x ; 2 ij rD  u dx j i j

ij

and (A1) allows to estimate the rst term in S2 by

T 1  2 1

Z

IR

n

00 (eA )De;Ajuj2dx:

All in all we have

R1 + R2 = R1 +" S1 + S2 = (R1 + S1 + T2 ) + T1



Z

2 IV (eA )D2juj4 + 000(eA )(4D2u> @u @x u + 2Djuj u  rD)

IRn

+2 00(eA ) +2 1

Z

IR

n

X

2 #

@ui + 1 @D u + 1 u @D ; 1  rD  u D @x j 2 i @xj 2 ij j 2 @xi

ij 00(eA )De;A

e;Adx

juj dx: 2

The rst integral can be written as Z

IR

n

tr(XY )e;Adx

where X and Y are the 2  2-matrices  00 A 000 A X = 22 000((eeA )) 2 IV ((eeA )) and, resp.,

Y=





1 2  D2 u> @u @x u +22 D4juj u  rD  1 2 > @u 2 D u @x u + 2 Djuj u  rD D juj



with

=

X

ij





2 @u 1 @D 1 i 1 @D D @x + 2 @x uj + 2 ui @x ; 2 ij rD  u : j i j

X is non-negative denite since generates an admissible entropy (cf. Denition 2.1). A simple calculation shows that Y is also nonnegative denite. Thus, Z

;A dx  0 tr( XY ) e n

IR

21

and we have for the entropy production rate (2.46):

R ( (t)j 1)  2 1

Z

IR

n

00(eA )De;Ajuj2dx = ;2 1I ( (t)j 1):

The assertion now follows from d jI ( (t)j )j  ;2 jI ( (t)j )j: 1 1 1 dt

(2.48)

In a special case of equation (2.1a) the condition (A1) has a simple geometric interpretation: Remark 2.9. Forn D(x)  I condition (A1) simply requires the uniform convexity of A(x) on IR , i.e. (A2) 9 1 > 0 such that





@ 2 A(x) @xi @xj ij =1:::n

 I 8x 2 IRn: 1

The condition (A1) is a special case of the well-known Bakry-Emery condition for logarithmic Sobolev-inequalities BaEm84], Bak91], Bak94]. In fact, the proof of Lemma 2.8 is a concretization of the approach of Bakry and Emery and was given here mainly for the sake of clarity. For general (symmetric and uniformly positive denite) diusion matrices D(x) an `excursion' into basic dierential geometry is, however, in order to understand the Bakry-Emery condition. Therefore we consider the Riemannian manifold corresponding to the diusion matrix D(x) = (dij (x)) as metric tensor. The Christoel symbols are dened as the elements of the 3-tensor:   n X @d 1 ij @dki @dij l kl (2.49) ;ij = 2 d @x + @x ; @x  i j k k=1 where dkl(x) are the entries of the inverse D(x);1 = (dkl (x)). The Riemann curvature tensor then reads n n X X @ @ l l l l m Rkij = @x ;jk ; @x ;ik + ;im ;jk ; ;ljm;mik (2.50) i j m=1 m=1 and the Ricci-tensor

ij =

n X k=1

22

Rik kj :

(2.51)

The covariant derivative of a vector eld X = (X 1 : : :  X n) is given by

ri

Xj

n j X @X = @x + ;jik X k : i k=1

(2.52)

We dene the symmetric 2-tensor

r

; S  X

n X 1 (djlriX l + dil rj X l ): ij = 2 l=1

(2.53)

The Ricci tensor of the Fokker-Planck operator is dened as

Ricij (x) = where we set

X i(x) =

;

n X



kl=1

n X j =1

dij

dik djl kl ;

r

; S  X



kl (x)



(2.54a) 

@ A(x) ; 1 ln det D(x) : @xj 2

(2.54b)

Then the Bakry-Emery condition for a general symmetric positive denite diusion matrix reads (A3)

9

1

> 0 such that D(x)Ric(x)  1 I

8x 2 IRn:

For the case D(x)  I we shall rst compare (A3), or rather (A1), to the corresponding condition given in Prop. 6.2 of Bak94]. If D  I, their condition 1 X (x) X (x)  m ; n (Ric(x) ; %D(x);1) m m simplies to   1 rA rA  m ; n @ 2 A ; %I 8x 2 IRn (2.55) m m @x2

with some % 2 IR and m 2 n 1] denoting, respectively, the curvature and dimension of the Fokker{Planck operator L. If we assume % and m to be constant (2.55) only admits a global solution A(x) x 2 IRn if m = 1. Hence, (2.55) reduces to (A1) with % = 1. We have Lemma 2.8'. If A = A(x) D = D(x) satisfy (A3) then the estimate (2.47) holds. 23

Proof: See BaEm84], Bak94].

Remark 2.10. To better understand the Bakry-Emery condition (A3) we consider a transformation of the Fokker-Planck type equation under an x ; y dieomorphism y = y(x) () x = x(y) n on IRn . We denote J (y) = det @x @y (y ), assume J > 0 on IR and set

0 (y t) = J (y) (x(y) t):

A lengthy calculation (cf. also Ris89]) gives

e (ry 0 + ry (A ~ ; ln J ) 0 )) 0t = divy (D

where

D

e (y ) =



>

@y (x(y))D(x(y)) @y (x(y)) @x @x A~(y) = A(x(y)):

(2.56a)



(2.56b)

(2.56c) Assume now that the Riemann curvature tensor (2.50) vanishes. Then the geometry produced by D is Euclidean and there exists a transformation y = y (x) e (y )  I Ris89]. The condition (A3) applied to (2.1a) reduces to (A2) such that D applied to (2.56a), i.e. we need to assume the uniform convexity of A~ ; ln J in the y-coordinates. In one dimension (n = 1) the Riemann curvature tensor always vanishes and the transformation which yields a Fokker-Planck equation with diusion coecient 1 can be constructed explicitly. It is dened by dy (x) = pD(x) ;1 > 0 x 2 IR: (2.57) dx For arbitrary n > 1 an analogous transformation works, if D is a constant matrix. We set @y (x) = pD ;1  x 2 IRn: (2.58) @x Then the transformed equation has the identity matrix as diusion matrix. In general (i.e. for a non-vanishing Riemann curvature tensor) the metric tensor cannot be transformed to the identity by a coordinate transformation. It can be e (y ) = ID e (y ), where D e is a scalar function, if isothermal transformed to the form D coordinates exist on the manifold corresponding to D. Locally at least, this is always the case for n = 2 (cf. BeGo87], p.421). 24

From the exponential decay of the entropy production I (Lemma 2.8) we shall now derive the exponential decay of the relative entropy e : Theorem 2.11. Let e be an admissible relative entropy and assume that e ( I j 1) < 1. Let the coecients A(x) and D(x) satisfy condition (A3). Then the relative entropy converges to 0 exponentially:

e ( (t)j 1)  e;21 t e ( I j 1) t > 0:

(2.59)

Proof: We proceed in two steps and rst derive (2.59) for I

2 S := f 2

L ( j 1) < 1g. 1 From the Lemmata 2.7, 2.8' we then know that e ( (t)j 1) ! 0 and I ( (t)j 1) ! 0 as t ! 1. Hence, integrating (2.48) (which also holds under condition (A3) see BaEm84], Bak94]) over (t 1) gives I (t) = dtd e (t)  ;2 1 e (t) (2.60) which proves the assertion for su!ciently regular initial data. For the general case we use a density argument to Rapproximate 1 R I in two steps: n n 1 1 is given in L+(IR ), and I is a normalized (i.e. I dx = 1 dx = 1) L+ (IR ){ function with nite relative entropy, I 2 f 2 L1+ (IRn) e ( j 1) < 1g. 2 +

(IRn ;1(dx)) I

We rst approximate I by N 2 L1+ (IRn) with N dx = 1(N 2 IN ): R

N (x) := N I (x)1lfI =1 N g "

with the monotonously decreasing normalization constants N =

& 1 for N ! 1. n ;

By construction we have j 1N j

L (IR  1 (dx)). N converges to I a.e., and also in L1 (IRn): 2

1

k I ; N kL

1(

IRn )

= j1 ; N j

Z

fI =1 N g



I dx +

#;1

R

I dx NN , which implies N

Z

fI =1 >N g

fI =1 N g

2

!1 I dx N;! 0:

Now we construct a uniform bound for ( 1N ) 1 N 2 IN : we consider the convex, monotonously increasing function R ()  > 0 dened by R () = (R) for 0 <   R and R () = () for R < , with R > 1 su!ciently large such that R  on IR+. This implies

R ()  () + (R)  > 0: 25

(2.61)

From Lemma 2.6b we easily conclude with



0

= 2:

R (2)  4 R () + 22   0:

(2.62)

Since N ! 1 we have N  2 I (for large N ) and we estimate using (2.62), (2.61): ( N ) 1  R ( N ) 1 (2.63) 1 1  R( 2 I ) 1  4 R( I ) 1 + 22 I  4 ( I ) 1 + 4 (R) 1 + 22 I =:  (): 1

1

Our assumptions on I show vergence theorem gives

R

1

IRn  ( )d

< 1, and Lebesgue's dominated con-

lim e ( N j 1) = e ( I j 1):

N !1

(2.64)

n ;1 2 In the second approximation R step we shall approximate N in L+ (IR  1 (dx)) by the normalized (i.e. NM dx = 1) sequence f NM gM 2IN , having nite 1 (IRn ) which denotes non-negative entropy production. We choose NM 2 C0+ C 1{functions with compact support in IRn. We shall now show that

1 (IRn)  S = fp 2 L4 (IRn  ;1 (dx)) jI ( j )j < 1g: C0+ 1 + 1

(2.65)

1 (IRn ) and compact support ) ( = supp  IRn. Since We consider with p p2 C0+ p n A=2 and hence = 1 2 L2 (IRn ) follows. A 2 L1 loc (IR ), so is 1= 1 = e Using 00()  2(1 + 1 )  > 0 which follows from (2.21), we estimate the entropy production (2.41): Z jI ( j 1)j = 00( 1 )uT D(x)u 1dx Z  2kDkL1( ) (1 + 1 )jr( )j2 1dx

= 42kDkL1( )



22kDkL1( )



Z

1

r ( + 1)

r 1 Z (1 + )(4jrp j 1

2 dx

2



26

+ jrAj2)dx

(2.66)

p n which is nite since D(x) ;11 (x) and rA(x) 2 L1 loc (IR ). Hence 2 S . Since C01(IRn) is dense in L4 (IRn ;11(dx)), N can indeed be approximated by f NM g  S in L2(IRn ;11(dx)). Lemma 2.5 then shows lim e ( NM j 1) = e ( N j 1):

M !1

(2.67)

From these two approximations we extract a normalized `diagonal' sequence f NM (N )g  S with !1 NM (N ) N;! I in L1 (IRn)

(2.68a)

lim e ( NM (N ) j 1) = e ( I j 1):

(2.68b)

N !1

For the approximations NM (N ) we can apply (2.59):

e ( NM (N ) (t)j 1)  e;21 te ( NM (N ) j 1) t > 0

(2.69)

where NM (N ) (t) denotes the solution of (2.1a) with initial data NM (N ) . From the entropy bound (2.69) and from the Dunford-Pettis theorem Ger87] we easily conclude NM (N ) (t) N !1 (t) ;! 1 in L1(IRn 1(dx)) weakly: 1 We now use the lower semi-continuity of e (:j 1) to nish the proof: Z

  Z NM (N ) (t)  ( t ) inf 1(dx) 1(dx)  lim N !1 IRn 1 1 IRn 

 e;  te ( I j 1) 2 1

t > 0:

Due to the above density argument we do not have to make the \algebra hypothesis" of BaEm84] and x2 of Bak94]: there one assumes that there exists a core for L that is stable under the evolution eLt and under composition with C 1 functions. Using dierent techniques such a density argument was also given in x6 of DeSt84].

27

2.4 Convex Sobolev inequalities in entropy version (2.60) is the entropy version of a convex Sobolev inequality         Z Z 1 00 > 1 (dx)  2 n r Dr 1(dx) 1 1 IR 1 1 1 IRn (2.70a)

8 2 L

1 +

(IRn)

with

Z

IRn

dx =

Z

IRn

1dx:

(2.70b) R

This inequality, of course, does not require our usual normalization (x)dx = 1. The relation of (2.70) to other versions of this inequality will be discussed in x3. Note that L1+(IRn) in (2.70b) can be replaced by L1 (IRn) if is quadratic. The desired L1 {convergence of (t) to 1 is now a direct consequence of Theorem 2.11 and the Csiszar-Kullback inequality (2.24), (2.26): Corollary 2.12. Let e be an admissible relative entropy and assume that e ( I j 1) < 1. Let the coecients A(x) and D(x) satisfy condition (A3). Then the solution of (2.1) satises r k (t) ; 1kL1(IRn)  e;1t 22 e ( I j 1) t > 0 (2.71) with the notation 2 = 00(1). To complete the line of argumentation we shall now show that the Hamiltonian H has a spectral gap if A and D satisfy (A3). This condition implies the logarithmic Sobolev inequalities (2.70) for any admissible relative entropy e . We start by rewriting the Sobolev-inequality (2.70) for () =  ln  ;  + 1 by setting r = jg j 1 kgkL2(IRn1(dx)) (cf. also x3). We obtain after a simple calculation Z Z 1 >Drg (dx) + kg k2 2 n 2 r g j g j ln j g j ( dx )  1 1 L (IR 1 (dx)) ln kg kL2(IRn 1 (dx)) 1 IRn IRn (2.72) for all g 2 L2(IRn 1(dx)). The well-known Rothaus{Simon mass gap theorem (Rot81], Sim76], Gro93]) gives Z

r g>Drg 1(dx)  kgkL IR 1

n

28

2

2(

IRn 1 (dx))

(2.73a)

if

Z

IRn

g 1(dx) = 0:

(2.73b)

Simple calculations show that H = p1 Lp 1 : DQ  L2 (IRn dx) ! L2 (IRn dx) 1

where H is given by (2.6) and

r r g 1(dx) = p 1gH (p 1g)dx: IR IR

Z

n

g >D

Z

n

Using the spectral theorem it is easy to conclude that the spectral gap 0 of H satises 0  1 , i.e.: (H ) \ (0 1) = : In many cases the logarithmic Sobolev constant 1 is indeed smaller than the spectral gap 0 (see, e.g., example (1.10), x4 of DiSC96], and the discussions in Rot81], Bak94], DeSt90]). Remark 2.13. Assume that D(x) is pointwise in IRn bounded below by a symmetric positive denite matrix D1 (x), i.e. D1(x)  D(x) x 2 IRn (in the sense of positive-denite matrices) and that the Fokker-Planck operator L1 ( ) := div(D1(r + rA)) satises the Bakry-Emery condition (A3) (with D replaced by D1 ). Then the convex Sobolev inequality (2.70) holds with D replaced by D1. Since 00  0 we also have       Z > 00 r D1 r 1(dx) n 1  1  1 IR    Z  IRn 00 1 r> 1 Dr 1 1(dx) (2.74) and (2.70) follows. Consequently, the decay statement in Lemma 2.8', Theorem 2.11 and Corollary 2.12 and the above mass gap argument hold for L. Note that this settles the case 0 < D(x)I  D(x) x 2 IRn where D(x) and A(x) satisfy (A1). In particular, for a uniformly positive denite diusion matrix dI  D(x) x 2 IRn (with d 2 IR+ ) and a uniformly convex potential A ((A2) holds) we obtain the Sobolev inequality (2.70), where 1=(2 1) has to be replaced by 1=(2d 1). 29

2.5 Non-symmetric Fokker-Planck equations At the end of this Section we extend the above analysis to the following class of Fokker-Planck type equations with certain rotational contributions to the driftcoe!cient. We consider t = div(D(r + (rA + F~ ))) x 2 IRn t > 0 (2.75)

(t = 0) = I 

(2.76)

with the above conditions on I , D and A. Additionally we assume for F~ = F~ (x t) su!cient regularity and div(DF~ 1) = 0 on IRn  (0 1) (2.77) such that 1 = e;A is still a stationary state of (2.75). Our objective is again to analyze the rate of convergence of (t) towards 1. For the symmetric problem (2.1) we proceeded in x2.3 by deriving a dierential inequality for the entropy production (see (2.48)). In contrast, we shall here only apply the convex Sobolev inequality (2.60) that was obtained for the corresponding symmetric problem (i.e. (2.75) with F~ = 0). With the transformation  = = 1 (cp. (2.11)) (2.75) may be rewritten as t = ;11 div(D 1r) + F~ >Dr where the second term of the r.h.s. is skew-symmetric in L2 (IRn d 1). We rst calculate: d e ( (t)j ) = I ( (t)j ) + T 1 1 dt with I as in (2.45) and

T= We use (2.77) in the form

Z

IRn

0





div(DF~ )dx: 1

div(DF~ ) = ;(DF~ )r 1

1

to obtain

T=

Z

IRn

0













(DF~ )r dx = Z r > DF~ dx: 1 1 1 1 1 IRn 30

An integration by parts nally gives   Z T = ; n div(DF~ 1)dx = 0: 1 IR By the convex Sobolev inequality (2.70) we obtain d e ( (t)j )  ;2 e ( (t)j ) 1 1 1 dt and

e ( (t)j 1)  e;21 te ( I j 1)

follows again. As an example of (2.75) we consider the kinetic Fokker-Planck-type equation for the distribution function f (x v t):

ft + fA f g = div(xv) e;A(xv) rxv (eA(xv) f )] t > 0 f (t = 0) = fI 2 L1+(IR2n)

(2.78a) (2.78b)

with the position variable x 2 IRn and the velocity variable v 2 IRn. Here

fA f g := rv A  rxf ; rxA  rv f

denotes the Poisson bracket. For the sake of simplicity we set D  I. With the choice F~ = (rv A ;rxA)> the convergence of f (t) to its steady state f1(x v) = e;A(xv) follows from the above calculations: R 21 2n 2n 1 Theorem 2.14. Let f 2 L ( IR )  A 2 W ( IR ) and 1 = I + loc IR2n fI (x v )dxdv = R ; A (xv) dxdv. Let e be an admissible relative entropy and assume that IR2n e e (fI jf1) < 1. Let A(x v) be strictly convex, i.e.

@2A > 0 such that 1 @ (x v)2

9

 I 8x v 2 IRn: 1

Then the relative entropy converges to 0 exponentially:

e (f (t)jf1)  e;21 t e (fI jf1) t > 0:

31

(2.79)

3 Sobolev Inequalities 3.1 Three versions of convex Sobolev inequalities We shall now discuss in detail the convex Sobolev inequality (2.70). In particular we shall rewrite (2.70) in various equivalent forms and discuss its relation to other known inequalities. At rst we set u = = 1. Then (2.70) becomes Z Z 1 (u) 1(dx)  2 n 00(u)r>uDru 1(dx) (3.1a) n 1 IR IR for all u 2 L1+(IRn) (L1 (IRn) if is quadratic) which satisfy Z

R

IRn

u 1(dx) =

Z

IRn

1(dx):

(3.1b)

Assume for the following IRn 1(dx) = 1. R Hence setting u = v= IRn v 1(dx), we obtain     Z Z 1 r>vDrv 1(dx) v v 00 R R R ( dx )  1 2 2 1 IRn IRn IRn v 1 (dx) IRn v 1 (dx)  IRn v 1(dx)] (3.2) for all v 2 L1+(d 1) := L1+ (IRn 1(dx)) (v 2 L1 (d 1) if is quadratic). The most common form of convex Sobolev inequalities is obtained by setting v = f 2 in (3.2). This gives the so called steady state measure version: 

Z

IR

n



!

f2

kf kL 2

2(

d1 )

1(dx)  2 1

Z

f2

IR

n

kf kL 4

2(

d1 )

00



!

f2

kf kL 2

2(

d1 )

r>f Drf 1(dx) (3.3)

for all f 2 L2 (d 1). The inequalities (3.1){(3.3) hold for all functions 1 = e;A (with L1 (dx)-norm equal 1), 1 > 0 on IRn and symmetric positive denite matrices D = D(x), which are su!ciently smooth (cf. Section 2) and which satisfy (A1) (if D(x) = D(x)I), or (A2) (if D(x)  I), or (A3), or there exists a symmetric positive denite matrix D1 = D1 (x)  D(x) such that A and D1 satisfy (A3). 2 If D(x) = I, 1(x) = Ma (x) := (2a1)n=2 exp(; jx2aj ) for some a > 0 then (A2) holds with 1 = 1=a and we obtain the celebrated Gross logarithmic Sobolev inequality Gro75]: 

Z

f ln 2

IR

n

!

f2

kf kL 2

dMa )

2(

Z

Ma (dx)  2a 32

jrf j Ma(dx) IR 2

n

(3.4)

for all f 2 L2 (dMa), where we set () =  ln  ;  + 1 (see also (2.72)). A second example is provided by the same choices of 1 and D setting () = ( ; 1)2. We then have Z

v Ma (dx) ;

Z

2

2

IR

n

IR

n

vMa (dx)

Z

 a IR jrvj Ma(dx) 2

n

(3.5)

for all v 2 L1 (dMa ). This is an old inequality, and as remarked by Beckner Bec89] in one dimension was probably known to both mathematicians and physicists in the 1940's Wey49]. It has been a useful tool in dierent subjects, like partial dierential equations Nas58] and statistics Che81]. Finally let = p() = p ; 1 ; p( ; 1), see (2.16b). Then we obtain the inequality Z p Z Z p ; 1 p jr (vp=2)j2 Ma (dx) (3.6) v Ma (dx) ; vMa (dx)  2a p n n n IR IR IR for all v 2 L1+(dMa ), 1 < p < 2 (v 2 L1(dMa ) for p = 2). Setting juj = vp=2 gives the generalized Poincare-type inequality by Beckner Bec89]:     p Z u2M (dx) ; Z juj2=pM (dx) p  2aZ jruj2M (dx) (3.7) a a p ; 1 IRn a IRn IRn for all u 2 L2=p(dMa ), 1 < p  2. We remark that the inequalities (3.7) interpolate in a very sharp way between the Poincare-type inequality (3.5) and the logarithmic Sobolev inequality (3.4), which is obtained from (3.7) in the p ! 1 limit. (3.4) and (3.7) represent a hierarchy of convex Sobolev inequalities, with the logarithmic Sobolev inequality (3.4) being the `strongest'. This, however, cannot be seen directly from (3.4), (3.7) and requires a more involved line of argument (see Bak94], p.51, where the spectral gap inequality (3.5) is compared to the logarithmic Sobolev inequality (3.4)). In Led92] this interpolation is discussed for the Ornstein{Uhlenbeck process on IRn and for the heat semigroup on spheres. Note that the inequalities (3.4), (3.5), (3.6), (3.7) also hold when the Gaussian measure Ma is replaced by a general steady state 1 = e;A, such that (A D) satises the Bakry{Emery condition (A3). The quadratic form on the r.h.s. is R then replaced by 21 ru>Dru 1(dx). Let us now discuss brie*y some consequences of inequalities (3.4), (3.5), (3.6), (3.7). In spite of the fact that the class of admissible entropies and steady state measures which generate inequalities (3.4), (3.5), (3.6), (3.7) is very wide, Sobolev inequalities with respect to the Lebesgue measure follow only if () =  ln  ;  + 1: 33

Actually, in this case the choice g2 = f 2Ma leads to the inequality 

Z

g ln 2

IRn

!

g2

kgkL 2

dx)

2(

Z n 2 dx + n + 2 ln 2a kgkL2(dx)  2a n jrgj2 dx (3.8) IR

for all g 2 L2(IRn), a > 0 (cf. Car91]). Inequality (3.8) is the logarithmic Sobolev inequality for the heat kernel H0 = ;&. In Davies' book Dav87] inequality (3.8) is contained in the more restrictive Sobolev inequalities framework of ultracontractive operators (see Theorem 2.2.3 and the rest of the Chapter). To obtain Davies' form, we rewrite (3.8) as a family of logarithmic Sobolev inequalities for  > 0: Z

IR

n

Z

g ln g dx   2

jrgj dx + MG()kgkL IR 2

2

n

dx) +

2(

kgkL 2

2(

dx) ln

kgkL

dx)

2(

(3.9)

for ;all 0  g 2L2(IRn). As will be shown in Theorem 3.10 below MG() = ; 21 n + n2 ln 2 is the sharp constant for all  > 0. Now, let us compare the function MG of (3.9) with the analogous one given by Davies' conditions. In his framework, inequality (3.9) follows if  ;H t  e 0 g 

L1 (dx)

 kgkL

2(

M (t)  dx) e

t > 0

(3.10)

where M (t) is a monotonically decreasing continuous function of t. In the present case, H0 = ;&,

e;H0t g = Kt  g

L1 (dx)

o

; jxjt . Then Young's inequality gives  kgkL dx kKtkL dx (3.12)

where Kt(x) is the Gaussian (4t);n=2exp  ;H t  e 0 g 

n

(3.11)

2

4

2(

)

2(

)

and we obtain M (t) = ; n4 ln 8t. Thus

M () ; MG () = n2 (1 ; ln 2) > 0

(3.13)

and the function M () is not optimal. In the above example we were able to rewrite (3.4) as (3.9) with arbitrarily small principal coecient . For general potentials A(x) that satisfy (A2), the possibility of doing so is deeply connected to the dierence between hypercontractivity and ultracontractivity (see x5 of Gro93], x2 of Dav87]). As the heat kernel example shows, the entropy approach of x2 could lead to better constants. 34

3.2 Perturbation lemmata for the potential A(x) In Section 2 we derived the convex Sobolev inequality (2.70) corresponding to Fokker-Planck operators L = ;div(D(r + rA)) that satisfy the Bakry-Emery condition (A3). Next we will present two perturbation results to extend this inequality to a larger class of operators. First we shall consider bounded perturbations of the `potential' A(x). Our result generalizes the perturbation lemma of Holley and Stroock HolSto87], Gro90] from the logarithmic entropy to all admissible relative entropies e from Denition 2.1: ;A(x)  f ;Ae(x) 2 L1 (IRn ) with R n 1 dx = Theorem 3.1. Let ( x ) = e ( x ) = e 1 1 + IR R 1 dx = 1 and IRn f

Ae(x) = A(x) + v(x) 0 < a  e;v(x)  b < 1

x 2 IRn:

(3.14)

Let the symmetric locally uniformly positive denite matrix D(x) be such that the convex Sobolev inequality (3.3) (with the admissible entropy generator ) holds for all f 2 L2 (d 1): Then, a convex Sobolev inequality also holds for the perturbed measure f1 : 

Z

IRn



!

f2

kf kL 2

df1 )

2(

f1(dx)

(3.15)

2 Z 2 2 b b  1 max( a2  a ) IRn kf k4f2 00 L (df1 )

for all f 2 L2(d f1) = L2 (d 1):



!

f2

kf kL 2

df1 )

2(

r>f Drf f1(dx)

Proof: We introduce the notations

2 2 k f k2L2(d1 ) f ( x )  f ( x ) 1  1 (x) := kf k2   :=  = kf k2  0 (x) := kf k2 0 L2 (d1 ) L2 (df1 ) L2 (df1 )

and because of (3.14) we have 1    1: b a

Below we shall need the estimate  00 00 (0 )  00 (() 11) 1 0

1 < 0  1  0 : 35

(3.16) (3:17a) (3:17b)

The rst line follows from 000  0 (see (2.20)), and the second line follows from (2.38). We shall now rst prove the assertion (3.15) for the case   1: In the following chain of estimates we use, in this sequence, (2.36), (3.14), (3.3), (3.14), (3.17b), (3.16): Z

(1 ) f1(dx) n

IR



Z

(0)2 + 2( ; 1)( ; 1) f1(dx)



IRZn



Z

(0) f1(dx)  b n (0) 1(dx) IR 2 f 2  b2 1 IRn kf k4 2 00(0)r>f Drf 1(dx) (3.18) L (d1 ) Z 2  ab 21 IRn kf k4f2 00(0)r>f Drf f1(dx) L (df1 ) Z 2  ab 21  IRn kf k4f2 00(1)r>f Drf f1(dx) L (df 1) Z 2  ab2 21 IRn kf k4f2 00(1)r>f Drf f1(dx) L (df1 ) which is the result if 1  0 : In the case  < 1 we proceed similarly and rst apply (2.37) to obtain: Z

IRn

(1 ) f1(dx) 

= 

Z

IRn

2

2

IRnZ

 (0 ) + 2( ; 1)( ; 1)] f1(dx) = 

Z

IRn

(0) f1(dx): (3.19)

Now we again use (3.14), (3.3), (3.14), (3.17a), (3.16) and nally obtain the result Z Z 2 b 2 f 2 00( )r>f Drf f (dx): (  ) f ( dx )  (3.20) 1 1 1 1 a 1 IRn kf k4L2(df1 ) IRn We remark that the `perturbation constant' in (3.15) is not as good as the one 2 b b in the proof of HolSto87] (max( a2  a ) versus ab ): This is due to the fact that the proof of Holley and Stroock exploits homogeneity properties of and 0, in the case () =  ln  ;  +1: Thus, their original proof (with the constant ab ) can be extended to entropy generators of the form () = p ; 1 ; p( ; 1) 1 < p  2, but not to the general case of Theorem 3.1. Example 3.2. Set D(x) = I and consider the `double well potential' A(x) = c1jxj4 ; c2 jxj2(c12 > 0): Then, the perturbation theorem 3.1 yields convex Sobolev inequalities as A(x) can be written as a bounded perturbation of a uniformly convex potential that satises (A2). 36

Next we shall specialize our discussion to the one-dimensional situation and derive convex Sobolev inequalities under very mild assumptions on the potential A(x): In particular we shall prove that an appropriate choice of the diusion coe!cient D compensates lack of convexity of the potential A. 21 Theorem 3.3. Let A 2 Wloc (IR) be bounded below and satisfy 1(x) = e;A(x) 2 R L1+(IR) with IR 1(dx) = 1 and let generate an admissible relative entropy. Then there exists a 1 > 0 and a function D = D(x) with D(x)  D0  x 2 IR for some constant D0 > 0 such that:      2  Z Z 1 00 (3.21) 1(dx)  2 D 1(dx) 1 1 IR 1 1 x IR Z

8 2 L (IR) with 1 +

IR

dx = 1:

The idea of the proof is to construct a bounded function D(x) such that (A D) satises (A1). To this end we need the following Lemma 3.4. Let A(x) satisfy the conditions of Theorem 3.3. Then there exists a > 0 such that the ODE 1 Dx2 ; 1 D + 1 D A + DA = (3.22) xx 4 D 2 xx 2 x x admits at least one global solution that satises D(x)  D0 > 0 x 2 IR: Note that the left hand side of (3.22) is the 1 ; d version of the left hand side of (A1) such that Theorem 3.3 follows. Proof: We transform D(x) = y (x)2 and solve yxx = Axxy + Axyx ; y  x 2 IR (3.24) y(0) = eA(0)  y0(0) = Ax(0)eA(0) : (3.25) The (possibly only local) solution of (3.23) satises

y(x) = eA(x)

;

eA(x)

Z x

e;A(z)

0

Z z 0

y( );1d



dz  eA(x) :

(3.26)

Due to this upper bound, y(x) can only break down at a nite x0 if y(x0) = 0: Locally, y(x) can be obtained through a xed point iteration starting with y0(x) := eA(x) :

yk+1(x) =

eA(x)



eA(x)

;



eA(x)

1 ; k

Z x 0

e;A

e;A(z)

Z z 0

yk

( );1d



dz

Z x  ; 1 yk ( ) d  L1 (IR)

k

0

37

(3.27)

k  0:

Starting with y0(x)  eA(x)  an iterative application of the estimate (3.27) yields

y(x)  eA(x) > 0

with

 = 1;

1;

(3.28) r



1 ; ...

= 1 + 1 ;  2 4

whenever  14 : By induction one shows that (yk ) is decreasing. Hence, (3.27) converges to y(x) 8x 2 IR: From (3.28) we obtain D(x)   2 e2A0 =: D0 x 2 IR with A0 denoting the lower bound of A(x): Note that D = e2A satises (3.22) with = 0: Hence the couple (A D = e2A) violates the Bakry-Emery condition (A1), nevertheless the convex Sobolev inequality (3.21) holds with D = e2A and 1 = 41 (due to the estimate (3.26)). Obviously, for A uniformly convex D can be chosen as a constant (cf. (A2)). We shall now show that for nonuniformly convex A in general (A1) cannot be satised with a uniformly bounded function D. Lemma 3.5. Let A(x) satisfy the conditions of Theorem (3.3) a) If 0 < D(x)  D1 < 1 x 2 IR then limx! 1Ax (x) = 1:

b) If 0 < D0  D(x)  D1 < 1 x 2 IR then A(x) grows at least quadratically.

Proof: a) Using D = y 2 we rewrite the dierential inequality (A1) as

yxx + y + f = (Axy)x

for some f (x)  0: Solving for A gives 



Z x Z x dz 1 Ax(x) = y(x) C + y(z) + f (z)dz + yx(x)  0 0

(3.29)

for some C 2 IR: Since y is bounded on IR we have limx!1yx(x) > ;1: The result follows from Z x dz  px  x  0: D1 0 y (z ) 38

b) Integrating (3.29) gives 



Z x Z x Z x Z z 2 dz dz y ( x ) 1 + C y(z) + ln y(0) + y(z) f ( )ddz A(x) = A(0) + 2 0 y (z ) 0 0 0 and the result follows from the boundedness assumptions on D. We now nish our discussion of perturbation results on convex Sobolev inequalities with an example. It demonstrates that the diusion coe!cient D constructed in the proof of Lemma 3.4 in many cases has a much stronger growth at x = 1 than necessary to satisfy the Bakry-Emery condition (A1). Example 3.6. Let A 2 C 2 (IR) satisfy A(x) = cjxj2 for jxj > L and 0 <   1. Then a pair (D ), > 0 satisfying (A1) can be constructed such that

D(x)

( ;  + + 1; 2 c + c x 1 2 = ; ; ; 1;2

c1 + c2 jxj

 x > L1   x < ;L1

(3.30)

for some c 1 2 IR, c 2 > 0, L1 > L. The construction of D proceeds as follows: On the interval (;L1  L1 ) where L1 will be determined later, we choose D = y 2 where y is the solution (3.26) of the IVP (3.23). Outside of this interval we extend D by (3.30) in a C 1 -way, thus xing c 12 (in dependence of and L1 ). By a perturbation argument around = 0 one easily sees that > 0 can be chosen small enough and L1 large enough such that yx (x)(L1 ) > 1 and such that (A1) holds for jxj > L1 .

3.3 Poincar e-type inequalities As an application of Theorem 3.1 we shall now derive Poincare-type inequalities. Remark 3.7. Poincare-type inequalities on bounded, uniformly convex domains are readily obtained from convex Sobolev inequalities. For simplicity's sake, let B = fx 2 IRnjjxj < 1g be the unit ball in IRn and let 0 = 0 (x),  0 (x)  on B , be the density of a probability measure on B . We dene the C 1-function A~" = A~"(jxj) on IRn for " > 0 by

d2 A~"(r) = dr2



We set

A"(x) =

1 0 < r < 1 1 " r > 1

(

d A~"(0) = 0:  A~"(0) = dr

2 A~"(jxj) ; jx2j ; ln 0 (x) x 2 B A~"(jxj) x 2= B

39

and

"

1 (x) = exp(

;

A" (x))=

Obviously,

"

1 (x)

;! " !0



Z

IR

n

exp(;A"(y))dy:

0 (x) x 2 B : 0 x 2= B

Since A" := ; ln "1 is an L1- perturbation (uniformly as " ! 0) of the uniformly convex function A~"(jxj), which satises @ 2 A~"  I on IRn 2

@x

we can apply the perturbation result Theorem 3.1 and obtain the convex Sobolev inequality (with D  I) Z





Z v " 00 R d  c 1 " IRn IRn IRn vd 1



v jrvj2 d "1 R R " 2 " IRn vd 1 ( IRn vd 1) 

for all admissible entropies and all functions v 2 L1+ (IRn  d "1 ) (v 2 L1 (IRn  d "1) if is quadratic on IR). Here c is independent of ". Passing to the limit " ! 0 gives the Poincare-type inequality Z





Z v 00 R d 0  c B B B vd 0



v jrvj2 d 0 R R 2 B vd 0 ( B vd 0 ) 

for all v 2 L1+ (B d 0 ) (v 2 L1 (B d 0 ) if is quadratic on IR). In the latter case () = ( ; 1)2 with 0 = vol(1B) we obtain the classical Poincare inequality Z 



Z Z 2 1 v ; vol(B ) vdx dx  2c jrvj2dx v 2 L1 (B ): B B B Obviously, the above limit argument can easily be carried over to general uniformly convex domains.

3.4 Sharpness results We now turn to the analysis of the saturation of the convex Sobolev inequalities, i.e. we shall answer the question for which function the inequality (2.70) becomes an equality. In particular we shall nd necessary and su!cient conditions on the entropy generator and on 1, which imply the existence of an admissible 40

function 6= 1 such that (2.70) (under the assumption D = I) becomes an equality. We remark that this question was completely answered in Car91] for the Gross logarithmic Sobolev inequality (i.e. () =  ln  ;  + 1, 1 = Ma ) by a technique dierent from the one presented in the sequel. The same problem was treated in Tos97a] and Led92] using a method which we shall generalize below. At rst we observe that the derivation of the convex Sobolev inequalities given in Section 2 is based on writing the entropy equation d e ( (t)j ) = I (e(t)j ) (3.31) 1 1 dt the equation for the entropy production d I ( (t)j ) = ;2 I ( (t)je ) + r ( (t)) (3.32) 1 1 1 dt and proving r ( (t))  0. Explicitly, we obtain by inserting (3.31) into the right hand side of (3.32) and by integrating with respect to t: Z 1 1 1 e ( I j 1) = ; 2 I ( I j 1) ; 2 r ( (s))ds (3.33) 1 1 0 where (t) in the Fokker-Planck trajectory which `connects' the initial state I with the steady state 1. r  0 then gives the convex Sobolev inequality e ( I j 1)  21 jI ( I j 1)j 1 which becomes an equality i Z 1 0

r ( (s))ds = 0

() r ( (t)) = 0

a.e. in IR+t :

(3.34)

The precise form of the remainder r can easily be extracted from the proof of Lemma 2.8. Assuming henceforth

DI

we obtain Z



(3.35) !

n X @ul )2 e;A dx @u IV A 4 000 00 A A > r ( ) = n (e )juj + 4 (e )u @x u + 2 (e ) ( @x m IR lm=1  2  Z +2 n 00 (eA )u> @@xA2 ; 1I u e;Adx (3.36) IR

41

where we recall u = r(eA ). Obviously, r ( ) = 0 if u  0, which (taking into account the normalization) implies  1 and gives the trivial case of equality in the convex Sobolev inequality. Thus, assume u 6 0. Then, using the admissibility conditions (2.12) for the entropy generator and (A2), we conclude that r ( ) = 0 holds i the subsequent four conditions are satised: 000(eA )2 = 12 00(eA ) IV (eA ) (3.37a)  n X

!1

@u uj = ju> @x

(3.37b)

000 A 4 00(eA )u> @u @x u = ; (e )juj 

(3.37c)

juj

2

@ul )2 ( @x m lm=1



2



@ 2 A ; I u = 0: (3.37d) @x2 1 Since (t) := eA (t) satises (2.11) we have by the maximum-minimum principle u>

A  eA(x) (x t)  sup eA inf e I I n n IR IR

a.e. in IRn  IR+t . We conclude from (3.37a): Lemma 3.8. If the Sobolev inequality (2.70) with D  I becomes an equality for 6 1, then () = () or () = '() holds for  2 (inf IRn eA I  supIRn eA I ), where  and ' are given by (2.16a) and (2.16c) resp. Nontrivial saturation can therefore only occur for the \minimal" and \maximal" entropies. At rst we investigate the case of the maximal (quadratic) entropy. Note that now any 2 L1(IRn dx) is admissible. Positivity is not required. Theorem 3.9. The convex Sobolev inequality (2.70) with D  I and () = ( ; 1)2 becomes an equality i the following two conditions hold: (i) there exist Cartesian coordinates y = (y1 : : :  yn)> = y (x) on IRn such that for some  2 IR

A(x(y)) = 21 y12 + y1 + B (y2 : : :  yn) 42

(3.38)

(ii) satises for some  2 IR:

(x(y)) = (1 + y1)e;A(x(y)) : Proof: Since 000

 0 we conclude from (3.37b) and (3.37c):

(3.39)

@ul = 0% l m = 1 : : :  n (3.40) @xm (u 6 0 !). Thus u(x t) = rx(eA(x) (x t))  C (t), where C (t) is constant in x. Then (x t) = (C (t)  x + C1 (t)) e;A(x) follows with C1 real valued. Inserting into the Fokker-Planck equation gives C_  x + C_1 = ;rA  C and C_ = ; @@x2 A2 C

follows. (3.37d) gives C (t)> @@x2A2 ; 1I C (t) = 0 and since @@x2 A2 ; 1I  0 @ 2 A C (t) = C (t). We obtain C_ = ; C and C (t) = C e;1 t . we conclude 1 1 0

@x2 2A @ Therefore @x2 ; 1I C0 = 0 and we have rA  C0 ; 1 C0  x   2 IR. We insert C (t) = C0e;1 t into C_  x + C_ 1 = ;rA  C and nd C1 (t) = 1 e;1t + C2  C2 2 IR. This gives    (x t) = C0  x + e;t;A(x) + C2e;A(x) 1 and t! 1 implies C2 = 1. Summing up, we found, for some C0 2 IRn,  2 IR:    (x t) = C0  x + e;1 t;A(x) + e;A(x) (3.41a) 1 and (3.41b) r(A ; 21 jxj2)  C0 = : Note that C0 = 0 implies  = 0 and (x t)  e;A(x) follows. We therefore assume C0 6= 0 from now on. Set !1 = jCC00 j , choose orthonormed vectors !2 : : :  !n 2 f!1g? and dene the change of coordinates x $ y by x = y1!1 + : : : + yn!n. We have rxf (x)  C0 = @f (x(y)) jC j and thus 0 @y1 A(x(y)) = 21 y12 + jC j y1 + B (y2 : : :  yn) 0

follows from (3.41b), and this nishes the proof. Next we treat the \minimal" entropy case. 43

Theorem 3.10. The convex Sobolev inequality (2.70) with D = I and () =  ln  ;  + 1 becomes an equality i the following two conditions hold:

(i) there exist Cartesian coordinates y = (y1  : : :  yn) = y (x) on IRn such that for some  2 IR

A(x(y)) = 21 y12 + y1 + B (y2 : : :  yn)

(ii) satises for some  2 IR





2 = exp ;A(x(y)) + y1 ; 2 +  1 : 1

Proof: We set z =

):

r ( ) = 2 Assume z 6 0.

Z

(3.42)

r ln( eA) in (3.36) and calculate (using the specic form of

n  X

@zl @xm

2

IRn lm=1 @z 0 follows Then @x



Z

dx + 2

IRn

z>





@ 2 A ; I zdx: @x2

and z(x t)  C (t) 2 IRn. This gives (x t) = C1(t) exp(C (t)  x ; A(x)) with C1 (t) > 0. Inserting into the Fokker-Planck equation and proceeding as in the proof of the previous Theorem implies

(3.43) (x t) = exp (C0  xe;t +  )e;t ; A(x) where  and C0 satisfy (3.41b). Setting t = 0 in (3.43) proves the assertion. Note that in (3.42) is obtained by a shift in the y1-coordinate of 1 (cf. Car91]). In one dimension (n = 1) or for constant matrices D the analogous result to the Theorems 3.9 and 3.10 is obtained by the coordinate transformation of Remark 2.10. For D(x) 6 I, however, nontrivial saturation of the convex Sobolev inequality (2.70) is not always possible for any A(x): even for scalar diusion D(x) = D(x)I, the analogue of (3.40) implies an integrability condition on D(x), which is not satised in general.

4 A generalized Csiszar-Kullback Inequality We shall now be concerned with proving generalized Csiszar-Kullback inequalities, which are estimates for the L1 -distance of two functions in terms of their relative entropy. These inequalities have at least a 30 years history in probability and 44

information theory Csi63, Csi66, Csi67, Kul59, KuLei51, Per57, Per65, BaNi64]. In Csi67] (Theorem 3.1, page 309, and Section 4, page 314) the following result was obtained: Theorem 4.1. Let e : (0 1) ! IR be bounded below, convex on (0 1), strictly convex at 1 with e(1) = 0. Then there exists a function W e : IR ! 0 1) such that 1. W e is increasing. 2. W e(0) = 0. 3. W e is continuous at 0.

1 4. For all R non-negative u 2 L () S  ) (where () S  ) is a probability space) with u d = 1 the Csiszar-Kullback inequality holds:

ku ; 1kL d  W (e (u)) 1(

where e e(u) :=

R

e

)

(4.1)

e

e(u) d.

For the generating function e() =  ln  ;  + 1, e.g., it is known Csi67] that p W e(c)  2c: (4.2) By a rescaling argument the inequality (4.1) can be extended to functions u 2 L1 (d), u  0, which are notR necessarily probability densities: Excluding the case R u  0, we replace u by u=( u d), apply (4.1) and multiply by u d to get   u 

;

Z

  u d

L1 (d)



Z









u d W e e(u) 

R

where () = e( ud ). Throughout this Section we shall make use of the following assumptions and notations:

B.1 () S  ) is a measure space and  is a probability measure on S ,

i.e. ()) = 1   0. B.2 : J ! IR is a strictly convex, continuous, non-negative function on the (possibly unbounded) interval J , which has a non-empty interior J . If  := inf J 2= J , then lim ! () = 1. If  := sup J 2= J , then lim ! () = 1. 45

For functions u 2 L1(d) := L1 ()% d) we dene the entropy 8 Z >
:



R

(u) d ; (u]) if u(x) 2 J -a.e.

1

else,

(4.3)

with the notation u] := u d. Remark 4.2. Since is continuous on J the mapping x 7! (  u)(x) is S measurable whenever u is S -measurable. Furthermore we have due to Jensen's inequality for all u 2 L1 (d) with u(x) 2 J  ; a:e: Z



(u) d 

Z





u d :

Thus the right-hand side of (4.3) has a well-dened value in 0 +1].

Remark 4.3. a) In contrast to Denition (2.1) the generating function does

not have to satisfy the normalization (1) = 0(1) = 0. b) The required continuity of as well as the assumed behaviour of ( ) as  tends to the boundary of J deserve a few more comments. Let ^ : J ! IR be convex on the interval J , which has a non-empty interior. Then ^ is continuous on J and possible points of discontinuity are contained in @J \ J . If  2 IR and if () remains bounded as  approaches , then we can (re-)dene ^() := lim !+ ^(). We get a convex function, less or equal to the original one (i.e. the entropy decreases), which is continuous at . The same procedure applies in cases where  is nite and ( ) remains bounded as  approaches  . The cases  = ;1 or  = 1 are discussed in c). c) Not every strictly convex, continuous function ^ : J ! IR on the (possibly unbounded) interval J with non-empty interior satises B.2. However, we can normalize the generating function ^ as in Remark 2.2: Due to the strict convexity of ^ there exist for each 0 2 J a constant b 2 IR such that the function () := ^() ; ^(0) ; b( ; 0 ) is non-negative on J . Furthermore we have for all u 2 L1 (d) the equality e (u) = e ^(u) which shows that and ^ generate the same entropy.

Remark 4.4. Let us assume that )  IRn, n 2 IN is a Borel set, S is the Borel

sigma algebra on ), and  is a probability measure on S , which is absolutely continuous with respect to the Lebesgue measure. In this case the Radon-Nikodym derivative of  with respect to the Lebesgue measure exists:

g(x) := ddx(x) : 46

To keep things simple let us assume that g (x) > 0 for almost all x consider a function f 2 L1 () dx) which satises

2 ).

Now

u(x) := fg((xx)) 2 J a.e. on ): Then e (u) is - as in the previous sections - the relative entropy of f w.r.t. g :

e (u) = e (f jg) =

Z

(f=g)g dx ;

Z



f dx :

R

In contrast to Denition 2.1, however, f dx needs Rnot to be normalized here. In the case of ( ) = ln ;  +1, and f (x)  0 with f dx = 1 the inequalities (4.1) and (4.2) combine to (2.24).

The subsequent analysis aims at generalizing the above entropy estimates. In particular we will establish a generalized Csiszar-Kullback inequality of the form   u 

;

Z

  u d

L1 (d)

U



e (u)



Z

u d 

(4.4)

where the function U only depends on and u belongs to L1 (d). We shall present a transparent (and almost elementary) construction of U and thus extend the results of the above cited references to a larger class of functions u (u does not have to be normalized or non-negative here). Then we shall prove the optimality of the estimate and construct U explicitly for certain convex functions . The explicit construction of U is somewhat lengthy and thus deferred to the Appendix. In the sequel we shall need the following -dependent constants. For any  2 J we dene: ( + z) ; () 2 (0 1] if  = 1 Q+ () := zlim !1 z

( ; z) ; () 2 (0 1] if  = ;1 Q; () := zlim !1 z 8 > > > > >
> > > > :

() ; () + ( ; )Q+ () if  () 2 IR  = 1 ( ) ; () + ( ; )Q; () if  ( ) 2 IR  = ;1 ( ; ) ( ) + ( ; ) () ; () if   () ( ) 2 IR ; 1 else, 47

(with the convention  1 = 1 for all  > 0), and 8 2( ; ) if  2 IR  = 1 > > > > 2( ; ) if  = ;1  2 IR < Usup() := > 2( ; )( ; ) if   2 IR > > ; > : 1 if  = ;1  = 1 We now collect the essential properties of U in the following theorem% its elementary proof is given in the Appendix. Theorem 4.5. Under the assumption B.2 there exists a function U : 0 1)  J ! IR such that

U is continuous. For all  2 J : U(0 ) = 0. For all  2 J : The mapping c 7! U(c  ), c 2 0 1), is increasing. For all  2 @J \ J and all c 2 0 1) the identity U(c ) = 0 holds. For all  2 J : The mapping c 7! U(c ), c 2 0 1), is strictly increasing on 0 c0 ( )). P6. For all  2 J : clim U(c ) = Usup(). !1

P1. P2. P3. P4. P5.

P7. For all  2 J and all c 2 c0 () 1): U(c ) = Usup().

We note that @J \ J (see P4.) may be empty. In P7. we use the convention 1 1) = . The main result of this Section is: Theorem 4.6. Assume B.1 - B.2. Then we have for all u 2 L1(d) with e (u) < 1:   u 

;

Z



  u d

L1 (d)

U



e (u)

Z





u d :

Proof: We rst shall introduce some abbreviations. Recall that u] :=

Since e (u) < 1 we have u(x) 2 J -a.e. Hence u] 2 J . We set )+0 := fx 2 ) : u(x)  u]g ); := fx 2 ) : u(x) < u]g and dene

w := u ; u]  a :=

Z

+ 0

48

w d  # := ()+0 ):

(4.5) R

u d.

; We R note that # 2 (0 1] and () ) = 1 ; #. Since ; w d = ;a and   u 

;

Z



  u d

L1 (d)

=

Z



jwj d =

Z

+ 0

w d ;

R

w d = 0, we have

Z

;



w d = 2a:

(This factor 2 also appears in the denition of U.) We proceed by a casedistinction. The rst case is probably the most interesting one.

Case 1: # 6= 1. If u] 2 @J \ J , then we will get u = u] and therefore # = 1. This is a contradiction since we have assumed # < 1. Hence u] 2 J . Furthermore we get from the denition of a u] and #:

a  minf#( ; u]) (1 ; #)(u] ; )g

and therefore   U a a sup(u]) ( ; u])(u] ; ) a 2 =  # 2  ; u]  1 ; u] ;  \ (0 1): 2 Now we apply Jensen's inequality:

1

> e (u) = = =

Z

Z



Z

 (u) ; (u])] d  (w + u]) ; (u])] d

+ 0

 (w + u]) ; (u])] d +

Z

;



 (w + u]) ; (u])] d  

a  # u] + # ; (u]) + (1 ; #) u] ; 1 ;a #  ; !  ;  u] + #a ; (u]) u] ; 1;a # ; (u]) = a + a a



=: ^(u] a #)



1;#

#

inf f^(u] a #) : # 2 J (u] a) \ (0 1)g

49



; (u])



where

8  > > > > >  > > >  > > > > > <   > > > > > >  > > >  > > > > :



a 1; a ; u] u] ;   a 1; a ; u] u] ;  J (u] a) := a 1; a ; u] u] ;   a 1; a  ; u] u] ;  We have to distinguish between three cases.

if   2 J if  2= J  2 J if  2 J  2= J if   2= J

Case 1a: 0 < a < Usup(u])=2. In this case the interval J (u] a) \ (0 1) has a nonempty interior. Due to the fact that ()  (+) (if  2 J ) and ( )  ( ;) (if  2 J ) we have



n

o

inf ^(u] a #) : # 2 J (u] a) \ (0 1)    a a ^ = inf (u] a #) : # 2  ; u]  1 ; u] ;   and therefore with the notations of the proof of Theorem 4.5 ku ; u]kL1(d)  I (e (u) u])  a = 2 which gives due to the denition of U

e (u)

U(e (u) u])  ku ; u]kL1(d) : Case 1b: a = 0. This case needs not to be considered here because a = 0 implies u = u] and therefore # = 1. But this is a contradiction because it is assumed that # < 1. Case 1c: a = Usup(u])=2. Since a 2 IR we have  2 IR or  2 IR. If  2 IR and  = 1 we will get the contradiction # 2 0 0) = , and if  = ;1 and  2 IR we will get the contradiction # 2 1 1) = . Hence, the only remaining case is   2 IR. We get # = (u] ; )=( ; ) which gives + ( ; u]) () ; (u]) e (u)  (u] ; ) () ;   (u] ; ) (;) ;+ ( ; u]) (+) ; (u]) = c0(u]) 50

and therefore by the denition of U and Usup(u])

U(e (u]) u]) = U(c0 (u]) u]) = Usup(u]) = 2a = ku ; u]kL1(d) :

Case 2: # = 1. In this case we easily get u = u]. Hence, e (u) = 0 = ku ; u]kL1(d) and therefore

U(e (u) u]) = U(0 u]) = 0 = ku ; u]kL1(d) :

We shall now discuss inequality (4.4).

4.1 Optimality A natural question in connection with (4.4) is the following. Is it possible to improve (4.4), i.e. is there a function U : f(c ) :  2 J  c 2 0 c0())  J g (for reasons explained in Remark 4.8 below we exclude the cases c 2 c0() 1) and  2 @J \ J here) such that 1. for all u 2 L1(d) with e (u) < 1: ku ; u]kL1(d)  U (e (u) u]), 2. there exists a u 2 L1(d) with e (u ) < 1 and U (e (u ) u ]) < U(e (u ) u ]) ? Under the assumption B.3 (see below) the answer to this question is no. Theorem 4.7. Assume B.1, B.2, and furthermore

B.3 For all # 2 (0 1) there exists a set S# 2 S such that (S#) = #. Then for all  2 J and c 2 0 c0 ( )) there exists a sequence (um )m2IN in L1 (d) such that 1. For all m 2 IN : um] =  .

2. For all m 2 IN : e (um) = c.

3. mlim ku ; um]kL1(d) = U(c ). !1 m

51

Proof: Let us consider the case 0 < c < c0 ( ) rst. In this case there exists a

(uniquely determined) a 2 (0 Usup=2) such that c = #2(a=(; )inf ^( a #) 1;(a=( ;)))

where ^ is dened as in the proof of Theorem 4.6. Clearly, 2a = U(c ). We note that the mapping # 7! ^( a #) is not constant and the mapping a 7! inf #2(a=(; )1;(a=( ;))) ^( a #) is increasing. Furthermore, ^ is continuous. We can therefore choose a sequence (#m )m2IN in (0 1) and a sequence (am )m2IN in (0 a) such that ^( am  #m) = c and limm!1 am = a. Now we choose a sequence (Sm )m2IN in S such that (Sm ) = #m for all m 2 IN . We set for all m 2 IN um : ) ! IR  x 7! ;(a(a=m(1=#;m )#+)) +  ifif xx 22 S)mn  m

m

We easily verify that um] =  and e (um) = c. Furthermore, lim ku ; um ]kL1(d) = 2a = U(c ): m!1 m

This nishes the proof for the case 0 < c < c0(). If c = 0 we choose um = , which gives um] = , e (um) = 0 = c and kum ; um]kL1(d) = 0 = U(0 ) = U(c ):

2 @J \ J and  2 J with c 2 c () 1), c () < 1 are not included. Indeed, if  2 @J \ J , then we would Remark 4.8. In Theorem 4.7 the cases  0

0

have u = u]. Hence, e (u) = 0 and in this case the estimate (4.4) is optimal, too: ku ; u]kL1(d) = 0 = U(0 ) = U(e (u) u]): If  2 J and e (u)  c0( ) we will trivially get for all u 2 L1 (d) with u(x) 2 J -a.e. and u] =  ku ; u]kL1(d)  Usup(u]) = U(c0(u]) u]) = U(c u]): We note that this inequality is strict in cases where  = ;1 or  = 1.

4.2 An extension of the inequality A closer screening of the proof of Theorems 4.5 and 4.6 shows that the core of all estimates is the fact that for all  2 J and all a 2 (0 Usup=2) the strict estimate inf ^( a #) > 0 #2(a=(; )1;(a=( ;))) 52

holds. For xed  2 J this is the case i is strictly convex at  , i.e. i there exists a constant b 2 IR such that the strict estimate

() > ( ) + b( ;  )

holds for all  2 (  ),  6=  . We can therefore extend Theorems 4.5 and 4.6 in the following way: Theorem 4.9. Assume B.1 and

B.2' ~ : J ! IR is a convex, continuous, non-negative function on the (possibly

unbounded) interval J , which has a non-empty interior J . If  2= J , then lim ! ~() = 1. If  2= J , then lim ! ~() = 1. Furthermore assume that ~ is strictly convex at  2 J .

Then: a. There exists a function U  : 0 1) ! IR such that

U  is continuous. U  (0) = 0. U  is increasing. U  is strictly increasing on 0 c0( )). lim U (c) = Usup( ). c!1  For all c 2 c0( ) 1): U  (c) = Usup( ). b. Let u 2 L1 (d). If e (u) < 1 and u] =  , then P1. P2. P3. P4. P5. P6.

  u 

;

Z



  u d

L1 (d)

 U  (e (u)): ~

We note that an optimality result as Theorem 4.7 also holds for U  . As an example consider the function ~() = jj which is strictly convex at  = 0. R In this case we get ej:j(u) = jujd and U0 (c) = c which trivially gives for all u 2 L1 (d) with u] = 0 Z

Z

kukL d  U (ej:j(u)) = U ( jujd) = jujd: 1(

)

0

0

53

4.3 Comparison with the classical Csisz ar-Kullback inequality Inequality 4.4 deepens the insight into the classical Csiszar-Kullback inequality in four ways. Firstly, inequality 4.4 isR valid for all functions u with e (u) < 1, i.e. there is no normalizing condition ud = 1 assumed and u may be negative. Secondly, we have an optimality result for the function U. Thirdly, it has been proven that U is strictly increasing on 0 c0()). Fourthly, the denition of U is rather straightforward and requires no measure-theoretic background but only Jensen's inequality.

4.4 An example The calculation of U involves a minimizing procedure. If is dierentiable this will lead to the problem of nding roots of transcendental equations. Hence there is no hope for an explicit presentation of U in general. In some cases, however, the function U can be found explicitly, at least for special R values of u] = u d. If () = j ; 1j2p, p 2 1 1), on J = IR, we have for all admissible u with u] = 1:

ku ; 1kL

d)

1(



Z

juj ; 1

;





2p

1=2p

d

:

In the situation described in Remark 4.4 this becomes with the additional requiR rement f dx = 1:

kf ; gkL dx  1(

Z

)

jf ; g j

2p



g

1;2p

1=2p

dx

:

For the special case of p = 1 and u] = 1 the above two inequalities read:

ku ; 1kL d  1(

)

and respectively,

kf ; gkL dx  1(

sZ

sZ

)



54

(u2 ; 1) d

(f ; g)2 g;1 dx:

4.5 Asymptotic behavior of U as e (u) ! 0 As mentioned before U can in general not be expressed in terms of elementary functions. For `small' values of e (u), however, one can obtain an asymptotic expansion of U. Let us recall the following result of Csi67]: If 2 C 2 (J ), 1 2 J and 00(1) > 0, then the Cszisar-Kullback inequality holds: ( 9K12 > 0 : 8u 2 L1(d) u  0 u] = 1 : If e (u)  K2 then (4.6) ku ; 1kL1(d)  K1pe (u): In Csi67] the investigations are specialized to the case () =  ln. A stronger result holds in this case:

For all u 2 L1(d) with u  0,

R

u d = 1:



ku ; 1kL

d)

1(



s

2

Z

u lnu d:

We note that 00(1) > 0 was essential to get (4.6). Otherwise one can, in general, not get an estimate of the form ku ; 1kL1(d)  K (e (u))  K  2 (0 1) even for small values of e (u) (to keep things simple we consider only the case R u d = 1). Example 4.10. Let ) = 0 1], d = dx. Let

: IR ! IR   7!

with

 : IR

!

IR 8

;(; =; 1);

;e

Z

1

(s) ds

 2 (;1 0]  2 (0 1)  = 1  7!  2 (1 2)  2 2 +1): Certainly, is bounded below, strictly convex and belongs to C 2(IR) with (1) = 0(1) = 00 (1) = 0. For k 2 IN let uk : 0 1] ! IR  (1=k)   2 0 1=2]  7! 11 + ; (1=k)   2 (1=2 1] :   0  > ; =( ;1) > e  > > : ;2 e ( ; 1)  > > > >
0, only depending on I and on the parameters of the problem (5.1) such that

k (t) ; ~1kL

dxdP ())

1(

 Ke;t 8t > 0:

Clearly, ~1 is a solution of the equation Z ~ ) 0 (x  % 1 R ~1(x  ) = (y  % ~ )dy n I (y  )dy: 1 IR IRn 0 66

(5.15)

If W (x) = W (;x) 8x 2 IRn holds, then a solution of (5.15) can be easily constructed. We set z = sm = 0 and 

~1(x  ) = N exp ; D1 (W (x) + 2 jxj2 )

Z

IRn

I (y  )dy

(5.16)

where N = IRn exp ; D1 (W (y) + 2 jyj2) dy ;1. It is immediately veried that this function solves the equation (5.15). We remark that convergence of (t) to a steady state ~1 without a convergence rate can still be shown if the single oscillator potential W (x) is not uniformly convex but satises only certain mild growth conditions (cf. ArBoMa95]). Then !1 it was shown in ArBoMa95] that I (t) t;! 0 still holds (without a rate). The method used above implies immediately ;R

;





!1 0 e( (t)j 1) t;!

!1 and (t) t;! ~1 in L1 (dxdP ( )) follows.

5.2 The drift-di usion-Poisson model Secondly, we consider the drift-diusion model with Poisson-coupling for an electron gas Mar86], MaRiSc90]

t = div



r + r jx2j 

2

 

+ V  x 2 IRn t > 0

(t = 0) = I  0 on

IRn

;&V = :

Z

IRn

I (x)dx = 1

(5.17a) (5.17b) (5.17c)

Here jx2j acts as a conning potential. For the sake of simplicity we only consider the case of at least 3 dimensions, i.e. n  3 and take the Newtonian solution of (5.17c): 2

Z 1 V (x t) = (n ; 2)S n jx ;(yyjtn);2 dy (5.17d) n IR with Sn being the surface area of the unit sphere in IRn: An existence-uniqueness result (for a global solution) of (5.17) is easily obtained by proceeding in analogy

67

to the bounded domain case (cf. e.g. Gaj85]). The steady state of (5.17) is the unique solution of the mean-eld equation

1(x) := exp



 Z j x j ; 2 ; V1(x) IR 2

exp n

 j y j ; 2 ; V1(y) dy



2

Z 1 V1(x) = (n ; 2)S n jx ;1y(yjn);2 dy: n IR

(5.18a) (5.18b)

A detailed analysis of equations of the type (5.18) can be found in Dol91]. We shall now show that (t) converges exponentially to 1 by using logarithmic Sobolev inequalities. We start by dening the relative entropy type functional Z





Z 1 e := n ln dx + 2 n jr(V ; V1)j2dx (5.19) 1 IR IR (cf. Gaj85]) and compute its time-derivative:   d e(t) = ;Z (t)jr ln (t) j2 dx (5.20) dt N (t) IRn where we denoted the t-local state   Z   2 2 j x j j y j N (t) = exp ; 2 ; V (x t) exp ; 2 ; V (y t) dy: (5.21) IRn We remark that the calculation which leads to (5.20) is completely analogous to Gaj85] making appropriate use of the Poisson equations (5.17d), (5.18b). Assume now that V (t) is in L1(IRn) uniformly in t, i.e. there is a K > 0 such that

kV (t)kL1 IR  K t  0 ( n)

(5.22)

(which shall be proven later on under additional assumptions on I ). Then the logarithmic Sobolev inequality (2.70) and the perturbation result of Theorem 3.1 imply the existence of 1 > 0 (independent of t) such that Z Z

1 ln N dx  2 jr ln( N )j2dx t  0: (5.23) 1 We use (5.23) in (5.20) and obtain 



d e(t)  ;2 Z ln dx = ;2 Z ln dx ; 2 Z ln 1 dx: 1 1 1 dt N 1 N IRn IRn IRn 68

Since





ln 1 = V (t) ; V1 + ln N (t)

Z

IRn

eV1;V (t)

1 dx



we have     d e(t)  ;2 Z ln dx ; 2 Z (V (t) ; V ) dx + 2 ; ln(Z eV1 ;V (t) dx) 1 1 1 1 dt 1 IRn IRn IRn R (using IRn (y t)dy = 1 8t > 0): The convexity of f (u) = ; ln u implies

; ln

Z

IRn

and



eV1 ;V (t)

1 dx





Z

IRn

(V (t) ; V1) 1dx



d e(t)  ;2 Z ln dx ; 2 Z (V (t) ; V )( (t) ; )dx 1 1 1 1 dt 1 IRn IRn follows. Now we use (t) ; 1 = ;&(V (t) ; V1) such that   d e(t)  ;2 Z (t) ln (t) dx ; 2 Z jr(V (t) ; V )j2 dx 1 1 1 dt 1 IRn  ;2 1e(t): We conclude exponential convergence in relative entropy:   Z Z ( t ) 1 (t) ln dx + 2 n jr(V (t) ; V1)j2dx n 1 IR Z  IR  Z 1 I ; 21 t 2 (5.24)  e I ln dx + 2 n jr(VI ; V1)j dx n 1

IR

IR

(where, obviously, VI is the Newtonian solution of ;&VI = I ) and exponential L1 (IRn)-convergence of (t) to 1 follows from the Csiszar-Kullback inequality. We are left with proving the bound (5.22). Therefore we proceed as in FaStr] and multiply (5.17a) by q  q 2 IN: Integration by parts and using (5.17c) gives 1 Z q+1 dx = ; 4q Z jr( q+12 )j2dx + qn Z q+1 dx ; q Z q+2 dx: q + 1 IRn q + 1 IRn q + 1 IRn q + 1 IRn Applying the Nash-inequality Nas58] Z

1+ 2

f dx n 2

IR

n

 an

Z

4 Z n

jf jdx IR n

69

jrf j dx IR 2

n

for some an > 0 to the rst term on the right hand side (with f = q+12 ) gives

d z  ; 4q (zq+1)1+ n2 + nqz  q+1 dt q+1 an (z q+1 ) n4 2

where we set

zq :=

Z

q dx: n

IR

It is easy to conclude that zq+1 (t) is uniformly bounded for t > 0 if zq+1(0) < 1: By interpolation we nd (t) 2 Lp+ (IRn) uniformly for t > 0 if I 2 Lp+ (IRn) \ L1+(IRn): Since the Newtonian potential of a function in Lp(IRn) is in L1(IRn) if p > n2 we obtain Theorem 5.2. Let I 2 L1+(IRn) \ Lp+(IRn) for some p > n2 : Then there is a constant 1 > 0 such that the exponential convergence (5.24) of the relative entropy holds for the solution ( (t) V (t)) of (5.17). We refer to ArMaTo98] for a detailed analysis of bipolar models of the form (5.17) (i.e. with two types of carriers).

6 Appendix: Proofs of Section 4 Proof of Theorem 4.5:

Step 1: We shall rst construct an auxiliary function . We introduce M := f( a #) 2 J  (0 1)  (0 1) : a < minf#( ; ) (1 ; #)( ; )gg: We note that M 6=  consists exactly of those ( a #) 2 J  (0 1)  (0 1) for which  + (a=#) and  ; (a=(1 ; #)) belongs to J . We dene  : M ! IR    



a a ( a #) 7! #  + ; () + (1 ; #)  ; ; () : # 1;# For xed  2 J let Pra(M % ) be the projection of M onto the a-axes and for xed  2 J and xed a 2 Pra (M % ) let Pr#(M %  a) be the projection of M onto the #-axis. Then we can easily verify that     a a U sup ( )  Pr# (M %  a) =  ;   1 ;  ;  : Pra (M % ) = 0 2 70

Due to the assumed strict convexity of we observe that for all ( a) 2 J 

0 Usup2 ( ) :

( a #) !  ;   + #a ; ()  ; 1;a # ; () = #2Prinf(M  a) a + a a # # 1;# > 0 inf

#2Pr# (M  a)  ;

Step 2: We dene



H^ : J  0 Usup2 ( ) ( a)



! !7

(0 1) inf ( a #) #2Pr (M  a) #

We note that H^ is dened as the inmum of a bounded-below, continuous function (recall that each convex function is continuous on the interior of its domain) over an open set where two entries are xed. Hence H^ is continuous. Furthermore, by the strict convexity of , the function

Usup ( ) ^ ! (0 1) H ( :) : 0 2 a 7! H^ ( a) is for all xed  2 J strictly increasing. It is not very di!cult to check that for all  2 J : lim H^ ( a) = 0  lim H^ ( a) = c0 (): Usup () a!0 a!

For xed  BeLo76])

2

2 J we can therefore dene the generalized inverse mapping (see 



I (: ) : (0 1) ! 0 Usup2() 8

;1 > ^ H (  : ) (c) if c 2 (0 c0()) < c 7! > U () sup : if c 2 c0 ( 1) 2 We note that I (: ) is increasing, strictly increasing on (0 c0()) and that the mapping ( c) 7! I (c ),  2 J , c 2 (0 1) is continuous. Furthermore it is easy to check that limc!0 I (c ) = 0. Let us note that Usup() : lim I ( c  ) = c!c0( ) 2 71

Step 3: Now we are in the position to dene U.

U : 0 1)  J

! !7

IR 8

2I (c ) if c 2 0 c0())  2 J Usup() if c 2 c0 () 1)  2 J (c ) : 0 if  2 @J \ J The verication of P1. -P7. follows from the previous remarks and can therefore be left to the reader.