(New from old) Assume that f(x) is a (strictly) convex function on an interval I ..... In 1890, Poincaré [316] noticed the integral version of this identity (that yields the ...... [130] W. Fulton, Eigenvalues, invariant factors, highest weights and Schubert.
CONVEX FUNCTIONS A Contemporary Approach 2nd Edition Constantin P. Niculescu and Lars-Erik Persson Preliminary version, July 11, 2017
2
Contents 1 Convex Functions on Intervals 1.1 Convex Functions at First Glance . . . . . . 1.2 Consequences of Young’s Inequality . . . . . 1.3 Log-convex Functions . . . . . . . . . . . . 1.4 Smoothness Properties of Convex Functions
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
1 1 12 21 25
A Generalized Convexity on Intervals 27 A.1 Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 A.2 Convexity According to a Pair of Means . . . . . . . . . . . . . . 30 A.3 A Case Study: Convexity According to the Geometric Mean . . . 33 Bibliography
39
Index
64
i
ii
CONTENTS
Chapter 1
Convex Functions on Intervals The study of convex functions of one real variable offers an excellent glimpse of the beauty and fascination of advanced mathematics. The reader will find here a large variety of results based on simple and intuitive arguments that have remarkable applications. At the same time they provide the starting point of deep generalizations in the setting of several variables, that will be discussed in the next chapters.
1.1
Convex Functions at First Glance
Throughout this book I will denote a nondegenerate interval (that is, an interval containing an infinity of points). 1.1.1 Definition A function f : I → R is called convex if f ((1 − λ)x + λy) ≤ (1 − λ)f (x) + λf (y)
(1.1)
for all points x and y in I and all λ ∈ [0, 1]. It is called strictly convex if the inequality (1.1) holds strictly whenever x and y are distinct points and λ ∈ (0, 1). If −f is convex (respectively, strictly convex) then we say that f is concave (respectively, strictly concave). If f is both convex and concave, then f is said to be affine. The affine functions on intervals are precisely the functions of the form mx+ n, for suitable constants m and n. One can easily prove that the following three functions are convex (though not strictly convex): the positive part x+ = max {x, 0}, the negative part x− = max {−x, 0}, and the absolute value |x| = max {−x, x}. Together with the affine functions they provide the building blocks for the entire class of convex functions on intervals. 1
2
1 Convex Functions on Intervals
2 Simple computations show that √ the square function x is strictly convex on R and the square root function x is strictly concave on R+ . In many cases of interest the convexity is established via the second derivative test. Some other criteria of convexity related to basic theory of convex functions will be presented in what follows. The convexity of a function f : I → R means geometrically that the points of the graph of f |[u,v] are under (or on) the chord joining the endpoints (u, f (u)) and (v, f (v)), for all u, v ∈ I, u < v; see Fig. 1.1. Thus the inequality (1.1) is equivalent to f (v) − f (u) f (x) ≤ f (u) + (x − u) (1.2) v−u for all x ∈ [u, v], and all u, v ∈ I, u < v.
Figure 1.1: Convex function: the graph is under the chord. This remark shows that the convex functions are majorized by affine functions on any compact subinterval. The existence of affine minorants will be discussed in Section 1.5. Every convex function f is bounded on each compact subinterval [u, v] of its interval of definition. In fact, f (x) ≤ M = max{f (u), f (v)} on [u, v] and writing an arbitrary point x ∈ [u, v] as x = (u + v)/2 + t for some t with |t| ≤ (v − u)/2, we easily infer that (u + v ) (u + v) (u + v ) f (x) = f + t ≥ 2f −f −t 2 2 2 (u + v ) ≥ 2f − M. 2 1.1.2 Theorem A convex function f : I → R is continuous at each interior point of I. Proof. Suppose that a ∈ int I and choose ε > 0 such that [a − ε, a + ε] ⊂ I. Then 1 1 f (a) ≤ f (a − ε) + f (a + ε), 2 2
1.1 Convex Functions at a First Glance
3
and f (a ± tε) = f ((1 − t)a + t (a ± ε)) ≤ (1 − t) f (a) + tf (a ± ε) for every t ∈ [0, 1]. Therefore t (f (a ± ε) − f (a)) ≥ f (a ± tε) − f (a) ≥ −t (f (a ∓ ε) − f (a)) , which yields |f (a ± tε) − f (a)| ≤ t max {|f (a − ε) − f (a)| , |f (a + ε) − f (a)|} . for every t ∈ [0, 1]. The continuity of f at a is now clear. Simple examples such as f (x) = 0 if x ∈ (0, 1), and f (0) = f (1) = 1, show that upward jumps can appear at the endpoints of the interval of definition of a convex function. Fortunately, these possible discontinuities are removable: 1.1.3 Proposition If f : [a, b] → R is a convex function, then the limits f (a+) = limx↘a f (x) and f (b−) = limx↗b f (x) exist in R and f (a+) e f (x) = f (x) f (b−)
if x = a if x ∈ (a, b) if x = b
is a continuous convex function. This result is a consequence of the following: 1.1.4 Lemma If f : I → R is convex, then either f is monotonic on int I, or there exists a point ξ ∈ int I such that f is nonincreasing on the interval (−∞, ξ] ∩ I and nondecreasing on the interval [ξ, ∞) ∩ I. Proof. Choose a < b arbitrarily among the interior points of I and put m = inf {f (x) : x ∈ [a, b]} . Since f is continuous on [a, b], this infimum is attained at a point ξ ∈ [a, b], that is, m = f (ξ). If a ≤ x < y < ξ, then y is a convex combination of x and ξ, precisely, ξ−y y = ξ−x x + y−x ξ−x ξ. Since f is convex, f (y) ≤
ξ−y y−x f (x) + f (ξ) ≤ f (x) ξ−x ξ−x
and thus f is decreasing on the interval [a, ξ]. If ξ < b, a similar argument shows that f is nondecreasing on [ξ, b]. The proof ends with a gluing process (to the left of a and the right of b), observing that the property of convexity makes impossible the existence of three u < v < w in I such that f (u) < f (v) > f (w).
4
1 Convex Functions on Intervals
1.1.5 Corollary (a) Every convex function f : I → R which is not monotonic on int I has an interior global minimum. (b) If a convex function f : R → R is bounded from above, then it is constant. Attaining supremum at endpoints is not a property characteristic of convex functions, but a variant of it does the job. 1.1.6 Theorem (S. Saks [339]) Let f be a real-valued function defined on an interval I. Then f is (strictly) convex if and only if for every compact subinterval J of I, and every affine function L, the supremum of f + L on J is attained at an endpoint (and only there). This statement remains valid if the perturbations L are supposed to be linear (that is, of the form L(x) = mx for suitable m ∈ R). See Exercise 1 for examples of convex functions derived from Theorem 1.1.6. Proof. We will restrict ourselves to the case of convex functions. The case of strictly convex functions can be treated in the same manner. Necessity: If f is convex, so is the sum F = f + L. Since every point of a subinterval J = [x, y] is a convex combination z = (1 − λ)x + λy of x and y, we have sup F (z) = sup F ((1 − λ)x + λy) z∈J
λ∈[0,1]
≤ sup [(1 − λ)F (x) + λF (y)] = max{F (x), F (y)}. λ∈[0,1]
Sufficiency: Given a compact subinterval J = [x, y] of I, there exists an affine function L(x) = mx + n which agrees with f at the two endpoints x and y. Then sup [(f − L)((1 − λ)x + λy)] = 0, λ∈[0,1]
which yields 0 ≥ f ((1 − λ)x + λy) − L((1 − λ)x + λy) = f ((1 − λ)x + λy) − (1 − λ)L(x) − λL(y) = f ((1 − λ)x + λy) − (1 − λ)f (x) − λf (y) for every λ ∈ [0, 1]. 1.1.7 Remark Call a function f : I → R quasiconvex if f ((1 − λ)x + λy) ≤ max {f (x), f (y)} for all x, y ∈ I and λ ∈ [0, 1]. For example, so are the monotonic functions and the functions that admit interior points c ∈ I such that they are nonincreasing on (−∞, c] ∩ I and nondecreasing on [c, ∞) ∩ I. Unlike the case of convex functions, the sum of a quasiconvex function and a linear function may not be quasiconvex.
1.1 Convex Functions at a First Glance
5
See the case of the sum x3 − 3x. Related to quasiconvex functions are the quasiconcave functions, characterized by the property f ((1 − λ)x + λy) ≥ min {f (x), f (y)} for all x, y ∈ I and λ ∈ [0, 1]. See Exercises 10 and 11 for more information. The following characterization of convexity within the class of continuous functions also proves useful in checking convexity. 1.1.8 Theorem (J. L. W. V. Jensen [172]) A function f : I → R is convex if and only if it verifies the following two conditions: (a) f is continuous at each interior point of I; and (b) f is midpoint convex, that is, ( x + y ) f (x) + f (y) f ≤ for all x, y ∈ I. 2 2 Proof. The necessity follows from Theorem 1.1.2. The sufficiency is proved by reductio ad absurdum. If f were not convex, then there would exist a subinterval [a, b] such that the graph of f |[a,b] is not under the chord joining (a, f (a)) and (b, f (b)); that is, the function φ(x) = −f (x) + f (a) +
f (b) − f (a) (x − a), b−a
x ∈ [a, b]
verifies γ = inf{φ(x) : x ∈ [a, b]} < 0. Notice that −φ is midpoint convex, continuous and φ(a) = φ(b) = 0. Put c = inf{x ∈ [a, b] : φ(x) = γ}; then necessarily φ(c) = γ and c ∈ (a, b). By the definition of c, for every h > 0 for which c ± h ∈ (a, b) we have φ(c − h) > φ(c)
and φ(c + h) ≥ φ(c)
so that
−φ(c − h) − φ(c + h) 2 in contradiction with the fact that −φ is midpoint convex. −φ(c) >
1.1.9 Corollary Let f : I → R be a continuous function. Then f is convex if and only if f (x + h) + f (x − h) − 2f (x) ≥ 0 for all x ∈ I and all h > 0 such that both x + h and x − h are in I. Notice that both Theorem 1.1.8 and its Corollary 1.1.9 above have straightforward variants in the case of strictly convex functions. Corollary 1.1.9 allows us to check immediately the strict convexity/concavity of some very common functions, such as the exponential, the logarithmic function, the restriction of the sine function to [0,π]. Indeed, in the first case, the fact that a+b √ > ab a, b > 0, a ̸= b, implies 2
6
1 Convex Functions on Intervals
is equivalent to ex+h + ex−h − 2ex > 0 for all x ∈ R and all h > 0. Many other examples can be deduced using the closure under functional operations with convex/concave functions. 1.1.10 Proposition (The operations with convex functions) (a) Adding two convex functions (defined on the same interval) we obtain a convex function; if one of them is strictly convex, then the sum is also strictly convex. (b) Multiplying a (strictly) convex function by a positive scalar we obtain also a (strictly) convex function. (c) Suppose that f and g are two positive convex functions defined on an interval I. Then their product is convex provided that they are synchronous in the sense that (f (x) − f (y)) (g(x) − g(y)) ≥ 0 for all x, y ∈ I; for example, this condition occurs if f and g are both nonincreasing or both nondecreasing. (d) The restriction of every (strictly) convex function to a subinterval of its domain is also a (strictly) convex function. (e) If f : I → R is a convex (respectively a strictly convex) function and g : R → R is a nondecreasing (respectively an increasing) convex function, then g ◦ f is convex (respectively strictly convex). (f) Suppose that f is a bijection between two intervals I and J. If f is increasing, then f is (strictly) convex if and only if f −1 is (strictly) concave. If f is a decreasing bijection, then f and f −1 are of the same type of convexity. (g) The maximum of two convex functions f, g : I → R, max {f, g} (x) = max {f (x), g(x)} is also a convex function. (h) The superposition f (ax + b) , of a convex function f and an affine function ax + b, is a convex function. What if a function is convex? A first way to emphasize the usefulness of convexity is the extension of inequality of convexity (C) to arbitrarily long finite families of points. The basic remark in this respect is the fact that the intervals are closed under arbitrary convex combinations, that is, n ∑
λk xk ∈ I
k=1
∑n for all x1 , . . . , xn ∈ I and all λ1 , . . . , λn ∈ [0, 1] with k=1 λk = 1. This can be proved by induction on the number n of points involved in the convex
1.1 Convex Functions at a First Glance
7
combinations. The case n = 1 is trivial, while for n = 2 it follows from the definition of a convex set. Assuming the result is true for all convex combinations with at most∑ n ≥ 2 points, let us pass to the case of combinations with n + 1 n+1 points, x = k=1 λk xk . The nontrivial case is when all coefficients λk lie in (0, 1). But in this case, due to our induction hypothesis, x can be represented as a convex combination of two elements of I, n (∑ x = (1 − λn+1 ) k=1
) λk xk + λn+1 xn+1 , 1 − λn+1
hence x belongs to I. The above remark has a notable counterpart for convex functions: 1.1.11 Lemma (The discrete case of Jensen’s inequality) A real-valued function f defined on an interval I is convex if and ∑n only if for all points x1 , . . . , xn in I and all scalars λ1 , . . . , λn in [0, 1] with k=1 λk = 1 we have f
n (∑
n ) ∑ λk xk ≤ λk f (xk ).
k=1
k=1
If f is strictly convex, the above inequality is strict if the points xk are not all equal and the scalars λk are positive. Proof. The first assertion follows by mathematical induction. As concerns the second assertion, suppose that the function f is strictly convex and ( n ) n ∑ ∑ λk xk = λk f (xk ) (1.3) f k=1
k=1
for some points x1 , . . . , xn ∈ I and some scalars λ1 , . . . , λn ∈ (0, 1) that sum up to 1. If x1 , . . . , xn are not all equal, the set S∑= {k : xk < max {x1 , . . . , xn }} will be a proper subset of {1, ..., n} and λS = k ∈S λk ∈ (0, 1). Since f is strictly convex, we get ( f
n ∑ k=1
)
(
(
∑ λk
)
(
∑
λk λk xk = f λS xk + (1 − λS ) xk λS 1 − λS k∈S k∈S / ) ( ) ( ∑ λk ∑ λk xk + (1 − λS )f xk < λS f λS 1 − λS k∈S /
k∈S
< λS
∑ λk k∈S
λS
f (xk ) + (1 − λS )
))
∑ k∈S /
n ∑ λk f (xk ) = λk f (xk ) , 1 − λS k=1
which contradicts our hypothesis (1.3). Therefore all points xk should coincide.
8
1 Convex Functions on Intervals
The extension of Lemma 1.1.11 to the case of arbitrary finite measure spaces will be discussed in Section 1.6. An immediate consequence of Lemma 1.1.11 (when applied to the exponential function) is the following result which extends the well-known AM-GM inequality (that is, the arithmetic mean–geometric mean inequality): 1.1.12 Theorem (The weighted form of the AM-GM ∑ninequality; L. J. Rogers [334]) If x1 , . . . , xn ∈ (0, ∞) and λ1 , . . . , λn ∈ (0, 1), k=1 λk = 1, then n ∑
λk xk > xλ1 1 · · · xλnn
k=1
unless x1 = · · · = xn . Replacing xk by 1/xk in the last inequality we get xλ1 1 · · · xλnn > 1 /
n ∑ λk xk
k=1
unless x1 = · · · = xn . This represents the weighted form of the geometric mean– harmonic mean inequality (that is, of GM-HM inequality). For λ1 = · · · = λn = 1/n we recover the usual AM-GM-HM inequality, which asserts that for every family x1 , . . . , xn of positive numbers, not all equal to each other, we have √ x1 + · · · + xn > n x1 · · · xn > ( n
n 1 x1
+ ··· +
1 xn
).
A stronger concept of convexity, called logarithmic convexity (or log-convexity) will be discussed in Section 1.3.
1.1 Convex Functions at a First Glance
9
Exercises 1. Infer from Theorem 1.1.6 that the following differentiable functions are strictly convex: (a) − sin x on [0, π] and − cos x on [−π/2, π/2]; (b) xp on [0, ∞) if p > 1; xp on (0, ∞) if p < 0; −xp on [0, ∞) if p ∈ (0, 1); (c) (1 + xp )1/p on [0, ∞) if p > 1. 2. (New from old) Assume that f (x) is a (strictly) convex function on an interval I ⊂ (0, ∞). Prove that xf (1/x) is (strictly) convex an any interval J for which x ∈ J implies 1/x ∈ I. Then infer that x log x is a strictly convex function on [0, ∞) and x sin(1/x) is strictly concave on [1/π, ∞). 3. Suppose that f : R → R is a convex function and P and Q are distinct points of its graph. Prove that: (a) if we modify the graph of f by replacing the portion joining P and Q with the corresponding linear segment, the result is still the graph of a convex function g; (b) all points of the graph of g are on or above the secant line joining P and Q. 4. Suppose that f1 , . . . , fn are positive concave functions with the same domain of definition. Prove that (f1 · · · fn )1/n is also a concave function. 5. (From discrete to continuous) Prove that the discrete form of Jensen’s inequality implies (and is implied by) the following integral form of this inequality: If f : [c, d] → R is a convex function and g : [a, b] → [c, d] is a Riemann integrable function, then ( ) ∫ b ∫ b 1 1 f g(x)dx ≤ f (g(x)) dx. b−a a b−a a 6. (Optimization without calculus) The AM–GM inequality offers a very convenient way to minimize a sum when the product of its terms is constant (or to maximize a product whose factors sum to a constant). Infer from Theorem 1.1.12 that √ 1 1 . min (x + y + 2 ) = 4/ 2 and max x2 yz = x,y>0 6x+5y+z=4 x y 45 7. (a) Prove that Theorem 1.1.8 remains true if the condition of midpoint convexity is replaced by the following one: f ((1−α)x+αy) ≤ (1−α)f (x)+ αf (y) for some fixed parameter α ∈ (0, 1), and for all x, y ∈ I. (b) Prove that Theorem 1.1.8 remains true if the condition of continuity is replaced by boundedness from above on every compact subinterval.
10
1 Convex Functions on Intervals 8. Let f : I → R be a convex function and let x1 , . . . , xn ∈ I (n ≥ 2). Prove that [ f (x ) + · · · + f (x (x + · · · + x )] 1 n−1 ) 1 n−1 (n − 1) −f n−1 n−1 cannot exceed ( x + · · · + x )] [ f (x ) + · · · + f (x ) 1 n 1 n −f . n n n 9. (Two more proofs of the AM–GM inequality) Let x1 , . . . , xn > 0 (n ≥ 2) and for each 1 ≤ k ≤ n put Ak =
x1 + · · · + xk k
and Gk = (x1 · · · xk )1/k .
(a) (T. Popoviciu) Prove that ( A )n n
Gn
≥
(A
n−1
)n−1
Gn−1
≥ ··· ≥
( A )1 1
G1
= 1.
(b) (R. Rado) Prove that n(An − Gn ) ≥ (n − 1)(An−1 − Gn−1 ) ≥ · · · ≥ 1 · (A1 − G1 ) = 0. [Hint: Apply the result of Exercise 8 respectively to f = − log and f = exp.] 10. (The power means in the discrete case) Let x = (x1 , . . . , xn ) and ∑n λ = (λ1 , . . . , λn ) be two n-tuples of strictly positive numbers such that k=1 λk = 1. The (weighted) power mean of order t is defined as Mt (x; λ) =
n (∑
λk xtk
)1/t
for t ∈ R \ {0}
k=1
and M0 (x; λ) =
n ∏
xλk k .
(1.4)
k=1
We also define M−∞ (x; λ) = inf{xk : k} and M∞ (x, λ) = sup{xk : k}. Notice that M1 is the arithmetic mean, M0 is the geometric mean and M−1 is the harmonic mean. Moreover, M−t (x; λ) = Mt (x−1 ; λ)−1 . Prove that: (a) Ms (x; λ) ≤ Mt (x; λ) whenever s ≤ t in R; ∏n (b) limt→0 Mt (x, λ) = k=1 xλk k ; (c) limt→−∞ Mt (x; λ) = M−∞ (x; λ) and
limt→∞ Mt (x; λ) = M∞ (x; λ).
[Hint: (a) According to the weighted form of the AM-GM inequality, M0 (x; λ) ≤ Mt (x; λ) for all t ≥ 0. Then apply Jensen’s inequality to the
1.1 Convex Functions at a First Glance
11
function xt/s to infer that Ms (x; λ) ≤ Mt (x; λ) whenever 0 < s ≤ t. To end the proof of (a) use the formula M−t (x; λ) = Mt (x−1 ; λ)−1 ] Remark. The power means discussed above are a special case of integral power means, that will be presented in Section ??, Exercise 1. Indeed, an n-tuple λ = (λ1 , . . . , λn ) of strictly positive numbers that sum to unity can be identified with the discrete ∑ probability measure on the set Ω = {1, 2, ..., n} defined by λ(A) = k∈A λk for A ∈ P(Ω). The λintegrable functions x : Ω → R, x(k) = xk , are nothing but the strings x = (x1 , . . . , xn ) of real numbers, so that ∫ Mt (x; λ) = xt (k)dλ(k) for t ∈ R\ {0} Ω
and M0 (x; λ) = exp
(∫ Ω
) log x(k)dλ(k) .
11. Prove that: (a) a function f : I → R is quasiconvex if and only if all sublevel sets Lλ = {x : f (x) ≤ λ} are intervals. (b) the product of two positive concave functions is a quasiconcave function; (c) if f is positive and convex, then −1/f is quasiconvex; (d) if f is negative and quasiconvex then 1/f is quasiconcave; (e) If f is a positive convex function and g is a positive concave function, then f /g is quasiconvex; 12. Suppose that f is a continuous real-valued function defined on an interval I. Prove that f is quasiconvex if and only if it is either monotonic or there exists an interior point c ∈ I such that f is nonincreasing on (−∞, c] ∩ I and nondecreasing on [c, ∞) ∩ I. 13. A sequence of real numbers a0 , a1 , . . . , an (with n ≥ 2) is said to be convex provided that ∆2 ak = ak − 2ak+1 + ak+2 ≥ 0 for all k = 0, . . . , n − 2; it is said to be concave provided ∆2 ak ≤ 0 for all k. (a) Solve the system ∆2 ak = bk
for k = 0, . . . , n − 2
(in the unknowns ak ) to prove that the general form of a convex sequence a = (a0 , a1 , . . . , an ) with a0 = an = 0 is given by the formula a=
n−1 ∑ j=1
cj wj ,
12
1 Convex Functions on Intervals where cj = 2aj − aj−1 − aj+1 and wj has the components { wkj
=
k(n − j)/n, j(n − k)/n,
for k = 0, . . . , j for k = j, . . . , n.
(b) Prove ∑n that the general form of a convex sequence a = (a0 , a1 , . . . , an ) is a = j=0 cj wj , where cj and wj are as in the case (a) for j = 1, . . . , n−1. The other coefficients and components are: c0 = a0 , cn = an , wk0 = (n − k)/n and wkn = k/n for k = 0, . . . , n). Remark. The theory of convex sequences can be subordinated to that of convex functions. If f : [0, n] → R is a convex function, then (f (k))k is a convex sequence; conversely, if (ak )k is a convex sequence, then the piecewise linear function f : [0, n] → R obtained by joining the points (k, ak ) is convex too.
1.2
Consequences of Young’s Inequality
The following special case of the weighted form of the AM –GM inequality is known as Young’s inequality: ab ≤
ap bq + p q
for all a, b ≥ 0,
(1.5)
whenever p, q ∈ (1, ∞) and 1/p + 1/q = 1; the equality holds if and only if ap = bq . Young’s inequality can be also obtained as a consequence of strict convexity of the exponential function. In fact, p
q
ab = elog ab = e(1/p) log a +(1/q) log b p q 1 1 ap bq < elog a + elog b = + p q p q for all a, b > 0 with ap ̸= bq . Yet another argument is provided by the study of variation of the differentiable function F (a) =
ap bq + − ab, p q
a ≥ 0,
where b ≥ 0 is a parameter. This function attains at a = bq/p the strict global minimum, which yields F (a) > F (bq/p ) = 0 for all a ≥ 0, a ̸= bq/p . A refinement of the inequality (1.5) is presented in Exercise 2. W. H. Young [387] actually proved a much more general inequality which covers (1.5) for f (x) = xp−1 :
1.2 Consequences of Young’s Inequality
13
1.2.1 Theorem (Young’s inequality) Suppose that f : [0, ∞) → [0, ∞) is an increasing continuous function such that f (0) = 0 and limx→∞ f (x) = ∞. Then ∫ u ∫ v uv ≤ f (x) dx + f −1 (y) dy 0
0
for all u, v ≥ 0, and equality occurs if and only if v = f (u).
Figure 1.2: The areas of the two curvilinear triangles exceed the area of the rectangle with sides u and v.
Proof. Using the definition of the derivative one can easily prove that the function ∫ u ∫ f (u) F (u) = f (x) dx + f −1 (y) dy − uf (u), u ∈ [0, ∞), 0
0
is differentiable, with F ′ identically 0. Thus F (u) = F (0) = 0 for all u ≥ 0. If u, v ≥ 0 and v ≥ f (u), then ∫ ∫
∫
u
=
0 v
f (x) dx + 0
∫ ≤
f (x) dx + [
∫
u
f (u) 0
f −1 (y) dy + u(v − f (u)) ] ∫
f −1 (y) dy + u(v − f (u)) −
0
f (x) dx + 0
∫
u
uv = uf (u) + u(v − f (u)) =
v
f −1 (y) dy
f (u) v
f −1 (y) dy.
0
The other case, where v ≤ f (u) can be treated in a similar way. A pictorial proof of Theorem 1.2.1 is suggested by Fig. 1.2. Applications of this result to Legendre duality will be presented in Exercise 7, Section 1.6. Young’s inequality (1.5) is the source of many important integral inequalities, most of them valid in the context of arbitrary measure spaces. The interested reader may find a thorough presentation of the basic elements of measure theory in the monograph of Hewitt and Stromberg [158], Sections 12 and 13.
14
1 Convex Functions on Intervals
1.2.2 Theorem (The Rogers–H¨older inequality for p > 1) Let p, q ∈ (1, ∞) with 1/p + 1/q = 1, and let f ∈ Lp (µ) and g ∈ Lq (µ). Then f g is in L1 (µ) and we have ∫ ∫ f g dµ ≤ |f g| dµ (1.6) Ω
Ω
∫
and
|f g| dµ ≤ ∥f ∥Lp ∥g∥Lq .
(1.7)
Ω
As a consequence,
∫ f g dµ ≤ ∥f ∥Lp ∥g∥Lq .
(1.8)
Ω
The above result extends in a straightforward manner to the pairs p = 1, q = ∞ and p = ∞, q = 1. In the complementary domain, p ∈ (−∞, 1)\{0} and 1/p + 1/q = 1, the inequality sign in (1.6)–(1.8) should be reversed. See Exercises 3 and 4. From the Rogers–H¨older inequality it follows that for all p, q, r ∈ (1, ∞) with 1/p + 1/q = 1/r and all f ∈ Lp (µ) and g ∈ Lq (µ) we have f g ∈ Lr (µ) and ∥f g∥Lr ≤ ∥f ∥Lp ∥g∥Lq . The special case of inequality (1.8) where p = q = 2, is known as the Cauchy–Bunyakovsky–Schwarz inequality. See Remark 1.2.6. Proof. The first inequality is trivial. If f or g is zero µ-almost everywhere, then the second inequality is trivial. Otherwise, using Young’s inequality, we have |f (x)| |g(x)| 1 |g(x)|q 1 |f (x)|p + · ≤ · · ∥f ∥Lp ∥g∥Lq p ∥f ∥pLp q ∥g∥qLq for all x in Ω, such that f g ∈ L1 (µ). Thus ∫ 1 |f g| dµ ≤ 1 ∥f ∥Lp ∥g∥Lq Ω and the proof of (1.7) is done. The inequality (1.8) is immediate. 1.2.3 Remark (Conditions for equality in Theorem 1.2.2) The basic observation is the fact that ∫ f ≥ 0 and f dµ = 0 imply f = 0 µ-almost everywhere. Ω
Consequently we have equality in (1.6) if and only if f (x)g(x) = eiθ |f (x)g(x)| for some real constant θ and for µ-almost every x.
1.2 Consequences of Young’s Inequality
15
Suppose that p, q ∈ (1, ∞) and f and g are not zero µ-almost everywhere. In order to get equality in (1.7) it is necessary and sufficient to have |f (x)| |g(x)| 1 |f (x)|p 1 |g(x)|q · = · + · p ∥f ∥Lp ∥g∥Lq p ∥f ∥Lp q ∥g∥qLq almost everywhere. The equality case in Young’s inequality shows that this is equivalent to |f (x)|p /∥f ∥pLp = |g(x)|q /∥g∥qLq almost everywhere, that is, A|f (x)|p = B|g(x)|q
almost everywhere
for some positive numbers A and B. If p = 1 and q = ∞, we have equality in (1.7) if and only if there is a constant λ ≥ 0 such that |g(x)| ≤ λ almost everywhere, and |g(x)| = λ almost everywhere in the set {x : f (x) ̸= 0}. 1.2.4 Theorem (Minkowski’s inequality) For 1 ≤ p < ∞ and f, g ∈ Lp (µ) we have ∥f + g∥Lp ≤ ∥f ∥Lp + ∥g∥Lp . (1.9) In the discrete case, using the notation of Exercise 10 in Section 1.1, this inequality reads Mp (x + y, λ) ≤ Mp (x, λ) + Mp (y, λ). (1.10) In this form, it extends to the complementary range 0 < p < 1, with the inequality sign reversed. See Section 2.4, Exercise 2, for the case p = 0. The integral analogue for p < 1 is presented in Section 2.6. Proof. For p = 1, the inequality (1.9) follows immediately by integrating the inequality |f + g| ≤ |f | + |g|. For p ∈ (1, ∞) we have |f + g|p ≤ (|f | + |g|)p ≤ (2 sup{|f |, |g|})p ≤ 2p (|f |p + |g|p ), which shows that f + g ∈ Lp (µ). Moreover, according to Theorem 1.2.2, ∫ ∥f + g∥pLp =
∫
∫
|f + g|p dµ ≤ Ω
|f + g|p−1 |f | dµ + Ω
(∫
)1/p (∫
≤
|f |p dµ Ω
)1/p (∫ )1/q p (p−1)q |g| dµ |f + g| dµ
+ Ω
= (∥f ∥
Ω
)1/q (p−1)q |f + g| dµ
Ω
(∫
Lp
|f + g|p−1 |g| dµ
+ ∥g∥
Lp
)∥f +
Ω p/q g∥Lp ,
where 1/p + 1/q = 1, and it remains to observe that p − p/q = 1.
16
1 Convex Functions on Intervals
1.2.5 Remark If p = 1, we obtain equality in (1.9) if and only if there is a positive measurable function φ such that f (x)φ(x) = g(x) µ-almost everywhere on the set {x : f (x)g(x) ̸= 0}. If p ∈ (1, ∞) and f is not 0 almost everywhere, then we have equality in (1.9) if and only if g = λf almost everywhere, for some λ ≥ 0. In the particular case where (Ω, Σ, µ) is the measure space associated with the counting measure on a finite set, µ : P({1, . . . , n}) → N,
µ(A) = |A|,
we retrieve the classical discrete forms of the above inequalities. For example, the discrete version of the Rogers–H¨older inequality can be read n n n ∑ (∑ )1/p (∑ )1/q ξk η k ≤ |ξk |p |ηk |q k=1
k=1
k=1
for all ξk , ηk ∈ C, k ∈ {1, . . . , n}. On the other hand, a moment’s reflection shows that we can pass immediately from these discrete inequalities to their integral analogues, corresponding to finite measure spaces. 1.2.6 Remark (More on Cauchy-Bunyakovsky-Schwarz inequality) Cauchy, in his famous Cours d’Analyse [78], derived the discrete case of this inequality from Lagrange’s algebraic identity, n (∑ k=1
Thus
a2k
n )(∑
) b2k =
k=1
∑
(aj bk − ak bj )2 +
1≤j −1 we have (1 + x)α ≥ 1 + αx
if α ∈ (−∞, 0] ∪ [1, ∞)
and (1 + x)α ≤ 1 + αx
if α ∈ [0, 1];
if α ∈ / {0, 1}, the equality occurs only for x = 0. (b) The substitution 1 + x → x/y followed by a multiplication by y leads us to Young’s inequality (for full range of parameters). Show that this inequality can be written as xy ≥
xp yq + p q
for all x, y > 0
in the domain p ∈ (−∞, 1)\{0} and 1/p + 1/q = 1. 2. (J. M. Aldaz [8]) Use calculus to prove the following refinement of the inequality (1.5): If 1 < p ≤ 2 and 1/p + 1/q = 1, then for all a, b ≥ 0 )2 )2 1 ( p/2 ap bq 1 ( p/2 a − bq/2 ≤ + − ab ≤ a − bq/2 . q p q p This fact yields a refinement of the Rogers-H¨older inequality and also a new proof of the uniform convexity of real Lp -spaces for 1 < p < ∞. Remark. Another refinement of Young’s inequality was proposed by Y. Manasrah and F. Kittaneh [229]: (1 − µ)a + µb ≥ a1−µ bµ + min {µ, 1 − µ} · (√ √ )2 a − b for all a, b ∈ [0, ∞) and µ ∈ [0, 1].
18
1 Convex Functions on Intervals 3. (A symmetric form of the discrete Rogers–H¨older inequality) Let p, q, r be nonzero real numbers such that 1/p + 1/q = 1/r. (a) Prove that the inequality n (∑
λk |ak bk |r
)1/r
≤
n (∑
λk |ak |p
n )1/p (∑
)1/q
k=1
k=1
k=1
λk |bk |q
holds in each of the following three cases: p > 0, q > 0, r > 0;
p < 0, q > 0, r < 0;
p > 0, q < 0, r < 0.
(b) Prove that the opposite inequality holds in each of the following cases: p > 0, q < 0, r > 0; p < 0, q > 0, r > 0; p < 0, q < 0, r < 0. ∑n Here λ1 , . . . , λn > 0, k=1 λk = 1, and a1 , . . . , an , b1 , . . . , bn ∈ C\{0}, n ∈ N∗ . (c) Formulate the above inequalities in terms of power means and then prove they still work for r = pq/(p + q) if p and q are not both zero, and for r = 0 if p = q = 0. 4. (a) Young’s inequality extends to more than two variables. Prove that if p1 , . . . , pn ∈ (1, ∞) and 1/p1 + · · · + 1/pn = 1, then n ∏ k=1
n ∑ xpk
xk ≤
k
k=1
pk
for all x1 , . . . , xn ≥ 0. (b) Infer the following generalization of Rogers–H¨older inequality: If (Ω, Σ, µ) is a measure space and∑f1 , . . . , fn are functions such that fk ∈ Lpk (µ) for n some pk ∈ [1, ∞], and k=1 1/pk = 1, then ∫ ( ∏ n n ) ∏ ≤ f dµ ∥fk ∥Lpk . k Ω k=1
k=1
5. (Hilbert’s inequality) Suppose that p, q ∈ (1, ∞), 1/p + 1/q = 1 and f, g are two positive functions such f ∈ Lp (0, ∞) and g ∈ Lq (0, ∞) . Infer from the Rogers–H¨older inequality that )1/p (∫ ∞ )1/q (∫ ∞ ∫ ∞∫ ∞ f (x) g (y) π q p g (y) dy , f (x) dx dxdy ≤ x+y sin πp 0 0 0 0 the constant π/ sin(π/p) being sharp. [Hint: Notice that the left hand side equals ( )1/pq ∫ ∞∫ ∞ f (x) x g (y) ( y )1/pq dxdy. ] 1/p 1/q y x 0 0 (x + y) (x + y)
1.2 Consequences of Young’s Inequality
19
6. (A general form of Minkowski’s inequality) Suppose that (X, M, µ) and (Y, N , ν) are two σ-finite measure spaces, f is a positive function on X ×Y which is µ × ν-measurable, and let p ∈ [1, ∞). Then (∫ (∫ )p )1/p ∫ (∫ )1/p p f (x, y) dν(y) dµ(x) ≤ f (x, y) dµ(x) dν(y). X
Y
Y
X
7. (The one-dimensional case of Poincar´e inequality) Infer from the integral version of Lagrange’s algebraic identity the following inequality: If f : ∫1 [0, 1] → R is a function of class C 1 that verifies the condition 0 f dx = 0, then ∫ ∫ 1 1 1 ′2 f 2 dx ≤ f (s)ds. 2 0 0 The n-dimensional extension of Poincar´e inequality plays a major role in partial differential equations and their applications. See [115]. 8. (A matrix extension of Cauchy–Bunyakovsky–Schwarz inequality) Suppose that A, B ∈ MN (R) are two positive matrices; recall that a symmetric matrix C ∈ MN (R) is positive if ⟨Cx, x⟩ ≥ 0 for all x ∈ RN . (a) Prove that 2
|⟨Ax, By⟩| ≤ ⟨Ax, x⟩⟨By, y⟩ for all vectors x, y ∈ RN . (b) Suppose that A ∈ MN (R) is positive and invertible and y ∈ RN . Prove that 2 |⟨x, y⟩| max = ⟨A−1 y, y⟩, x̸=0 ⟨Ax, x⟩ which is attained for x = cA−1 y with c ̸= 0 arbitrary. This solves a filtering problem in signal processing. See Byrne [68], pp. 10-11. 9. Let (Ω, Σ, µ) be a measure space and let f : Ω → C be a measurable function, which belongs to Lt (µ) for t in a subinterval I of (0, ∞). Infer from ∫ the Cauchy–Bunyakovsky–Schwarz inequality that the function t → log Ω |f |t dµ is convex on I. Remark. The result of this exercise is equivalent to Lyapunov’s inequality [219]: If a ≥ b ≥ c, then (∫ )a−c (∫ )a−b (∫ )b−c |f |b dµ ≤ |f |c dµ |f |a dµ Ω
Ω
Ω
(provided the integrability aspects are fixed). Equality holds if and only if one of the following conditions holds: (a) f is constant on some subset of Ω and 0 elsewhere; (b) a = b; (c) b = c; (d) c(2a − b) = ab.
20
1 Convex Functions on Intervals
∑n 10. Denote by Prob (n) = {(λ1 , ..., λn ) ∈ (0, ∞)n : k=1 λk = 1} the set of all probability distributions of order n and suppose there are given two such distributions Q = (q1 , ..., qn ) and P = (p1 , ..., pn ) . The Kullback–Leibler divergence from Q to P is defined by the formula DKL (P ||Q) =
n ∑
pk log
k=1
pk qk
and represents a measure of how one probability distribution diverges from a second expected probability distribution. From Jensen’s inequality it follows that DKL (P ||Q) ≥ 0 (with equality if and only if P = Q). This is known as Gibbs’ inequality. A better lower bound is provided by Pinsker’s inequality, formulated by Borwein and Lewis [55], p. 63, as follows:
2
n ∑ k=1
∑ (pk − qk ) pk pk log ≥3 ≥ qk pk + 2qk n
2
k=1
(
n ∑
)2 |pk − qk |
.
k=1
(a) Derive the right hand side inequality from the Cauchy-BunyakovskySchwarz inequality. (b) Prove that 3(x − 1)2 ≤ 2 (x + 2) (x log x − x (+ 1) for all x >) 0 and infer from it the inequality 3(u − v)2 ≤ 2 (u + 2v) u log uv − u + v for all u, v > 0. Then put u = pk and v = qk and sum up for k from 1 to n to get the left hand side inequality. 11. (a) Prove the discrete Favard inequality: n n ( 3(n − 1) )1/2 ( 1 ∑ )1/2 1 ∑ ak ≥ a2k n+1 4(n + 1) n+1 k=0
k=0
for every concave sequence a0 , a1 , . . . , an of positive numbers. (b) Derive from (a) the integral form of this inequality: if f is a concave positive function on [a, b], then (
1 b−a
∫
)1/2
b 2
f (x)dx a
) √ ( ∫ b 3 1 ≤ f (x)dx . 2 b−a a
[Hint: By Minkowski’s inequality (Theorem 1.2.4 above), if the Berwald inequality works for two concave sequences, then it also works for all linear combinations of them, with positive coefficients. Then apply the assertion (b) of Exercise 13, Section 1.1. ]
1.3 Log-Convex Functions
21
Figure 1.3: The graph of the gamma function.
1.3
Log-convex Functions
This section is aimed to a brief discussion on a stronger concept of convexity. 1.3.1 Definition A strictly positive function f : I → (0, ∞) is called logconvex (respectively log-concave) if log f (respectively − log f ) is a convex function. Equivalently, the condition of log-convexity of f means x, y ∈ I and λ ∈ (0, 1) =⇒ f ((1 − λ)x + λy) ≤ f (x)1−λ f (y)λ . The following result collects some immediate consequences of the generalized AM–GM inequality: 1.3.2 Lemma (a) The multiplicative inverse of any strictly positive concave function is a log-convex function. (b) Every log-convex function is also convex. (c) Every strictly positive concave function is also log-concave. According to this lemma, 1/x and 1/ sin x (as well as their exponentials) are log-convex, respectively on (0, ∞) and (0, π). 2 Notice that the function e−x is log-concave but not concave. An important example of a log-convex function is the gamma function, ∫ Γ : (0, ∞) → R,
∞
Γ(x) =
tx−1 e−t dt for x > 0.
0
See Choudary and Niculescu [84], pp. 352-360, for its basic theory. A sketch of the graph of gamma function is shown in Fig. 1.3.
22
1 Convex Functions on Intervals
The fact that Γ is log-convex is a consequence of Rogers–H¨older inequality. Indeed, for every x, y > 0 and λ ∈ [0, 1] we have ∫ ∞ ∫ ∞ Γ((1 − λ)x + λy) = t(1−λ)x+λy−1 e−t dt = (tx−1 e−t )1−λ (ty−1 e−t )λ dt 0
(∫ ≤
∞
tx−1 e−t dt
)1−λ (∫
0 ∞
ty−1 e−t dt
)λ = Γ1−λ (x)Γλ (y).
0
0
Remarkably, Γ is the unique log-convex extension of the factorial function. 1.3.3 Theorem (H. Bohr and J. Mollerup [47], [17]) The gamma function is the only function f : (0, ∞) → R that satisfies the following three conditions: (a) f (x + 1) = xf (x) for all x > 0; (b) f (1) = 1; (c) f is log-convex. An important class of log-convex functions is that of completely monotonic functions. A function f : (0, ∞) → R is called completely monotonic if f has derivatives of all orders and satisfies (−1)n f (n) (x) ≥ 0 for all x > 0 and n ∈ N. Necessarily, every completely monotonic function is positive, decreasing and convex. One can prove easily that the functions e−x , e−1/x ,
1 ln(1 + x) and 2 (1 + x) x
(as well as their sums, products and derivatives of even order) are completely monotonic on (0, ∞). More interesting examples can be found in the papers of Merkle [244] and Miller and Samko [249]. S. N. Bernstein has proved in 1928 that a necessary and sufficient condition that a function f (x) be completely monotonic is the existence of a representation of the form ∫ ∞ f (x) = e−xt dµ(t), (1.11) 0
where µ is a positive Borel measure on (0, ∞) and the integral converges for 0 < x < ∞. A simple proof based on the Krein-Milman Theorem (see Appendix A, Corollary 3.2), can be found in Simon [352], pp. 143-152. His argument also covers the fact (known to Bernstein) that every completely monotonic function is the restriction to (0, ∞) of an analytic function in the right semi-plan Re z > 0. 1.3.4 Theorem (A. M. Fink [124]) Every completely monotonic function is log-convex. Proof. It suffices to prove that every completely monotonic function f veri2 fies the inequality f (x)f ′′ (x) ≥ (f ′ (x)) . See Exercise 1. Taking into account
1.3 Log-Convex Functions
23
the aforementioned integral representation due to Bernstein, this inequality is equivalent to ∫ ∞ ∫ ∞ ∫ ∞ ∫ ∞ −xt1 2 −xt2 −xt1 e dµ(t1 ) t2 e dµ(t2 ) ≥ t1 e dµ(t1 ) t2 e−xt2 dµ(t2 ) 0
0
and also to ∫ ∞∫ ∞ 0
0
0
t22 e−x(t1 +t2 ) dµ(t1 )dµ(t2 )
∫ ≥
∞
0
∫
0
∞
t1 t2 e−x(t1 +t2 ) dµ(t1 )dµ(t2 ),
0
which is clear because by symmetry ∫ ∞∫ ∞ ∫ ∫ ) 1 ∞ ∞( 2 t22 e−x(t1 +t2 ) dµ(t1 )dµ(t2 ) = t1 + t22 e−x(t1 +t2 ) dµ(t1 )dµ(t2 ). 2 0 0 0 0 The convex functions, log convex functions and quasiconvex functions are all just special cases of a considerably more general concept, that of convex function with respect to a pair of means. This makes the subject of Appendix A. Exercises 1. (The second derivative test of log-convexity) Prove that a twice differentiable function f : I → (0, ∞) is log-convex if and only if f (x)f ′′ (x) ≥ (f ′ (x))
2
for all x ∈ I.
Infer from this result the log-convexity of the gamma function. 2. (The log-convex analogue of Theorem 1.1.6) Prove that a strictly positive continuous function f defined on an interval I is log-convex if and only if ( ) √ f x+y ≤ f (x)f (y) for all x, y ∈ I. 2 3. (Operations with log-convex functions) (a) Clearly, a product of log-convex functions is also a log-convex function. Prove that the same happens for sums. (b) Suppose that x = (x1 , . . . , xn )∑ and λ = (λ1 , . . . , λn ) are n-tuples of n strictly positive numbers such that k=1 λk = 1). Infer from the assertion (a) the convexity of the function t → t log Mt (x; λ) on R and of the function t → log Mt (x; λ) on (0, ∞). The last assertion yields Mαs+βt (x; λ) ≤ Msα (x; λ)Mtβ (x; λ) whenever α, β ∈ [0, 1], α + β = 1 and s, t > 0. (b) Suppose that f is a convex function and g is an increasing log-convex function. Prove that g ◦ f is a log-convex function.
24
1 Convex Functions on Intervals (c) Prove that the harmonic mean of two log-concave functions is also log-concave. [Hint: (a) Note that this assertion is equivalent to the following inequality for positive numbers: aα bβ + cα dβ ≤ (a + c)α (b + d)β .] 4. (Operations with log-concave functions) (a) Prove that the product of log-concave functions is also log-concave. (b) Prove that the convolution preserves log-concavity, that is, if f and g are two integrable log-concave functions defined on R, then the function ∫ (f ∗ g) (x) = f (x − y)g(y)dy R
is log-concave too. Remark. The statement (b) of Exercise 4 is a particular case of Pr´ekopa– Leindler inequality. 5. (P. Montel [260]) Let I be an interval. Prove that the following assertions are equivalent for every function f : I → (0, ∞) : (a) f is log-convex; (b) the function x → eαx f (x) is convex on I for all α ∈ R; (c) the function x → [f (x)]α is convex on I for all α > 0. [Hint: For (c) ⇒ (a), note that ([f (x)]α − 1)/α is convex for all α > 0 and log f (x) = limα→0+ ([f (x)]α − 1)/α. The limit of a sequence of convex functions is itself convex. ] 6. Prove that the function
ex Γ(x) xx
is log-convex.
7. (Some geometric consequences of log-concavity) (a) A convex quadrilateral ABCD is inscribed in the unit circle. Its sides satisfy the inequality AB · BC · CD · DA ≥ 4. Prove that ABCD is a square. (b) Suppose that A, B, C are the angles of a triangle, expressed in radians. Prove that ( 3√3 )3 ( √ 3 )3 sin A sin B sin C < ABC < , 2π 2 unless A = B = C. [Hint: Note that the sine function is log-concave, while x/ sin x is logconvex on (0, π). ] 8. (E. Artin [17]) Let U be an open convex subset of Rn and let µ be a Borel measure on an interval I. Consider the integral transform ∫ F (x) = K(x, t) dµ(t), I
1.4 Smoothness Properties
25
where the kernel K(x, t) : U × I → [0, ∞) satisfies the following two conditions: (a) K(x, t) is µ-integrable in t for each fixed x; (b) K(x, t) is log-convex in x for each fixed t. Prove that F is log-convex on U . [Hint: Apply the Rogers–H¨older inequality, noticing that K((1 − λ)x + λy, t) ≤ (K(x, t))1−λ (K(y, t))λ . ] Remark. The Laplace of a function f ∈ L1 (0, ∞) is given by the ∫ ∞transform −tx formula (Lf )(x) = 0 f (t)e dt. By Exercise 8, the Laplace transform of any positive function is log-convex. In the same way one can show that ∫∞ the moment µα = 0 tα f (t) dt, of any random variable with probability density f , is a log-convex function in α on each subinterval of [0, ∞) where it is finite. 9. (Combinatorial properties of sequences) Call a sequence (an )n of strictly positive numbers log-concave (respectively log-convex ) if an−1 an+1 ≤ a2n for all n ≥ 1 (respectively an−1 an+1 ≥ a2n for all n ≥ 1). (a) Let (an )n be a strictly ∑n positive and strictly decreasing sequence of numbers and let An = k=0 ak . Prove that (An )n is log-concave. Illustrate this result using the Maclaurin expansion of log(1 − x) on (0, 1). ( ) (b) Prove that the sequence of binomial coefficients nk is log-concave in k for fixed n. ∑n ( ) (c) (I. Newton) Suppose that P (x) = k=0 nk ak xk is a real polynomial with real zeros. Prove that a0 , a1 , ..., an is a log-convex sequence. (d) (Davenport-P´olya) If both (xn )n and (yn )n are log-convex, then so is the sequence of their binomial convolution, n ( ) ∑ n zn = xk yn−k . k k=0
Remark. More information on log-convex/log-concave sequences are available in the papers of Lily L. Liu and Yi Wang [218] and R. P. Stanley [354]. See also Appendix C.
1.4
Smoothness Properties of Convex Functions
The starting point is the following restatement of Definition 1.1.1. A function f : I → R is convex if and only if f (x) ≤
b−x x−a · f (a) + · f (b) b−a b−a
(1.12)
26 equivalently,
1 Convex Functions on Intervals 1 a f (a) 1 x f (x) 1 b f (b)
≥ 0,
(1.13)
whenever a < x < b in I.Indeed, every point x belonging to an interval [a, b] can be written uniquely as a convex combination of a and b, more precisely, x=
b−x x−a ·a+ · b. b−a b−a
Subtracting f (a) from both sides of the inequality (1.12) and repeating the operation with f (b) instead of f (a), we obtain that every convex function f : I → R verifies the three chords inequality, f (x) − f (a) f (b) − f (a) f (b) − f (x) ≤ ≤ x−a b−a b−x
(1.14)
whenever a < x < b in I. Clearly, this inequality actually characterizes the convexity of f. Moreover, the three chords inequality with strict inequalities provides a characterization of strict convexity. Very close to these remarks is Galvani’s characterization of convexity. See Exercise 3. 1.4.1 Remark The three chords inequality can be strengthened as follows: If f : I → R is a convex function and x, y, a, b are points in the interval I such that x ≤ a, y ≤ b, x ̸= y and a ̸= b, then f (x) − f (y) f (a) − f (b) ≤ . x−y a−b
Appendix A
Generalized Convexity on Intervals An important feature underlying the notion of convex function is the comparison of means under the action of a function. Indeed, by considering the weighted arithmetic mean of n variables, A(x1 , . . . , xn ; λ1 , . . . , λn ) =
n ∑
λk xk ,
k=1
the convex sets are precisely the sets invariant under the action of these means, while the convex functions are the functions that increase the value of means of any order: f (A(x1 , . . . , xn ; λ1 , . . . , λn )) ≤ A(f (x1 ), . . . , f (xn ); λ1 , . . . , λn ).
(AA)
This motivates to call the usual concept of convexity also (A, A)-convexity. What is happening if we consider other means? The interval (0, ∞) is closed under the action of weighted geometric mean of n variables, n ∏ G(x1 , . . . , xn ; λ1 , . . . , λn ) = xλk k , k=1
a fact that allowed us to consider in Sections 1.1 and 1.3 the log-convex functions, that is, the functions f : U → (0, ∞) (defined on convex sets) that verify the inequalities f (A(x1 , . . . , xn ; λ1 , . . . , λn )) ≤ G(f (x1 ), . . . , f (xn ); λ1 , . . . , λn );
(AG)
this motivates the name of (A, G)-convex for the log-convex functions. The quasiconvex functions also fit a similar description as they coincide with the (A, M∞ )-convex functions, where the max mean is defined by M∞ (x1 , . . . , xn ; λ1 , . . . , λn ) = max {x1 , . . . , xn } . 27
28
Appendix A
The aim of this appendix is to give a short account on the subject of convexity according to a pair of means (acting respectively on the domain and the codomain) and to bring into attention other classes of convex like functions.
A.1
Means
By a mean we understand a procedure M to associate to each discrete random variable X : {1, ..., n} → R having the probability distribution P ({X(k) = xk ) = λk
for k = 1, ..., n,
a number M (x1 , ..., xn ; λ1 , ..., λn ) such that M (x1 , ..., xn ; λ1 , ..., λn ) ∈ [min X, max X]. We make the convention to omit the weights λk when they are equal to each others, that is, we put M (x1 , ..., xn ) = M (x1 , ..., xn ; 1/n, ..., 1/n). A mean is called: symmetric if M (x1 , ..., xn ) = M (xσ(1) , ..., xσ(n) ) for every finite family x1 , ..., xn of elements of I and every permutation σ of {1, ..., n} with n ≥ 1. continuous when all functions M (x1 , ..., xn ; λ1 , ..., λn ) are globally continuous; increasing/strictly increasing when all functions M (x1 , ..., xn ; λ1 , ..., λn ) with strictly positive weights are increasing/strictly increasing in each of the variables xk (when the others are kept fixed). When I is one of the intervals (0, ∞), [0, ∞) or (−∞, ∞), it is usual to consider homogeneous means, that is, means for which M (αx1 , ..., αxn ; λ1 , ..., λn ) = αM (x1 , ..., xn ; λ1 , ..., λn ) whenever α > 0. From a probabilistic point of view, means represent generalizations of the concept of expectation (or expected value) of a random variable. The most known class of symmetric, continuous and homogeneous means on (0, ∞) is that of H¨ older’s means (also known as the power means): min{x1 , ..., xn } if p = −∞ ∑n ( k=1 λk xpk , )1/p if p ∈ R\{0} ∏n Mp (x1 , ..., xn ; λ1 , ..., λn ) = (A.1) λk if p = 0 k=1 xk max{x1 , ..., xn }, if p = ∞. This family bring together three of the most important means in mathematics: the arithmetic mean A = M1 , the geometric mean G = M0 and the harmonic
A.1 Means
29
mean H = M−1 . Notice that all H¨older’s means of real index are strictly increasing and M0 (s, t) = limp→0 Mp (s, t). Besides, all H¨older’s means of odd index p ≥ 1 makes sense for all families of real numbers. H¨older’s means admits a natural extensions to all positive random variables (discrete or continuous); see Exercise 1, Section 1.7. A somewhat more general class of means is that of quasi-arithmetic means. In the discrete case, the quasi-arithmetic means are associated to strictly monotonic and continuous functions φ : I → R via the formulas ( n ) ∑ −1 Mφ (x1 , ..., xn ; λ1 , ..., λn ) = φ λk φ(xk ) ; k=1
in the case of an arbitrary random variable, they can be defined as RiemannStieltjes integrals, (∫ ) −1 Mφ (X) = φ φ(x)dF (x) , where F represents the cumulative distribution function of X. Kolmogorov [195] has developed the first axiomatic theory of means, providing the ubiquity of quasi-arithmetic means. A nice account on his contribution, as well as on some recent results concerning the probabilistic applications of quasi-arithmetic means, can be found in the paper of M. de Carvalho [76]. A.1.1 Lemma (K. Knopp and B. Jessen; see [154], p. 66) Suppose that φ and ψ are two continuous functions defined in an interval I such that φ is strictly monotonic and ψ is strictly increasing. Then: Mφ (x1 , . . . , xn ; λ1 , . . . , λn ) = Mψ (x1 , . . . , xn ; λ1 , . . . , λn ) for every family x1 , . . .∑ , xn of elements of I and every n family λ1 , . . . , λn of nonnegative numbers with k=1 λk = 1 (n ∈ N⋆ ) if and −1 only if ψ ◦ φ is affine, that is, ψ = αφ + β for some constants α and β, with α ̸= 0. An immediate consequence of Lemma A.1.1 is the fact that every power mean Mp of real index is a quasi-arithmetic mean Mφ , where φ(x) = log x, if p = 0, and φ(x) = (xp − 1)/p, if p ̸= 0. A.1.2 Theorem (M. Nagumo, B. de Finetti and B. Jessen; see [154], p. 68) Let φ be a continuous increasing function on (0, ∞) such that the quasi-arithmetic mean Mφ is positively homogeneous. Then Mφ is a power means. Proof. By Lemma A.1.1, we can replace φ by φ − φ(1), so we may assume that φ(1) = 0. The same argument yields two functions α and β such that φ(cx) = α(c)φ(x) + β(c) for all x > 0, c > 0. The condition φ(1) = 0 shows that β = φ, so for reasons of symmetry, φ(cx) = α(c)φ(x) + φ(c) = α(x)φ(c) + φ(x). Letting fixed c ̸= 1, we obtain that α is of the form α(x) = 1 + kφ(x) for some constant k. Then φ verifies the functional equation φ(xy) = kφ(x)φ(y) + φ(x) + φ(y)
30
Appendix A
for all x > 0, y > 0. When k = 0 we find that φ(x) = C log x for some constant C, so Mφ = M0 . When k ̸= 0 we notice that χ = kφ + 1 verifies χ(xy) = χ(x)χ(y) for all x > 0, y > 0. This leads to φ(x) = (xp − 1)/k, for some p ̸= 0, hence Mφ = Mp . Some authors (such as P. S. Bullen, D. S. Mitrinovi´c and P. M. Vasi´c [65]) consider a more general concept of mean, associated to functions M : I × I → I with the single property that min {x, y} ≤ M (x, y) ≤ max {x, y}
for all x, y ∈ I.
One can interpret M (x, y) as the mean M (x, y; 1/2, 1/2) of a random variable taking only two values, x and y, with probability 1/2. Unfortunately, the problem to indicate ”natural ” extensions of such means to the entire class of discrete random variables remains open. The difficulty of this problem is illustrated by the case of logarithmic and identric means, which are defined respectively by L(x, y) =
x−y 1 ( y y )1/(y−x) and I(x, y) = log x − log y e xx
for x, y > 0, x ̸= y, and L(x, x) = I(x, x) = x for x > 0. The inequalities noticed in Example ??, Section 1.9, can be completed as follows: G(x, y) < L(x, y) < M1/3 (x, y) < M2/3 (x, y) < I(x, y) < A(x, y) for all x, y > 0, x ̸= y. See Tung-Po Lin [215] and K. Stolarsky [363] for the inner inequalities and also for the fact that the logarithmic mean and the identric mean are not power means. Since the logarithmic mean is positively homogeneous, Theorem A.1.2 allows us to conclude that this mean is not quasi-arithmetic. A satisfactory extension of these means for arbitrary random variables was obtained by E. Neuman [268], [269]. See also the comments by Niculescu [286].
A.2
Convexity According to a Pair of Means
According to G. Aumann [20], if M and N are means defined respectively on the intervals I and J, a function f : I → J is called (M, N )-midpoint convex f (M (x, y)) ≤ N (f (x), f (y)) for all x, y ∈ I; it is called (M, N )-midpoint concave if the inequality works in the reverse way, and (M, N )-midpoint affine if the inequality sign is replaced by equality. The condition of midpoint affinity is essentially a functional equation and this explains why the theory of generalized convexity has much in common with the subject of functional equations. In what follows we will be interested in a concept of convexity dealing with weighted means.
A.2 Convexity According to a Pair of Means
31
A.2.1 Definition A function f : I → J is called (M, N )-convex if f (M (x, y; 1 − λ, λ)) ≤ N (f (x), f (y); 1 − λ, λ) for all x, y ∈ I and all λ ∈ [0, 1]. It is called (M, N )-strictly convex when the inequality is strict whenever x and y are distinct points and λ ∈ (0, 1). If −f is (M, N )-convex (respectively, strictly (M, N )-convex) then we say that f is (M, N )-concave (respectively, (M, N )-strictly concave). Jensen’s criterion of convexity (Theorem 1.1.8 above) can be extended easily to the context of power means (and even to that of quasi-arithmetic means). Thus, in their case, every (M, N )-midpoint and continuous convex function f : I → J is also (M, N )-convex. Jensen’s inequality works as well, providing the possibility to deal with rather general random variables. All these facts are consequences of the following simple remark, which reduce the convexity with respect to a pair of quasi-arithmetic means to the usual convexity of a function derived via a change of variable and a change of function. A.2.2 Lemma (J. Acz´el [3]) Suppose that φ and ψ are two continuous and strictly monotonic functions defined respectively on the intervals I and J. Then: (a) if ψ is strictly increasing, a function f : I → J is (M[φ] , M[ψ] )-convex/concave if and only if ψ ◦ f ◦ φ−1 is convex/concave on φ(I); (b) if ψ is strictly decreasing, a function f : I → J is (M[φ] , M[ψ] )-convex/concave if and only if ψ ◦ f ◦ φ−1 is concave/convex on φ(I). A.2.3 Corollary Suppose that φ, ψ : I → R are two strictly monotonic continuous functions. If ψ is strictly increasing, then M[φ] ≤ M[ψ] if and only if ψ ◦ φ−1 is convex. Corollary A.2.3 has important consequences. For example, as was noticed by J. Lamperti [204], it yields Clarkson’s inequalities. His basic remark is as follows: A.2.4 Lemma Suppose √ that Φ : [0, ∞) → R is a continuous increasing function with Φ(0) = 0 and Φ( t) convex. Then Φ(|z + w|) + Φ(|z − w|) ≥ 2Φ(|z|) + 2Φ(|w|), (A.2) √ for all z, w ∈ C, while if Φ( t) is concave, the reverse inequality is true. Provided the convexity or concavity is strict, equality holds if and only if zw = 0. Clarkson’s inequalities follow for Φ(t) = tp ; this function is strictly concave for 1 < p ≤ 2 and strictly convex for 2 ≤ p < ∞.
32
Appendix A
√ Proof. When Φ( t) is convex, we infer from Corollary A.2.3 and the parallelogram law that Φ−1
{ Φ(|z + w|) + Φ(|z − w|) } 2
≥
{ |z + w|2 + |z − w|2 }1/2
2 = (|z|2 + |w|2 )1/2 .
(A.3)
On the other hand, taking into account the three √ chords inequality (as stated in Section 1.4), we infer from the √ convexity of Φ( t) and the fact that Φ(0) = 0 the increasing monotonicity of Φ( t)/t; the monotonicity is strict provided that √ the convexity of Φ( t) is strict. Then ( 2 ) 2 1/2 ) ( 2 2 Φ(|z|) 2 Φ(|w|) 2 Φ (|z| + |w| ) |z| + |w| ≤ |z| + |w| , 2 2 |z|2 + |w|2 |z| |w| √ and the inequality is strict when Φ( t) is strictly convex and zw ̸= 0. Therefore Φ−1 (Φ(|z|) + Φ(|w|)) ≤ (|z|2 +√|w|2 )1/2 and this fact together with √ (A.3) ends the proof in the case where Φ( t) is convex. The case where Φ( t) is concave can be treated in a similar way. According to Lemma A.2.2, a function f : (0, α) → (0, ∞) is (H, G)-convex (concave) on (0, α) if and only if log f (1/x) is convex (concave) on (1/α, ∞). Clearly, the ”strict” variant also works. The following example of a (H, G)-strictly concave function is due to D. Borwein, J. Borwein, G. Fee and R. Girgensohn [54]. A.2.5 Example Given α > 1, the function Vα (p) = 2α inequality ( ) 1 Vα1−λ (p)Vαλ (q) < Vα 1−λ λ , p + q
(1+1/p)α (1+α/p)
verifies the
for all p, q > 0 with p ̸= q and all λ ∈ (0, 1). This example has a geometric motivation. When α > 1 is an integer number and p ≥ 1 is a real number, Vα (p) represents the volume of the ellipsoid {x ∈ Rα : ∥x∥Lp ≤ 1}. According to a remark above, it suffices to prove that the function Uα (x) = − log(Vα (1/x)/2α ) = log (1 + αx) − α log (1 + x) is strictly convex on (0, ∞) for every α > 1. Using the psi function, Psi(x) =
d log (x), dx
we have Uα′′ (x) = α2
d d Psi(1 + αx) − α Psi(1 + x). dx dx
A.3 A Case Study: Convexity According to the Geometric Mean
33
The condition Uα′′ (x) > 0 on (0, ∞) is equivalent with (x/α)Uα′′ (x) > 0 on (0, ∞), d and the latter holds if the function x → x dx Psi(1 + x) is strictly increasing. Or, a classical formula in special function theory [16], [140] asserts that d Psi(1 + x) = dx
∫
∞
0
ueux du, eu − 1
whence we infer that ) ∫ ∞ u[(u − 1)eu + 1]eux d ( d x Psi(1 + x) = du > 0. dx dx (eu − 1)2 0 The result now follows. Notice that the volume function Vn (p) is neither convex nor concave for n ≥ 3. The following example is devoted to the presence of generalized convexity in the framework of hypergeometric functions. A.2.6 Example The Gaussian hypergeometric function (of parameters a, b, c > 0) is defined via the formula F (x) =2 F1 (x; a, b, c) =
∞ ∑ (a, n) (b, n) n x (c, n)n! n=0
for |x| < 1,
where (a, n) = a(a + 1) · · · (a + n − 1) if n ≥ 1 and (a, 0) = 1. Anderson, Vamanamurthy and Vuorinen [15] proved that if a + b ≥ c > 2ab and c ≥ a + b − 1/2, then the function 1/F (x) is concave on (0, 1). This implies ( F
x+y 2
) ≤
1 2
(
1 1 F (x)
+
1 F (y)
)
for all x, y ∈ (0, 1),
whence it follows that the hypergeometric function is (A, H)-convex. Last but not least, Lemma A.2.2 offers a simple but powerful tool for deriving the entire theory of (Mr , Ms )-convex functions from that of usual convex functions. We will illustrate this fact in the next section, by discussing the multiplicative analogue of usual convexity.
A.3
A Case Study: Convexity According to the Geometric Mean
The following relative of usual convexity was first considered by P. Montel [260] in a beautiful paper discussing different analogues of convex functions in several variables.
34
Appendix A
A.3.1 Definition A function f : I → J, acting on subintervals of (0, ∞), is called multiplicatively convex (equivalently, G, G)-convex) if it verifies the condition f (x1−λ y λ ) ≤ f (x)1−λ f (y)λ . (GG) whenever x, y ∈ I and λ ∈ [0, 1]. The related notions of multiplicatively strictly convex function, multiplicatively concave function, multiplicatively strictly concave function and multiplicatively affine function can be introduced in the natural manner. The basic tool in translating the results known for convex functions into results valid for multiplicatively convex functions is the following particular case of Lemma A.2.2: A.3.2 Lemma Suppose that I is a subinterval of (0, ∞) and f : I → (0, ∞) is a multiplicatively (strictly) convex function on I. Then F = log ◦f ◦ exp : log(I) → R is a (strictly) convex function. Conversely, if J is an interval of R and F : J → R is a (strictly) convex function, then f = exp ◦F ◦ log : exp(J) → (0, ∞) is a multiplicatively (strictly) convex function. The corresponding ”concave” variant of this lemma also works. This approach of multiplicative convexity follows C. P. Niculescu [272], [274]. According to Lemma A.3.2, every multiplicatively convex function f : I → (0, ∞) has finite lateral derivatives at each interior point of I (and the set of all points where f is not differentiable is at most countable). As a consequence, every multiplicatively convex function is continuous at the interior points of its domain. The following result represents the multiplicative analogue of Jensen’s criterion of convexity (Theorem 1.1.8 above) A.3.3 Theorem Suppose that I is a subinterval of (0, ∞) and f : I → (0, ∞) is a function continuous on the interior of I. Then f is multiplicatively convex if and only if √ √ f ( xy) ≤ f (x)f (y) for all x, y ∈ I. A large class of strictly multiplicatively convex functions is indicated by the following result: A.3.4 Proposition (G. H. Hardy, J. E. Littlewood and G. P´olya [154, Theorem 177, p. 125]) Every polynomial P (x) with nonnegative coefficients is a multiplicatively convex ∑∞ function on (0, ∞). More generally, every real analytic function f (x) = n=0 cn xn with nonnegative coefficients is a multiplicatively convex function on (0, R), where R denotes the radius of convergence.
A.3 A Case Study: Convexity According to the Geometric Mean
35
Moreover, except for the case of functions Cxn (with C > 0 and n ∈ N), the above examples exhibit strictly multiplicatively convex functions (which are also increasing and strictly convex). In particular, exp, sinh and cosh on (0, ∞); tan, sec, csc and x1 − cot x on (0, π/2); arcsin on (0, 1]; 1+x − log(1 − x) and 1−x on (0, 1). See the table of series in I. S. Gradshteyn and I. M. Ryzhik [140]. By continuity, it suffices to prove only the first assertion. Suppose that ∑N P (x) = n=0 cn xn . According to Theorem A.3.3, we have to prove that √ x, y > 0 implies (P ( xy))2 ≤ P (x)P (y), or, equivalently, x, y > 0 implies (P (xy))2 ≤ P (x2 )P (y 2 ). The later implication is an easy consequence of Cauchy–Bunyakovsky–Schwarz inequality. The following remark collects a series of useful facts concerning the multiplicative convexity of concrete functions: A.3.5 Remark (a) If a function is log-convex and nondecreasing, then it is multiplicatively convex. (b) If a function f is multiplicatively convex, then the function 1/f is multiplicatively concave (and vice versa). (c) If a function f is multiplicatively convex, increasing and one-to-one, then its inverse is multiplicatively concave (and vice versa). (d) If a function f is multiplicatively convex, so is xα [f (x)]β (for all α ∈ R and all β > 0). (e) If f is continuous, and one of the functions f (x)x and f (e1/ log x ) is multiplicatively convex, then so is the other. (f) The general form of multiplicatively affine functions is Cxα with C > 0 and α ∈ R. The indefinite integral of a multiplicatively convex function has the same nature: A.3.6 Proposition (P. Montel [260]) Let f : [0, a) → [0, ∞) be a continuous function which is multiplicatively convex on (0, a). Then ∫ x F (x) = f (t) dt 0
is also continuous on [0, a) and multiplicatively convex on (0, a).
36
Appendix A
Proof. Due to the continuity of F , it suffices to show that √ (F ( xy))2 ≤ F (x)F (y) for all x, y ∈ [0, a), which is a consequence of the corresponding inequality at the level of integral sums, [ √xy n−1 ∑ ( √xy )]2 [ x n−1 ∑ ( x )][ y n−1 ∑ ( y )] f k f k f k ≤ , n n n n n n k=0
k=0
k=0
that is, of the inequality [n−1 ∑ ( x )][n−1 ∑ ( y )] ∑ ( √xy )]2 [n−1 ≤ f k f k . f k n n n k=0
k=0
k=0
To see that the later inequality holds, first notice that [ ( √xy )]2 [ ( x )][ ( y )] f k ≤ f k f k n n n and then apply the Cauchy–Bunyakovsky–Schwarz inequality. According to Proposition A.3.6, the logarithmic integral , ∫ x dt Li(x) = , x ≥ 2, 2 log t is multiplicatively convex. This function is important in number theory. For example, if π(x) counts the number of primes p such that 2 ≤ p ≤ x, then an equivalent formulation of the Riemann hypothesis is the existence of a function C : (0, ∞) → (0, ∞) such that |π(x) − Li(x)| ≤ C(ε)x1/2+ε
for all x ≥ 2 and all ε > 0.
Since the function tan is continuous on [0, π/2) and strictly multiplicatively convex on (0, π/2), a repeated application of Proposition A.3.6 shows that the Lobacevski’s function ∫ x
L(x) = −
log cos t dt 0
is strictly multiplicatively convex on (0, π/2). Starting with t/(sin t), (which is strictly multiplicatively convex on (0, π/2]) and then switching to (sin t)/t, a similar argument leads us to the fact that the integral sine function, ∫ x sin t dt, Si(x) = t 0 is strictly multiplicatively concave on (0, π/2]. A.3.7 Proposition Γ is a strictly multiplicatively convex function on [1, ∞).
A.3 A Case Study: Convexity According to the Geometric Mean
37
Proof. In fact, log Γ(1 + x) is strictly convex and increasing on (1, ∞). Moreover, an increasing strictly convex function of a strictly convex function is strictly convex. Hence, F (x) = log Γ(1 + ex ) is strictly convex on (0, ∞) and thus Γ(1 + x) = exp F (log x) is strictly multiplicatively convex on [1, ∞). As Γ(1 + x) = xΓ(x), we conclude that Γ itself is strictly multiplicatively convex on [1, ∞). As was noted by T. Trif [369], the result of Proposition A.3.7 can be improved: the gamma function is strictly multiplicatively concave on (0, α] and strictly multiplicatively convex on [α, ∞), where α = 0.21609... is the unique d positive solution of the equation Psi(x) + x dx Psi(x) = 0. D. Gronau and J. Matkowski [141] have proved the following multiplicative analogue of the Bohr-Mollerup Theorem: If f : (0, ∞) → (0, ∞) verifies the functional equation f (x + 1) = xf (x), the normalization condition f (1) = 1 and f is multiplicatively convex on an interval (a, ∞), for some a > 0, then f = . Another application of Proposition A.3.7 is the fact that the function Γ(2x+ 1)/Γ(x + 1) is strictly multiplicatively convex on [1, ∞). This can be seen by using the Gauss–Legendre duplication formula (see [84], Theorem 10.3.10, p. 356). A.3.8 Remark The following result due to Matkowski [235] connects convexity, multiplicative convexity and (L, L)-convexity: every convex and multiplicatively convex function is (L, L)-convex. The ”concave” variant of this results also works. As a consequence, the exponential function is an example of (L, L)convex function on (0, ∞), while the tangent function is an (L, L)-convex function on (0, π/2). Exercises 1. Let f : I → (0, ∞) be a differentiable function defined on a subinterval I of (0, ∞). Prove that the following assertions are equivalent: (a) f is multiplicatively convex; (b) the function xf ′ (x)/f (x) is nondecreasing; (c) f verifies the inequality ′ f (x) ( x )yf (y)/f (y) ≥ f (y) y
for all x, y ∈ I.
A similar statement works for the multiplicatively concave functions. Illustrate this fact by considering the restriction of sin(cos x) to (0, π/2). 2. Let f : I → (0, ∞) be a twice differentiable function defined on a subinterval I of (0, ∞). Prove that f is multiplicatively convex if and only if it verifies the differential inequality x[f (x)f ′′ (x) − f ′2 (x)] + f (x)f ′ (x) ≥ 0
for all x > 0.
38
Appendix A Infer that the integral sine function is multiplicatively concave. 3. (A multiplicative variant of Popoviciu’s inequality) Prove that √ √ √ √ Γ(x)Γ(y)Γ(z)Γ3 ( 3 xyz) ≥ Γ2 ( xy)Γ2 ( yz)Γ2 ( zx) for all x, y, z ≥ 1; the equality occurs only for x = y = z. 4. (An estimate of Jensen gap in the multiplicative case [272]) Suppose that f is a twice differentiable multiplicatively convex function defined on a subinterval I of (0, ∞) and put α(f ) =
d2 d2 log f (ex ) and β(f ) = sup log f (ex ). 2 2 dx x∈log(I) dx x∈log(I) inf
Prove that ( α(f ) exp 2n2
∑
(log xj − log xk )2
1≤j