A NOTE ON THE MONGE-KANTOROVICH PROBLEM IN THE PLANE

2 downloads 0 Views 489KB Size Report
PLANE. Zuo Quan Xu. Department of Applied Mathematics. The Hong Kong Polytechnic University, Kowloon, Hong Kong. Jia-An Yan. Academy of Mathematics ...
COMMUNICATIONS ON PURE AND APPLIED ANALYSIS Volume 14, Number 2, March 2015

doi:10.3934/cpaa.2015.14. pp. –

A NOTE ON THE MONGE-KANTOROVICH PROBLEM IN THE PLANE

Zuo Quan Xu Department of Applied Mathematics The Hong Kong Polytechnic University, Kowloon, Hong Kong

Jia-An Yan Academy of Mathematics and Systems Science Chinese Academy of Sciences, Beijing, China

(Communicated by Zhen Lei ) Abstract. The Monge-Kantorovich mass-transportation problem has been shown to be fundamental for various basic problems in analysis and geometry in recent years. Shen and Zheng propose a probability method to transform the celebrated Monge-Kantorovich problem in a bounded region of the Euclidean plane into a Dirichlet boundary problem associated to a nonlinear elliptic equation. Their results are original and sound, however, their arguments leading to the main results are skipped and difficult to follow. In the present paper, we adopt a different approach and give a short and easy-followed detailed proof for their main results.

1. Introduction. The optimal transportation problem was first raised in [1] by Monge in 1781. Let X and Y be two separable metric spaces, and c : X ×Y → [0, ∞] be a Borel-measurable function, where c(x, y) is the cost of the transportation from x to y. Given probability measures µ on X and ν on Y , Monge’s formulation of the optimal transportation problem is to find a transport map T : X → Y that realizes the infimum Z  inf c(x, T (x)) dµ(x) : T −1 (µ) = ν , X

where T −1 (µ) = ν means that ν(A) = µ(T −1 (A)) for every Borel set A on Y . Sometimes one can write µT −1 = ν. A map T that attains this infimum is called an optimal transport map. Monge’s formulation of the optimal transportation problem can be ill-posed, because sometimes there is no T satisfying T −1 (µ) = ν. In [2], Kantorovich reformulated this problem as follows: to find a probability measure γ on X × Y that attains the infimum Z  inf c(x, y) dγ(x, y) : γ ∈ Γ(µ, ν) , X×Y

where Γ(µ, ν) denotes the collection of all probability measures on X × Y with marginal measures µ on X and ν on Y . It is known that a minimizer for this 2000 Mathematics Subject Classification. Primary: 35J62; Secondary: 49K20. Key words and phrases. Monge-Kantorovich problem, transportation problem, calculus of variations, Dirichlet boundary problem, comonotonic random variable. .

1

2

ZUO QUAN XU AND JIA-AN YAN

problem always exists when the cost function c is lower semi-continuous and Γ(µ, ν) is a tight collection of measures. Such an optimization problem is called MongeKantorovich problem. 2. 1-dimensional case. Let us consider a special case: X and Y are both 1dimensional Euclidian domains, and c(x, y) = |x − y|2 . Let (Ω, F, P) be a nonatomic probability space. We denote by L(F, G) the set of all 2-dimensional random variables whose marginal distributions are F and G, respectively. Then the MongeKantorovich problem can be reformulated as follows: to find an optimal coupling of (X, Y ) ∈ L(F, G) that realizes the infimum n o inf E[(X − Y )2 ] : (X, Y ) ∈ L(F, G) . e Ye ) ∈ L(F, G), and X e It is well-known and easily proved (see, eg. [3, 4]) that if (X, 1 e e e and Y are comonotonic, then (X, Y ) is an optimal coupling, that is, e Ye ) = (X,

argmin

E[(X − Y )2 ],

(X,Y )∈L(F,G)

and the minimum value is min (X,Y )∈L(F,G)

2

e − Ye )2 ] = E[(X − Y ) ] = E[(X

Z

1

|F (−1) (t) − G(−1) (t)|2 dt,

0

where F (−1) (·) and G(−1) (·) denote the left-continuous inverse functions of F (·) and G(·), respectively. 3. Main idea of [5]. Shen and Zheng consider the Monge-Kantorovich problem in the Euclidean plane in [5]. Given two 2-dimensional distribution functions F and G, the Monge-Kantorovich problem reduces to finding an optimal coupling of (X , Y) whose marginal distributions are F and G, respectively, such that E[|| X − Y ||2 ] := E[(X1 − Y1 )2 + (X2 − Y2 )2 ] attains the infimum over all such couplings, where X = (X1 , X2 ) and Y = (Y1 , Y2 ). Denote Z = (X1 , Y2 ), then E[|| X − Y ||2 ] = E[(X1 − Y1 )2 + (X2 − Y2 )2 ] = E[|| Z − Y ||2 ] + [|| X − Z ||2 ]. Assuming that random vectors X , Y and Z all have smooth and strictly positive density functions, with help of the random vector Z, Shen and Zheng have successfully reduced the dimension of the decision variable by turning the original optimal coupling problem on (X , Y) into an optimization problem on the distribution of Z. It is assumed in [5] that the random vectors take values in a bounded region and e with Ω e F, e P) e = [0, 1] × [0, 1] reformulate the problem in a new probability space (Ω, e being the Lebesgue measure. In fact, as we will see below, this restriction and P and reformulation are not needed. The main approach in [5] consists two steps. First, for each fixed pair (X1 , Y2 ), they adopt a probability approach to find the best X2 and Y1 to minimize E[(X2 − Y2 )2 ] and E[(X1 − Y1 )2 ] under the constraint that the joint distribution of (X1 , X2 ) is the given distribution F and that of (Y1 , Y2 ) is G. After this step, the optimal coupling problem boils down to an optimization problem over all possible joint 1 Two real-valued random variables X and Y are said to be comonotonic if (X(ω 0 ) − X(ω))(Y (ω 0 ) − Y (ω)) > 0 almost surely under P ⊗ P.

MONGE-KANTOROVICH PROBLEM IN THE PLANE

3

probability density functions of (X1 , Y2 ). They then propose a calculus of variations method to solve the above optimization problem. 4. Reduction in the dimension. As in [5], our first step is to construct two functions g(·, ·) and h(·, ·) satisfying (X1 , g(X1 , Y2 )) ∼ (X1 , X2 ) and (h(X1 , Y2 ), Y2 ) ∼ (Y1 , Y2 ). Here X ∼ Y means that X and Y have the same distribution. Let f (·, ·) be the probability density function of the 2-dimensional random vector X = (X1 , X2 ). Then the conditional distribution of X2 given X1 = x is Ry f (x, t) dt . FX2 |X1 (y|x) = P(X2 6 y|X1 = x) = R−∞ f (x, t) dt R For each fixed x, denote the inverse function of FX2 |X1 (·|x) by G(x, ·), that is FX2 |X1 (G(x, ·)|x) = ·. Let p(·, ·) be the probability density function of the 2dimensional random vector Z = (X1 , Y2 ). Then the conditional distribution of Y2 given X1 = x is Ry Ry p(x, t) dt p(x, t) dt −∞ R FY2 |X1 (y|x) = P(Y2 6 y|X1 = x) = = R−∞ , p(x, t) dt f (x, t) dt R R where we used the fact that Z Z 0 p(x, t) dt = FX (x) = f (x, t) dt, 1 R

R

in the last identity. Now define g(x, y) := G(x, FY2 |X1 (y|x)). Similarly, we define e X |Y (x|y), y), h(x, y) := G(F 1 2 e y) is the inverse function of where, for each fixed y, G(·, Rx

fe(u, y) du FY1 |Y2 (x|y) = P(Y1 6 x|Y2 = y) = R−∞ , fe(u, y) du R

and fe(·, ·) is the probability density function of the 2-dimensional random vector Y = (Y1 , Y2 ). Then FX1 |Y2 (x|y) is given by Rx Rx p(u, y) du p(u, y) du −∞ = R−∞ , FX1 |Y2 (x|y) = P(X1 6 x|Y2 = y) = R e p(u, y) du f (u, y) du R R where we used the fact that Z Z p(u, y) du = FY0 2 (y) = fe(u, y) du, R

R

in the last identity. Let b := (X1 , g(X1 , Y2 )), X b := (h(X1 , Y2 ), Y2 ). Y b ∼ X and Y b ∼ Y without giving a proof. For In [5], Shen and Zheng claim that X the reader’s convenience, we give a proof here.

4

ZUO QUAN XU AND JIA-AN YAN

In fact, for any bounded Borel function B on R2 , we have Z Z E[B(X1 , g(X1 , Y2 ))] = B(x, g(x, y))p(x, y) dx dy ZR ZR = B(x, G(x, FY2 |X1 (y|x)))p(x, y) dx dy R R !! Ry Z Z p(x, t) dt −∞ = p(x, y) dx dy. B x, G x, R f (x, t) dt R R R Ry p(x,t) dt −∞ R , f (x,t) dt R

For each fixed x, applying change of variable v = dv = R

we have

p(x, y) dy, f (x, t) dt R

and consequently, Ry

Z Z E[B(X1 , g(X1 , Y2 ))] =

B R

−∞

x, G x, R

R

Z Z =

p(x, y) dy dx

Z B (x, G (x, v))

R

!!

f (x, t) dt

R

1

p(x, t) dt

0

f (x, t) dt dv dx. R

For each fixed x, applying change of variable u = G (x, v), we obtain FX2 |X1 (u|x) = FX2 |X1 (G (x, v) |x) = v, 0 dv = FX (u|x) du = R 2 |X1

f (x, u) du, f (x, t) dt R

and consequently, Z Z E[B(X1 , g(X1 , Y2 ))] =

Z B (x, G (x, v))

ZR ZR =

f (x, t) dt dv dx R

B (x, u) f (x, u) du dx = E[B(X1 , X2 )]. R

R

b ∼ X . Similarly, one can prove that Y b ∼ Y. This indicates X In [5], Shen and Zheng claim that b , Y) b have the same “if (X , Y) is the optimal coupling, then the above vector (X 2 b − Y|| b 2 ].” optimal joint distribution. Thus we have E[|| X − Y || ] = E[||X b , Y) b have the same joint Unfortunately, we are not able to prove that (X , Y) and (X distribution. Fortunately, we will show that b − Y|| b 2 ], E[|| X − Y ||2 ] > E[||X b , Y). b which implies that if (X , Y) is the optimal coupling, so is (X In fact, given Y2 = y, the conditional distributions of X1 and Y1 are FX1 |Y2 (·|y) and FY1 |Y2 (·|y), respectively. Therefore, E[(X1 − Y1 )2 |Y2 = y] >

inf

E[(X − Y )2 ],

(X,Y )∈B

where B is the set of all 2-dimensional random vectors whose marginal distributions are FX1 |Y2 (·|y) and FY1 |Y2 (·|y), respectively. Note that the above optimization probe Ye ) ∈ B lem is nothing but a 1-dimensional Monge-Kantorovich problem, so if (X,

MONGE-KANTOROVICH PROBLEM IN THE PLANE

5

e and Ye are comonotonic, then (X, e Ye ) is an optimal coupling: and X e − Ye )2 ]. min E[(X − Y )2 ] = E[(X

(X,Y )∈B

e Ye ) ∈ B and X e and Ye are comonotonic if and It is an easy exercise to show that (X, e e only if Y = α(X), where (−1)

α(x) := FYe (−1)

(−1) e X |Y (x|y), y) = h(x, y), (FXe (x)) = FY1 |Y2 (FX1 |Y2 (x|y)|y) = G(F 1 2 (−1)

and FYe (·) and FY1 |Y2 (·|y) denote the left-continuous inverse functions of FYe (·) and FY1 |Y2 (·|y), respectively. Therefore, E[(X1 − Y1 )2 |Y2 = y] >

inf (X,Y )∈B

e − Ye )2 ] E[(X − Y )2 ] = E[(X

e − h(X, e y))2 ] = E[(X1 − h(X1 , y))2 |Y2 = y] = E[(X = E[(X1 − h(X1 , Y2 ))2 |Y2 = y], where we used the fact that the conditional distribution of X1 given Y2 = y is the e that is FX |Y (·|y). Now we obtain same as the distribution of X, 1 2 E[(X1 − Y1 )2 ] = E[E[(X1 − Y1 )2 |Y2 ]] > E[E[(X1 − h(X1 , Y2 ))2 |Y2 ]] = E[(X1 − h(X1 , Y2 ))2 ]. Similarly, one can prove that E[(X2 − Y2 )2 ] = E[E[(X2 − Y2 )2 |X1 ]] > E[E[(g(X1 , Y2 ) − Y2 )2 |X1 ]] = E[(g(X1 , Y2 ) − Y2 )2 ]. Adding them up, we get E[|| X − Y ||2 ] = E[(X1 − Y1 )2 ] + E[(X2 − Y2 )2 ] b − Y|| b 2 ]. > E[(X1 − h(X1 , Y2 ))2 ] + E[(g(X1 , Y2 ) − Y2 )2 ] = E[||X b , Y). b Thus, we proved that if (X , Y) is an optimal coupling, so is (X Note that b − Y|| b 2 ] = E[(X1 − h(X1 , Y2 ))2 ] + E[(g(X1 , Y2 ) − Y2 )2 ] E[||X e X |Y (X1 |Y2 ), Y2 ))2 ]+E[(G(X1 , FY |X (Y2 |X1 ))−Y2 )2 ] = E[(X1 − G(F 1 2 2 1 !!2 Rs Z Z p(u, y) du −∞ e ,y p(s, y) ds dy = s−G R fe(u, y) du R R R !!2 Rt Z Z p(x, v) dv −∞ + t − G x, R p(x, t) dt dx. (1) f (x, v) dv R R R The optimal coupling problem in the Euclidean plane boils down to minimizing the right hand side of 1 over H, the set of all the probability density functions p(·, ·) R R R R satisfying R p(·, t) dt = R f (·, t) dt and R p(u, ·) du = R fe(u, ·) du. In [5], Shen and Zheng propose a calculus of variations method to solve the above optimization problem. However, their arguments are skipped and difficult to follow. The main objective of this note is to modify their method and give a detailed proof for their main results.

6

ZUO QUAN XU AND JIA-AN YAN

5. Solving the Problem 1: calculus of variations. Lemma 5.1. If β is continuous in a neighbourhood of (a, b) ∈ R2 , then Z Z 1 b+ε a+ε β(x, y) dx dy = β(a, b). lim ε→0+ ε2 b a Proof. This follows immediately from the mean value theorem. Lemma 5.2. If β is second order continuously differentiable in a neighbourhood of (a, b) ∈ R2 , then Z Z 1 lim lim lim 2 β(x, y)ηε (x, y) dx dy = βxy (a, b), b1 →b+ a1 →a+ ε→0+ ε (a1 − a)(b1 − b) R R where ηε (x, y) := (1[a1 ,a1 +ε] (x) − 1[a,a+ε] (x))(1[b1 ,b1 +ε] (y) − 1[b,b+ε] (y)),

(2)

and 1A is the indicator function of set A. Proof. Note Z Z Z β(x, y)ηε (x, y) dx dy = R

b+ε

a+ε

Z

b1 +ε

Z

a1 +ε

+

b

R

Z

Z

a b1 a1 Z b1 +ε Z a+ε b+ε Z a1 +ε



− b

a1

β(x, y) dx dy. b1

a

Applying Lemma 5.1 to each term above, we obtain Z Z 1 β(x, y)ηε (x, y) dx dy = β(a, b) + β(a1 , b1 ) − β(a1 , b) − β(a, b1 ). lim 2 ε→0+ ε R R Therefore, Z Z 1 lim lim β(x, y)ηε (x, y) dx dy a1 →a+ ε→0+ ε2 (a1 − a) R R β(a, b) + β(a1 , b1 ) − β(a1 , b) − β(a, b1 ) = lim = βx (a, b1 ) − βx (a, b), a1 →a+ a1 − a and consequently, Z Z 1 lim lim lim 2 β(x, y)ηε (x, y) dx dy b1 →b+ a1 →a+ ε→0+ ε (a1 − a)(b1 − b) R R βx (a, b1 ) − βx (a, b) = lim = βxy (a, b). b1 →b+ b1 − b The proof is complete. By 1, the Monge-Kantorovich problem in the Euclidean plane boils down to minimizing the functional L(·) over H, where  Z t 2 Z Z  p(x, v) L(p) := t − G x, dv p(x, t) dt dx R R −∞ f1 (x) Z s 2 Z Z  p(u, y) e + s−G du, y p(s, y) ds dy, R R −∞ f2 (y) and Z f1 (x) :=

Z f (x, v) dv,

R

f2 (y) :=

fe(u, y) du. R

MONGE-KANTOROVICH PROBLEM IN THE PLANE

7

Suppose p > 0 minimizes the functional L(·) over H. Let η be any bounded function with compact support on R2 satisfying Z

Z η(x, ·) dx =

R

η(·, y) dy = 0. R

Then pε := p + εη ∈ H, when |ε| is small enough. Because p0 = p minimizes the functional L(·) over {pε : − 1 < ε < 1} ∩ H, the first order condition reads ∂ ∂ε

2 pε (x, v) dv pε (x, t) dt dx −∞ f1 (x) 2  Z s Z Z  pε (u, y) e = 0. pε (s, y) ds dy du, y + s−G −∞ f2 (y) R R ε=0

Z Z  R

R

 Z t − G x,

t

(3)

Let us compute the first term in 3, Z Z   Z t 2  ∂ pε (x, v) t − G x, dv pε (x, t) dt dx ∂ε R R −∞ f1 (x) ε=0   Z t 2  Z Z ∂ pε (x, v) p(x, t) dt dx = t − G x, dv ∂ε f (x) 1 R R −∞ ε=0 2  Z t Z Z  p(x, v) dv η(x, t) dt dx + t − G x, R R −∞ f1 (x)       Z t Z Z Z t p(x, v) p(x, v) = 2 t − G x, dv −Gy x, dv R R −∞ f1 (x) −∞ f1 (x)  Z t η(x, v) dv p(x, t) dt dx × −∞ f1 (x)  Z t 2 Z Z  p(x, v) + t − G x, η(x, t) dt dx dv R R −∞ f1 (x)   Z t Z Z Z ∞  p(x, v) = dv 2 t − G x, R R v −∞ f1 (x)   Z t  p(x, v) p(x, t) × −Gy x, dv dt η(x, v) dv dx f (x) f1 (x) 1 −∞  Z t 2 Z Z  p(x, v) + t − G x, dv η(x, t) dt dx −∞ f1 (x) Z ZR R = ϕ(x, y)η(x, y) dy dx, R

R

where ∞

  Z 2 t − G x,

t

 p(x, u) du y −∞ f1 (x)   Z t    Z y 2 p(x, u) p(x, t) p(x, u) × −Gy x, du dt+ y−G x, du . f1 (x) −∞ f1 (x) −∞ f1 (x)

Z ϕ(x, y) :=

8

ZUO QUAN XU AND JIA-AN YAN

Similarly, ∂ ∂ε

Z Z 

Z

s

e s−G R

−∞

R

pε (u, y) du, y f2 (y)

2

 pε (s, y) ds dy Z Z ε=0 = ψ(x, y)η(x, y) dy dx, R

R

where ∞

 Z e 2 s−G

s

 p(u, y) du, y −∞ f2 (y) x  Z s   Z x 2 p(u, y) p(u, y) p(s, y) e e . × −Gx du, y ds+ x− G du, y f2 (y) −∞ f2 (y) −∞ f2 (y)

Z ψ(x, y) :=

Applying the first order condition 3, we deduce that Z Z (ϕ(x, y) + ψ(x, y))η(x, y) dy dx = 0. R

R

Let us take η ≡ ηε defined in 2 in the above equation. It then follows from Lemma 5.2 that Z Z 1 0 = lim lim lim 2 (ϕ(x, y) + ψ(x, y))ηε (x, y) dy dx b1 →b+ a1 →a+ ε→0+ ε (a1 − a)(b1 − b) R R =ϕxy (a, b) + ψxy (a, b). It is not hard to show ϕy (x, y)      Z y  Z y p(x, y) p(x, u) p(x, u) du −Gy x, du = − 2 y − G x, f1 (x) −∞ f1 (x) −∞ f1 (x)   Z y     Z y  p(x, u) p(x, y) p(x, u) + 2 y − G x, du 1 − Gy x, du f1 (x) −∞ f1 (x) −∞ f1 (x)   Z y  p(x, u) =2 y − G x, du , −∞ f1 (x) and ψx (x, y)  Z x   Z x  p(u, y) p(x, y) p(u, y) e ex du, y −G du, y =−2 x−G f2 (y) −∞ f2 (y) −∞ f2 (y)  Z x   Z x   p(u, y) p(u, y) p(x, y) e ex +2 x−G du, y 1−G du, y f2 (y) −∞ f2 (y) −∞ f2 (y)  Z x  p(u, y) e =2 x − G du, y . −∞ f2 (y) Now we deduce the main theorem in [5]. Theorem 5.3. Suppose p > 0 minimizes L(·) over H. Denote Z x Z y H(x, y) = p(u, v) dv du. −∞

−∞

MONGE-KANTOROVICH PROBLEM IN THE PLANE

9

Then       ∂ 1 ∂ e 1 G x, Hx (x, y) + G Hy (x, y), y = 0. ∂x f1 (x) ∂y f2 (y) Moreover, H(x, −∞) = 0, Z x H(x, +∞) = f1 (u) du,

H(−∞, y) = 0, Z y H(+∞, y) = f2 (v) dv.

−∞

−∞

Acknowledgments. The first author acknowledges financial supports from Hong Kong Early Career Scheme (No. 533112), Hong Kong General Research Fund (No. 529711) and Hong Kong Polytechnic University. The second author acknowledges financial supports from National Natural Science Foundation of China (No. 11371350), Key Laboratory of Random Complex Structures and Data Science, CAS (No. 2008DP173182), Hong Kong General Research Fund (No. 529711), and Department of Applied Mathematics, Hong Kong Polytechnic University, during his visit in December 2012. Both authors thank Prof. Weian Zheng for valuable discussions during the preparation of this note. REFERENCES [1] G. Monge, Mémoire sur la théorie des déblais et des remblais, Histoire de l’Académie Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique pour la même année, (1781), 666–704. [2] L. Kantorovich, On the translocation of masses, C. R. (Doklady) Acad. Sci. URSS (N. S.), 37 (1942), 199–201. [3] S. T. Rachev and L. Rüschendorf, Mass Transportation Problems, Volume I: Theory (Probability and its Applications), Springer-Verlag, 1998. [4] S. T. Rachev and L. Rüschendorf, Mass Transportation Problems, Volume II: Applications (Probability and its Applications), Springer-Verlag, 1998. [5] Y. F. Shen and W. A. Zheng, On Monge-Kantorovich problem in the plane, C. R. Acad. Sci. Paris, Ser. I , 348 (2010), 267–271.

Received March 2014; revised September 2014. E-mail address: [email protected] E-mail address: [email protected]