On Dominated l Metrics 1
Jir Matou sek
Department of Applied Mathematics Charles University Malostranske nam. 25, 118 00 Praha 1 Czech Republic
Yuri Rabinovich
CS Department Haifa University Haifa 31905, Israel e-mail:
[email protected]
Rev. Oct/11/1999
Abstract We introduce and study a class l1dom() of l1 -embeddable metrics corresponding to a given metric . This class is de ned as the set of all convex combinations of -dominated line metrics. Such metrics were implicitly used before in several constuctions of low-distortion embeddings into lp -spaces, such as Bourgain's embedding of an arbitrary metric on n points with O(log n) distortion. Our main result is that the gap between thep distortions of embedding of a nite metric of size n into l2 versus into l1dom () is at most O( log n), and that this bound is essentially tight. A signi cant part of the paper is devoted to proving lower bounds on distortion of such embeddings. We also discuss some general properties and concrete examples.
1 Introduction The study of approximate embeddings of nite metric spaces into lp -spaces was rst studied mainly in connection with problems in Functional Analysis (Local Theory of Banach spaces; see, for instance [6], [7], [13], [4]). It has evolved into a rich research area with intimate links to Functional Analysis, to classical questions of Combinatorial Optimisation, and to the design of approximate algorithms. In the last area, it oered new paradigms which allowed to solve long-standing open problems. A partial list of results and applications includes [15, 14, 2, 9, 17, 11]. Let (X; ) be a metric space. If f : X ! Z is an embedding of X into a normed space (Z; k:k), we say that f has distortion at most D if we have D1 kf (x) ? f (y)k (x; y) kf (x) ? f (y)k for all x; y 2 X (that is, f is non-expanding and contracts each distance by the factor of at most D). For (X; ) nite, we de ne cp () as the in mum of D 1 such that there is an embedding with distortion at most D into lpm (for some natural number m). Here lpm denotes the space Rm with the P norm kxkp = ( mi=1 jxi jp )1=p . For (X; ) in nite, we de ne cp () as the supremum of cp ( ), where is restricted to a nite subspace of X (in this way, we avoid discussion of in nite-dimensional lp -spaces). Research by J. M. supported by Charles University grants No. 158/99 and 159/99. Part of the work by Y. R. was
done during his visit at the Charles University in Prague partially supported by these grants, by the grant GAC R 201/99/0242, and by Haifa University grant for Promotion of Research.
1
The most important cases, both theoretically and practically, are p = 1 and p = 2. Much of the existing theory deals with nding good bounds on c1 () and c2 (), under various conditions on , and with constructing mappings achieving optimal or near-optimal distortions. In this paper we introduce and study certain subclasses of l1 -metrics. For each nite metric space (X; ), we de ne a class l1dom() of metrics on X . Such metrics have been used implicitly before. They appear e.g., in Bourgain's proof [4] showing that each n-point metric can be embedded into l2 with distortion O(log n), and in Rao's [17] improvement of this construction for metrics of a special type (see Section 6 below for an example of such construction). The metrics in l1dom() have a relatively simple structure, yet many interesting properties, and a better understanding of such metrics may well prove rewarding. Let (X; ) be a nite metric space. A -dominated line metric on X is a metric induced by an nonexpanding (1-Lipschitz) embedding of (X; ) into the real line. Explicitly, (x; y) = j(x) ? (y)j (x; y) for some nonexpanding mapping : X ! R. The class l1dom() consists of all metrics on the ground set X that are convex combinations of -dominated line metrics on X ; that is,
l1dom() =
X m
i=1
i i : i 0;
X
i = 1; 1 ; : : : ; m -dominated line metrics :
Clearly, each metric in l1dom () can also be viewed as induced by an nonexpanding embedding : (X; ) ! l1m, whose ith coordinate ( )i is given by i-times the nonexpanding embedding of (X; ) into R inducing i . We de ne dom cdom 1 () = inf fD : there is a 2 l1 () such that is a D-embeddingg: In order to de ne cdom 1 () for an in nite metric space, we again take the supremum over all nite subspaces; also see the remark at the end of this section. Two important properties of cdom 1 () were established by Bourgain [4]. First, his proof gives cdom 1 () O(log n) where n, as usual, stands for the size of the underlying space. The bound is tight, since by [14], the shortest-path metric of a constant-degree expander graph achieves c1 () = (log n).1 The second property of cdom 1 () is cdom 1 () c2 (): (Note that we have the reverse inequality for unrestricted embeddings into l1 : for all , c1 () c2 (); see, e.g., [14].) For completeness, here is an explanation. Let be the metric in l1dom () closest to (i.e., attaining cdom 1 ()), and assume that is a convex combination of -dominated line metrics i : m X i i : = Write xi = ( (x))i . Then the mapping
i=1
f : x 7! 11=2 x1 ; : : : ; 1m=2 xm 2 Rm ;
x2X
1 The notation f = (g ) is equivalent to g = O(f ), and f = (g ) means that both f = O(g ) and f = (g ).
2
from (X; ) to l2m has distortion at most cdom 1 (). Indeed, on the one hand,
kf (x) ? f (y)k22 = while, on the other hand,
kf (x) ? f (y)k2 =
X m
i=1
i jxi ? yij2
m X i=1
i jxi ? yi j2 (x; y)2 ;
1=2
m X i=1
i jxi ? yi j
1
(x; y): cdom 1 ()
We remark that a similar argument gives cdom 1 () cp () for all p, 1 p 1. The central issue of this paper is the estimation of the gap between c2 () and cdom 1 (). As was mentioned above, c2 () cdom ( ). We show that for a nite metric space ( X; ) with 1 ( ), and that this bound is essentially tight. jX j = n, we always have c2 () = O(plog n) cdom 1 dom Using p a probabilistic construction, we exhibit a family of metric spaces n for which c1 (n ) =
( log n=log log n )c2 (n ). p n n In order to obtain the lower bound, it is rst shown that cdom 1 (S ) = ( n), where S is the Euclidean n-sphere equipped with its geodesic metric. Then a discrete analogue of this result is obtained for a suitable nite random subset of S n . Finally, we also discuss two examples of non-obvious embeddings of special metric spaces into dom l1 .
Remarks on the de nition of cdom 1 for in nite metric spaces. For an in nite metric space
dom (X; ), we have de ned cdom 1 () as the supremum of c1 ( ) over nite submetrics of ; thus, we have postulated a \compactness" property. A natural way of introducing the corresponding set l1dom() of metrics seems to be the following. Let () denote the set of all -dominated line metrics. Consider () as a subspace of the space RX X with the product topology; in other words, with the topology of pointwise convergence of functions on X X . We de ne l1dom () as the closure of the convex hull of () in this topology (for X nite, conv (()) is closed, and so the new de nition agrees with the one given earlier for nite spaces). Having de ned l1dom() for an in nite , we can now de ne cdom 1 () as the in mum of the distortions of the identity mapping (X; ) ! (X; ~), where ~ 2 l1dom (). Let us check that this agrees with the previous de nition of cdom 1 () via nite subspaces. Since the new de nition gives no new metrics on nite subspaces, the distortion according to the new de nition cannot be smaller than the one from the old de nition. To see the opposite inequality, we rst note that l1dom() is compact, being a closed subspace of the product of the closed intervals [0; (x; y)], x; y 2 X . Suppose that cdom 1 ( ) < D for all nite subspaces (Y; ) of (X; ). For each (Y; ), de ne FY as the set of all 2 l1dom() that, when restricted to Y , distort by at most D. We observe that any -dominated line metric on Y can be extended to a -dominated line metric on X ; this follows from McShane's theorem, stating that any 1-Lipschitz real function de ned on a subspace of a metric space can be extended to a 1-Lipschitz function on the whole space (see e.g. [19]). It follows that FY 6= ;. Each FY is closed, and for nite Y1 ; Y2 ; : : : ; Yk , we have FY1 \ FY2 \ \ FYk FY1 [[Yk 6= ;. Hence the family fFY g has a nonempty intersection, and any metric in the intersection distorts by at most D. For some of the subsequent proofs, it is convenient to note that l1dom () as de ned above contains \in nite convex combinations" of the -dominated line metrics. Namely, if is a Borel probability R measure supported on (), then the metric ~ given by ~(x; y) = () (x; y) d( ) lies in l1dom().
3
Finally, we note that if (X; ) is compact, then l1dom() coincides with the closure of conv (()) in the supremum norm on C (X X ) (uniform convergence). Indeed, the elements of conv (()), regarded as functions X X ! R, are 1-Lipschitz with respect to the product metric on X X derived from (the distance of pairs (x; y) and (z; t) is (x; z )+ (y; t)). Thus, if (X; ) is compact, pointwise convergence of these functions is the same as uniform convergence. We omit more detailed discussion of these \in nite-dimensional" aspects, since we are mainly interested in nite metrics.
2 The geodesic metric of S n We start with a simple observation about the n-dimensional Euclidean space. Let S n denote the n-dimensional unit sphere in Rn+1 , and let n be the uniform probability measure on S n.
p n Claim 2.1: cdom 1 (l2 ) = O( n) :
Proof. Let be the dominated line metric of l2n obtained by the orthogonal projection of Rn on the line de ned by the center of S n and a point 2 S n . Consider the metric ~ de ned by ~ =
Z
Sn
# dn(#) :
This is an element of l1dom(), where is the Euclidean metric on l2n (see the remarks at the end of Section 1). The distortion of the identity mapping with respect to the metrics and ~ equals the expected length of projection of a random unit vector on a xed line. It is well known (see, e.g., p [16], Section 2) that this expectation is (1= n), implying the claim. 2 Combining Claim 2.1 with a simple observation that the Euclidean metric of S n shrinks its geodesic metric by at most 2 , one gets
p n Corollary 2.2: cdom 1 (S ) = O( n) :
Proving lower bounds is harder. Using the standard approach in such situations, let us rst write cdom 1 () in a dual form, more suited for the task. For simplicity, we use the discrete notation. By duality of linear programming (Farkas' Lemma, or separation of disjoint convex sets by a hyperplane), we have
Claim 2.3: For any nite metric space (X; ), cdom min L() ; 1 () = max L L( ) where the maximum is taken over all nonnegative linear forms L() = minimum is taken over all -dominated line metrics .
P
i;j 2X aij ij ,
and the
Thus, in order to obtain a lower bound on cdom 1 (), it suces to exhibit a single linear form L such that L() is suciently bigger than max L( ). The main diculty of this approach lies in bounding the latter expression. 4
Our next goal is to obtain a lower bound on the distortion of the geodesic metric of S n . We shall use the approach outlined above (in a version for an in nite metric space). In order to make it work, we use the classical Levy's Lemma about the metric structure of S n . We follow [16], Chapter 2. Given a set K S n and " 0, de ne K " as the set of all points x 2 S n that are at (geodesic) distance at most " from K .
Lemma 2.4 (Levy's Lemma): Let f : S n ! R be a continuous function, and denote by Mf its median, i.e., a number such that both the sets fx : f (x) Mf g and fx : f (x) Mf g have measure 21 . Let A = fx : f (x) = Mf g. Then n
(A" )
q
1?
e?"2 (n?1)=2 2
:
We recall the key ideas of the proof, since they might be useful in derivation of similar results for other metric spaces. All the details can be completed from [16]. The proof is based on the following two results (which we will also need later):
Theorem 2.5 (Isoperimetric inequality for sphere): For any " 0 and any K S n with n (K ) = , 0 1, n(K " ) n(C" ) ; where C denotes a spherical cap of measure .
Lemma 2.6: The measure of a spherical cap of geodesic radius =2 + " satis es n(C1"=2 ) 1 ?
q
e?"2 (n?1)=2 : 8
This is a \spherical analogue" of the well-known Cherno inequality for the binomial distribution, and it can be proved by more or less straightforward calculations.
Proof of Levy's Lemma (Lemma 2.4): De ne A? = fpx : f (x2) Mf g and A+ = fx : f (x) Mf g. Since n (A+ ) = 1=2, we have n(A"+ ) n (C1"=2 ) 1 ? 8 e?" (n?1)=2 , and similarly for A? . It 2 remains to observe that A" = A"? \ A"+ . Corollary 2.7: Let f : S n ! R be a nonexpanding map. Then, for every t 0, there exists an interval It R of length t such that n(f ?1 It ) 1 ?
q
e?t2 (n?1)=8 2
:
The corollary follows from Lemma 2.4: let It be the 2t -neighbourhood of f (A) = M 2 R and note that f (At=2 ) It since f is nonexpanding.
Corollary 2.8: Let f : S n ! R be a nonexpanding map. Then, for every t 0, p Prob n n [ jf (x) ? f (y)j t ] 2 e?t2 (n?1)=8 : 5
The corollary follows from Corollary 2.7 by bounding the probability that two randomly and independently chosen points of S n are not both mapped to It by f . We can now prove a lower bound on the distortion of embedding S n into l1dom , and show that Corollary 2.2 is tight:
Theorem 2.9:
n cdom 1 (S )
pn ? 1 : 4
Proof. Following the general approach for establishing lower bounds via Farkas' Lemma, consider the following linear functional L( ) over metrics on S n : L( ) =
Z
Z
Sn Sn
(x; y) dn dn = E n n [ (x; y) ] :
The value of L on the geodesic metric of S n is, clearly, =2. Consider now an arbitrary nonexpanding map f : S n ! R and the dominated line metric induced by f on S n . By a standard argument, Z 1 Prob [ (x; y) t ] dt: L( ) = E[ (x; y) ] = 0
Using Corollary 2.8 we get
L( )
Z
0
1p
2 e?t2 (n?1)=8 dt =
q
8 n?1
Z
0
1
e?y2 =2 dy = p 2 : n?1
If ~ is a convex combination p of -dominated line metrics, the just derived inequality implies that ~ distorts by at least n ? 1=2 (we use the \easy direction" of Farkas' lemma; it is obvious that if ~ distorts by at most D then L(~) L()=D). Finally, any metric in l1dom() is a pointwise p limit of such convex combinations ~ (see the end of Section 1), and so its distortion is at least n ? 1=2 as well. 2
3 Finite metrics This section is the heart of the paper; it deals with establishing sharp bounds on the gap between
c2 () and cdom 1 () as a function of the size of the underlying metric space. We start with the simpler upper bound.
p Theorem 3.1: Let (X; ) be a nite metric space with jX j = n. Then cdom 1 () = O( log n)c2 () . dom Proof. p Clearly, it suces to show that if the metric space (X; ) is l2 -embeddable, then c1 () = O( log n) . Assume that this is the case. By the fundamental Johnson{Lindenstrauss Lemma [13], (X; ) is also embeddable in l2log n with a constant distortion. Applying Claim 2.1 to l2log n , we deduce the theorem. 2
The lower bound is obtained via a discretization of Theorem 2.9. The line of reasoning will mimic that of the previous section. Denote by Cap(a; r) S n a spherical cap of geodesic radius r centered at a. The following lemma allows us to use the ideas from the continuous case in the discrete setting: 6
Lemma 3.2: Let r be a positive number with r < r0 for a suitable universal constant r0 > 0. For
any natural n, there exist N = ( 1r )n+1 and a covering of S n by N caps of geodesic radius r with the following properties:
a. The family of caps f Cap(a; r) : a 2 X g covers S n. b. No point of S n is covered by more than = O(n(log n + log 1r )) caps in this family. c. The average geodesic distance between the points of X is =2 + o(1). We shall delay the proof of the lemma until later, and show how p it is used. Let (X; ) be a subset of S n with properties as ensured by Lemma 3.2 for r = 1= n; then N = (n)0:5(n+1) and = O(n log n). The size of X is N , and is induced by the geodesic metric of S n . Consider the linear form X = E [(x; y)] ; L() = 1
N 2 i;j2X
ij
expressing the expected distance between two random points of X . By property c:, L() = =2 + o(1). show that the value of L( ) on any -dominated line metric is at most p Our goal is top O( (log n)=n ) = O( log log N= log N ). This, of course, would imply the theorem.
Claim 3.3: Let K be a subset of X . Then 1 jK j (K r ) jK j ; n N N
where, as before, and K r stands for the r-neighbourhood of K .
Proof. Let vr denote the n-measure of a spherical cap of geodesic radius r. By the properties a: and b:, we have N vr 1 . Therefore, (K r ) jK j v jK j : n
On the other hand,
r
N
j: n (K r ) 1 jK j vr 1 jK N
2
Claim 3.4: Let Q X be subset of X of size N=2, and let P X be a set of points such that (Q; P ) " + 2r + ; p where = (2 ln 2)=(n ? 1). Then jP j=N e?"2 (n?1)=2 : Proof. By Claim 3.3, we have
Qj 1 : n(Qr ) 1 jN 2 7
Using Lemma 2.6 on the complement of the cap C1=2 , we calculate that the radius of C1=2 is at least =2 ? , and by the Isoperimetric Inequality 2.5, we obtain
n (Qr+"+ ) n (C1"=+2 ) v=2+" 1 ? Since P r and Qr+"+ have disjoint interiors, we have
n (P r ) 1 ? n (Qr+"+ )
q
q
e?"2 (n?1)=2 : 8
e?"2 (n?1)=2 : 8
2
The claim now follows Claim 3.3. Finally, we can show a lower bound on the gap between cdom 1 () and c2 () for our (X; ). p
Theorem 3.5: cdom 1 () = ( log N=log log N )c2 (). Proof. As we have seen before, it suces to show that random p the expected distance between two 0:5(n+1)
points of any -dominated metric line on X is ( (log n)=n). Recall that N = (n) . Arguing as in the derivation of Corollary 2.7 from Levy's Lemma (our Lemma 2.4), and keeping p ?2 in mind r = 1= n, we conclude from Claim p 3.4 that all but, say, an n fraction of points in f (X ) can be covered by an interval of length B (log n)=n for a suitable constant B . Consequently, the p probability that the -distance between two randomly chosen points of X exceeds B (log n)=n is O(n?2 ). Keeping in mind that the maximal -distance between any two points of X is at most , we conclude that E X X [(x; y)] is at most h
p
p
B (log n)=n + Prob (x; y) B (log n)=n p
i
p
= B (log n)=n + O(n?2 ) = O( (log n)=n ): This concludes the proof of the theorem.
2
4 Proof of Lemma 3.2 Proof. The proof is essentially the same as in Rogers [18] and Erd}os and Rogers [8]. Since we
have no explicit reference for the exact result we need here, we include a proof. The calculations are somewhat simpler than those in [18, 8], since a less precise bound is sucient for our application. In the sequel, we may suppose that n is suciently large. Let r be given, and set = r=n2 . Let vx denote, as before, the n -measure of a spherical cap of geodesic radius x. De ne N as the smallest integer such that N vr? 14 ln(n=v ). Let X = fa1 ; a2 ; : : : ; aN g be a subset of S n obtained by chosing N points independently and uniformly at random. We show that X satis es the requirements of the lemma with probability close to 1. We start with a. Let B S n be the set of points not covered by any of Cap(ai ; r ? ), i = 1; 2; : : : ; N . The expected n -measure of B equals the probability of a xed point x 2 S n being missed by all Cap(ai ; r ? ). By de nition of N , the latter probability is (1 ? vr? )N < e?N vr? n?14 v : 8
Thus, by Markov's Inequality, the measure of B is smaller than v with probability at least 1 ? n?14 . In such case, the caps Cap(ai ; r) cover the entire S n , for if x were not covered by any Cap(ai ; r), we would have Cap(x; ) B implying n (B ) v . We continue with b. Set = 35 N vr+ . As we shall see later, this is indeed of the correct order of magnitude. The expected measure of the subset of points S n covered more than -times by the caps Cap(ai ; r + ), equals the probability of a xed x 2 S n being covered more than -times. The number of (r + )caps covering x behaves as a sum of N independent 0/1 random variables, each of them attaining value 1 with probability vr+ . By Cherno's Bound (as in Alon and Spencer [1]; see the remark above Theorem A.12), the probability of more than caps covering x is bounded above by e?2N vr+ =27 e?2N vr? =27 e? ln(n=v ) = 1 v :
n
Thus, by Markov's Inequality, with probability at least 1 ? n1 , the fraction of the points of S n that are covered more than -times by the caps Cap(ai ; r + ) is no more than v . Consequently, no point x is covered more than times by the r-caps Cap(ai ; r), since otherwise the entire Cap(x; ) would be covered more than -times by caps Cap(ai ; r + ). Finally, we show c. De ne ij as the distance between ai and aj , i < j . Then each ij is random variable with values of S n , the ij in the interval [0; ] and with distribution symmetric about . Due tohthe uniformity i i ? 2 hP P P are pairwise independent. Therefore E i