Quite a few papers in BB model: [Alon et al. ... But, Almost nothing in MP model
before 2012. [Alon et al. .... Alice plays a random site SI with input XI = X.
Lower Bound Techniques for Multiparty Communication Complexity Qin Zhang Indiana University Bloomington Based on works with Jeff Phillips, Elad Verbin and David Woodruff
1-1
The multiparty number-in-hand (NIH) comm. – A model for proving lower bounds x1 = 010011
x2 = 111011
x3 = 111111 xk = 100011
They want to jointly compute f (x1 , x2 , . . . , xk ) Goal: minimize total bits of communication 2-1
The multiparty number-in-hand (NIH) comm. – A model for proving lower bounds x1 = 010011
x2 = 111011
Blackboard (BB): One speaks, everyone hears. Message-Passing (MP): If x1 talks to x2 , others cannot hear.
x3 = 111111 xk = 100011
They want to jointly compute f (x1 , x2 , . . . , xk ) Goal: minimize total bits of communication 2-2
The multiparty number-in-hand (NIH) comm. – A model for proving lower bounds x1 = 010011
Blackboard (BB): One speaks, everyone hears.
x2 = 111011
Message-Passing (MP): If x1 talks to x2 , others cannot hear.
x3 = 111111 xk = 100011
Coordinator
=
C S1 S2
They want to jointly compute f (x1 , x2 , . . . , xk ) Goal: minimize total bits of communication 2-3
S3
···
Sk
Previously, for multiparty NIH comm. model Well studied? YES and NO.
3-1
Previously, for multiparty NIH comm. model Well studied? YES and NO. Quite a few papers in BB model: [Alon et al. ’96], [Bar-Yossef et al. ’04], . . .
3-2
Previously, for multiparty NIH comm. model Well studied? YES and NO. Quite a few papers in BB model: [Alon et al. ’96], [Bar-Yossef et al. ’04], . . .
But, Almost nothing in MP model before 2012.
3-3
Previously, for multiparty NIH comm. model Well studied? YES and NO. Quite a few papers in BB model: [Alon et al. ’96], [Bar-Yossef et al. ’04], . . .
But, Almost nothing in MP model before 2012. • Too easy?
3-4
Previously, for multiparty NIH comm. model Well studied? YES and NO. Quite a few papers in BB model: [Alon et al. ’96], [Bar-Yossef et al. ’04], . . .
But, Almost nothing in MP model before 2012. • Too easy? • Lack of motivations?
3-5
Too easy? Could be more difficult than BB, since in MP each player observes only a piece of protocol transcript.
4-1
Too easy? Could be more difficult than BB, since in MP each player observes only a piece of protocol transcript. We do not want to analyze a k-dimensional matrix directly. A 0 1
B 0 0 0 1 0 1
4-2
A communication matrix of the 2-party AND function
Too easy? Could be more difficult than BB, since in MP each player observes only a piece of protocol transcript. We do not want to analyze a k-dimensional matrix directly. A 0 1
B 0 0 0 1 0 1
A communication matrix of the 2-party AND function
Even for the 2-party Gap-Hamming problem, where Alice has a n-bit vector and Bob has a n-bit vector, and they want to compute if the hamming distance of these√two vectors √ is bigger than n/2 + n or less than n/2 − n, it tooks researchers in this area about 10 years and dozens of papers (IW03, Woodruff04, . . ., Sherstov12) to fully understand its 2-dim communication matrix! 4-3
Lack of motivations?
(Continous) Distributed Monitoring communication → energy, bandwidth, . . .
Network routers Sensor networks
5-1
Data in the cloud
Lack of motivations? (cont.)
Data Streams communication → space Memory
CPU
6-1
Lack of motivations? (cont.)
Data Streams communication → space Memory
P1
6-2
P2
Pk
CPU
Lack of motivations? (cont.)
Distributed Computation for Big Data communication → bandwidth, . . .
Input Output
Map The BSP model. E.g., Apache Giraph, Pregel.
7-1
Shuffle
Reduce
The MapReduce model. E.g., Hadoop, Google MapReduce.
Interesting problems Statistical problems • Frequency moments Fp F0 : #distinct elements F2 : size of self-join • Heavy hitters • Quantile • Entropy • ...
8-1
Interesting problems Statistical problems
Graph problems • Connectivity
• Frequency moments Fp F0 : #distinct elements F2 : size of self-join • Heavy hitters • Quantile • Entropy • ...
8-2
• Bipartiteness • Counting triangles • Matching • Minimum spanning tree • ...
Interesting problems Statistical problems
Graph problems • Connectivity
• Frequency moments Fp F0 : #distinct elements F2 : size of self-join • Heavy hitters • Quantile
• Bipartiteness • Counting triangles • Matching • Minimum spanning tree • ...
• Entropy • ...
• Lp regression
Numerical linear algebra 8-3
• Low-rank approximation • ...
Interesting problems Statistical problems
Graph problems • Connectivity
• Frequency moments Fp F0 : #distinct elements F2 : size of self-join • Heavy hitters • Quantile
• Bipartiteness • Counting triangles
Numerical linear algebra 8-4
• Conjuntive queries
• Matching • ...
Distributed learning
• Lp regression
• Classifiers ...
• Low-rank approximation
Geometry problems
• Minimum spanning tree
• Entropy • ...
DB queries
• ...
This talk We will talk lower bounds. Most lower bounds match the upper bounds.
Introduce two general techniques that work in both BB model and MP model for proving lower bounds, with examples. Conclude with several future directions.
9-1
Our new ideas and techniques Ideas 1. Reduce a k-player problem to a 2-player problem. 2. Decompose a complicated problem to several simpler problems
10-1
Our new ideas and techniques Ideas 1. Reduce a k-player problem to a 2-player problem. 2. Decompose a complicated problem to several simpler problems
In fact, we are exploring the underlying patterns of communication matrices
10-2
Our new ideas and techniques Ideas 1. Reduce a k-player problem to a 2-player problem. 2. Decompose a complicated problem to several simpler problems
In fact, we are exploring the underlying patterns of communication matrices
Concrete techniques
10-3
1.
Symmetrization With Phillips and Verbin, 2012
2.
Composition With Woodruff, 2012
Our new ideas and techniques Ideas 1. Reduce a k-player problem to a 2-player problem. 2. Decompose a complicated problem to several simpler problems
In fact, we are exploring the underlying patterns of communication matrices
Concrete techniques
10-4
1.
Example: Graph Connecitivy Symmetrization With Phillips and Verbin, 2012
2.
Composition With Woodruff, 2012
Example: Distinct Elements
Communication complexity
I choose not to introduce formal definitions/concepts of communication complexity. Just want to mention Yao’s lemma: We will prove lower bounds for randomized protocols. To do that we will usually pick some input distribution, and then prove lower bounds for any deterministic protocol that succeeds w.pr 0.99 under that input distribution.
11-1
1st Technique: Symmetrization
12-1
Example: graph connectivity k sites each holds a set of edges of a graph Goal: compute whether the graph is connected. A trivial UB: O(nk log n) bits. We prove that this is tight up to a log factor.
13-1
Set disjointness (2-DISJ) X ∩ Y = ∅? X ⊆ {1, . . . , n}
14-1
Y ⊆ {1, . . . , n}
Set disjointness (2-DISJ) X ∩ Y = ∅? X ⊆ {1, . . . , n}
Y ⊆ {1, . . . , n}
Exists a hard distribution µβ , under which |X ∩ Y | = 1 (YES instance) w.p. β and |X ∩ Y | = 0 (NO instance) w.p. 1 − β.
Lemma: (Generalization of [Razborov ’90, BJKS ’04]) E[CCβ/100 (2-DISJ)] = Ω(n) µβ
14-2
2-DISJ ⇒ k-CONN x1
Bob
x2
x5 Alice and Bob want to solve the 2-DISJ (X, Y ) ∼ µβ . (set β = 1/k2 ) ⇒ running a protocol for k-CONN on:
x3 x4 Alice
Alice plays a random site SI with input XI = X. Bob plays other k − 1 sites, and constructs inputs for them using Y : it choose Xi ∼ µβ |Y (i 6= I). Note: X1 , . . . , Xk are i.i.d. conditioned on Y . Vertices u1 , . . . , uk , v1 , . . . , vn . An edge (ui , vj ) exists iff Xi,j = 1. Let ν be the distribution of the graph.
15-1
2-DISJ ⇒ k-CONN (cont.) vj |j ∈ [r]\Y vj|Y |+1 vj|Y |+2 vj|Y |+3
vj |j ∈ Y vjn
vj1
vj2
XI ∩ Y = ∅?
u1
u2
u3
uk
(ui , vj ) exists if and only if Xi,j = 1
16-1
vj|Y |
2-DISJ ⇒ k-CONN x1
Bob
x2
x5 Alice and Bob want to solve the 2-DISJ (X, Y ) ∼ µβ . (set β = 1/k2 ) ⇒ running a protocol for k-CONN on:
x3 x4 Alice
Alice plays a random site SI with input XI = X. Bob plays other k − 1 sites, and constructs inputs for them using Y : it choose Xi ∼ µβ |Y (i 6= I). Note: X1 , . . . , Xk are i.i.d. conditioned on Y . Vertices u1 , . . . , uk , v1 , . . . , vn . An edge (ui , vj ) exists iff Xi,j = 1. Let ν be the distribution of the graph. Correctness: Solving k-CONN w.pr. 1 − β 1.5 under ν also solves 2-DISJ w.pr. 1 − β/100 under µβ . 17-1
2-DISJ ⇒ k-CONN x1 β/100 E[CCµβ (2-DISJ)]
=O
1 k
·
Bob
x2
β 1.5 CCν (k-CONN)
x5 Alice and Bob want to solve the 2-DISJ (X, Y ) ∼ µβ . (set β = 1/k2 ) ⇒ running a protocol for k-CONN on:
x3 x4 Alice
Alice plays a random site SI with input XI = X. Bob plays other k − 1 sites, and constructs inputs for them using Y : it choose Xi ∼ µβ |Y (i 6= I). Note: X1 , . . . , Xk are i.i.d. conditioned on Y . Vertices u1 , . . . , uk , v1 , . . . , vn . An edge (ui , vj ) exists iff Xi,j = 1. Let ν be the distribution of the graph. Correctness: Solving k-CONN w.pr. 1 − β 1.5 under ν also solves 2-DISJ w.pr. 1 − β/100 under µβ . 17-2
2-DISJ ⇒ k-CONN x1 β/100 E[CCµβ (2-DISJ)]
=O
1 k
·
Bob
x2
β 1.5 CCν (k-CONN)
Ω(n) x5 Alice and Bob want to solve the 2-DISJ (X, Y ) ∼ µβ . (set β = 1/k2 ) ⇒ running a protocol for k-CONN on:
x3 x4 Alice
Alice plays a random site SI with input XI = X. Bob plays other k − 1 sites, and constructs inputs for them using Y : it choose Xi ∼ µβ |Y (i 6= I). Note: X1 , . . . , Xk are i.i.d. conditioned on Y . Vertices u1 , . . . , uk , v1 , . . . , vn . An edge (ui , vj ) exists iff Xi,j = 1. Let ν be the distribution of the graph. Correctness: Solving k-CONN w.pr. 1 − β 1.5 under ν also solves 2-DISJ w.pr. 1 − β/100 under µβ . 17-3
2-DISJ ⇒ k-CONN x1 β/100 E[CCµβ (2-DISJ)]
Ω(n)
=O
1 k
·
Bob
x2
β 1.5 CCν (k-CONN)
Ω(nk) x5
Alice and Bob want to solve the 2-DISJ (X, Y ) ∼ µβ . (set β = 1/k2 ) ⇒ running a protocol for k-CONN on:
x3 x4 Alice
Alice plays a random site SI with input XI = X. Bob plays other k − 1 sites, and constructs inputs for them using Y : it choose Xi ∼ µβ |Y (i 6= I). Note: X1 , . . . , Xk are i.i.d. conditioned on Y . Vertices u1 , . . . , uk , v1 , . . . , vn . An edge (ui , vj ) exists iff Xi,j = 1. Let ν be the distribution of the graph. Correctness: Solving k-CONN w.pr. 1 − β 1.5 under ν also solves 2-DISJ w.pr. 1 − β/100 under µβ . 17-4
A useful message We can show trivial UBs are tight for many statistical and graph problems in the MP model (e.g., #distinct elements, bipartiteness, cycle/triangle-freeness, . . .), when exact answers are wanted. (Woodruff, Z, 2013a) Approximation is often necessary if we want comm. efficient protocols.
18-1
A useful message We can show trivial UBs are tight for many statistical and graph problems in the MP model (e.g., #distinct elements, bipartiteness, cycle/triangle-freeness, . . .), when exact answers are wanted. (Woodruff, Z, 2013a) Approximation is often necessary if we want comm. efficient protocols. Example: Computing the diameter of a graph exactly needs Ω(kn2 ) bits (when m = n2 ) in the MP model. However, there is a protocol that computes the diameter up to 3/2 ˜ an additive error of 2 using O(kn ) bits.
18-2
Other problems using symmetrization • k-bitwise-MAJ • k-bitwise-OR/AND/XOR With applications in – Distinct elements reporting – ε-kernels (geometry) – Heavy-hitters • Testing bipartiteness, Testing cycle Testing triangle-freeness, ... 19-1
Other problems using symmetrization • k-bitwise-MAJ • k-bitwise-OR/AND/XOR With applications in – Distinct elements reporting – ε-kernels (geometry) – Heavy-hitters • Testing bipartiteness, Testing cycle Testing triangle-freeness, ... 19-2
Key: Find the right 2-player problem for reduction
2nd Technique: Composition
20-1
Example – the F0 (distinct elements) problem k sites each holds a set Xi (i ∈ {1, 2, . . . , k}). Goal: compute #distinct elements(∪ki=1 Xi ) up to a (1 + ε)-approx.
21-1
Example – the F0 (distinct elements) problem k sites each holds a set Xi (i ∈ {1, 2, . . . , k}). Goal: compute #distinct elements(∪ki=1 Xi ) up to a (1 + ε)-approx. 2 ˜ Current best UB: O(k/ε ) [Cormode, Muth, Yi. 2008]
Holds in MP model
21-2
Example – the F0 (distinct elements) problem k sites each holds a set Xi (i ∈ {1, 2, . . . , k}). Goal: compute #distinct elements(∪ki=1 Xi ) up to a (1 + ε)-approx. 2 ˜ Current best UB: O(k/ε ) [Cormode, Muth, Yi. 2008]
Holds in MP model Previous LB: Ω(k) And Ω(1/ε2 ) Direct reduction from Gap-Hamming
21-3
Example – the F0 (distinct elements) problem k sites each holds a set Xi (i ∈ {1, 2, . . . , k}). Goal: compute #distinct elements(∪ki=1 Xi ) up to a (1 + ε)-approx. 2 ˜ Current best UB: O(k/ε ) [Cormode, Muth, Yi. 2008]
Holds in MP model Previous LB: Ω(k) And Ω(1/ε2 ) Direct reduction from Gap-Hamming New LB: Ω(k/ε2 ). [Woodruff, Z. 2012, 2013b]. Holds in MP model. Better UB in the BB model 21-4
The proof framework Step 1: Find two modular problems k-GAP-MAJ and 2-DISJ of simpler structures s.t. F0 can be thought as a composition of them. Step 2: Analyze the complexities of k-GAP-MAJ and 2-DISJ. Step 3: Compose two modular problems so that: Complexity(F0 ) = Complexity(k-GAP-MAJ) × Complexity(2-DISJ).
22-1
The proof framework Step 1: Find two modular problems k-GAP-MAJ and 2-DISJ of simpler structures s.t. F0 can be thought as a composition of them. Step 2: Analyze the complexities of k-GAP-MAJ and 2-DISJ. Step 3: Compose two modular problems so that: Complexity(F0 ) = Complexity(k-GAP-MAJ) × Complexity(2-DISJ). The proof presented here is for a special case when k = 2/ε2 .
22-2
k-GAP-MAJ k sites each holds a bit Zi chosen uniform at random from {0, 1} Goal: compute √ P 0, if Pi∈[k] Zi ≤ k/2 − √k, k-GAP-MAJ(Z1 , Z2 , . . . , Zk ) = 1, if i∈[k] Zi ≥ k/2 + k, don’t care, otherwise,
23-1
k-GAP-MAJ k sites each holds a bit Zi chosen uniform at random from {0, 1} Goal: compute √ P 0, if Pi∈[k] Zi ≤ k/2 − √k, k-GAP-MAJ(Z1 , Z2 , . . . , Zk ) = 1, if i∈[k] Zi ≥ k/2 + k, don’t care, otherwise,
Lemma: Any protocol Π that computes k-GAP-MAJ correctly w.p. 0.9999 has to learn Ω(k) Zi ’s well, that is, H(Zi | Π) ≤ Hb (1/100) for Ω(k) of i. In other words: I(Π; Z1 , . . . , Zk ) = Ω(k).
23-2
Set disjointness (2-DISJ) X ∩ Y = ∅? X ⊆ {1, . . . , n}
24-1
Y ⊆ {1, . . . , n}
Set disjointness (2-DISJ) X ∩ Y = ∅? X ⊆ {1, . . . , n}
Y ⊆ {1, . . . , n}
Exists a hard distribution µβ , under which |X ∩ Y | = 1 (YES instance) w.p. β and |X ∩ Y | = 0 (NO instance) w.p. 1 − β.
Lemma: (Generalization of [Razborov ’90, E[CCβ/100 (2-DISJ)] = Ω(n) µβ
24-2
BJKS ’04])
The proof sketch coordinator C
Y
2-DISJ
sites
S1
S2
S3
X1
X2
X3
···
Sk
Xk
(Xi , Y ) ∼ µ1/2 (hard input dist. for DISJ) for each i ∈ [k]
25-1
The proof sketch coordinator C 2-DISJ
sites
S1
S2
S3
X1
X2
X3
Y
Zi = |Xi ∩ Y |
···
1 0
w.p. 1/2 w.p. 1/2
Sk
Xk
(Xi , Y ) ∼ µ1/2 (hard input dist. for DISJ) for each i ∈ [k]
25-2
The proof sketch Y
coordinator C
Zi = |Xi ∩ Y |
2-DISJ
sites
S1
S2
S3
X1
X2
X3
···
1 0
w.p. 1/2 w.p. 1/2
Sk
Xk
(Xi , Y ) ∼ µ1/2 (hard input dist. for DISJ) for each i ∈ [k]
F0 (X1 , X2 , . . . , Xk ) ⇐⇒
k-GAP-MAJ(Z1 , Z2 , . . . , Zk ) (Zi = |Xi ∩ Y |)
=⇒
learn Ω(k) Zi ’s well
=⇒
need Ω(nk) bits (using an embedding argument)
25-3
The proof sketch Y
coordinator C
Zi = |Xi ∩ Y |
2-DISJ
sites
S1
S2
S3
X1
X2
X3
···
1 0
w.p. 1/2 w.p. 1/2
Sk
Xk
(Xi , Y ) ∼ µ1/2 (hard input dist. for DISJ) for each i ∈ [k]
F0 (X1 , X2 , . . . , Xk ) ⇐⇒
k-GAP-MAJ(Z1 , Z2 , . . . , Zk ) (Zi = |Xi ∩ Y |)
For the reduction set universe n = Θ(1/ε2 ) we get Ω(k/ε2 ) LB. Q.E.D. 25-4
=⇒
learn Ω(k) Zi ’s well
=⇒
need Ω(nk) bits (using an embedding argument)
Other problems using composition • Frequency moments Fp (p > 1) • Entropy • `p • Heavy-hitters • Quantile
26-1
Other problems using composition • Frequency moments Fp (p > 1) • Entropy • `p • Heavy-hitters • Quantile
Key: Find the right modular problems
26-2
Furture directions
27-1
Future directions – Problem space The current understandings
statistics
28-1
graph
Exact computation
well
medium
Approximation
well
limited
linear algebra
DB queries
limited limited
limited
Future directions – Problem space The current understandings
statistics
graph
Exact computation
well
medium
Approximation
well
limited
linear algebra
DB queries
limited limited
limited
Concretely, e.g., What is the CC of computing a const approx of the size of the matching?
28-2
Future directions – Problem space (cont.) Communication complexity of distributed property testing. Communication complexity of distributed property testing can also be used for proving LBs for traditional property testing, by extending a framework by Blais et al. 2011.
29-1
Future directions – Problem space (cont.) Communication complexity of distributed property testing. Communication complexity of distributed property testing can also be used for proving LBs for traditional property testing, by extending a framework by Blais et al. 2011. Communication complexity of relations (instead of functions, useful for database queries)
29-2
Future directions – Problem space (cont.) Communication complexity of distributed property testing. Communication complexity of distributed property testing can also be used for proving LBs for traditional property testing, by extending a framework by Blais et al. 2011. Communication complexity of relations (instead of functions, useful for database queries) Need to understand more primitive problems (building blocks), e.g., breadth-first search (BFS), pointer-chasing, etc. So far building blocks are quite limited: mainly 2-DISJ, 2GAP-HAMMING, k-GAP-MAJ, k-OR and their variants. 29-3
Future directions – Model/Technique space Can we relax or generalize the symmetrization technique? E.g., do not require “totally symmetric distributions”
30-1
Future directions – Model/Technique space Can we relax or generalize the symmetrization technique? E.g., do not require “totally symmetric distributions”
Consider round complexity (practical needs). Consider players with limited space (practical needs).
30-2
The end
T HAN K
YOU
Questions?
31-1