probability that a δ-dense tile of size |x| ⥠a and |y| ⥠b exists is no larger than. ( n a. )( m b. ) exp(â2ab(δ â p)2) . (1). Proof (sketch): Hoeffding's inequality yields.
The Trustworthy Pal: Controlling the False Discovery Rate in Boolean Matrix Factorization Sibylle Hess, Katharina Morik and Nico Piatkowski
Given a Binary Data Matrix..
Depending on the data domain, one could ask for ▶
groups of users which like the same set of movies,
▶
groups of patients having a similar set of gene mutations,
▶
groups of costumers buying a similar set of items.
Finding a Factorization Solve min |D − Y ⊙ X ⊤ | for binary matrices X and Y X,Y
=
≈
⊙
⊗
⊕
⊕
The False Discovery Rate (FDR)
“
Boy called wolf once – Type 1 error Boy called wolf twice – Type 2 error
The FDR is controlled at level q if (v ) E ≤q r
v : false alarms r : alarms in total
”
False Discoveries and BMF Given a factorization of rank r, define { 1 if Y·s X·s⊤ covers (mostly) noise Zs = 0 if Y·s X·s⊤ is a part of the model The FDR is controlled at level q if P(Zs = 1) < q: E
(v ) r
1∑ = P(Zs = 1) r r
s=1
Properties of Outer Products If the binary matrices solve (X, Y ) ∈ arg min |D − Y ⊙ X ⊤ | then any outer product A = D ◦ Y·s X·s⊤ has a high density δ=
Y·s⊤ DX·s 1 ≥ |X·s ||Y·s | 2
and a high coherence η=
max ⟨A·i , A·k ⟩ > δ|Y·s |
1≤i̸=k≤n
δ|X·s | − 1 |X·s | − 1
Theorem (Density Bound) Suppose N is an m × n Bernoulli matrix with parameter p. The probability that a δ-dense tile of size |x| ≥ a and |y| ≥ b exists is no larger than ( )( ) n m exp(−2ab(δ − p)2 ) . (1) a b Proof (sketch): Hoeffding’s inequality yields ) ( ⊤ ∑ y Nx 1 xi yj Nji − p ≥ δ − p ≥ δ = P P |x||y| ab i,j
≤ exp(−2ab(δ − p)2 ), The union bound yields the final result.
Theorem (Coherence Bound) Let N be an m × n Bernoulli matrix with parameter p and let √ µ > p2 . The function value of η satisfies η ((1/ m)N ) ≥ µ with probability no larger than ( ) 3 (µ − p2 )2 n(n − 1) exp − m 2 . (2) 2 2 2p + µ Proof (sketch): The Bernstein inequality yields ( ) 3 (µ − p2 )2 P(⟨N·i , N·k ⟩ ≥ mµ) ≤ exp − m 2 , 2 2p + µ The union bound yields the final result.
Applying the Bounds Given Y ⊙ X ⊤ ≈ D, calculate for 1 ≤ s ≤ r the density δs and coherence ηs . Toss (X·s , Y·s ) if all of the following bounds are larger than q.
Corollary )( ) n m exp(−2|X·s ||Y·s |(δs − p)2 ) P(Zs = 1) ≤ |X·s | |Y·s | ( ) 3 (ηs/m − p2 )2 P(Zs = 1) ≤ exp − m 2 2 2p + ηs/m (
How Good are These Bounds?
|Y·s | m
(n, m) = (1000, 800)
(n, m) = (1600, 500)
0.2
0.2
0.1
0.1
0
0
0.02
0.04
|X·s | n
density bound
0
0
0.02 |X·s | n
coherence bound
0.04
Synthetic Experiments in PalTiling (n, m) = (1600, 500)
F
1 0.8 0.6 0.4
r
(n, m) = (1000, 800)
30 20 10 0
0
0
10
10
20
20
1 0.8 0.6 0.4 30 20 10 0
0
10
20
0
10
20
p± [%] TrustPal dens
p± [%] TrustPal coh
Primp
Take Home ▶
Establishing quality guarantees for unsupervised tasks is pretty interesting
▶
Concentration inequalities (e.g., Hoeffding bound) are powerful stuff,
▶
the binary nature of BMF enables satisfying solutions to problems which are much harder in fuzzier tasks like NMF