Dec 10, 2013 - Curvature + Convergence Rate. Cf = sup x,sâD;γâ[0,1];y=x+γ(sâx). 2 γ2. (f (y) â f (x) â ãy
Marginal Inference in MRFs using Frank-Wolfe David Belanger, Daniel Sheldon, Andrew McCallum School of Computer Science University of Massachusetts, Amherst {belanger,sheldon,mccallum}@cs.umass.edu
December 10, 2013
Table of Contents
1
Markov Random Fields
2
Frank-Wolfe for Marginal Inference
3
Optimality Guarantees and Convergence Rate
4
Beyond MRFs
5
Fancier FW
December 10, 2013
2 / 26
Table of Contents
1
Markov Random Fields
2
Frank-Wolfe for Marginal Inference
3
Optimality Guarantees and Convergence Rate
4
Beyond MRFs
5
Fancier FW
December 10, 2013
3 / 26
Markov Random Fields
December 10, 2013
4 / 26
Markov Random Fields
Φθ (x) =
X
θc (xc )
c∈C
December 10, 2013
4 / 26
Markov Random Fields
Φθ (x) =
X
θc (xc )
c∈C
P(x) =
exp (Φθ (x)) log(Z )
December 10, 2013
4 / 26
Markov Random Fields
Φθ (x) =
X
θc (xc )
c∈C
P(x) =
x→µ
exp (Φθ (x)) log(Z )
December 10, 2013
4 / 26
Markov Random Fields
Φθ (x) =
X
θc (xc )
c∈C
P(x) =
exp (Φθ (x)) log(Z )
x→µ Φθ (x) → hθ, µi
December 10, 2013
4 / 26
Marginal Inference
µMARG = EPθ [µ]
December 10, 2013
5 / 26
Marginal Inference
µMARG = EPθ [µ] µMARG = arg max hµ, θi + HM (µ) µ∈M
December 10, 2013
5 / 26
Marginal Inference
µMARG = EPθ [µ] µMARG = arg max hµ, θi + HM (µ) µ∈M
µ ¯ approx = arg maxhµ, θi + HB (µ) µ∈L
December 10, 2013
5 / 26
Marginal Inference
µMARG = EPθ [µ] µMARG = arg max hµ, θi + HM (µ) µ∈M
µ ¯ approx = arg maxhµ, θi + HB (µ) µ∈L
HB (µ) =
X
Wc H(µc )
c∈C
December 10, 2013
5 / 26
MAP Inference
µMAP = arg max hµ, θi µ∈M
December 10, 2013
6 / 26
MAP Inference
µMAP = arg max hµ, θi µ∈M
✓
Black&Box&& MAP&Solver&
µMAP
December 10, 2013
6 / 26
MAP Inference
µMAP = arg max hµ, θi µ∈M
✓
✓
Black&Box&& MAP&Solver&
Gray&Box&& MAP&Solver&
µMAP
µMAP
December 10, 2013
6 / 26
Marginal → MAP Reductions
Hazan and Jaakkola [2012] Ermon et al. [2013]
December 10, 2013
7 / 26
Table of Contents
1
Markov Random Fields
2
Frank-Wolfe for Marginal Inference
3
Optimality Guarantees and Convergence Rate
4
Beyond MRFs
5
Fancier FW
December 10, 2013
8 / 26
Generic FW with Line Search
yt = arg minhx, −∇f (xt−1 )i x∈X
xt = min f ((1 − γ)xt + γyt ) γ∈[0,1]
December 10, 2013
9 / 26
Generic FW with Line Search
xt
Compute& &Gradient&
rf (xt
1)
Linear&& Minimiza