Lifted Aggregation in Directed First-order ... - Semantic Scholar

Lifted Aggregation in Directed First-order Probabilistic Models Jacek Kisy´ nski and David Poole Laboratory of Computational Intelligence Department of Computer Science University of British Columbia

SRL Workshop - July 3, 2009 J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

1 / 31

What’s the Plan?

1

First-order Probabilistic Models and Lifted Inference

2

Aggregation in Directed First-order Models

3

Lifted Aggregation

4

Experiments

5

Summary

J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

2 / 31

Where are we?

1


2


3

Lifted Aggregation

4

Experiments

5

Summary


Lifted Aggregation

3 / 31

First-order logic + probabilistic graphical models = first-order probabilistic models First-order probabilistic models (probabilistic relational models) combine desired features of probabilistic graphical models – ability to represent (in)dependencies between random variables; first-order logic – existential and universal quantification, ability to represent relations.


Lifted Aggregation

4 / 31

First-order logic + probabilistic graphical models = first-order probabilistic models First-order probabilistic models (probabilistic relational models) combine desired features of probabilistic graphical models – ability to represent (in)dependencies between random variables; first-order logic – existential and universal quantification, ability to represent relations. There are many different first-order probabilistic modeling frameworks and programming languages. This talk is not tied to any particular approach. In particular, we discuss inference at the level of data-structures.


Lifted Aggregation

4 / 31

Parameterized random variable: a common building block of first-order probabilistic models A parameterized random variable is a random variable parameterized by logical variables; represents a set of random variables.


Lifted Aggregation

5 / 31

Parameterized random variable: a common building block of first-order probabilistic models A parameterized random variable is a random variable parameterized by logical variables; represents a set of random variables. Examples: played(Person) with range {false, true}

Assume the size of the domain of logical variable is 2, 000, 000, played(Person) represents 2, 000, 000 random variables, each with the domain {false, true}.


Lifted Aggregation

5 / 31

Parameterized random variable: a common building block of first-order probabilistic models A parameterized random variable is a random variable parameterized by logical variables; represents a set of random variables. Examples: played(Person) with range {false, true}

Assume the size of the domain of logical variable is 2, 000, 000, played(Person) represents 2, 000, 000 random variables, each with the domain {false, true}.

matched(Person) with range {0, 1, 2, 3, 4, 5, 6}

Assume the size of the domain of logical variable is 2, 000, 000, matched(Person) represents 2, 000, 000 random variables, each with the domain {0, 1, 2, 3, 4, 5, 6}.


Lifted Aggregation

5 / 31

Inference in first-order probabilistic models: knowledge-based model construction first-order posterior

first-order model lifted inference

propositionalization

propositionalization propositional inference propositional model J. Kisy´ nski & D. Poole (UBC CS)

propositional posterior Lifted Aggregation

6 / 31

Inference in first-order probabilistic models: compilation techniques first-order posterior





6 / 31

Inference in first-order probabilistic models: lifted inference first-order posterior





6 / 31

Lifted inference in a nutshell Idea behind lifted inference: given the same information about a parameterized random variable, manipulate it as a single entity; avoid reasoning about every single instance of a parameterized random variable.


Lifted Aggregation

7 / 31

Lifted inference in a nutshell Idea behind lifted inference: given the same information about a parameterized random variable, manipulate it as a single entity; avoid reasoning about every single instance of a parameterized random variable. The C-FOVE lifted inference algorithm (Milch et al., 2008); is a lifted variable elimination algorithm (Zhang and Poole, 1994); based on work by de Salvo Braz et al. (2007) and Poole (2003); performs exact inference; focused on undirected models.


Lifted Aggregation

7 / 31

Lifted inference in a nutshell Idea behind lifted inference: given the same information about a parameterized random variable, manipulate it as a single entity; avoid reasoning about every single instance of a parameterized random variable. The C-FOVE lifted inference algorithm (Milch et al., 2008); is a lifted variable elimination algorithm (Zhang and Poole, 1994); based on work by de Salvo Braz et al. (2007) and Poole (2003); performs exact inference; focused on undirected models. There is also earlier work on structured variable elimination by (Koller and Pfeffer, 1997), as well as recent work on approximate lifted inference by Singla and Domingos (2008), Sen et al. (2009) and Kersting et al. (2009). J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

7 / 31

Where are we?

1


2


3

Lifted Aggregation

4

Experiments

5

Summary


Lifted Aggregation

8 / 31

Aggregation occurs frequently in directed models

played(P erson)

played(jan)

played(sylwia)

played(magda)

P erson

jackpot won() FIRST-ORDER

jackpot won() PROPOSITIONAL

A parent random variable is parameterized by a logical variable that is not present in a child random variable.


Lifted Aggregation

9 / 31

Aggregation occurs frequently in directed models

played(P erson)

played(jan)

played(sylwia)

played(magda)

P erson



The number of instances of the parent parameterized random variable is equal to the size of the domain of Person. Their common effect aggregates in the child parameterized random variable.


Lifted Aggregation

9 / 31

Representing aggregation - desiderata

played(P erson)

played(jan)

played(sylwia)

played(magda)

P erson



The length of representation should be independent of the size of the domains of logical variables. The cost of inference should be logarithmic or linear in the size of the domains of logical variables.


Lifted Aggregation

10 / 31

Representing aggregation - tabular representations

played(P erson)

played(jan)

played(sylwia)

played(magda)

P erson



Tabular representation of the conditional probability on the child variable given the parent variable is not adequate. The length of such representation is exponential in the size of the domain of the logical variable Person.


Lifted Aggregation

11 / 31

Representing aggregation - counting formulas Counting formulas (Milch et al., 2008) are special case of parameterized random variables: we need to know how many instances of parameterized random variable have particular value; we do not care which instances have this value. #played(Person) = F n n−1 .. .

#played(Person) = T 0 1 .. .

. . .. .

0

n

.

Counting formula can be used to represent aggregation, but the range of the counting formula depends on the size of the domain of the extra logical variable. J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

12 / 31

Representing aggregation - noisy MAX/MIN factorization The noisy MAX/MIN-factorization of D´ıez and Galán (2003) can be used to compactly represent aggregation. played(jan)

played(sylwia)

played(magda)

played(jan)

played(sylwia)

played(magda)

jackpot won ()

jackpot won()

jackpot won() NOISY-OR

FACTORIZED NOISY-OR

It is limited to the above two operators.


Lifted Aggregation

13 / 31

Causal independence comes to rescue

We use a commutative and associative deterministic binary operator over the range of the child variable as an aggregation operator. OR/MAX AND/MIN SUM ...

Given probabilistic input to the parent variable, we can construct any causal independence model (Zhang and Poole, 1996).


Lifted Aggregation

14 / 31

OR-based aggregation

played(P erson)

played(jan)

got 6(P erson)

got 6(jan)

played(sylwia)

got 6(sylwia)

played(magda)

got 6(magda)

P erson



Logical disjunction is used as an aggregation operator. J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

15 / 31

MAX-based aggregation

played(P erson)

played(jan)

matched(P erson)

played(sylwia)

matched(jan)

matched(sylwia)

played(magda)

matched(magda)

P erson

best match() FIRST-ORDER

best match() PROPOSITIONAL

Maximum is used as an aggregation operator. J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

16 / 31

SUM-based aggregation

played(P erson)

played(jan)

got 6(P erson)

got 6(jan)

played(sylwia)

played(magda)

got 6(sylwia)

got 6(magda)

P erson

jackpot winners() FIRST-ORDER

jackpot winners() PROPOSITIONAL

Arithmetic addition (with a “cap”) is used as an aggregation operator.


Lifted Aggregation

17 / 31

Where are we?

1


2


3

Lifted Aggregation

4

Experiments

5

Summary


Lifted Aggregation

18 / 31

We can decompose the aggregation g 6(jan)

g 6(sylwia) ⊗

g 6(magda) ⊗

⊗

c1,n/2

c1,2

c1,1

⊗

⊗ ⊗

⊗

c(log2 n)−1,2

c(log2 n)−1,1 ⊗

jackpot won() = clog2 n,1 The binary operator ⊗ is commutative and associative. J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

19 / 31

We can decompose the aggregation g 6(jan)

g 6(sylwia) ⊗

g 6(magda) ⊗

⊗

c1,n/2

c1,2

c1,1

⊗

⊗ ⊗

⊗

c(log2 n)−1,2

c(log2 n)−1,1 ⊗

jackpot won() = clog2 n,1 In the general case, we use a square-and-multiply method. (Pingala, ˙ 200 B.C.) J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

19 / 31

Square-and-multiply got 6(jan) F T

got 6(sylwia) F T

0.9 0.1

&

OR

0.9 0.1

.

↓ c1,1 F T


0.9 · 0.9 0.9 · 0.1 + 0.1 · 0.9 + 0.1 · 0.1 Lifted Aggregation

20 / 31

What if instances of a parent variable are dependent? big jackpot()

big jackpot()

played(P erson)

played(jan)

played(sylwia)

played(magda)

got 6(P erson)

got 6(jan)

got 6(sylwia)

got 6(magda)

P erson

jackpot won() FIRST-ORDER J. Kisy´ nski & D. Poole (UBC CS)

jackpot won() PROPOSITIONAL Lifted Aggregation

21 / 31

Square-and-multiply with context variables big jackpot() F F T T

got 6(jan) F T F T

& big jackpot() F F T T J. Kisy´ nski & D. Poole (UBC CS)

big jackpot() F F T T

0.9 0.1 0.95 0.05

c1,1 F T F T

OR

↓

got 6(jan) F T F T

0.9 0.1 0.95 0.05

.

0.9 · 0.9 0.9 · 0.1 + 0.1 · 0.9 + 0.1 · 0.1 0.95 · 0.95 0.95 · 0.05 + 0.05 · 0.95 + 0.05 · 0.05 Lifted Aggregation

22 / 31

Lifted factorization The noisy MAX/MIN-factorization of D´ıez and Galán (2003) can be lifted.

played(jan)

played(sylwia)

played(magda)

played(P erson)

jackpot won ()

jackpot won()

jackpot won() NOISY-OR

LIFTED FACTORIZED NOISY-OR

It is limited to the above two operators. J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

23 / 31

Where are we?

1


2


3

Lifted Aggregation

4

Experiments

5

Summary


Lifted Aggregation

24 / 31

Experiments

We compared the following algorithms: variable elimination (VE) (Zhang and Poole, 1994); variable elimination with noisy-max/MIN factorization (VE-FCT) (D´ıez and Galán, 2003); C-FOVE (Milch et al., 2008); C-FOVE with lifted noisy-MIN/MAX factorization (C-FOVE-FCT); C-FOVE with operator-based aggregation (AC-FOVE).

How much time it takes to run out of 1GB of memory?


Lifted Aggregation

25 / 31

Experiments - OR based aggregation VE VE−FCT C−FOVE AC−FOVE C−FOVE−FCT

4

time [ms]

10

2

10

0

10

0

10


2

10

4

10 n = |D(P erson)| Lifted Aggregation

6

10

26 / 31

Experiments - MAX-based aggregation VE VE−FCT C−FOVE AC−FOVE C−FOVE−FCT

4

time [ms]

10

2

10

0

10

0

10


2

10

4


6

10

27 / 31

Experiments - SUM-based aggregation VE C−FOVE AC−FOVE

4

time [ms]

10

2

10

0

10

0

10


2

10

4


6

10

28 / 31

Experiments - social network modeling

4

time [ms]

10

2

10

VE VE−FCT C−FOVE AC−FOVE C−FOVE−FCT

0

10

1

10 J. Kisy´ nski & D. Poole (UBC CS)

2

n

Lifted Aggregation

10

29 / 31

Summary

Aggregation is an important component of directed first-order probabilistic models. Causal independence can be used to incorporate aggregation into these models. Square-end-multiply method enables efficient inference.


Lifted Aggregation

30 / 31

References Rodrigo de Salvo Braz, Eyal Amir, and Dan Roth. Lifted first-order probabilistic inference. In Lise Getoor and Ben Taskar, editors, Introduction to Statistical Relational Learning, chapter 15, pages 433–450. MIT Press, 2007. Francisco J. D´ıez and Severino F. Gal´ an. Efficient computation for the noisy MAX. International Journal of Intelligent Systems, 18(2):165–177, 2003. Krystian Kersting, Babak Ahmadi, and Sriraam Natarajan. Counting belief propoagation. In 25th UAI, 2009. Daphne Koller and Avi Pfeffer. Object-oriented Bayesian networks. In 13th UAI, pages 302–313, 1997. Brian Milch, Luke S. Zettlemoyer, Kristian Kersting, Michael Haimes, and Leslie Pack Kaelbling. Lifted probabilistic inference with counting formulas. In 23rd AAAI, pages 1062–1068, 2008. Pingala. ˙ Chandah-sˆ utra. 200 B.C. David Poole. First-order probabilistic inference. In 18th IJCAI, pages 985–991, 2003. Prithviraj Sen, Amol Deshpande, and Lise Getoor. Bisimulation-based approximate lifted inference. In 25th UAI, 2009. Parag Singla and Pedro Domingos. Lifted first-order belief propagation. In 23rd AAAI, pages 1094–1099, 2008. Nevin Lianwen Zhang and David Poole. Exploiting causal independence in Bayesian network inference. Journal of Artificial Intelligence Research, 5:301–328, 1996. Nevin Linawen Zhang and David Poole. A simple approach to Bayesian network computations. In 10th CAI, pages 171–178, 1994. J. Kisy´ nski & D. Poole (UBC CS)

Lifted Aggregation

31 / 31

Lifted Aggregation in Directed First-order ... - Semantic Scholar

Lifted Aggregation in Directed First-order ... - Semantic Scholar

Suggest Documents

Advances in Lifted Importance Sampling - Semantic Scholar

Directed Diffusion-Limited Aggregation

Topology Aggregation for Directed Graphs

Properties of Systems in the Lifted Domain. - Semantic Scholar

Judgment aggregation in general logics - Semantic Scholar

Score Aggregation Techniques in Retrieval ... - Semantic Scholar

Magnetic Nanoparticles Aggregation in Magnetic ... - Semantic Scholar

Specifying Aggregation Functions in ... - Semantic Scholar

Bayesian Decision Aggregation in Collaborative ... - Semantic Scholar

Controlling Nanoparticle Aggregation in Colloidal ... - Semantic Scholar

Collaborative Data Aggregation in Emerging ... - Semantic Scholar

Information Aggregation in Polls - Semantic Scholar

Optimal Demand Response Aggregation in ... - Semantic Scholar

Aggregation Supportive Authentication in Wireless ... - Semantic Scholar

Alamethicin Aggregation in Lipid Membranes - Semantic Scholar

Dynamic Service Aggregation in Electronic ... - Semantic Scholar

Defensive Aggregation (Huddling) in Rattus ... - Semantic Scholar

Generated aggregation operators - Semantic Scholar

Online Rank Aggregation - Semantic Scholar

MULTIVARIATE EXTREMES, AGGREGATION ... - Semantic Scholar

Directed aromatic functionalization in natural ... - Semantic Scholar

Visual Word Aggregation - Semantic Scholar

Evidential Force Aggregation - Semantic Scholar

Flow-directed Inlining - Semantic Scholar