of this tutorial. 1 Rationale. 2 Basic notions. 3 Semantics ... starting from the 1990's, growing number of applications in AI, information fusion, classification ...
Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Belief functions for the working scientist A UAI 2015 Tutorial
Rationale Basic notions Semantics
F. Cuzzolin1 and T. Denoeux2
Inference Conditioning Computation
1 Department
of Computing and Communication Technologies Oxford Brookes University, UK
Propagation Decisions BFs on reals Sister theories Advances
2 Université
de Technologie de Compiegne, France HEUDIASYC (UMR CNRS 7253)
UAI 2015
Toolbox Applications Future trends F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
1 / 229
Belief functions for the working scientist
Outline of this tutorial
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories
1 Rationale
5 Conditioning
10 Sister theories
6 Computation
11 Advances
7 Propagation
12 Toolbox
2 Basic notions 3 Semantics 8 Decisions 4 Inference 9 BFs on reals
13 Applications 14 Future trends
Advances Toolbox Applications Future trends F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
2 / 229
Rationale Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances Toolbox
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Applications Future trends F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
3 / 229
Rationale Belief functions for the working scientist
Some history
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories
• a mathematical framework for representing and reasoning with uncertain information
• also known as Dempster-Shafer (DS) theory or Evidence theory • originates from the work of Dempster (1968) in the context of statistical inference
• formalized by Shafer (1976) as a theory of evidence • popularized and developed by Smets in the 1980’s and 1990’s under the name Transferable Belief Model.
• starting from the 1990’s, growing number of applications in AI, information fusion, classification, reliability and risk analysis, etc.
Advances Toolbox Applications Future trends F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
4 / 229
Rationale Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Rationale 1 a modeling language for representing elementary items of
evidence and combining them, in order to form a representation of our beliefs about certain aspects of the world
Basic notions Semantics Inference Conditioning Computation
2 the theory of belief function subsumes both the set-based and
probabilistic approaches to uncertainty: • a belief function may be viewed both as a generalized set and as a non additive measure
• basic mechanisms for reasoning with belief functions extend both
Propagation
probabilistic operations (such as marginalization and conditioning) and set-theoretic operations (such as intersection and union)
Decisions BFs on reals Sister theories Advances
3 DS reasoning produces the same results as probabilistic reasoning
or interval analysis when provided with the same information 4 however, its greater expressive power allows us to handle more
general forms of information
Toolbox Applications Future trends
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
5 / 229
Basic notions Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
6 / 229
Basic notions
Belief functions
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
7 / 229
Basic notions Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Belief functions
Mass functions or “Basic Probability Assignments”
• let ω be an unknown quantity with possible values in a finite domain Ω, called the frame of discernment
Rationale Basic notions Belief functions Dempster’s rule Families of frames
• a piece of evidence about ω may be represented by a mass function m on Ω, defined as a function 2Ω → [0, 1], such that: P m(∅) = 0 A⊆Ω m(A) = 1
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals
• P(Ω) = 2Ω is the set of all subsets of Ω • any subset A of Ω such that m(A) > 0 is called a focal element (FE) or “focal set” of m
• special cases: • a logical (or “categorical”) mass function has one focal set (∼ set) • a Bayesian mass function has only focal sets of cardinality one (∼ probability distribution)
Sister theories Advances
• complete ignorance is represented by the vacuous mass function defined by mΩ (ω) = 1
Toolbox Applications
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
8 / 229
Basic notions Belief functions for the working scientist
Belief functions
Mass function Example
F. Cuzzolin and T. Denoeux
• a mass function encodes evidence directly supporting Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals
propositions [Shafer, 1976] - let us see an example
• a murder has been committed. There are three suspects: Ω = {Peter , John, Mary }
• available evidence: a witness saw the murderer going away, but he is short-sighted and he only saw that it was a man. We know that the witness is drunk 20 % of the time
• if the witness was not drunk, we know that ω ∈ {Peter , John} • otherwise, we only know that ω ∈ Ω. The first case holds with probability 0.8
• the corresponding mass function is:
Sister theories Advances
m({Peter , John}) = 0.8,
m(Ω) = 0.2
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
9 / 229
Basic notions Belief functions for the working scientist
Belief functions
Random set interpretation of belief functions
F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances
• a mass assignment on Ω is induced by a probability distribution P on a different domain C, via a multi-valued mapping Γ : C → 2Ω
• Γ maps elements c ∈ C (“codes”) to subsets of Ω • this is a random set, i.e, a set-valued random variable • in the example, the source of the evidence is the probability P1 that the witness is drunk (or not)
• Γ1 maps {not drunk } ∈ C1 to {Peter , John} ⊂ Ω
Toolbox Applications
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
10 / 229
Basic notions Belief functions for the working scientist
Belief (BFs) and plausibility functions
F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics
Belief functions
induced by a mass function
• for any A ⊆ Ω, we can define: • the total degree of support (belief) in A as the probability that the evidence implies A: X Bel(A) = P({c ∈ C|Γ(c) ⊆ A}) = m(B)
Inference Conditioning Computation
B⊆A
• the plausibility of A as the probability that the evidence does not contradict A:
Propagation
Pl(A) = P({c ∈ C|Γ(c) ∩ A 6= ∅}) = 1 − Bel(A)
Decisions BFs on reals Sister theories
• the uncertainty on the truth value of the proposition “ω ∈ A” is represented by two numbers: Bel(A) and Pl(A), with
Advances
Bel(A) ≤ Pl(A)
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
11 / 229
Basic notions Belief functions for the working scientist
Special cases of belief functions they generalise both probabilities and possibilities
F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference
Belief functions
• if all focal sets of m are singletons, then m is said to be Bayesian • it is equivalent to a probability distribution, and Bel = Pl is a probability measure
• if the focal sets of m are nested, then m is said to be consonant • in that case Pl is a possibility measure, i.e.,
Conditioning
Pl(A ∪ B) = max(Pl(A), Pl(B)),
Computation Propagation Decisions BFs on reals
∀A, B ⊆ Ω,
and Bel is the dual necessity measure
• the contour function pl(ω) = Pl({ω}) corresponds to the possibility distribution (membership function)
Sister theories Advances Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
12 / 229
Basic notions
Dempster’s rule
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
13 / 229
Basic notions Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances
Dempster’s rule
Combination of evidence Murder example continued
• when we have separate bodies of evidence, each represented by a belief function, can we combine them in order to estimate the state of the world, or make a decision?
• the first item of evidence gave us: m1 ({Peter , John}) = 0.8, m1 (Ω) = 0.2
• new piece of evidence: a blond hair has been found • also, there is a probability 0.6 that the room has been cleaned before the crime
• this second body of evidence is encoded by the mass assignment m2 ({John, Mary }) = 0.6, m2 (Ω) = 0.4
• how to combine these two pieces of evidence? • again, an answer can be given within the “random set” interpretation of belief functions
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
14 / 229
Basic notions Belief functions for the working scientist
Dempster’s rule
Combination of evidence
F. Cuzzolin and T. Denoeux
drunk (0.2)
(C1, P1)
not drunk (0.8)
Rationale
Γ1
Ω
Peter
Basic notions
Mary
Belief functions
(C2, P2)
Dempster’s rule Families of frames
Semantics
Γ2
Inference
not cleaned (0.4)
Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances Toolbox
John
cleaned (0.6)
• if codes c1 ∈ C1 and c2 ∈ C2 were selected, ω ∈ Γ1 (c1 ) ∩ Γ2 (c2 ) • if the codes are selected independently, then the probability that the pair (c1 , c2 ) is selected is P1 ({c1 })P2 ({c2 })
• if Γ1 (c1 ) ∩ Γ2 (c2 ) = ∅, (c1 , c2 ) cannot be selected, hence: • the joint probability distribution on C1 × C2 must be conditioned to eliminate such pairs
Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
15 / 229
Basic notions
Dempster’s rule
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Dempster’s rule Definition
• under these assumptions we get Dempster’s rule of combination • let m1 and m2 be two mass functions on the same frame Ω, induced by two independent pieces of evidence
• their combination using Dempster’s rule is defined as:
Semantics
(m1 ⊕ m2 )(A) =
Inference
X 1 m1 (B)m2 (C), 1−κ
Conditioning Computation
where κ=
Propagation
Sister theories Advances
X
m1 (B)m2 (C)
B∩C=∅
Decisions BFs on reals
∀∅ 6= A ⊆ Ω,
B∩C=A
is the degree of conflict between m1 and m2
• their Dempster’s sum m1 ⊕ m2 exists iff κ < 1 • can be easily extended to any number of BFs
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
16 / 229
Basic notions Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Dempster’s rule
Dempster’s rule A simple numerical example
Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
17 / 229
Basic notions Belief functions for the working scientist
Dempster’s rule
Dempster’s rule Properties
F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference
• Dempster’s rule has some interesting properties: • commutativity, associativity, existence of a neutral element: the vacuous BF mΩ
• it generalises set-theoretical intersection: if mA and mB are logical mass functions and A ∩ B 6= ∅, then
Conditioning
mA ⊕ mB = mA∩B
Computation Propagation Decisions BFs on reals Sister theories
• it generalises probabilistic conditioning via Bayes’ rule: if m is a Bayesian mass function and mA is a logical mass function, then m ⊕ mA is a Bayesian mass function that corresponding to Bayes’ conditioning of m by A
Advances Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
18 / 229
Basic notions
Families of frames
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
19 / 229
Basic notions Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation
Families of frames
Families of frames Refinements and coarsenings
• the theory allows us to handle evidence impacting on different but related domains
• assume we are interested in the nature of an object in a road scene. We could describe it, e.g., in the frame Θ = {vehicle, pedestrian}, or in the finer frame Ω = {car, bicycle, motorcycle, pedestrian}
• other example: different image features in pose estimation • a frame Ω is a refinement of a frame Θ (or, equivalently, Θ is a coarsening of Ω) if elements of Ω can be obtained by splitting some or all of the elements of Θ
Propagation Decisions
Θ
BFs on reals Sister theories Advances
ρ
θ1
Ω
θ2 θ3
Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
20 / 229
Basic notions Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances Toolbox
Families of frames
Compatible frames Families of frames
• when Ω is a refinement for a collection Θ1 , ..., ΘN of other frames it is called their common refinement
• two frames are said to be compatible if they do have a common refinement
• compatible frames can be associated with different variables/attributes/features: • let ΩX = {red, blue, green} and ΩY = {small, medium, large} be the domains of attributes X and Y describing, respectively, the color and the size of an object • in such a case the common refinement ΩX × ΩY is ΩX and ΩY
• or, they can be descriptions of the same variable at different levels of granularity (as in the road scene example)
• evidence can be moved from one frame to another within a family of compatible frames
Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
21 / 229
Basic notions
Families of frames
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Marginalization • let ΩX and ΩY be two compatible frames • let mXY be a mass function on ΩX × ΩY • it can be expressed in the coarser frame ΩX by transferring each mass mXY (A) to the projection of A on ΩX :
Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation Propagation Decisions
• we obtain a marginal mass function on ΩX : mXY ↓X (B) =
BFs on reals Sister theories Advances Toolbox
X
mXY (A) ∀B ⊆ ΩX
{A⊆ΩXY ,A↓ΩX =B}
• (again, it generalizes both set projection and probabilistic marginalization)
Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
22 / 229
Basic notions Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Families of frames
Vacuous extension • the “inverse” of marginalization • a mass function mX on ΩX can be expressed in ΩX × ΩY by transferring each mass mX (B) to the cylindrical extension of B:
Basic notions Belief functions Dempster’s rule Families of frames
Semantics Inference Conditioning Computation
• this operation is called the vacuous extension of mX in ΩX × ΩY :
Propagation
(
Decisions
m
BFs on reals Sister theories Advances Toolbox
X ↑XY
(A) =
mX (B) if A = B × ΩY 0 otherwise
• a strong feature of belief theory: the vacuous belief function (our representation of ignorance) is left unchanged when moving from one compatible frame to another
Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
23 / 229
Semantics Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
Inference Conditioning Computation Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
24 / 229
Semantics Belief functions for the working scientist
The multiple semantics
F. Cuzzolin and T. Denoeux
• being complex objects, belief functions have a number of
of belief functions (sometimes conflicting) semantics and mathematical interpretations
Rationale Basic notions Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
Inference Conditioning Computation
• original one [Dempster 1967]: lower probabilities induced by a multivalued mapping • the mathematical representation: random set framework
• Shafer’s (1976): representations of pieces of evidence in favour of propositions within someone’s subjective state of belief • represented as set functions on a finite domain Ω
• as convex sets of probability measures, in a robust Bayesian interpretation • mathematically, a credal set whose lower and upper envelopes are
Propagation Decisions BFs on reals Sister theories Advances
belief and plausibility functions
• other equivalent mathematical formulations: • as non-additive (generalised) probabilities • as monotone capacities • as inner measures (linked to the rough set idea)
Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
25 / 229
Semantics
Lower probabilities
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
Inference Conditioning Computation Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
26 / 229
Semantics Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions
Lower probabilities
Upper and lower probabilities induced by multi-valued mappings
• Dempster has shown that mapping a probability distribution via a multi-valued map yields an object more general than a probability distribution: a belief function
Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories
• belief and plausibility values are interpreted as lower and upper bounds to the values of an unknown, underlying probability measure: Bel(A) ≤ P(A) ≤ Pl(A) for all A ⊆ Ω
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
27 / 229
Semantics
Credal sets
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
Inference Conditioning Computation Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
28 / 229
Semantics Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Lower probabilities Credal sets Set functions Generalised probabilities
Belief functions as credal sets or convex sets of probabilities
• each focal element A of mass m(A) as the indication of the existence of a mass m(A) “floating" inside A
• constraint on the probability measure on Ω: a distribution is “consistent" with Bel if it is obtained by redistributing the mass of each focal element to its singletons
• set of probabilities consistent with b:
Random sets
n o . P[Bel] = P ∈ P : P(A) ≥ Bel(A) ∀A ⊆ Ω
Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories
Credal sets
• it is a polytope in the probability simplex, with vertices induced by permutations of the elements of Ω
• Shafer disavowed any probability-bound interpretation • also criticized by Walley as incompatible with Dempster’s rule of combination
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
29 / 229
Semantics
Set functions
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
Inference Conditioning Computation Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
30 / 229
Semantics Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Set functions
Belief functions as set functions • Shafer’s definition in terms of set functions [Aigner] • a belief function Bel : 2Ω → [0, 1] is such that,
Basic notions
Bel(A) =
Semantics
X
m(B)
B⊆A
Lower probabilities Credal sets Set functions Generalised probabilities Random sets
where m : 2Ω → [0, 1] is a basic probability assignment s.t. X m(∅) = 0, m(A) = 1, m(A) ≥ 0 ∀A ⊆ Ω
Inference Conditioning Computation Propagation Decisions BFs on reals Sister theories
A⊆Ω
• operating with belief functions reduces to manipulating their focal elements
• in Shafer’s framework, the mass assignment is derived by “impact of evidence” associated with propositions via an exponential-like relation
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
31 / 229
Semantics
Generalised probabilities
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
Inference Conditioning Computation Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
32 / 229
Semantics Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Generalised probabilities
As non-additive probabilities or “generalised” probabilities
Probability measure A function P : F → [0, 1] over a σ-field F ⊂ 2Ω such that
Rationale Basic notions Semantics
• P(∅) = 0, P(Ω) = 1; • if A ∩ B = ∅, A, B ∈ F then P(A ∪ B) = P(A) + P(B) (additivity).
Lower probabilities Credal sets Set functions Generalised probabilities
• if we relax the third constraint to allow the function to meet additivity only as a lower bound we obtain a:
Random sets
Inference
Belief function
Conditioning
A function Bel : 2Ω → [0, 1] from the power set 2Ω to [0, 1] such that:
Computation Propagation
• Bel(∅) = 0, Bel(Ω) = 1; • for every n and for every collection A1 , ..., An ∈ 2Ω we have that:
Decisions BFs on reals Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Bel(A1 ∪ ... ∪ An )
≥
X
Bel(Ai ) −
i
X
Bel(Ai ∩ Aj ) + · · ·
i and indifference ∼ • goal: to build a belief function Bel such that A· > B iff Bel(A) > Bel(B) and A ∼ B iff Bel(A) = Bel(B)
• exists if · > is a weak order and ∼ an equivalence relation
From preferences
Conditioning Computation Propagation Decisions BFs on reals Sister theories Advances Toolbox
• Algorithm 1 consider all propositions that appear in the preference relations as potential focal elements (FEs) 2 elimination: if A ∼ B for some B ⊂ A then A is not a FE 3 a perceptron algorithm is used to generate the mass m by solving the system of remaining equalities and disequalities
• however: it selects arbitrarily one solution over many • does not address possible inconsistency in the given preferences
Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
52 / 229
Inference Belief functions for the working scientist
From preferences
Ben Yaghlane’s constrained optimisation approach
F. Cuzzolin and T. Denoeux
Building belief functions from preferences
Rationale Basic notions Semantics Inference Dempster’s approach Likelihood-based From preferences
Conditioning Computation Propagation
• uses preferences and indifferences as in Wong and Lingras, with same axioms..
• .. but converts them into a constrained optimisation problem • objective function: maximise the entropy/uncertainty of the BF to generate (least informative result)
• constraints derived from input preferences/indifferences, i.e. A· > B ↔ Bel(A) − Bel(B) ≥ ,
A ∼ B ↔ |Bel(A) − Bel(B)| ≤
Decisions BFs on reals Sister theories
• is a constant specified by the expert • various uncertainty measures can be plugged in (see Advances)
Advances Toolbox Applications F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
53 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
54 / 229
Conditioning Belief functions for the working scientist
Conditional belief functions A variety of proposals
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
• many different approaches to conditioning belief functions have been proposed
• a non-exhaustive list: • original Dempster’s conditioning • lower and upper envelopes of conditional probabilities [Fagin and • • • • •
Halpern] geometric conditioning [Suppes] unnormalized conditional belief functions [Smets] generalised Jeffrey’s rules [Smets] sets of equivalent events under multi-valued mappings [Spies] conditioning by distance minimisation [Cuzzolin]
• implications of the notion of conditional belief function: • generalised Bayes theorem [Smets] • the generalisation of the total probability theorem
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
55 / 229
Conditioning
Dempster’s conditioning
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
56 / 229
Conditioning Belief functions for the working scientist
Dempster’s conditioning
Dempster’s conditioning Conditioning
F. Cuzzolin and T. Denoeux
• Dempster’s rule of combination is associated with a conditioning Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
operator
• suppose we have an “a-priori” BF Bel • given a new event A, the “logical” belief function such that m(A) = 1 can be defined ...
• ... and combined with Bel using Dempster’s rule • the resulting BF is the conditional belief function given A, a la Dempster
Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
57 / 229
Conditioning
Lower conditional envelopes
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
58 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions
Lower conditional envelopes Conditioning
• Fagin and Halpern proposed an approach based on interpretation of a belief function as the lower envelope of the family of probabilities consistent with it (robust Bayesian)
Semantics
Bel(A) =
Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Lower conditional envelopes
inf
P∈P[Bel]
P(A)
• they define the conditional belief as the lower envelope (that is, the infimum) of the family of conditional probability functions P(A|B), where P is consistent with Bel: . Bel(A|B) =
Conditional events as equivalence classes
inf
P∈P[Bel]
P(A|B),
. Pl(A|B) =
sup P(A|B) P∈P[Bel]
Generalised Bayes theorem The total belief theorem
Computation Propagation
• trivially generalises conditional probability • have been considered by other authors too, e.g. Dempster 1967 and Walley 1981
Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
59 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Lower conditional envelopes
Lower conditional envelopes versus Dempster’s
• the authors provide a closed-form expression for it:
Rationale
Bel(A|B) =
Bel(A∩B) , ¯ Bel(A∩B+Pl(A∩B)
Pl(A|B) =
Pl(A∩B) ¯ Pl(A∩B)+Bel(A∩B)
Basic notions Semantics
• lower/upper envelopes of arbitrary sets of probabilities are not in
Inference
general belief functions, but these actually are, as Fagin and Halpern have proven
Conditioning Dempster’s conditioning
• they are quite different from Dempster’s conditioning:
Lower conditional envelopes
Bel⊕ (A|B) =
Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem
Propagation Decisions
Pl⊕ (A|B) =
Pl(A∩B) Pl(B)
• in fact, they provide a more conservative estimate: Bel(A|B) ≤ Bel⊕ (A|B) ≤ Pl⊕ (A|B) ≤ Pl(A|B)
The total belief theorem
Computation
¯ Bel(A∪B) ¯ , 1−Bel(B)
• Fagin and Halpern argue that Dempster’s conditioning behaves unreasonably on their “three prisoners” example
BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
60 / 229
Conditioning
Unnormalised vs geometric conditioning
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
61 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Unnormalised vs geometric conditioning
Revision versus focussing in belief as opposed to probability theory
Focussing No new information is introduced, we merely focus on a specific subset of the original set.
Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation Propagation Decisions
Belief revision A state of belief is modified to take into account a new piece of information.
• in probability theory, both are expressed by Bayes’ rule, but they are conceptually different operations
• in belief theory, these principles lead to different conditioning rules
• the application of revision and focussing to belief theory has been explored by Smets in his Transferable Belief Model (TBM)
• here we are not assuming any random set generating Bel, nor any underlying convex sets of probabilities
BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
62 / 229
Conditioning Belief functions for the working scientist
Unnormalised vs geometric conditioning
Suppes’ geometric conditioning Conditioning
F. Cuzzolin and T. Denoeux
• the geometric conditioning proposed by Suppes and Zanotti Rationale Basic notions
BelG (A|B) =
Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Bel(A ∩ B) , Bel(B)
is indeed a consequence of the focussing idea
• (this was proved by Smets using the “probability of provability” interpretation of belief functions, yes, yet another one!)
• somewhat dual to Dempster’s conditioning, as it replaces probability with belief in Bayes’ rule
• remember that Dempster’s rule dually replaces probability with plausibility in Bayes’ rule
Computation
Pl⊕ (A|B) =
Pl(A∩B) Pl(B)
↔
BelG (A|B) =
Bel(A∩B) Bel(B)
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
63 / 229
Conditioning Belief functions for the working scientist
Unnormalised vs geometric conditioning
Smets’ unnormalised rule of conditioning
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics
• to rebuke Bayesian criticisms, in his TBM Smets rejects the existence of a probability measure on a parent space C
• Smet’s (Dempster’s) unnormalized conditional belief function:
Inference Conditioning
mU (.|B) =
Dempster’s conditioning
X m(A ∪ X )
if A ⊆ B
X ⊆B c
0
elsewhere
Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
• (in the TBM BFs which assign mass to ∅ can exist, under the “open world” assumption)
• in terms of plausibilities: PlU (A|B) = Pl(A ∩ B) - in the TBM the mass m(A) is transferred by conditioning on B to A ∩ B
• it is a consequence of belief revision principles [Gilboa, Perea]
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
64 / 229
Conditioning
Conditional events as equivalence classes
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
65 / 229
Conditioning Belief functions for the working scientist
Conditional events as equivalence classes
Spies’ sets of equivalent events under multi-valued mappings
F. Cuzzolin and T. Denoeux
Conditioning
Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes
• intriguing approach to conditioning, in the random set interpretation with (C, F, P) and Γ : C → 2Ω
• null sets for P(.|A): N (P(.|A)) = {B ∈ A : P(B|A) = 0} ¯ ∪ (A ¯ ∩ B) • let 4 be the symmetric difference A4B = (A ∩ B) • two events have the same conditional probability if they both are the symmetric difference between a same event and some null set
• a conditional event [B|A] with A, B ⊆ Ω is a set of events with the same conditional probability P(B|A):
Generalised Bayes theorem
[B|A] = B4N (PA )
The total belief theorem
Computation
¯ ∪ B} • you can prove that [B|A] = {C : A ∩ B ⊆ C ⊆ A
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
66 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Conditional events as equivalence classes
Conditional belief functions Spies’ approach
• by applying to conditional events a multivalued mapping Spies gave a new definition of conditional belief function
Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
• conditional multivalued mapping for B ⊆ Ω: ΓB (c) = [Γ(c)|B], where Γ : C → 2Ω
• (if A = Γ(c), ΓB maps c to [A|B]) • consequence: to all elements of each conditioning event (an equivalence class) must be assigned equal belief/plausibility
• a conditional belief function is then a “second-order” BF with values on collections of focal elements (the conditional events)
Conditional events as equivalence classes
Bel([C|B]) = P({c : ΓB (c) = [C|B]}) =
Generalised Bayes theorem
1 K
X
m(A)
A∈[C|B]
The total belief theorem
Computation Propagation Decisions
• it is not a BF on the sub-algebra {Y = C ∩ B, C ⊆ Ω} • Spies’ conditional belief functions are closed under Dempster’s rule of combination
BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
67 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Jeffrey’s rule of conditioning or Total Probability Theorem
• suppose P is defined on a σ-algebra A • there is a new prob measure P 0 on a sub-algebra B of A, and the updated probability P 00 has to:
1 meet the prob values specified by P 0 for events in B 2 be such that ∀ B ∈ B, X , Y ⊂ B, X , Y ∈ A ( P(X ) P 00 (X ) if P(Y ) > 0 P(Y ) = 00 P (Y ) 0 if P(Y ) = 0
• there is a unique solution:
Conditional events as equivalence classes
P 00 (A) =
Generalised Bayes theorem The total belief theorem
Computation Propagation Decisions
Conditional events as equivalence classes
X
P(A|B)P 0 (B)
B∈B
• meaning: the initial probability stands corrected by the second one on a number of events
• generalises conditioning (obtained when P 0 (B) = 1 for some B)
BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
68 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Conditional events as equivalence classes
Jeffrey’s rule generalised Jeffrey’s rule for belief functions
Rationale
1 Let Π = {B1 , ..., Bn } a disjoint partition of Ω;
Basic notions
2 m1 , ..., mn the BPAs of BFs conditional on B1 , ..., Bn respectively;
Semantics Inference
3 mB an unconditional belief function on the coarsening associated
with the partition Π
Conditioning Dempster’s conditioning
Then the belief function Beltot (A) =
Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
X
mB ⊕ ⊕ni mBi (C) is a marginal
C⊆A
belief function on Ω, and if all BFs are probabilities is reduces to the result of Jeffrey’s rule of total probability
• combining the a-priori with all the conditionals we get a marginal • is this the only solution? we will discuss this later (total belief theorem)
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
69 / 229
Conditioning
Conditional events as equivalence classes
Belief functions for the working scientist
Conditioning approaches
F. Cuzzolin and T. Denoeux
• various approaches to conditioning have been proposed:
A summary • Dempster’s (normalised) conditioning:
Rationale
Bel⊕ (A|B) =
Basic notions Semantics
Dempster’s conditioning
Bel(A|B) =
Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation Propagation Decisions
Pl(A∩B) Pl(B)
Bel(A∩B) , ¯ Bel(A∩B+Pl(A∩B)
Pl(A|B) =
Pl(A∩B) ¯ Pl(A∩B)+Bel(A∩B)
• geometric conditioning:
Lower conditional envelopes Unnormalised vs geometric conditioning
Pl⊕ (A|B) =
• lower and upper conditional envelopes:
Inference Conditioning
¯ Bel(A∪B) ¯ , 1−Bel(B)
BelG (A|B) =
Bel(A ∩ B) , Bel(B)
• Smets’ unnnormalised rule: PlU (A|B) = Pl(A ∩ B) X • Spies’ conditioning: Bel[A|B] ∝ m(X ) ¯ X :A∩B⊆X ⊆A∪B
• derived from different revision principles • follow different semantic interpretations (TBM rather than random set, open versus closed world assumption, robust Bayesian)
BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
70 / 229
Conditioning
Generalised Bayes theorem
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
71 / 229
Conditioning
Generalised Bayes theorem
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference
Bayes’ theorem generalised to belief functions
• have a conditional probability P(x|θi ) over observations x ∈ X , and an a-priori probability P0 over a set of hidden variables θi ∈ Θ
• (for instance, x is a symptom and θi a disease) • after observing x, the probability distribution on Θ is updated to the posterior via Bayes’s theorem:
Conditioning
P(x|θi )P0 (θi ) P(θi |x) = P j P(x|θj )P0 (θj )
Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation Propagation
∀θj ∈ Θ
• the GBT is a generalisation of Bayes’ theorem for conditional BFs, when the a-priori BF on Θ is vacuous
• Dempster’s normalised/unnormalised conditioning is assumed • (a further generalisation for non-vacuous priors is proposed in Smets’ work) [Smets 1993]
Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
72 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Cognitive independence • consider a belief function over the product space X × Y • the two variables are cognitively independent if
Basic notions
plX ×Y (x ∩ y ) = plX (x)plY (y ) ∀x ⊆ X , y ⊆ Y
Semantics Inference Conditioning Dempster’s conditioning
• cognitive independence extends stochastic independence • conditional cognitive independence reads as
Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes
Generalised Bayes theorem
plX ×Y (x ∩ y |θi ) = plX (x|θi )plY (y |θi ) ∀x, y , θi and implies that the ratio of plausibility/belief on X does not depend on Y :
Generalised Bayes theorem The total belief theorem
Computation
plX (x1 |y ) plX (x1 ) = , plX (x2 |y ) plX (x2 )
BelX (x1 |y ) BelX (x1 ) = BelX (x2 |y ) BelX (x2 )
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
73 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Generalised Bayes theorem
Likelihood principle Edwards, 1929
• the likelihood of an hypothesis given the data amounts to the conditional probability of the data given the hypothesis: l(θi |x) = p(x|θi )
Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
and, for unions of singleton hypotheses: n o l(θ = {θ1 , ..., θk }|x) = max l(θi |x) : θi ∈ θ
• Shafer’s somewhat similar proposal for statistical inference (see inference-likelihood method):
Conditional events as equivalence classes
pl(θ|x) = max pl(θi |x)0
Generalised Bayes theorem The total belief theorem
Computation Propagation
θi ∈θ
was rejected by Smets, for not satisfying the condition that, if two pieces of evidence are conditionally independent, BelΘ (.|x, y ) is the conjunctive combination of BelΘ (.|x) and BelΘ (.|y )
Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
74 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics
Generalised Bayes theorem
Generalised Likelihood Principle Smets’ Generalised Likelihood Principle (GLP) 1 plΘ (θ|x) = plX (x|θ) 2 For all x, θ the plausibility of data pl(x|θ) given a compound
hypothesis θ = {θ1 , ..., θm } is a function of only {pl(x|θi ), pl(x¯ |θi ) : θi ∈ θ}
Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation Propagation
• the form of the function is not assumed (not necessarily the max) • both pl(x|θi ) and pl(x¯ |θi ) are necessary because of the non-addivitivity of belief functions
• justified by the following requirements: • pl(x|θ) remains the same on the coarsening of X formed by just x and x¯ • plausibilities for θj 6∈ θ are irrelevant for pl(x|θ)
Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
75 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics
Generalised Bayes theorem
Generalised Bayesian Theorem and the disjunctive rule of combination
• under conditional cognitive independence and the Generalised Likelihood Principle (2), BelX (.|θ), θ ⊂ Θ is generated from the {BelX (.|θi ), θi ∈ Θ} by disjunctive rule of combination Y Y PlX (x|θ) = 1 − (1 − PlX (x|θi )), BelX (x|θ) = BelX (x|θi )
Inference
θi ∈θ
θi ∈θ
Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
• then, condition (1) of the GLP plΘ (θ|x) = plX (x|θ) implies the generalised Bayes theorem: Y 1 PlΘ (θ|x) = 1− (1 − plX (x|θi )) K θi ∈θ Y 1Y BelΘ (θ|x) = BelX (x¯ |θi ) − BelX (x¯ |θi ) K ¯ θi ∈ θ
θi ∈Θ
Computation Propagation
where K = 1 −
Q
θi ∈Θ (1
− plX (x|θi ))
Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
76 / 229
Conditioning
The total belief theorem
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
77 / 229
Conditioning Belief functions for the working scientist
The total belief theorem Generalising total probability to belief functions
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem
The total belief theorem
Theorem Suppose Θ and Ω are two frames of discernment, and ρ : 2Ω → 2Θ the unique refining between them. Let Bel0 be a belief function defined over Ω = {ω1 , ..., ω|Ω| }. Suppose there exists a collection of belief functions Beli : 2Πi → [0, 1], where Π = {Π1 , ..., Π|Ω| }, Πi = ρ({ωi }), is the partition of Θ induced by its coarsening Ω. Then, there exists a belief function Bel : 2Θ → [0, 1] such that: 1 Bel0 is the restriction of Bel to Ω 2 Bel ⊕ BelΠi = Beli ∀i = 1, ..., |Ω|, where BelΠi is the logical belief
function with mass
The total belief theorem
mΠi (A) = 1 A = Πi , 0 otherwise
Computation Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
78 / 229
Conditioning Belief functions for the working scientist
The total belief theorem
The total belief theorem Visual representation
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation Propagation
• pictorial representation of the total belief theorem
Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
79 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions
The total belief theorem
Structure of the focal elements of the total belief function
• restricted total belief theorem: Bel0 has only disjoint FEs • pictorial representation of the structure of the FEs of a total BF Bel lying in the image of a focal element of Bel0 of cardinality 3
Semantics Inference Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation Propagation Decisions BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
80 / 229
Conditioning Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference
The total belief theorem
Graph of solutions • potential solutions correspond to square linear systems, and form a graph whose nodes are linked by linear transformations of columns X X e 7→ e0 = −e + ei − ej i∈C
j∈S
where C is a covering set for e (i.e., every component of e is covered by at least one of them), S a set of selection columns
• at each transformation, the most negative component decreases
Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
Computation Propagation Decisions
• general solution based on simplex-like optimisation?
BFs on reals F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
81 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Sister theories
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
82 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals Sister theories Advances
Efficient computation with belief functions
• as belief functions are set functions, their complexity is exponential on the size n of the domain they are defined on
• combinining belief functions via Dempster’s rule is also exponential: (2n )N , where N is the number of BFs involved
• efficient approaches based on approximating the original evidence in particular, approaches that transform a belief function into a less complex uncertainty measure • probability (Bayesian) transformation • possibility (consonant) transformation
• approaches based on the local propagation of evidence • • • •
Barnett’s approximation hierarchical evidence Cano’s propagation on DAGs Shafer-Shenoy architecture
• Monte-Carlo methods [Wilson and Moral]
Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
83 / 229
Computation
Probability transformation
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Sister theories
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
84 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Probability transformation
Probability transformations Mapping belief functions to probabilities
• probability transform of belief functions: an operator pt : B → P, b 7→ pt[b] mapping belief measures onto probability distributions
Rationale Basic notions Semantics Inference Conditioning Computation Probability transformation
• (not necessarily an element of the corresponding credal set) • a number of transforms proposed, either as efficient implementations of ToE or tools for decision making • pignistic transform, central in the TBM [Smets] • plausibility and belief transform [Voorbraak, Cobb & Shenoy] • orthogonal projection and intersection probability [Cuzzolin]
Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
85 / 229
Computation Belief functions for the working scientist
Transformation in the TBM: the pignistic transform
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference
Probability transformation
• Smets’ Transferable Belief Model -> decisions made via “pignistic transform” ..
• .. resulting in a pignistic probability:
Conditioning
BetP[b](x) =
Computation
Monte-Carlo methods
Propagation Decisions BFs on reals Sister theories
X mb (A) , |A|
A⊇{x}
Probability transformation Possibility transformation
Probability transformation
• its purpose is to allow decision making at the level of probabilities, typically in an expected utility framework
• the result of a redistribution process in which the mass of each focal element A is re-assigned to all its elements x ∈ A on an equal basis
• it commutes with affine combination and is the center of mass of the credal set of consistent probabilities
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
86 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Probability transformation
Plausibility and belief transform Probability transformation
• plausibility transform [Voorbraak 89]: maps a belief function to the relative belief of singletons:
Rationale
pl(x) ˜ pl(x) = P y ∈Θ pl(y )
Basic notions Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
˜ is a perfect representative of Bel • relative plausibility of singletons pl when combined with other probabilities by Dempster’s rule ⊕
• meets a number of properties w.r.t. ⊕ [Coob&Shenoy 03] • the relative belief transform maps each belief function to the corresponding relative belief of singletons:
Propagation
Bel({x}) ˜ bel(x) = P y ∈Θ Bel({y })
Decisions BFs on reals Sister theories Advances
• first proposed by Daniel in 2006, its geometry and that of plausibility transform analyzed in [Cuzzolin 2010 AMAI]
Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
87 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Probability transformation
Geometric transformations Probability transformation
• the probability transformation problem can be posed in geometric terms [IEEE SMC-B07]
Basic notions Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals
• orthogonal projection π[b] -> minimises the L2 distance from P
• intersection probability p[b] -> intersection with P of the belief-plausibility line
Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
88 / 229
Computation
Possibility transformation
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Sister theories
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
89 / 229
Computation Belief functions for the working scientist
Possibility transformation
Possibility transformations Outer approximations
F. Cuzzolin and T. Denoeux
• necessity measures have as counterparts in the ToE consonant Rationale Basic notions Semantics Inference
belief functions, whose focal elements are nested: A1 ⊂ · · · ⊂ Am , Ai ⊆ Θ
• outer consonant approximations [Dubois&Prade 90]: consonant BFs co which are dominated by the original BF on all events:
Conditioning
co(A) ≤ Bel(A)
Computation Probability transformation Possibility transformation Monte-Carlo methods
• for each possible maximal chain A1 ⊂ · · · ⊂ An , |Ai | = i of focal elements the maximal outer consonant approximation has mass mmax (Ai ) = Bel(Ai ) − Bel(Ai−1 )
Propagation Decisions BFs on reals Sister theories
∀A ⊆ Θ
• mirrors the behavior of the vertices of the credal set of probabilities dominating a belief function [Chateauneuf, Miranda& Grabish]
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
90 / 229
Computation Belief functions for the working scientist
Possibility transformation
Possibility transformation Isopignistic approximation
F. Cuzzolin and T. Denoeux
• completely different approximation in Smets’ Transferable Belief Rationale Basic notions Semantics Inference Conditioning
Model [Smets94,05]
• isopignistic" approximation: the unique consonant belief function whose pignistic probability BetP is identical to that of Bel [Dubois, Aregui]
• its contour function is:
Computation Probability transformation
pliso (x) =
Possibility transformation Monte-Carlo methods
Propagation
Sister theories
n o min BetP(x), BetP(x 0 )
x 0 ∈Θ
• mass assignment: miso (Ai ) = i · (BetP(xi ) − BetP(xi+1 ))
Decisions BFs on reals
X
where {xi } = Ai \ Ai−1
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
91 / 229
Computation
Monte-Carlo methods
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Sister theories
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
92 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals Sister theories Advances
Monte-Carlo methods
A simple Monte-Carlo approach to Dempster’s combination - Wilson, 1989
• we seek Bel = Bel1 ⊕ ... ⊕ Belm on Ω, where the evidence is induced by probability distributions Pi on Ci via Γi : Ci → 2Ω
• Monte-Carlo algorithm simulates the random set interpretation of belief functions: Bel(A) = P(Γ(c) ⊆ A|Γ(c) 6= ∅) for a large number of trials n = 1 : N do randomly pick c ∈ C such that Γ(c) 6= ∅ for i = 1 : m do randomly pick an element ci of Ci with probability Pi (ci ) end for let c = (c1 , ..., cm ) if Γ(c) = ∅ then restart trial end if if Γ(c) ⊆ A then trial succeeds, T = 1 end if end for
Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
93 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation
A Monte-Carlo approach Wilson, 1989
• the proportion of trials which succeed converges to Bel(A): E[T¯ ] = Bel(A), Var [T¯ ] ≤
constant
• testing if xj ∈ Γ(c) takes less then Bm, constant B • expected time of the algorithm is N m · (A + B|Ω|) 1−κ
Possibility transformation
Propagation Decisions BFs on reals Sister theories Advances
1 4N
• we say algorithms has accuracy k if 3σ[T¯ ] ≤ k • picking c ∈ C involves m random numers so it takes A · m, A
Probability transformation
Monte-Carlo methods
Monte-Carlo methods
where κ is Shafer’s conflict measure
• expected time to achieve accuracy k is then
9 m 4(1−κ)κ2
· (A + C|Ω|)
for constant C, better for simple support functions
• conclusion: unless κ is close to 1 (highly conflicting evidence) Dempster’s combination is feasible for large values of m (number of BFs to combine) and large Ω (hypothesis space)
Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
94 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics
Monte-Carlo methods
Markov-Chain Monte-Carlo Wilson and Moral, 1996
• trials are not independent but form a Markov chain • non-deterministic OPERATIONi : changes at most the i-th coordinate c 0 (i) of c 0 to y , with chance Pi (y ) Pr (OPERATIONi (c 0 ) = c) ∝ Pi (c(i)) if c(i) = c 0 (i), 0 otherwise
Inference Conditioning Computation
• MCMC algorithm which returns a value BELN (c0 ) which is the proportion of time in which Γ(cc ) ⊆ X
Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
cc = c0 S=0 for n = 1 : N do for i = 1 : m do cc = OPERATIONi (cc ) if Γ(cc ) ⊆ X then S =S+1 end if end for end for S return Nm Belief functions for the working scientist
UAI 2015
95 / 229
Computation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions
Monte-Carlo methods
Importance sampling Wilson and Moral, 1996
Theorem If C is connected (i.e., any c, c 0 are linked by a chain of OPERATIONi ) then given , δ there exist K 0 , N 0 s.t. for all K ≥ K 0 and N ≥ N 0 and c0 : Pr (|BELNK (c0 )| < ) ≥ 1 − δ
Semantics Inference Conditioning Computation Probability transformation Possibility transformation Monte-Carlo methods
Propagation Decisions BFs on reals Sister theories
• further step: importance sampling -> pick samples c 1 , ..., c N according to an “easy to handle” probability distribution P ∗
• assign to each sample a weight wi =
P(c) P ∗ (c)
• if P(c) > 0 implies P ∗ (c) > 0 then the average
P
Γ(c i )⊆X
N
wi
is an
unbiased estimator of Bel(X )
• try to use P ∗ as close as possible to the real one P • strategies are proposed to compute P(C) = c P(c)
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
96 / 229
Propagation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
97 / 229
Propagation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals Sister theories Advances Toolbox
Graphical models for belief functions • to tackle complexity, a number of local computation schemes have been proposed • • • •
Barnett’s computational scheme Gordon and Shortliffe’s diagnostic trees Shafer and Logan’s hierarchical evidence Shafer-Shenoy architecture
• later on, these developed into graphical models for reasoning with conditional belief functions: • Cano et al - propagating uncertainty in directed acyclic networks • Xu and Smets - Evidential networks with conditional belief functions • Shenoy - graphical representation of valuation-based systems (VBS), called valuation networks • Ben Yaghlane and Mellouli - Directed Evidential Networks
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
98 / 229
Propagation
Barnett’s method
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
99 / 229
Propagation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning
Barnett’s method
Barnett’s scheme, 1981 • computations linear in the size of Ω if all BFs to combine are simple support focused on singletons or their complements
• simple support function -> as focal elements only A or Ω • assume we have a Belω with as FEs only {ω, ω ¯ , Ω} for all ω, and we want to combine them
• uses the fact that the plausibility of the combined BF is a function of P their input BFs’ commonalities Q(A) =
Computation
Pl(A) =
Propagation
X
(−1)|B|+1
B⊆A,B6=∅
Barnett’s method
B⊇A
Y
m(B):
Qω (B)
ω∈Ω
Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals Sister theories Advances Toolbox
• we get that Pl(A) = K
1+
X ω∈A
Y Belω (¯ Belω (ω) ω) − 1 − Belω (ω) 1 − Belω (ω)
!
ω∈A
• the computation of a specific plausibility value Pl(A) is linear in the size of Ω (only elements of A and not subsets are involved)
• however, the number of events A themselves is still exponential
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
100 / 229
Propagation
Diagnostic trees
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
101 / 229
Propagation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics
Diagnostic trees
Gordon and Shortliffe’s scheme based on diagnostic trees
• they are interested in computing degrees of belief only for events forming a hierarchy (diagnostic tree)
• (in some applications certain events are not relevant, e.g. classes of diseases)
Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
• combine simple support functions focused on or against the nodes
Sister theories
• produces good approximations, unless evidence is highly conflicting
Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
102 / 229
Propagation Belief functions for the working scientist
Gordon and Shortliffe’s scheme based on diagnostic trees
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics
• however, intersection of complements produces FEs not in the tree • approximated algorithm: 1 first we combine all simple functions focussing on the node
events (by Dempster’s rule)
Inference Conditioning
2 then, we successively (working down the tree) combine those
focused on the complements of the nodes
Computation Propagation
3 tricky bit: when we do that, we replace each intersection of
FEs with the smallest node in the tree that contains it
Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
Diagnostic trees
• results depends on the order of the combination in phase 2 • again approximation can be poor, also no degrees of belief are assigned to complements of nodes
• therefore, we cannot compute their plausibilities!
Sister theories Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
103 / 229
Propagation
Shafer-Shenoy architecture
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
104 / 229
Propagation Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Shafer-Shenoy architecture
Shafer-Shenoy architecture Qualitative Conditional Independence
• uses qualitative Markov trees, which generalise both diagnostic trees and causal trees (Pearl). Extend Pearl’s idea to BFs
Rationale Basic notions
• partitions Ψ1 , ..., Ψn of a frame are qualitatively conditionally independent (QCI) given the partition Ψ if
Semantics
P ∩ P1 ∩ ... ∩ Pn 6= ∅
Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
whenever P ∈ Ψ, Pi ∈ Ψi and P ∩ Pi 6= ∅ for all i
• example: {θ1 } × {θ2 } × Θ3 and Θ1 × {θ2 } × {θ3 } are QCI on Θ1 × Θ2 × Θ3 given Θ1 × {θ2 } × Θ3 for all θi ∈ Θi
• does not involve probability, but only logical independence • stochastic conditional independence does imply the above • if two BFs Bel1 and Bel2 are carried by partitions Ψ1 , Ψ2 which are QCI given Ψ then
Sister theories Advances Toolbox
F. Cuzzolin and T. Denoeux
(Bel1 ⊕ Bel2 )Ψ = (Bel1 )Ψ ⊕ (Bel2 )Ψ Belief functions for the working scientist
UAI 2015
105 / 229
Propagation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale
Shafer-Shenoy architecture
Qualitative Markov trees Shafer-Shenoy architecture
• given a tree, deleting a node and all incident edges yields a forest denote the collection of nodes of the j-th subtree by αm (j)
Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals Sister theories
• a qualitative Markov tree is a tree of partitions s.t. for every node i the minimal refinements of partitions in αm (j) for j = 1, ..., k are QCI given Ψi
• a Bayesian causal tree becomes a qualitative Markov tree whenever we associate each node B with the partition ΨB associated with the random variable vB
• a QMT remains such if we insert between parent and child their common refinement
• can also be constructed from a diagnostic tree (left), same interpolation property holds
Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
106 / 229
Propagation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference
Shafer-Shenoy architecture
Propagating belief functions on qualitative Markov trees
• each BF to combine has to be carried by a partition in the tree • idea: replace Dempster’s combination over the whole frame with multiple implementations over partitions
• a processor located at each node Ψi combines BFs using Ψi as a frame and projects BFs to its neighbours
Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals Sister theories
1 send Beli to its neighbours 2 whenever it gets a new input, computes
(Bel T )Ψi ← (⊕{(Belx )Ψi : x ∈ N(i)} ⊕ Beli )Ψi 3 computes Beli,y ← (⊕{(Belx )Ψi : x ∈ N(i) \ {y }} ⊕ Beli )Ψy and sends it to its neighbour y , for each neighbour
• inputting new BFs in the tree can take place asynchronously • final result of each processor: coarsening to that partition of the combination of all inputted BFs: (⊕j∈J Belj )Ψi
Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
107 / 229
Propagation Belief functions for the working scientist
Shafer-Shenoy architecture
Graphical representation Shafer-Shenoy architecture
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals Sister theories
• total time to reach equilibrium is proportional to the tree’s diameter
Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
108 / 229
Propagation
Directed Evidential Networks
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Sister theories Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
109 / 229
Propagation Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals Sister theories Advances Toolbox
Directed Evidential Networks
Directed evidential networks Ben Yaghlane and Mellouli, 2008
• Evidential networks with conditional belief functions (ENC) were originally proposed by Xu and Smets for the propagation of beliefs • (Dempster’s conditioning is adopted)
• ENCs contain a directed acyclic graph with conditional beliefs defined in a different manner from conditional probabilities in Bayesian networks (BNs) • edges represent the existence of a conditional BF (no form of independence assumed)
• initially defined only for binary (conditional) relationships
• Ben Yaghlane and Mellouli generalised ENCs to any number of nodes - directed evidential network (DEVN) • a directed acyclic graph (DAG) in which directed arcs describe the conditional dependence relations expressed by conditional BFs for each node given its parents • new observations introduced in the network are represented by belief functions allocated to some nodes
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
110 / 229
Propagation Belief functions for the working scientist
Directed evidential networks Ben Yaghlane and Mellouli, 2008
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
Decisions BFs on reals
Directed Evidential Networks
• problem: given n BFs Bel1 , ..., Beln over X1 , ..., Xn we seek the marginal on Xi of their joint belief function
• uses the generalised Bayesian theorem (GBT) to compute the posterior Bel(x|y ) given the conditional Bel(y |x)
• the marginal is computed for each node by combining all the messages received from its neighbors and its own prior belief: P Bel X = Bel0X ⊕ BelY →X , BelY →X (x) = y ⊆ΘY m0 (y )Bel(x|y ) where Bel(x|y ) is given by GBT
• another application of the message-passing idea to belief functions • propose a simplified scheme for simply directed networks • extension to DEVNs by first transforming them to binary join trees
Sister theories Advances Toolbox
F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
111 / 229
Decisions Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Sister theories
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
112 / 229
Decisions Belief functions for the working scientist
Decision making with belief functions
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
BFs on reals Sister theories
An overview
• natural application of belief function representation of uncertainty • problem: selecting an act f from an available list F (making a “’decision’), which optimises a certain objective function
• various approaches to decision making • decision making in the TBM is based on expected utility via pignistic transform • Strat has proposed something similar in his “cloaked carnival wheel” scenario • generalised expected utility [Gilboa] based on classical expected utility theory [Savage,von Neumann]
• a lot of interest in multicriteria decision making (based on a number of attributes)
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
113 / 229
Decisions Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions
Expected utility approach Decision making under uncertainty
• a decision problem can be formalized by defining: • a set Ω of states of the world; • a set X of consequences; • a set F of acts, where an act is a function f : Ω → X
• let < be a preference relation on F, such that f < g means that f is at least as desirable as g
• Savage (1954) has showed that < verifies some rationality requirements iff there exists a probability measure P on Ω and a utility function u : X → R s.t.
Decision making in the TBM
∀f , g ∈ F,
Strat’s decision apparatus Upper and lower expected utilities
BFs on reals Sister theories Advances
f < g ⇔ EP (u ◦ f ) ≥ EP (u ◦ g)
where EP denotes the expectation w.r.t. P
• P and u are unique up to a positive affine transformation • does that mean that basing decisions on belief functions is irrational?
Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
114 / 229
Decisions
Decision making in the TBM
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Sister theories
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
115 / 229
Decisions Belief functions for the working scientist
Decision making in the TBM Expected utility using the pignistic probability
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions
Decision making in the TBM
• in the TBM, decision making is done by maximising the expected utility of actions based on the pignistic transform
• (as opposed to computing upper and lower expected utilities directly from (Bel, Pl) via Choquet integral, as we will see later)
• the set of possible actions F and the set Ω of possible outcomes are distinct, and the utility function is defined on F × Ω
• Smets proves the necessity of the pignistic transform by maximizing
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
E[u] =
X
u(f , ω)Pign(ω)
ω∈Ω
BFs on reals Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
116 / 229
Decisions
Strat’s decision apparatus
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Sister theories
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
117 / 229
Decisions Belief functions for the working scientist
Strat’s decision apparatus
Strat’s decision apparatus [UAI 1990]
F. Cuzzolin and T. Denoeux
• Strat’s decision apparatus is based on computing intervals of expected values
Rationale Basic notions
• assumes that the decision frame Ω
Semantics
is itself a set of scalar values (e.g. dollar values, see left) - does not distinguish between utilities and elements of Ω (returns)
Inference Conditioning Computation Propagation Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
• .. so that an expected value interval can be computed: E(Ω) = [E∗ (Ω), E ∗ (Ω)], where . X . X E∗ (Ω) = inf(A)m(A), E ∗ (Ω) = sup(A)m(A) A⊆Ω
A⊆Ω
BFs on reals Sister theories Advances
• not good enough to make a decision, e.g.: should we pay a 6$ ticket when the expected interval is [5$, 8$]?
Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
118 / 229
Decisions Belief functions for the working scientist
Strat’s decision apparatus A probability of favourable outcome
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation
Strat’s decision apparatus
• Strat identifies ρ as the probability that the value assigned to the hidden sector is the one the player would choose
• 1 − ρ is the probability that the sector is chosen by the carnival hawker
Theorem The expected value of the mass function of the wheel is E(Ω) = E∗ (Ω) + ρ(E ∗ (Ω) − E∗ (Ω))
Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
BFs on reals
• to decide whether to play the game we only need to assess ρ • basically, this amounts to a specific probability transform (like the pignistic one)
• Lesh, 1986 had also proposed a similar approach
Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
119 / 229
Decisions
Upper and lower expected utilities
Belief functions for the working scientist F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation Propagation Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
BFs on reals
Outline Conditional events as equivalence classes Generalised Bayes theorem The total belief theorem
1 Rationale 2 Basic notions Belief functions Dempster’s rule Families of frames
6 Computation
3 Semantics Lower probabilities Credal sets Set functions Generalised probabilities Random sets
4 Inference
Probability transformation Possibility transformation Monte-Carlo methods
7 Propagation Barnett’s method Diagnostic trees Shafer-Shenoy architecture Directed Evidential Networks
8 Decisions
Dempster’s approach Likelihood-based From preferences
5 Conditioning Dempster’s conditioning Lower conditional envelopes Unnormalised vs geometric conditioning
Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
9 BFs on reals Allocations of probabilitity Random sets
Sister theories
Random closed intervals
10 Sister theories Imprecise probability Fuzzy sets and possibility p-Boxes
11 Advances Matrix representation Geometry Distances Combinatorics Uncertainty measures
12 Toolbox Classification Ranking aggregation
13 Applications A brief survey Climate change Pose estimation
14 Future trends
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
120 / 229
Decisions Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Upper and lower expected utilities
Savage’s axioms • Savage has proposed 7 axioms, 4 of which are considered as meaningful (the others are rather technical)
Rationale
Semantics
• let us examine the first two axioms: • Axiom 1: < is a total preorder (complete, reflexive and transitive)
Inference
• Axiom 2 [Sure Thing Principle]. Given f , h ∈ F and E ⊆ Ω, let fEh
Basic notions
Conditioning
denote the act defined by
Computation
( f (ω) if ω ∈ E (fEh)(ω) = h(ω) if ω 6∈ E
Propagation Decisions Decision making in the TBM Strat’s decision apparatus
• then the Sure Thing Principle states that ∀E, ∀f , g, h, h0 , fEh < gEh ⇒ fEh0 < gEh0
Upper and lower expected utilities
BFs on reals Sister theories
• this axiom seems reasonable, but it is not verified empirically!
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
121 / 229
Decisions Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Ellsberg’s paradox • suppose you have an urn containing 30 red balls and 60 balls, either black
Rationale Basic notions Semantics Inference Conditioning
Upper and lower expected utilities
• • •
Computation
or yellow. Consider the following gambles: • f1 : you receive 100 euros if you draw a red ball • f2 : you receive 100 euros if you draw a black ball • f3 : you receive 100 euros if you draw a red or yellow ball • f4 : you receive 100 euros if you draw a black or yellow ball in this example Ω = {R, B, Y }, fi : Ω → R and X = R the four acts are the mappings in the left table empirically it is observed that most people strictly prefer f1 to f2 , but they strictly prefer f4 to f3
Propagation Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
BFs on reals Sister theories Advances
f1 f2 f3 f4
R 100 0 100 0
B 0 100 0 100
Y 0 0 100 100
Now, pick E = {R, B}: by definition f1 {R, B}0 = f1 , f1 {R, B}100 = f3 ,
f2 {R, B}0 = f2 f2 {R, B}100 = f4
• since f1 < f2 , i.e. f1 {R, B}0 < f2 {R, B}0 the Sure Thing principle would imply f1 {R, B}100 < f2 {R, B}100, i.e., f3 < f4
• empirically the Sure Thing Principle is violated!
Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
122 / 229
Decisions Belief functions for the working scientist F. Cuzzolin and T. Denoeux
Upper and lower expected utilities
Gilboa’s theorem • Gilboa (1987) proposed a modification of Savage’s axioms with, in particular, a weaker form of Axiom 2
Rationale Basic notions Semantics
• a preference relation < meets these weaker requirements iff there exists a (non necessarily additive) measure µ and a utility function u : X → R such that
Inference
∀f , g ∈ F ,
Conditioning Computation Propagation Decisions Decision making in the TBM Strat’s decision apparatus Upper and lower expected utilities
BFs on reals Sister theories
f < g ⇔ Cµ (u ◦ f ) ≥ Cµ (u ◦ g),
where Cµ is the Choquet integral, defined for X : Ω → R as Z +∞ Z 0 Cµ (X ) = µ(X (ω) ≥ t)dt + [µ(X (ω) ≥ t) − 1]dt. 0
−∞
• given a belief function Bel on Ω and a utility function u, this theorem supports making decisions based on the Choquet integral of u with respect to Bel or Pl
Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
123 / 229
Decisions Belief functions for the working scientist
Lower and upper expected utilities
F. Cuzzolin and T. Denoeux Rationale
Upper and lower expected utilities
• for finite Ω, it can be shown that
Basic notions
CBel (u ◦ f ) =
Semantics
X
m(B) min u(f (ω))
B⊆Ω
ω∈B
Inference
CPl (u ◦ f ) =
Conditioning
Decisions
m(B) max u(f (ω))
B⊆Ω
Computation Propagation
X
ω∈B
• let P(Bel) as usual be the set of probability measures P compatible with Bel, i.e., such that Bel ≤ P. Then, it can be shown that
Decision making in the TBM Strat’s decision apparatus
CBel (u ◦ f ) =
min EP (u ◦ f ) = E(u ◦ f )
P∈P(Bel)
Upper and lower expected utilities
BFs on reals
CPl (u ◦ f ) = max EP (u ◦ f ) = E(u ◦ f ) P∈P(Bel)
Sister theories Advances Toolbox F. Cuzzolin and T. Denoeux
Belief functions for the working scientist
UAI 2015
124 / 229
Decisions Belief functions for the working scientist
Upper and lower expected utilities
Decision making Strategies
F. Cuzzolin and T. Denoeux Rationale Basic notions Semantics Inference Conditioning Computation
• for each act f we have two expected utilities E(f ) and E(f ). How do we make a decision?
• possible decision criteria based on interval dominance: 1 2 3 4
Propagation
Decision making in the TBM
Sister theories
iff E(u ◦ f ) ≥ E(u ◦ g) (conservative strategy) iff E(u ◦ f ) ≥ E(u ◦ g) (pessimistic strategy) iff E(u ◦ f ) ≥ E(u ◦ g) (optimistic strategy) iff
for some α ∈ [0, 1] called a pessimism index (Hurwicz criterion)
Strat’s decision apparatus
BFs on reals