Bayesian Networks vs. Evidential Networks: An Application to Convoy Detection Evangeline Pollard, Mich`ele Rombaut, and Benjamin Pannetier ONERA - BP72 - 29 avenue de la Division Leclerc FR-92322 Chˆ atillon Cedex, France GIPSA-Lab 961 rue de la Houille Blanche - BP 46 - 38402 Grenoble Cedex, France
[email protected],
[email protected],
[email protected] http://www.onera.fr, http://www.gipsa-lab.inpg.fr/
Abstract. In this article, the evidential network is combined with a temporal credal filter in order to incorporate the time information and describe the information propagation from a node to another one. Then we describe an application in convoy detection and propose a complex simulated scenario. The results are compared with those of our previous approach with Bayesian networks. Keywords: Bayesian networks, Evidential networks, Convoy detection.
1
Introduction
Graphical models, first formalized by Pearl [1], are commonly used for many applications like medical diagnosis, situation assessment [2] or biological applications [3]. Also called Bayesian networks, they are the merging of graph theory with probabilistic theory. They have a powerful formalism for reasoning under uncertainty because they compute the variable trends of a system, which is intuitively represented by a directed acyclic graph. This graph is composed of nodes and edges. The nodes correspond to the set of random variables representing the system evolution, and edges represent the dependencies between random variables, quantified by a set of conditional probability distributions. The dominant relevance is to limit the computational complexity by using the fundamental following formula: n p (xi |P a(xi )) (1) p(x1 , . . . , xn ) = i=1
where P a(xi ) represents the parent node set of node xi . This kind of graphical model is specially effective when a very complete statistical knowledge description of the modeled system is available. If not, the use of a priori can strongly influence the final results. Evidential networks, which are graphical models in an evidential context, are less known, but have similar applications in system analysis [4] or threat assessment [5]. They were for the first time formalized by Xu and Smets [6,7] by using a generalization of Bayes’ theorem where all conditional probabilities are E. H¨ ullermeier, R. Kruse, and F. Hoffmann (Eds.): IPMU 2010, Part I, CCIS 80, pp. 31–39, 2010. c Springer-Verlag Berlin Heidelberg 2010
32
E. Pollard, M. Rombaut, and B. Pannetier
replaced by conditional belief functions. They limit the use of a priori and are consequently more flexible to model the knowledge. On the same principle, evidential networks are the merging of graph theory with evidential theory, also called Dempster-Shafer theory. The main idea, in evidential theory, is to consider a larger frame of discernment as in probability, called power set. Indeed, instead of strictly considering probability distributions on Ω = {ω1 , . . . , ωn }, where wi represents a hypothesis by itself, the belief are also computed on the subset of Ω. In this paper, we adopt the Transferable Belief Model (TBM) representation which proposes to manage uncertainty in two levels: the credal level where beliefs are addressed and the pignistic level where beliefs are used to make decisions. The main difference between DempsterShafer theory and TBM representation is that with the latter, belief functions are unnormalized and the mass on conflict m(∅) can be non empty. In this case, it asks the question of the origin of this conflict (unreliable sources, missing hypothesis, open world. . . ) [8]. In this paper, we first review transferable belief model formalism. In the second part, we describe the theoretical implementation of evidential networks, before we describe our specific application of convoy detection in the third part. Finally, we give some simulation results by using a complex simulated scenario.
2 2.1
The Transferable Belief Model (TBM) Framework Background
The main idea with the Transferable Belief Model is to attribute a belief distribution on a variable to a larger state space as with probability. The power set of Ω, denoted 2Ω , is a set composed of hypotheses and joined hypotheses and is of size 2|Ω| . The basic belief assignment mΩ (bba) is defined such that: mΩ : 2Ω → [0, 1] B → mΩ (B)
(2)
m(B) = 1
(3)
B∈2Ω
The belief, given to a hypothesis or to a junction between hypotheses can be also expressed according to some other elementary functions called the plausibility function pl, the commonality function q and the the implicability function b, where ∀A ⊆ 2Ω : mΩ (B) (4) bΩ (A) = belΩ (A) + mΩ (∅) belΩ (A) = (6) B⊆A,B=φ
plΩ (A) =
A∩B=φ
mΩ (B)
(5)
q Ω (A) =
mΩ (B)
(7)
B⊇A
These elementary functions bel, pl, q, b are in one-to-one correspondence with m. If bel and pl can be easily understood as the minimal and maximal likelihood
Bayesian Networks vs. Evidential Networks
33
admitted to a proposition A ∈ 2Ω , q and b are less intuitive, but their formalisms are convenient for the propagation mechanism, as we will see in the next parts. 2.2
The Combination Rules
The Conjunctive Rule of Combination (CRC). It is an associative and commutative operation that combines belief functions coming from reliable and independent sources: Ω Ω Ω ∪ m2 (A) = mΩ mΩ (8) ∪ 2 (A) = m1 1 1 (B) .m2 (C) B∪C=A
The same equation can be more conveniently expressed with the commonality: Ω Ω Ω Ω ∪ b2 (A) = b1 (A) .b2 (A) bΩ (9) ∪ 2 (A) = b1 1 The Disjuntive Rule of Combination (DRC). The DRC is defined as the combination rule for unreliable sources. It can also be seen as a combination rule which can deal with conflict: Ω Ω Ω ∩ m2 (A) = mΩ (10) mΩ ∩ 2 (A) = m1 1 1 (B) .m2 (C) B∩C=A
The same equation can be more conveniently expressed with the implicability: Ω Ω Ω Ω ∩ q2 (A) = q1 (A) .q2 (A) (11) q1Ω ∩ 2 (A) = q1 2.3
Generalized Bayes Theorem (GBT)
The GBT performs the same task as the Bayesian theorem, but within the TBM conflict. Given the set of conditional basic belief assignments mΩ [θi ], ∀θi ∈ Θ,∀ω ∈ Ω, if θ ⊂ Θ: plΘ [ω](θ) = 1 − 1 − plΩ [θi ](ω) (12) θi ∈θ
2.4
Temporal Belief Filter
A temporal belief filter is proposed as in [9,10] in order to ensure a temporal consistency: the presence of objects of interest can be based on a long term detection. The predicted belief on Xi at time k can be written as: m ˆ Ωi (Xik ) = F Ωi .mΩi (Xik−1 )
(13)
ˆ Ωi (X k ) is where mΩi (X k−1 ) at time k − 1 is the belief function at time k − 1, m Ωi the predicted belief function at time k and F is the temporal evolution model.
34
E. Pollard, M. Rombaut, and B. Pannetier
¯ i }), the following Assuming that each node Xi is a binary node (Ωi = {Xi , X vector notation for the belief distribution is used: ¯ i ) mΩi (Ωi )]T mΩi = [mΩi (∅) mΩi (Xi ) mΩi (X
(14)
Finally, the temporal evolution model F Ωi of size 2|Ωi | × 2|Ωi | is written as: ¯ i ) F Ωi (Ωi )] F Ωi = [F Ωi (∅) F Ωi (Xi ) F Ωi (X
(15)
with F Ωi (∅) = [1 0 0 0]T and F Ωi (Ωi ) = [0 0 0 1]T because all con¯ i )) represents the model flict/doubt is transferred on itself. F Ωi (Xi ) (resp. F Ωi (X evolution of the node Xi if its value is true (resp. false). In this case, the belief ¯ i ) is partly transferred on Xi (resp. X ¯ i ) according to a certain on Xi (resp. X confidence αT (resp. αF ) as: F Ωi (Xi ) = [0 αT
0 1 − αT ]T
¯ i ) = [0 0 αF 1 − αF ]T F Ωi (X
(16)
Finally, the obtained belief at time k is combined with the measured belief distribution. This combination is made according to a CRC (cf. 2.2) which highlights ˆ Ω (X k ) and is conflict between the prediction m ˜ Ω (X k ) and the measurement m written as: ∩ m ˜ Ω (X k ) ˆ Ω (X k ) (17) mΩ (X k ) = m 2.5
Discounting a Belief Function
The discounting process is used to reduce the influence of a source of information. The new basic belief assignment α mΩ is computed from mΩ using the parameter α ∈ [0, 1]: α Ω m (A) = (1 − α) mΩ (A) ∀A ⊂ Ω (18) α Ω m (Ω) = (1 − α) mΩ (Ω) + α
3
Dynamic Evidential Networks
A basic dynamic network illustrated in Figure 1, where X1 and X2 are parent nodes and the belief distribution on X3 is computed at each iteration k illustrates Dynamic Evidential Network principle. But, before describing the inference mechanism, it is necessary to develop the two initialization steps:
X1
k−1
X3
X2
X1
k
X2
X3
Fig. 1. A very simple example of evidential network
Bayesian Networks vs. Evidential Networks
35
1. Prior mass belief establishment: With Bayesian approach, the first step would be to establish p(X3 |X1 , X2 ) of size 2 × 2, but whose size can quickly increase depending on the number of feasible states for X1 and X2 and more generally on the number of parent nodes. With evidential network, the conditional beliefs mΩ3 [Xi ] are established for each parent node i, ∀i ∈ {1, 2} and independently of the others according to the knowledge on the system. However only conditional belief functions on X3 knowing that Xi is in the ¯ i can be established. The belief knowing that the node Xi is state Xi or X in the state Ωi cannot be intuitively established but is computed ∀i ∈ {1, 2} by using the DRC as in equation (11): ¯ i ](X3 ) bΩ3 [Ωi ](X3 ) = bΩ3 [Xi ](X3 ).bΩ3 [X
(19)
2. Discounting coefficient establishment: When a node depends on many other nodes, it is possible to modify the importance of each node by using discounting coefficients. Another point of view could be that the parent nodes can be seen as independent sources which are strongly or weakly weighted, depending on their reliability. The inference mechanism is now decomposed in simple operations for the basic dynamic network illustrated in Figure 1. 1. Data transformation: Data are transforming into belief distribution m ˜ Ωi for root nodes Xi . This transformation can be made by using fuzzy sets or Rayleigh distributions as done in [11]. 2. Propagation: The information from parent nodes Xi are propagated to the node X3 . The obtained belief distributions are denoted mΩ i→3 . These are computed by using plausibility with the GBT equation (12) as, ∀X3 ⊆ Ω3 : Ω3 (X3 ) = pli→3
plΩ3 [Xi ](X3 ).m ˜ Ωi (Xi )
(20)
X3 ⊆Ω3
3. Discounting: If discounting coefficients are αi for each node Xi , the formula αi Ω (18) is applied on propagated belief distributions mΩ mi→3 . i→3 to obtain 4. Combination: Discounted propagated belief distributions are finally combined by using the CRC with implicabilities as in equation (9): Ω3 Ω3 (X3 ).αi q2→3 (X3 ) q Ω3 (X3 ) =αi q1→3
(21)
5. Time propagation: Assuming that the node X3 evolves according to a model F3Ω , it is possible to predict the belief m ˆ Ω3 (X3 ) and to combine it Ω3 with the measured belief function m ˜ (X3 ) according to the equation (17). It is accepted that this propagation algorithms can only be applied on naive networks. With more complex networks, inference algorithms must be applied as in [12].
36
4 4.1
E. Pollard, M. Rombaut, and B. Pannetier
Application to the Convoy Detection Application Description
The human expert describes a convoy as a vehicle set evolving approximately with the same dynamics over a long time. These vehicles are moving on the road at a limited velocity (