Automated reasoning with merged contradictory information whose ...

1 downloads 89 Views 155KB Size Report
a girl, wearing a Chanel suit, jumping into a sport Volkswagen car. The man says that he saw a girl wearing a dress. He assumed that she jumped into a ...
Automated reasoning with merged contradictory information whose reliability depends on topics Laurence Cholvy ONERA-CERT 2 avenue Edouard Belin,31055 Toulouse, France [email protected]

Abstract

This paper presents a theorem prover for reasoning with information which is provided by several information sources and which may be contradictory. This prover allows the user to assume that the di erent sources are more or less reliable, depending on the topics of the information. Theorems which can be proved by this prover are of the form : if the sources were ordered in such a way for such a topic, then such a formula would be deducible.

1

Introduction

This paper addresses the problem of reasoning with merged information which is contradictory. Such a problem is raised in case of multi-database systems [10] [13] [23] where one wants to access several distributed databases or when one wants to build a new database from several existing databases which have been independently developped. The problem of merging information is also generated when adding a new piece of information to an initial set of information. This problem has been studied in many ways : it has initially been studied in database area [1], [17], [25], [14], [24]... It has also been studied in the area of belief revision [15], [21], [16]... Finally, the problem of merging information is generated in multi-source environment [2], [3], [9], [6], [5]... Roughly speaking, our solution to the problem of merging contradictory information consists in considering the relative reliability of the information i.e. considering the relative reliability of the sources which provide the information. In a rst step, we had considered that this reliability only depends on the sources. But in fact, to be more realistic, we have considered that the reliability of an information source also depends on what the information is talking about i.e. its topics. So our present solution consists in considering as many orders between the sources, as topics of information. Each order associated to a topic, represents the relative reliability of the sources as regard to the information which belongs to this topic. Let us take an example. Consider a police inspector who collects information provided by two witnesses of a crime who are a woman and a man. They both provide information about what they have seen. The woman says that she saw a girl, wearing a Chanel suit, jumping into a sport Volkswagen car. The man says that he saw a girl wearing a dress. He assumed that she jumped into a

car that he did not see but he heard that it was a diesel. The two accounts are contradictory : Did the girl wear a dress or a suit ? Was the car a sport car or not ? For solving these contradictions, the inspector may use the fact that, when speaking about \clothes", women are generally more expert than men ; and when speaking about \mechanics", men are generally more expert than women. This leads the inspector to assume two orders, depending on the two topics \clothes" and \mechanics" : the woman is more reliable than the man as regard to \clothes" and the man is more reliable than the woman as regard to \mechanics". Using such orders, the inspector can adopt two attitudes we called \trustful" and \suspicious" [6]. Adopting a \trustful" attitude, the inspector could conclude that there was a girl, wearing a Chanel suit, jumping into a diesel car which was a Volkswagen make. Adopting a \suspicious" attitude, the inspector had derived that there was a girl, wearing a Chanel suit, jumping into a diesel car whose brand is unkown. In section 2, we recall the semantics we have de ned, in previous works, for reasoning with merged information using topics-dependent orders and according to \trustful" attitude. In section 3, we de ne a theorem-prover for automatically reasoning according to this semantics. This prover, which takes into account the reliability of the sources depending on the topics, constitutes the main contribution of our paper. 2

A semantics for merging information sources with topic-dependent orders

2.1

Modelisation of topics for the problem of merging

The notion of topic has been investigated to characterize sets of sentences from the point of view of their meaning, independently of their truth values. For example, in the context of Cooperative Answering, topics can be used to extend an answer to other facts related to the same topic [8, 4] . In the context of Knowledge Representation [18] , topics are used to represent all an agent believes about a given topic. In other works [22, 12] the formal de nition of the notion of \aboutness" is investigated in general. The purpose of this paper is not to de ne a logic for reasoning about the links between a sentence and a topic in general, but to de ne a logic that is based on source orders which depend on topics. We assume that the underlying propositional language is associated to a nite number of topics which are sets of literals such that : each literal of L belongs to a topic ; topics may intersect ; if a literal of L belongs to a topic, its negation belongs to this topic too.

De nition 1

Let t1 . . . tm be the topics of L. Let O1 . . . Om be total orders on the bases, associated with the topics t1 . . . tm . O1 . . . Om are (t1 . . . tm )-compatible i 8 k = 1 . . . m , 8 r = 1 . . . m tk \ tr 6= ; =) Ok = Or . Intuitively, this de nition characterizes orders which, in some sense, \agree on" the structure of the topics.

2.2

Semantics of the logic

The databases we consider are nite sets of literals which are satis able but not necessarily complete. We will note them 1...n. Let us note L the underlying propositional language and let t1 :::tm be the topics on L. We de ne a language L' by adding to language L, a nite number of modalities noted [O1 :::Om], where the Oi 's are orders on subsets of the databases, which are assumed to be total and to be (t1 :::tm)-compatible. Let O1 ; :::; Om be total and (t1 :::tm)-compatible orders on k databases, then the formula [O1 :::Om] F will mean that F is deducible in the database obtained by merging the k databases when considering that their relative reliability are expressed by orders O1 :::Om. Notice that the general form of these modalities allows us to represent the particular case when k = 1. In such a case, there is only one base to order, say i0, thus O1 = ::: = Om = i0.

De nition 2

Let m be an interpretation of L and let t be a topic. We de ne the projection of m on t, noted m j t by : m j t def = fl : l 2 m and l 2 tg

De nition 3

Let E be a set of interpretations of L and let t be a topic. We de ne : E j t def = fm j t : m 2 E g

De nition 4

Let t be a topic and O the total order (i1 > 1 1 1 > ik ) on k databases, relatively to topic t. We de ne : S Rt (O) def = fik ;t(1 1 1 (fi2 ;t (R(i1) j t)) 1 1 1), with : fij ;t (E) = m2R(ij )jt Min(E; m )

De nition 5

An interpretation of FUSION-T is a pair M = (W,r), where W is the set of all the interpretations of L, and r is a nite set of subsets of W such that every modality [O1:::Om ] is associated to one of these subsets which is noted R(O1 :::Om). Sets R(O1 :::Om) are recursively de ned by : R(i...i) is a non empty subset of W R(O1 :::Om) def = fw : w = w1 [ ::: [ wm , where : and

8 t 2 [1::m], wt 2 Rt(Ot) 8 l 2 L, l 62 w or : l 62 w g

De nition 6 (Satisfaction of formulas) Let F be a formula of L. Let F1 and F2 be two formulas of L'. Let O1 :::Om be (t1 :::tm)-compatible total orders on some databases. Let (W,r) an interpretation of FUSION-T, let w 2 W. The satisfaction of formulas is de ned by :

FUSION-T,r,w j= F FUSION-T,r,w j= [O1 :::Om] F FUSION-T,r,w j= : F1 FUSION-T,r,w j= F1 ^ F2

i w j= F i 8 w', if w' 2 R(O1 :::Om) then w' j= F i FUSION-T,r,w 6j= F1) i (FUSION-T,r,w j= F1) and (FUSION-T,r,w j= F2)

De nition 7 (Valid formulas) let F be a formula of L'. F is a valid formula in FUSION-T, i 8 M = (W,r), 8 w 2 W, FUSION-T,r,w j= F. (We note them FUSION-T j= F) De nition 8

We note

the conjunction, for any database, of all the literals it believes and

the clauses it does not believe.

=

^^ n

( [i]l ^

2

i=1 l i

c is a clause and i's are the databases.

^ 6`

:[i]c),

where l is a literal,

i c

We are interested in nding valid formulas of the form : ( ! [O1:::Om] F), i.e nding formulas F which are deducible in the database obtained by merging several databases ordered by topic-dependent orders O1 :::Om. Let M0 be the interpretation of L' in which each set R(i...i) is the set of all the models of the ith base. Let O1 :::Om be topic-compatible orders. We have proved, in previous works, that the set R(O1:::Om ), associated in M0 to the modality [O1:::Om ] is not empty. This guarantees that, when the databases to be merged are satis able sets of literals and when the orders are topic-compatible, the semantics de nes a database whose set of models, R(O1:::Om ) is never empty. Thus, this database is satis able even if the merged databases are contradictory. We have also proved that FUSION-T j= ( ! [O1 :::Om]F) if and only if 8 w 2 R(O1 :::Om), w j= F. So, for proving that a formula F is deducible in the database obtained by merging several databases ordered by topic-compatible orders O1:::Om , we just have to compute R(O1:::Om ) by de nition 5, assuming that sets R(i...i) are the sets of all the models of the ith base. Let us add that a complete and sound axiomatics has been given for this semantics [7] 2.3

Example

Let us come back to the police inspector example and consider the two topics \clothes" and \mechanics". The orders, relatively to these topics were : Oclothes = (woman > man) and Omechanics = (man > woman) Here are some deductions the inspector can make : FUSION-T j= ( ! [Oclothes Omechanics ] (Chanel ^ tailleur ^ : dress)) FUSION-T j= ( ! [Oclothes Omechanics ] (diesel ^ : sport-car ^ Volkswagen)) FUSION-T j= ( ! : [Oclothes Omechanics ] (: Volkswagen))

In other terms, when assuming that the woman is more reliable than the man on \clothes" and that the man is more reliable than the woman on \mechanics", the inspector can deduce that the girl was wearing a Chanel suit, and not a dress, and that she jumped into a diesel car, and not a sport-car, which was a Volkswagen make. 3

Automated deduction

In this section, we deal with the implementation aspects of the logic FUSION-T introduced in the previous section. We describe a theorem prover which allows us to answer questions of the form : is formula F deducible in the database obtained from trustfully merging several databases ordered by topic-dependent orders [O1...Om] ? This theorem-prover is an extension of the one described in [5] since instead of considering only one order on information sources, it considers several orders depending on topics. It is speci ed at the meta-level and implemented in a PROLOG-like language. 3.1

The meta-language

Let us consider a meta-language ML, based on language L, de ned by :

Constants : - Propositions of L are constants of ML - There is a constant noted nil which will represent the empty order - There are as many constants as databases 1...n Functions : - An unary function noted : . : l will represents the negation of literal l. - A function noted >. The term O > i represents the extension of the order O with i. For instance (i1 > i2 ) > i3 is the order (i1 > i2 > i3). By de nition, the order (i1 > :: > in > nil) will be the order (i1 > ::: > in ). - A function noted 0 . By de nition, the term (O 0 i) represents the order obtained from O by deleting i. For instance ((i1 > i2 > i3) 0 i2 ) is the order (i1 > i3). - A m-ary function noted (). (O1 :::Om) is a set of m orders. Predicates : - Predicate symbols of ML are : TFUSION, EMPTY, TOPIC. The intuitive semantics of the predicates is the following : - TFUSION((O1 :::Om), l) means that it is the case that literal l is true in the database obtained by merging the databases according to the orders : O1 :::Om. - EMPTY(O1 ; :::; Om) is true if and only if all the orders Oi are empty. - TOPIC(l,t) means that literal l belongs to the topic t.

3.1.1 The meta-program Let us consider META, the following set of the ML formulas : (0) TOPIC(l,k) for any literal l belonging to topic k (1) TFUSION( (i...i), l) for any literal l in database i (2) TOPIC(l,k) ^ TFUSION( ((O1 0 i):::(Ok01 0 i) Ok (Ok+1 0 i):::(Om 0 i)), l) ^ : EMPTY( (O1 0 i); :::; (Ok01 0 i); Ok ; (Ok+1 0 i); :::; (Om 0 i)) ! TFUSION((O1 :::Ok01 Ok > i Ok+1 :::Om), l) (3) TOPIC(l,k) ^ TFUSION( (i...i), l) ^ : TFUSION(((O1 0 i):::(Ok01 0 i) Ok (Ok+1 0 i):::(Om 0 i)), : l) ^ : EMPTY( (O1 0 i); :::; (Ok01 0 i); Ok ; (Ok+1 0 i); :::; (Om 0 i)) ! TFUSION((O1 :::Ok01 Ok > i Ok+1 :::Om), l) (4) EMPTY(nil...nil)

Proposition (soundness and completeness) Let l be a literal of L, let O1 :::Om be m topic-compatible total orders on a subset of f1..ng. Using negation-as-failure on the meta-progran META, PROLOG succeeds in proving TFUSION((O1 :::Om), l) i FUSION-T j= ( ! [O1:::Om]l). it fails i FUSION-T j= ( ! : [O1 :::Om] l) Sketch of proof

We rst introduce an intermediate meta-program where the negation of literals TFUSION and EMPTY are explicitely represented by new predicates nonTFUSION and nonEMPTY. PROLOG without negation-as-failure, can be used on this meta-program. And we can show that it proves TFUSION((O1 :::Om), l) i FUSION-T ` ( ! [O1:::Om ] l). Its proves nonTFUSION((O1 :::Om), l) i FUSION-T ` ( ! : [O1:::Om ] l). In a second step,, we optimize this intermediate program by only using predicates TFUSION and EMPTY and applying PROLOG with negation-as-failure. 3.2

Extension to any propositional formula

We consider four new meta-axioms and a new function, noted [ , for the management of conjunctions and disjunctions : (5) CFUSION((O1 :::Om),nil) (6) DFUSION((O1 :::Om ),d) ^ CFUSION((O1 :::Om),c) ! CFUSION((O1 :::Om ),d [ c) (7) TFUSION((O1 :::Om),l) ! DFUSION((O1 :::Om),l [ d) (8) DFUSION((O1 :::Om ),d) ! DFUSION((O1 :::Om ), l [ d)

Proposition .

Let F be a formula under its conjunctive normal form in which no disjunction is a tautology . Let (O1 :::Om) be m topic-compatible total orders on a subset of f1..ng. Using negation-as-failure on the meta-progran obtained by adding these four axioms to META, PROLOG proves the goal CFUSION((O1 :::Om ),F) i FUSION-T j= ( ! [(O1 :::Om)]F) ; it fails i FUSION-T j= ( ! : [(O1:::Om)] F)

Sketch of proof

This result is essentially due to the fact that these new axioms are de nite Horn clauses with no function symbols in the left side, and to the fact that the databases we consider are sets of literals. 4

Discussion

Let us notice that the notion of reliability we have introduced in our work, expressed by an order between the databases, is a relative notion. Indeed, we only assume that a database is assumed to be more reliable than another one. We do not assume that even the most reliable database is providing information which is true in the real world. In other terms, none of the databases, even the most reliable, is assumed to be safe [20], [11] i.e to tell the truth. In our work, the database obtained by merging several information sources still remains of collection of beliefs. Recently, in [19], A. Motro also attacked the problem of multiple information sources. Like us, he is interested in providing a way to answer queries addressed to a collection of information sources, and particularly, answers which are \inconsistent". However, his notion of inconsistency is not exactly the same as ours : indeed, he considers that two information sources are inconsistent as soon as they describe di erently the same portion of the real world. This does't mean that they are necessarily contradictory : two information sources may be inconsistent according to Motro, if for instance, one is more precise the other. He assumes that the information is provided by the sources with a degree of \goodness" (or in the previous terminology, a degree of safety) which estimates its relationship to the true information of the real world. The main problem is then to integrate information in a set of the highest goodness. Again, the problem we attacked here is a bit di erent since we do not consider in our work the relation between the information (stored in the databases), and the information of the real world it is supposed to represent. Finally, we think that the notion of topics we have used in our work to order the information sources topic-by-topic, could also be used to the particular problem of belief revision in the following way : instead of considering that the new belief is always more reliable that the initial set of beliefs (this is what one of the postulate of revision, named R1 in [16], expresses), we could consider that for some topics, the new belief is more reliable, but for some other topics, this is the initial set of beliefs which is more reliable.

References

[1] S. Abiteboul and G. Grahne. Update semantics for incomplete databases. In Proceedings of VLDB, pages 1{12, 1985. [2] J. Minker C. Baral, S. Kraus and V.S. Subrahmanian. Combining multiple knowledge bases. IEEE Trans. on Knowledge and Data Engineering, 3(2), 1991. [3] J. Minker C. Baral, S. Kraus and V.S. Subrahmanian. Combining knowledge bases consisting of rst order theories. Computational Intelligence, 8(1), 1992. [4] S. Cazalens and R. Demolombe. Intelligent access to data and knowledge bases via users' topics of interest. In Proceedings of IFIP Conference, pages 245{251, 1992. [5] L. Cholvy. Proving theorems in a multi-sources environment. In Proceedings of IJCAI, pages 66{71, 1993. [6] L. Cholvy. A logical approach to multi-sources reasoning. In Lecture notes in Arti cial Intelligence, number 808, pages 183{196. Springer-Verlag, 1994. [7] L. Cholvy and R. Demolombe. Reasoning with information sources ordered by topics. In Proc of AIMSA, 1994. [8] F. Cuppens and R. Demolombe. Cooperative Answering: a methodology to provide intelligent access to Databases. In Proc of Eexpert Database Systems, 1988. [9] J. Lang D. Dubois and H. Prade. Dealing with multi-source information in possibilistic logic. In Proceedings of ECAI, pages 38{42, 1992. [10] L. G. Demichiel. Resolving database incompatibility : an approach to performing relational operations over mismatched domains. IEEE Transactions on Knowledge and Data Engineering, 1(4), 1989. [11] R. Demolombe and A. Jones. Deriving answers to safety queries. In Proc of International workshop on nonstandard queries and answers (Toulouse), 1991. [12] R.L. Epstein. The Semantic Foundations of Logic, Volume1: Propositional Logic. Kluwer Academic, 1990. [13] Y. Breitbart et al. Panel : interoperability in multidatabases : semantic and systems issues. In Proc of VLDB, pages 561{562, 1991. [14] L. Farinas and A. Herzig. Reasoning about database updates. In Workshop of Foundations of deductive databases and logic programming. [15] P. Gardenfors. Knowledge in Flux : Modeling the Dynamics of Epistemic States. The MIT Press, 1988. [16] H. Katsuno and A. Mendelzon. Propositional knowledge base revision and minimal change. Arti cial Intelligence, 52, 1991. [17] G.M. Kupper, J.D. Ullman, and M. Vardi. On the equivalence of logical databases. In Proc of ACM-PODS, 1984. [18] G. Lakemeyer. All they know about. In Proc. of AAAI-93, 1993. [19] A. Motro. A formal framework for integrating inconsistent answers from multiple information sources. Technical Report ISSE-TR-93-106, George Mason University. [20] A. Motro. Integrity = validity + completeness. In ACM TODS, volume 14(4). [21] B. Nebel. A knowledge level analysis of belief revision. In Proc of KR'89. [22] R. Demolombe S. Cazalens and A. Jones. A logic for reasoning about is about. Technical report, ESPRIT Project MEDLAR, 1992. [23] M. Siegel and S. E. Madnick. A metadata approach to resolving semantic con icts. In Proceedings of VLDB, pages 133{146, 1991. [24] M. Winslett. Updating Logical Databases. Cambridge University Press, 1990. [25] M. Winslett-Wilkins. A model theoretic approach to updating logical databases. In Proceedings of International Conference on Database Theory, Rome, 1986.