Abduction As Belief Revision - University of Toronto Computer Science

3 downloads 0 Views 119KB Size Report
Abstract. We propose a natural model of abduction based on the revi- sion of the epistemic state of an agent. We require that ex- planations be sufficient to ...
Abduction As Belief Revision: A Model of Preferred Explanations Craig Boutilier and Ver´onica Becher Department of Computer Science University of British Columbia Vancouver, British Columbia CANADA, V6T 1Z2 email: cebly,[email protected] Abstract We propose a natural model of abduction based on the revision of the epistemic state of an agent. We require that explanations be sufficient to induce belief in an observation in a manner that adequately accounts for factual and hypothetical observations. Our model will generate explanations that nonmonotonically predict an observation, thus generalizing most current accounts, which require some deductive relationship between explanation and observation. It also provides a natural preference ordering on explanations, defined in terms of normality or plausibility. We reconstruct the Theorist system in our framework, and show how it can be extended to accommodate our predictive explanations and semantic preferences on explanations.

1 Introduction A number of different approaches to abduction have been proposed in the AI literature that model the concept of abduction as some sort of deductive relation between an explanation and the explanandum, the “observation” it purports to explain (e.g., Hempel’s (1966) deductive-nomological explanations). Theories of this type are, unfortunately, bound to the unrelenting nature of deductive inference. There are two directions in which such theories must be generalized. First, we should not require that an explanation deductively entail its observation (even relative to some background theory). There are very few explanations that do not admit exceptions. Second, while there may be many competing explanations for a particular observation, certain of these may be relatively implausible. Thus we require some notion of preference to chose among these potential explanations. Both of these problems can be addressed using, for example, probabilistic information (Hempel 1966; de Kleer and Williams 1987; Poole 1991; Pearl 1988): we might simply require that an explanation render the observation sufficiently probably and that most likely explanations be preferred. Explanations might thus nonmonotonic in the sense

that α may explain β, but α ∧ γ may not (e.g., P (β|α) may be sufficiently high while P (β|α ∧ γ) may not). There have been proposals to address these issues in a more qualitative manner using “logic-based” frameworks also. Peirce (see Rescher (1978)) discusses the “plausibility” of explanations, as do Quine and Ullian (1970). Consistency-based diagnosis (Reiter 1987; de Kleer, Mackworth and Reiter 1990) uses abnormality assumptions to capture the context dependence of explanations; and preferred explanations are those that minimize abnormalities. Poole’s (1989) assumption-based framework captures some of these ideas by explicitly introducing a set of default assumptions to account for the nonmonotonicity of explanations. We propose a semantic framework for abduction that captures the spirit of probabilistic proposals, but in a qualitative fashion, and in such a way that existing logic-based proposals can be represented as well. Our account will take as central subjunctive conditionals of the form A ⇒ B, which can be interpreted as asserting that, if an agent were to believe A it would also believe B. This is the cornerstone of our notion of explanation: if believing A is sufficient to induce belief in B, then A explains B. This determines a strong, predictive sense of explanation. Semantically, such conditionals are interpreted relative to an ordering of plausibility or normality over worlds. Our conditional logic, described in earlier work as a representation of belief revision and default reasoning (Boutilier 1991; 1992b; 1992c), has the desired nonmonotonicity and induces a natural preference ordering on sentences (hence explanations). In the next section we describe our conditional logics and the necessary logical preliminaries. In Section 3, we discuss the concept of explanation, its epistemic nature, and its definition in our framework. We also introduce the notion of preferred explanations, showing how the same conditional information used to represent the defeasibility of explanations induces a natural preference ordering. To demonstrate the expressive power of our model, in Section 4 we show how Poole’s

Theorist framework (and Brewka’s (1989) extension) can be captured in our logics. This reconstruction explains semantically the non-predictive and paraconsistent nature of explanations in Theorist. It also illustrates the correct manner in which to augment Theorist with a notion of predictive explanation and how one should capture semantic preferences on explanations. These two abilities have until now been unexplored in this canonical abductive framework. We conclude by describing directions for future research, and how consistency-based diagnosis also fits in our system.

AB

More Plausible

ABC AB

2 Conditionals and Belief Revision The problem of revising a knowledge base or belief set when new information is learned has been well-studied in AI. One of the most influential theories of belief revision is the AGM theory (Alchourr´on, G¨ardenfors and Makinson 1985; G¨ardenfors 1988). If we take an agent to have a (deductively closed) belief set K, adding new information A to K is problematic if K ⊢ ¬A. Intuitively, certain beliefs in K must be retracted before A can be accepted. The AGM theory provides a set of constraints on acceptable belief re∗ vision functions ∗. Roughly, using KA to denote the belief set resulting when K is revised by A, the theory maintains that the least “entrenched” beliefs in K should be given up and then A added to this contracted belief set. Semantically, this process can be captured by considering a plausibility ordering over possible worlds. As described in (Boutilier 1992b; Boutilier 1992a), we can use a family of logics to capture the AGM theory of revision. The modal logic CO is based on a propositional language (over ← variables P) augmented with two modal operators 2 and 2. LCPL denotes the propositional sublanguage of this bimodal language LB . The sentence 2α is read as usual as “α is true ← at all equally or more plausible worlds.” In contrast, 2α is read “α is true at all less plausible worlds.” A CO-model is a triple M = hW, ≤, ϕi, where W is a set of worlds with valuation function ϕ and ≤ is a plausibility ordering over W . If w ≤ v the w is at least as plausible as v. We insist that ≤ be transitive and connected (that is, either w ≤ v or v ≤ w for all w, v). CO-structures consist of a totally-ordered set of clusters of worlds, where a cluster is simply a maximal set of worlds C ⊆ W such that w ≤ v for each w, v ∈ C (that is, no extension of C enjoys this property). This is evident in Figure 1(b), where each large circle represents a cluster of equally plausible worlds. Satisfaction of a modal formula at w is given by: 1. M |=w 2α iff for each v such that v ≤ w, M |=v α. ←

2. M |=w 2α iff for each v such that not v ≤ w, M |=v α. We define several new connectives as follows: 3α ≡df ← ← ↔ ← ↔ ¬2¬α; 3α ≡df ¬2¬α; 2α ≡df 2α ∧ 2α; and 3α ≡df

AB

ABC AB

AB

AB

A

(a) CT4O-model

(b) CO-model

Figure 1: CT4O and CO models ↔

¬2¬α. It is easy to verify that these connectives have the ← following truth conditions: 3α (3α) is true at a world if ↔ α holds at some more plausible (less plausible) world; 2α ↔ (3α) holds iff α holds at all (some) worlds, whether more or less plausible. The modal logic CT4O is a weaker version of CO, where we weaken the condition of connectedness to be simple reflexivity. This logic is based on models whose structure is that of a partially-ordered set of clusters (see Figure 1(a)). Both logics can be extended by requiring that the set of worlds in a model include every propositional valuation over P (so that every logically possible state of affairs is possible). The corresponding logics are denoted CO* and CT4O*. Axiomatizations for all logics may be found in (Boutilier 1992b; Boutilier 1992a). For a given model, we define the following notions. We let kαk denote the set of worlds satisfying formula α (and also use this notion for sets of formulae K). We use min(α) to denote the set of most plausible α-worlds:1 min(α) = {w : w |= α, and v < w implies v 6|= α} The revision of a belief set K can be represented using CT4O or CO-models that reflect the degree of plausibility accorded to worlds by an agent in such a belief state. To capture revision of K, we insist that any such K-revision model be such that kKk = min(⊤); that is, kKk forms the (unique) minimal cluster in the model. This reflects the intuition that all and only K-worlds are most plausible (Boutilier 1992b). The CT4O-model in Figure 1(a) is a K-revision model for K = Cn(¬A, B), while the CO-model in Figure 1(b) is suitable for K = Cn(¬A). 1 We assume, for simplicity, that such a (limiting) set exists for each α ∈ LCPL , though the following technical developments do not require this (Boutilier 1992b).

∗ To revise K by A, we construct the revised set KA by considering the set min(A) of most plausible A-worlds in ∗ M . In particular, we require that kKA k = min(A); thus ∗ B ∈ KA iff B is true at each of the most plausible A-worlds. We can define a conditional connective ⇒ such that A ⇒ B is true in just such a case: ↔

(A ⇒ B) ≡df 2(A ⊃ 3(A ∧ 2(A ⊃ B))) Both models in Figure 1 satisfy A ⇒ B, since B holds at each world in the shaded regions, min(A), of the models. Using the Ramsey test for acceptance of conditionals (Stal∗ naker 1968), we equate B ∈ KA with M |= A ⇒ B. ∗ Indeed, for both models we have that KA = Cn(A, B). If the model in question is a CO*-model then this characterization of revision is equivalent to the AGM model (Boutilier 1992b). Simply using CT4O*, the model satisfies all AGM postulates (G¨ardenfors 1988) but the eighth. Properties of this conditional logic are described in Boutilier (1990; 1991). We briefly describe the contraction of K by ¬A in this semantic framework. To retract belief in ¬A, we adopt the belief state determined by the set of worlds kKk ∪ min(A). − The belief set K¬A does not contain ¬A, and this operation captures the AGM model of contraction. In Figure 1(a) − − K¬A = Cn(B), while in Figure 1(b) K¬A = Cn(A ⊃ B). A key distinction between CT4O and CO-models is illustrated in Figure 1: in a CO-model, all worlds in min(A) must be equally plausible, while in CT4O this need not be the case. Indeed, the CT4O-model shown has two maximally plausible sets of A-worlds (the shaded regions), yet these are incomparable. We denote the set of such incomparable subsets of min(A) by P l(A), so that min(A) = ∪P l(A).2 Taking each such subset to be a plausible revised state of affairs rather than their union, we can define a weaker notion of revision using the following connective. It reflects the intuition that at some element of P l(A), C holds: ↔



(A → C) ≡df 2(¬A) ∨ 3(A ∧ 2(A ⊃ C)) The model in Figure 1(a) shows the distinction: it satisfies neither A ⇒ C nor A ⇒ ¬C, but both A → C and A → ¬C. There is a set of comparable most plausible A-worlds that satisfies C and one that satisfies ¬C. Notice that this connective is paraconsistent in the sense that both C and ¬C may be “derivable” from A, but C ∧ ¬C is not. However, → and ⇒ are equivalent in CO, since min(A) must lie within a single cluster. Finally, we define the plausibility of a proposition. A is at least as plausible as B just when, for every B-world w, there is some A-world that is at least as plausible as w. This ↔ is expressed in LB as 2(B ⊃ 3A). If A is (strictly) more 2

P l(A) = {min(A) ∩ C : C is a cluster}.

plausible than B, then as we move away from kKk, we will find an A-world before a B-world; thus, A is qualitatively “more likely” than B. In each model in Figure 1, A ∧ B is more plausible than A ∧ ¬B.

3 Epistemic Explanations Often explanations are postulated relative to some background theory, which together with the explanation entails the observation. Our notion of explanation will be somewhat different than the usual ones. We define an explanation relative to the epistemic state of some agent (or program). An agent’s beliefs and judgements of plausibility will be crucial in its evaluation of what counts as a valid explanation (see G¨ardenfors (1988)). We assume a deductively closed belief set K along with some set of conditionals that represent the revision policies of the agent. These conditionals may represent statements of normality or simply subjunctives (below). There are two types of sentences that we may wish to explain: beliefs and non-beliefs. If β is a belief held by the agent, it requires a factual explanation, some other belief α that might have caused the agent to accept β. This type of explanation is clearly crucial in most reasoning applications. An intelligent program will provide conclusions of various types to a user; but a user should expect a program to be able to explain how it reached such a “belief,” to justify its reasoning. The explanation should clearly be given in terms of other (perhaps more fundamental) beliefs held by the program. This applies to advice-systems, intelligent databases, tutorial systems, or a robot that must explain its actions. A second type of explanation is hypothetical. Even if β is not believed, we may want a hypothetical explanation for it, some new belief the agent could adopt that would be sufficient to ensure belief in β. This counterfactual reading turns out to be quite important in AI, for instance, in diagnosis tasks (see below), planning, and so on (Ginsberg 1986). For example, if A explains B in this sense, it may be that ensuring A will bring about B. If α is to count as an explanation of β in this case, we must insist that α is also not believed. If it were, it would hardly make sense as a predictive explanation, for the agent has already adopted belief in α without committing to β. This leads us to the following condition on epistemic explanations: if α is an explanation for β then α and β must have the same epistemic status for the agent. In other words, α ∈ K iff β ∈ K and ¬α ∈ K iff ¬β ∈ K.3 3 This is at odds with one prevailing view of explanation, which takes only non-beliefs to be valid explanations: to offer a current belief α as an explanation is uninformative; abduction should be an “inference process” allowing the derivation of new beliefs. We take a somewhat different view, assuming that observations are not (usually) accepted into a belief set until some explanation is found and accepted. In the context of its other beliefs, β is unexpected. An explanation relieves this dissonance when it is accepted (G¨ardenfors 1988). After this process both explanation and observation are be-

Since our explanations are to be predictive, there has to be some sense in which α is sufficient to cause acceptance of β. On our interpretation of conditionals (using the Ramsey test), this is the case just when the agent believes the conditional α ⇒ β. So for α to count as an explanation of β (in this predictive sense, at least) this conditional relation must hold.4 In other words, if the explanation were believed, so too would the observation. Unfortunately, this conditional is vacuously satisfied when β is believed, once we adopt the requirement that α be believed too. Any α ∈ K is such that α ⇒ β; but surely arbitrary beliefs cannot count as explanations. To determine an explanation for some β ∈ K, we want to (hypothetically) suspend belief in β and, relative to this new belief state, evaluate the conditional α ⇒ β. This hypothetical belief state should simply be the contraction of K by β. The contracted belief set Kβ− is constructed as described in the last section. We can think of it as the set of beliefs held by the agent before it came to accept β.5 In general, the conditionals an agent accepts relative to the contracted set need not bear a strong relation to those in the original set. Fortunately, we are only interested in those conditionals α ⇒ β where α ∈ K. The AGM contraction operation ensures that ¬α 6∈ Kβ− . This means that we can determine the truth of α ⇒ β relative to Kβ− by examining conditionals in the original belief set. We simply need to check if ¬β ⇒ ¬α relative to K. This is our final criterion for explanation. If the observation had been absent, so too would the explanation. We assume, for now, the existence of a model M that captures an agent’s objective belief set K and its revision poli− ∗ cies (e.g., M completely determines KA , KA and accepted conditionals A ⇒ B). When we mention a belief set K, we have in mind also the appropriate model M . All conditionals are evaluated with respect to K unless otherwise indicated. We can summarize the considerations above: Definition A predictive explanation of β ∈ LCPL relative to belief set K is any α ∈ LCPL such that: (1) α ∈ K iff β ∈ K and ¬α ∈ K iff ¬β ∈ K; (2) α ⇒ β; and (3) ¬β ⇒ ¬α. lieved. Thus, the abductive process should be understood in terms of hypothetical explanations: when it is realized what could have caused belief in an (unexpected) observation, both observation and explanation are incorporated. Factual explanations are retrospective in the sense that they (should) describe “historically” what explanation was actually adopted for a certain belief. In (Boutilier and Becher 1993) we explore a weakening of this condition on epistemic status. Preferences on explanations (see below) then play a large role in ruling out any explanation whose epistemic status differs from that of the observation. 4 See the below for a discussion of non-predictive explanations. 5 We do not require that this must actually be the case.

RSWC RSW

RSWC

More Plausible

RSW RSWC RSWC

RSW RSWC

Hypothetical

Factual

Figure 2: Explanations for “Wet Grass” As a consequence of this definition, we can have the following property of factual explanations: Proposition 1 If α, β ∈ K then α explains β iff α ⇒ β is accepted in Kβ− . Thus factual explanations satisfy our desideratum regarding contraction by β. Furthermore, for both factual and hypothetical explanations, only one of conditions (2) or (3) needs to be tested, the other being superfluous: Proposition 2 (i) If α, β ∈ K then α explains β iff ¬β ⇒ ¬α; (ii) If α, β 6∈ K then α explains β iff α ⇒ β. Figure 2 illustrates both factual and hypothetical explanations. In the first model, wet grass (W ) is explained by rain (R), since R ⇒ W holds in that model. Similarly, sprinkler S explains W , as does S ∧ R. Thus, there may be competing explanations; we discuss preferences on these below. Intuitively, α explains β just when β is true at the most plausible situations in which α holds. Thus, explanations are defeasible: W is explained by R; but, R together with C (the lawn is covered) does not explain wet grass, for R ∧ C ⇒ ¬W . Notice that R alone explains W , since the “exceptional” condition C is normally false when R (or otherwise), thus need not be stated. This defeasibility is a feature of explanations that has been given little attention in many logic-based approaches to abduction. The second model illustrates factual explanations for W . Since W is believed, explanations must also be believed. R and ¬S are candidates, but only R satisfies the condition on factual explanations: if we give up belief in W , adding R is sufficient to get it back. In other words, ¬W ⇒ ¬R. This does not hold for ¬S because ¬W ⇒ S is false. Notice that if we relax the condition on epistemic status, we might

accept S as a hypothetical explanation for factual belief R. This is explored in (Boutilier and Becher 1993). Semantic Preferences: Predictive explanations are very general, for any α that induces belief in β satisfies our conditions. Of course, some such explanations should be ruled out on grounds of implausibility (e.g., a tanker truck exploding in front of my house explains wet grass). In probabilistic approaches to abduction, one might prefer most probable explanations. In consistency-based diagnosis, explanations with the fewest abnormalities are preferred on the grounds that (say) multiple component failures are unlikely. Preferences can be easily accommodated within our framework. We assume that the β to be explained is not (yet) believed and rank possible explanations for β.6 An adopted explanation is not one that simply makes an observation less surprising, but one that is itself as unsurprising as possible. We use the plausibility ranking described in the last section. Definition If α and α′ both explain β then α is at least as ↔ preferred as α′ (written α ≤P α′ ) iff M |= 2(α′ ⊃ 3α). The preferred explanations of β are those α such that not α′

Suggest Documents