Dec 15, 2010 - by one episode from famous TV series about Lt. Columbo. ..... (possible world; cf. [12]). We follow Carnap's idea to model intension as a function defined on a set of contexts. ..... (FS4) l(sHabits) ⣠¬ Own Smth hSmith.
Towards Commonsense Reasoning in Fuzzy Logic in Broader Sense∗ Vil´em Nov´ak and Anton´ın Dvoˇr´ak University of Ostrava Institute for Research and Applications of Fuzzy Modeling 30. dubna 22, 701 03 Ostrava 1, Czech Republic email:{Vilem.Novak, Antonin.Dvorak}@osu.cz December 15, 2010
Abstract In this paper, we present a formal logical model of reasoning based on a detective story inspired by one episode from famous TV series about Lt. Columbo. It is a demonstration of the power of fuzzy logic in broader sense for modeling of human reasoning about common life situations. It should also serve as a methodology for the development of further similar kinds of reasoning.
Keywords: Fuzzy logic in broader sense; fuzzy type theory; fuzzy IF-THEN rules; nonmonotonic logic; evaluative linguistic expressions; precisiated natural language; commonsense reasoning
1
Introduction
The aim of this paper is to show the possibility to develop a reasonably working formalism using which commonsense knowledge and reasoning performed by people can be modeled. The proposed formalism is provided by fuzzy logic in broader sense (FLb) which is an extension of fuzzy logic in narrow sense with the goal to develop a formal theory of natural human reasoning for which it is characteristic to use natural language. Thus, FLb, besides the full-fledged logical formalism that covers also classical logic, can offer a model of the meaning of a part of natural language and, namely, that part having extraordinary position in human commonsense reasoning. For our presentation, we chose a detective story inspired by one episode of famous TV series about Lt. Columbo because this is a typical example of human reasoning which uses natural language, knowledge of common things and situations, and, of course, standard logical deduction. A crucial role in our model is played by natural language. There are many attempts at translation of natural language into a form better suited for (automated) deduction. A repeated drawback of many proposed formalisms is neglecting the vagueness phenomenon which, however, is inherently present in natural language semantics (cf. [7]). One of attempts that overcomes this drawback has been presented by L. A. Zadeh who developed a methodology called PNL – Precisiated Natural Language [22, 23]. The main ideas of PNL are implemented also in this paper. Besides the methodology of PNL, we accept the requirement to construct our model in such a way to have a reasonable interpretation of the meaning of natural language without claim to be apt in all cases according to the linguistic theory. We better concentrate on features important from the point of view of inferential abilities of natural language and develop formal systems that are models (approximations) of natural language. The formal frame of FLb is the fuzzy type theory (FTT) that is a higher-order fuzzy logic of Henkin style because its formalism is very flexible and powerful. Thus, analogously to [21] we employ FTT as a ∗ The
ˇ ˇ research was supported by the project MSM 6198898701 of the MSMT CR.
1
general frame on the basis of which we try to develop a special formal theory suitable for natural language based commonsense reasoning. Our approach can be viewed as a part of logic-based artificial intelligence originated by the pioneering work of John McCarthy in [13]. It aims at formalizing common sense reasoning, that is a reasoning used by people when solving everyday problems. The use of logic is motivated by the fact that logical formalization should help to understand the reasoning problem itself. Logical sentences are used for representation of agents’s knowledge, goals and situation. In classical approach, they are represented by sentences of first-order logic though other logical systems have also been proposed. It is widely argued that any working formalization of common-sense reasoning should be nonmonotonic. This means that, on one side, the original conclusion can be modified or even denied on the basis of some new fact or, on the other side, that the reasoning is partially based on some selected facts. The selection is necessary since otherwise it may happen that all possible facts lead to contradiction simply because they are irrelevant for the given case but they logically imply it. Therefore, we will adopt also some ideas from the theory of nonmonotonic reasoning [1, 3] and, namely, consider an epistemic state characterizing Lt. Columbo’s solution of the plot. The role of vagueness in commonsense reasoning has several aspects. First of all it enables us to understand complicated surrounding world which cannot be known in all details. Vagueness thus enables us to reduce necessary amount of information and to focus only on its relevant constituents. Further, vagueness is quite often a feature of the information we have at disposal because more precise one is not available (or too expensive to be obtained). We must thus cope with the lack of details and find relevant conclusions despite that. It is important to stress that vagueness is often useful because it helps us to get better orientation in the problem due to avoidance of unnecessary details. We want to show the power of fuzzy logic in broader sense that is based on fuzzy type theory on a non-trivial example. We provide a mathematical model of human reasoning. We expect that similar reasoning can be used in variety of applications where inherently vague notions play crucial role (for example, in economical analyses). Therefore, our paper should serve also as methodological instructions for formal analysis of other, possibly more complicated human reasoning problems.
2
Informal presentation of the detective story
Our story is inspired by one episode from famous TV series about Lt. Columbo. Of course, we will not present a real essay but select only some moments essential for our analysis.
2.1
The story
Mr. John Smith has been shot dead in his house. He was found by his friend, Mr. Robert Brown. Lt. Columbo suspects Mr. Brown to be the murderer. Mr. Brown’s testimony is the following: I have started from my home at about 6:30, arrived to John’s house at about 7, found John dead and went immediately to the phone box to call police. They told me to wait and came immediately. Lt. Columbo has found the following evidence about dead Mr. Smith: He had high quality suit with broken wristwatch stopped at 5:45. No evidence of strong strike on his body. Lt. Columbo touched engine of Mr. Brown’s car and found it to be more or less cold. Lt. Columbo concluded that Mr. Brown lied because of the following. (a) Mr. Brown’s car engine is more or less cold, but it must have been hot because he drove long (more than about 30 minutes). Therefore, he could not arrive and call the police right away (police came immediately after his call). He must have stayed there longer.
2
(b) The wristwatch has been broken but high quality wristwatch does not break after not too strong strike. A man having high quality dress and a luxurious house is supposed to have also high quality wristwatch. However, the wristwatch of John Smith is of low quality and so, it does not belong to him. Consequently, it does not show the time of death.
2.2
Commentary
Note that there is no direct evidence of Mr. Brown’s crime. However, Mr. Brown lied about the time of his arrival, and he could assassinate Mr. Smith, because the time of Mr. Smith’s death is unknown. Hence, Lt. Columbo’s conjecture based on indirect evidence is justified. Our goal is to develop a formal logical mechanism that can mimic Lt. Columbo’s reasoning. The principal idea is that the knowledge coming out of the story contains contradiction. We will also show that slight change in the evidence may lead to disappearance of contradiction despite the fact that the basic knowledge is not changed. This is the consequence of vagueness of perceptions which is captured in our model. Our method will be described in Section 4. Section 5.3 contains logical analysis of the reasoning including formal proofs of the statements.
3
Preliminaries
3.1
Mathematical fuzzy logic in narrow and broader sense
The fundamental accepted classification of mathematical fuzzy logic is fuzzy logic in narrow sense (FLn) and fuzzy logic in broader sense (FLb)∗) . The former is mathematical fuzzy logic (see [10, 17]) that is a generalization of classical mathematical logic, i.e., it has clearly distinguished syntax and semantics which is always many-valued. The syntax consists of precise definitions of formula, proof, formal theory, model, provability, etc. There are many formalisms falling into the realm of FLn which usually differ from each other on the basis of the assumed structure of truth values, which then determines the properties of the given calculus. It is argued in [17] that the most distinguished calculi that are important also for the development of FLb are IMTL-, BL-, Lukasiewicz- and LΠ-fuzzy logics. All these calculi have been formally developed up to higher-order. The mathematical fuzzy logic in broader sense is an extension of FLn whose aim is to develop a formal theory of human reasoning that would include mathematical model of meaning of special expressions of natural language and generalized quantifiers with regard to their vagueness. This program has been initiated by V. Nov´ ak in 1995 in [14]. One can see that it overlaps with two other paradigms proposed in the literature, namely commonsense reasoning and precisiated natural language. The main drawback of the up-to-date formalizations of commonsense reasoning, in our opinion, is neglecting vagueness present in meaning of natural language expressions (cf. [5] and the citations therein). PNL (see [23]) is based on the premiss that much of the world knowledge is perception based, and this knowledge is intrinsically fuzzy. It is important to stress that the term “precisiated natural language” means especially a reasonable working formalization of semantics of natural language without pretension to capture it in all the detail and fineness. Its goal is to provide acceptable and applicable technical solution. It should also be noted that the term “perception” is not considered here as a psychological term but better as a result of human, intrinsically imprecise measurement. In our formal theory, we technically identify perceptions with evaluative expressions of natural language characterizing certain values on an ordered scale. The PNL methodology requires presence of World Knowledge Database and Multiagent, Modular Deduction Database. The former contains all the necessary information including perception based propositions describing the knowledge acquired by direct human experience which can be used in the deduction process. The latter contains various rules of deduction. No exact formalization of PNL, however, has been developed till now. Hence, it should be taken mainly as a reasonable methodology. Our concept of FLb is thus a glue between both paradigms above that should take the best of each. So far, FLb consists of the following theories: ∗) Let
us emphasize that this is a special mathematical theory which must not be mistaken with fuzzy logic in wide sense.
3
(a) Formal theory of evaluative linguistic expressions, (b) formal theory of fuzzy IF-THEN rules, (c) formal theory of perception-based logical deduction, (d) formal theory of intermediate quantifiers. In this paper, we heavily use the formalism of FTT. Its syntax is a generalized lambda calculus in which the fundamental connective is that of fuzzy equality ≡. Important role is also played by the ∆ connective which sends all truth values smaller than 1 to 0 and thus, it “extracts” the boolean part of FTT. Each formula has a certain type denoted by Greek letter subscript. Elementary types are o which represents truth values and which represents elements (objects). A formula Aβα of type βα is interpreted by a function assigning elements of type β to elements of type α. The structure of truth values is supposed in this paper to form the standard Lukasiewicz∆ algebra because FTT based on it (we will use the short L-FTT) has some nice and useful properties needed for our model of commonsense reasoning below. Because of lack of space, we must refer the reader to [15, 19] where all the necessary details can be found. Recall that a formula Ao is crisp if ` Ao ∨ ¬Ao . Equivalently, Ao is crisp iff ` Ao ≡ ∆Ao . The following special crisp formulas are important: ¬zo ), Υoo := λzo · ¬∆ (¬ ˆ Υoo := λzo · ¬∆ (zo ∨ ¬ zo ).
(nonzero truth value) (general truth value)
ˆ o) = It is easy to prove that in every model M and assignment p, Mp (Υzo ) = 1 iff Mp (zo ) > 0 and Mp (Υz 1 iff 1 > Mp (zo ) > 0. Lemma 1 (a) If T ` zo ⇒ xo then T ` Υzo ⇒ Υxo . (b) If either T ` Υzo & ∆ (zo ⇒ yo ) or T ` Υzo & (zo ⇒ yo ) then T ` Υyo . A formal theory T is contradictory, if there is a formula Ao such that T ` Ao & ¬Ao .
(1)
It can be easily proved that a theory T is contradictory iff T ` Ao holds for all formulas of type o (type for truth value). Lemma 2 Let A ∈ Formo be arbitrary formula and T ` Υzo & (zo ⇒ A & ¬ A). Then T is contradictory. Let us briefly discuss the role of FTT for commonsense reasoning. There are several formalisms used in commonsense reasoning theory (a nice overview can be found in [5]). These formalisms are in most cases certain extensions of classical predicate logic. FTT, however, is a higher order fuzzy logic which generalizes classical one. Using the ∆ connective, we can prove in FTT everything that is provable in the latter, and even much more. For example, the Versatile Event Logic presented in [2] can be interpreted in FTT and similarly the formalization of [9]. Thus, FTT has a potential for better relation to the reality than classical logic. Because of its big explication power, it can capture many finenesses of the semantics of natural language (cf. [12]).
3.2
A formal theory of evaluative linguistic expressions
This is the principal constituent of FLb. Recall that evaluative (linguistic) expressions are expressions of natural language such as small, medium, big, about twenty five, roughly one hundred, very short, more or less deep, not very tall, roughly warm or medium hot, quite roughly strong, roughly medium important, and many other ones. They form a small but very important constituent of natural language used practically in any commonsense speech. The reason is that by evaluating phenomena around us we can modify 4
our subsequent behavior. Evaluative expressions thus determine our decisions, help us in learning and understanding, and in many other activities. The gradable adjectives small, medium, big are taken as canonical. Of course, they can be replaced by other proper adjectives depending on the context. We will deal in this paper only with so-called simple trichotomous evaluative expressions that have the form hlinguistic hedgeihevaluative adjectivei
(2)
where hlinguistic hedgei is a special intensifying adverb such as very, significantly, more or less, roughly, etc. and hevaluative adjectivei is one of the canonical adjectives small, medium, big. We differentiate evaluative expressions and evaluative predications. The latter are expressions of natural language of the form X is A where A is an evaluative expression. Examples are “temperature is high”, “speed is extremely low”, “quality is very high”, etc. The semantic model of evaluative expressions in FLb makes distinction between their intension (a property) and extension in a given context of use (possible world ; cf. [12]). We follow Carnap’s idea to model intension as a function defined on a set of contexts. In our model, this enables us to state that the expression “high” is the name of a property of some feature of objects, i.e. of their height. Its meaning can be 30 cm when a beetle needs to climb a straw, 30 m for an electrical pylon, but 4 km or more for a mountain. Because of lack of space, we refer to the paper [18] where all the details of the theory of evaluative expressions, i.e., their definition, syntactic structure, logical analysis and formal theory of their meaning can be found. Let us only mention that we will often use a general metavariable Ev ∈ Formϕ for intension of an evaluative expression where ϕ = (oα)ω = (oα)(αo) is a special (meta-)type for intension. We can formalize expressions such as “temperature is high”, “pressure is extremely low, “a car is beautiful” etc. in a unique way.
3.3
Perception-based logical deduction
This is a special deduction method of FLb enabling us to find conclusions from linguistic descriptions, i.e. sets of fuzzy/linguistic IF-THEN rules of the form R := IF X is A THEN Y is B.
(3)
where A, B are evaluative expressions. Hence, whole rules are construed as conditional expressions of natural language. A list of fuzzy IF-THEN rules can be taken as a special text in natural language. We will call it linguistic description and distinguish its topic and focus (the general linguistic elaboration of this phenomenon can be found, e.g. in [11]). More formally, a linguistic description is a finite set LD = {Int(Rj ) | j = 1, . . . , m} of fuzzy IF-THEN rules (3), where Int(Rj ) denotes an intension of rule Rj [18]. Its topic is a set of evaluative expressions Topic LD = {Ev A j | j = 1, . . . , m} forming antecedents of the rules, and its focus LD C is Focus = {Ev j | j = 1, . . . , m} forming consequents of the rules. The symbols Ev A , Ev C denote intensions of the predications in the antecedent and consequent, respectively. We refer to [16, 20] for precise definitions and proofs of basic theorems introduced below. We have to consider a partial ordering of sharpness between (intensions of) evaluative expressions. This is realized by special formula ≺ so that Ev 1 ≺ Ev 2 means that Ev 1 is sharper than Ev 2 . The sharpness has the following meaning: let, for example x be, at least partly, “very big” in all contexts. Then we know that x is also “big” in all these contexts. Hence, we conclude that Int(X is very big) ≺ Int(X is big). The formula Ev 1 | Ev 2 expresses that these two evaluative expressions are incomparable. Let c be a fixed logical constant for some truth value c > 0. Then the crisp formula CEval o(ϕαω) ≡ λw λxα λzϕ (∃zo )Υzo & ∆ ((cc ∨ zo ) ⇒ zϕ wxα )
(4)
expresses that xα is evaluated in the context w by (intension of) expression zϕ in a degree at least c ∨ zo where c is a threshold. We may also consider a weaker formula Eval o(ϕαω) ≡ λw λxα λzϕ (∃zo )(Υzo & ∆ (zo ⇒ zϕ wxα )). We say that an element x in the context w is evaluated by the expression Ev if T Ev ` Eval wx Ev is provable.†) Here, T Ev demotes formal theory of evaluative linguistic expressions [18]. †) A
context can be informally understood as an interval of real numbers. For formal definition, see again [18].
5
One of principal paradigms of the concept of precisiated natural language is that the world knowledge, i.e., the knowledge accumulated by people during their life, is perception-based. As already noted, the perception is for us a result of some measurement that, when realized by people without precise measuring tools, is highly imprecise. We are interested in those values. Therefore, it seems reasonable to identify perceptions with evaluative expressions, i.e. they are formulas of T Ev . The concept of perception is formalized as follows: LPerc o(ϕα) := λw λx λzϕ · Topic zϕ & CEval wxzϕ & (∀zϕ0 )((CEval wxzϕ0 & Topic zϕ0 ) ⇒ ((zϕ0 wx < zϕ wx) ∨ (zϕ ≺ zϕ0 ) ∨ (zϕ0 |zϕ ))). (5) LPerc wxα zϕ expresses that intension zϕ of some evaluative expression is a local perception of xα ∈ Formα in the context w with respect to the given set Topic of linguistic expressions. The meaning of (5) is the following: the intension zϕ of an evaluative expression is a local perception of x ∈ Formα in w with respect to the topic of the given linguistic description if xα is evaluated by zα in the given context w, and for every other zϕ0 which also evaluates xα in w, either the truth value of zϕ0 wx is smaller than zϕ wx, or zϕ is sharper than zϕ0 , or it is incomparable with the latter. Clearly, the adjective “local” relates to the given context w. Let LD be a linguistic description. Then we may consider a formula LPerc LD obtained from (5) by inserting Topic LD for Topic. The following lemma summarizes some of the main properties used in this paper. Lemma 3 (a) Let T Ev ` Ev 1 ≺ Ev 2 . Then T Ev ` LPerc w x Ev 1 ⇒ LPerc w x Ev 2 . (b) Let T Ev ` zϕ wx ⇒ zϕ0 w0 y. Then T Ev ` Eval wxzϕ ⇒ Eval w0 yzϕ0 . The linguistic description LD characterizes in natural language some kind of dependence or relation between attributes of objects (and, consequently, the objects themselves). People use it when they want to describe a certain situation or process but they do not know it precisely. Therefore, the most important role of the linguistic description is to provide a conclusion about consequent objects Y when an information about antecedent objects X is given. Such an information has a character of perception of properties of the latter objects and so, the corresponding procedure is called perception-based logical deduction. We introduce a special inference rule of perception-based logical deduction. Let LD be a linguistic description consisting of rules of the form (3) and x ∈ Formα , y ∈ Formβ , w ∈ Formαo , w0 ∈ Formβo . Then the following scheme is a specific inference rule: rP bLD :
LPerc LD wx Ev A LD i , C 0 Eval w yˆi Ev i
(6)
C 0 ‡) LD where yˆi ≡ ιλy · Ev A Ev A and T ` Focus LD Ev C . Saying i wx ⇒ Ev i w y , i ∈ {1, . . . , m}, T ` Topic A loosely, from local perception Ev i and a linguistic description we conclude using rP bLD that the element yˆi is evaluated by the evaluative expression Ev C i . The conclusion of this rule is the formula (Eval w0 yˆi Ev C i ) of type o stating the a special element yˆi whose construction is given by (6) is evaluated by Ev C . Thus, in every model M we can find its i interpretation Mp (ˆ yi ) = v ∈ Mβ using the operation Mp (ιβ(oβ) ) which in fuzzy set theory is just the defuzzification function.
3.4
Necessity of nonmonotonic reasoning
As we mention in the introduction, our reasoning is based on some selection of all known facts. To formalize this idea, we will adopt the concept of epistemic state introduced in the book by Bochman [3]. Let a language J of L-FTT be given. The epistemic state is a triple E = hS, `, i
(7)
‡) The description operator ι picks up a typical element form the kernel of the corresponding fuzzy set. Thus, it works as defuzzification operator.
6
where S is a set of objects called admissible belief states, is a preference relation on S and ` is a labeling function assigning a deductively closed theory of FLb (called an admissible belief set) to every belief state from S. An admissible belief state s ∈ S is preferred if there is no t ∈ S such that s t. In our case, we say that a belief state s is preferred to s0 , s0 s, if some of the consequences of `(s0 ) are necessary for inference in `(s).
4
General principles of our commonsense reasoning method
We will follow the methodology of PNL whose main idea is to construct the World Knowledge Database. For this task, we will utilize ideas taken from the theory of commonsense reasoning and formalization developed in FLb. Note that the first simple application of this methodology to economy has been published in [6]. The general scheme of our method is depicted in Figure 1. Among our goals is to formalize the World Knowledge Database - Contexts - Logical rules - Knowledge from experience Formalization - Models M, M', ... - Formal theories - Linguistic descriptions
Problem
Modular Deduction Database
Solution
Formalization - Formal deduction rules - PbLD - Epistemic state
Evidence Formalization - Model M - Perceptions
Figure 1: General scheme of reasoning applied to Lt. Columbo problem and to show, how it can work. We will now explain the content of this figure. The world knowledge database consists of several components: (i) Contexts of various typical features of objects (variables). This information comes from experience 7
and general knowledge of the world. For example, we know that the context for heights of people in Europe is always h40, 165, 220i (in cm). In FLb this means that the known values should be inserted in a specific model M of FTT which corresponds to the given situation. (ii) Logical rules are logical theorems of FTT and theorems both of the theory T Ev of evaluative expressions and the logical theory of fuzzy IF-THEN rules. (iii) Customs of people. (iv) Properties of products. (v) Knowledge from physics and also from other areas of human activity. (vi) Possibly some other commonsense knowledge necessary for solution of the problem. Items (iii)–(vi) are in Figure 1 condensed into “knowledge from experience”. Their formalization in FLb is in most cases expressed using sets of linguistic descriptions consisting of special fuzzy IF-THEN rules acquired by experience. Items (iv) and (v) correspond to knowledge that has been extensively discussed by Davis in [4] under the term “naive physics”. Our item (v) corresponds to the description of what is called “microworld” in the cited paper. Note that various kinds of inferences discusses in there, for example “If A is very much bigger than B then A+B is very close to A” (cf. Section 12) can be easily realized in FLb†) . The Modular Deduction Database combines two sets of techniques: the techniques developed within FLb, and the principles of nonmonotonic reasoning, among them the preferential nonmonotonic reasoning (see discussion in [3]). This enables us to structure the reasoning into separate blocks, some of them being mutually independent, and the other ones following from each other so that the results of one are used in the next one. The structure of reasoning thus forms a clear transparent system. The evidence in the story comes from two sources: 1. Evidence provided by Mr. Brown, 2. evidence found by Lt. Columbo. Its formalization thus leads to completion of the model M of FTT constructed due to item (i) and to formation of specific perceptions (i.e., specific evaluative predications). In the rest of this paper we will formalize blocks from Figure 1 and construct a special epistemic state E as defined in (7) that will be taken as a formal model of knowledge and beliefs of Lt. Columbo. Its admissible belief sets contain pieces of Lt. Columbo’s world knowledge and also perceptions concerning Mr. Smith case that are derived with respect to a specific model. This enables us to distinguish clearly evidence obtained by Lt. Columbo from the testimony of Mr. Brown. Conclusions in preferential nonmonotonic reasoning are obtained by choosing preferred admissible belief states consistent with the facts. If one or more admissible belief sets l(s) is contradictory then Lt. Columbo may conclude that Mr. Brown lies and accuse him of murder.
5 5.1
Formalization Specific world knowledge database
The language J of FTT considered below contains all the elements of the language J Ev including all special formulas and constants of the theory T Ev defined further. Moreover, we include in J the following: (i) ϑ ∈ Types represents a general feature of objects that can be characterized using grades. In our case, these can be wealth, quality, strike strength or temperature. In the model, a set of this type can be, e.g., a subset of the real numbers. †) In the field of commonsense reasoning, there are projects which aim at building large collections of commonsense facts consisting of millions of assertions. In Lt. Columbo example we provide world knowledge manually but automatization is considered in the future.
8
(ii) τ ∈ Types represents time, more specifically a length of time period. Time is a very specific characteristics of human life that requires a more detailed elaboration. In this paper, however, we do not need special treatment of it because the length of time period can be understood simply as physical quantity and we do not need its other characteristics (ordering etc.). (iii) β, γ, δ, η, π ∈ Types represent objects: wristwatch, suit, car, house and person, respectively. (iv) WatchQual , WatchState, WatchStrike ∈ Formϑβ , SuitQual ∈ Formϑγ , PersonWealth ∈ Formϑπ , DriveDur ∈ Formτ δ , EngTemp ∈ Formϑγ , HouseQual ∈ Formϑη , are special formulas characterizing quality of wristwatch (objects of type β), state of it and strike strength into it, quality of suits (objects of type γ), wealth of persons (objects of type π), drive duration of cars (objects of type δ), temperature of car engine and quality of houses (objects of type η), respectively. (v) In correspondence with the previous item, we also need to consider variables for the contexts: wWatchQual , wWatchState , wWatchStrike , wSuitQual , wPersonWealth , wEngTemp , wHouseQual ∈ Formϑo and also wDriveDur ∈ Formτ o . (vi) Own ∈ Form(oβ)π , is a predicate saying that a person, say xπ , owns a watch. (vii) Special constants concerning the story, namely: cBrown ∈ Formδ representing the car of Mr. Brown, hSmith ∈ Formβ , sSmith ∈ Formγ , mSmith ∈ Formη representing Mr. Smith’s wristwatch, suit and house, respectively and a constant Smth ∈ Formπ representing Mr. Smith himself. The above considered set of types will be denoted by TypesCol = {β, γ, δ, η, π, ϑ, τ }. To relax a bit the overburden of various kinds of symbols, we will use the formulas for contexts wWatchQual , . . . as variables (of the given type) as well as constants for our story (later on). We will also consider a specific model M given by the evidence above. The perceptions become axioms of the theories in concern. Recall that they are, in fact, formulas true in the model M in the degree 1. The specific world knowledge concerning our story is formalized using specific linguistic descriptions. We will usually write only one concrete fuzzy IF-THEN rule (the most important for our case). (i) Contexts that are generally known: (a) Drive duration to heat the engine: wDriveDur = h0, 5, 30i (minutes). (b) Temperature of engine: wEngTemp = h0, 45, 100i (degrees of Celsius). (c) Abstract degrees†) of state and strike strength: wWatchState = wW k = h0, 0.4, 1i. (ii) Logical rules:(see [18], Theorem 6) IF X is Sm ν THEN X is ¬ Bi IF X is Bi ν THEN X is ¬ Sm
(8) (9)
where Sm ν, Bi ν are intensions of “ν small” “ν big”, respectively for arbitrary linguistic hedge ν and variable X for objects of arbitrary type. Of course, many other rules based on the logical structure of L-FTT and T Ev are also considered. (iii) Knowledge from physics: The linguistic description denoted by LD P hysics consists of rules of the form IF drive duration is Bi THEN engine temperature is Bi IF drive duration is Sm THEN engine temperature is ML Sm ...................................................................
(10)
†) This is a simplification that should be accepted because all the information is subjective and so we may hardly assume that people are able to estimate, for example, a physical force in newtons necessary to break the wristwatch.
9
(iv) Habits of people: Characterization of relation of quality of clothes, house and wristwatch with the wealth of a person xπ . We consider two linguistic descriptions. The first linguistic description LD Habits1 consists of rules IF quality of xπ ’s suit is Bi AND quality of xπ ’s house is Ve Bi THEN wealth of xπ is Bi ..........................................................................
(11)
The second description LD Habits2 consists of rules IF xπ owns wristwatch AND wealth of xπ is Bi THEN quality of wristwatch is Bi ..........................................................................
(12)
(v) Properties of products: The linguistic description denoted by LD P rod consists of rules of the form IF strike strength is ML Sm AND state of wristwatch is Sm THEN quality of wristwatch is Sm ............................................................................
(13)
(If more or less small strike breaks the wristwatch then their quality is low.) Furthermore, we must construct a specific model. This will be based on a frame M = h(Mα , =α )α∈Typescol , L∆ i
(14)
where Mo = [0, 1], Mζ = R for ζ ∈ {ϑ, τ } and furthermore, Mβ is a set of wristwatches, Mγ a set of suits, Mδ a set of cars, Mη a set of houses and Mπ a set of persons. Let us remark at this point, for example, that the set Mβ is not a set of all wristwatches since such a set does not exist; neither it necessarily contains wristwatches only. In fact, it can be arbitrary (finite or even infinite) set of objects among which we can find the wristwatch of Mr. Smith. The same reasoning holds also for the other kinds of sets of objects.
5.2 5.2.1
Evidence Evidence by Lt. Columbo
To provide the evidence formally means to establish the model M in (14). Evidence about context: (ContC1) Wealth of a person: wPersonWealth = h0, 1, 50i (mil. $) (here we consider typical value of real estates in the given region). (ContC2) Quality of house: wHouseQual = h0.05, 0.5, 20i (mil. $). (ContC3) Quality of wristwatch: wWatchQual = h0.1, 1, 10i (thousands $). (ContC4) Quality of suit: wSuitQual = h0.1, 0.5, 3i (thousands $). Evidence leading to perceptions: (PercC1) Touching Mr. Brown’s car engine by hand does not burn, that is, its temperature is ML Sm (it can typically be about 40◦ C). With respect to the given model M, this formally means that Mp (LPerc LD wEngTemp (EngTemp cBrown ) ML Sm) = 1. (PercC2) Quality of Mr. Smith’s house is VeBi (more than about 18 mil $). Formally, Mp (LPerc LD wHouseQual (HouseQual mSmith ) VeBi ) = 1. 10
(PercC3) Mr. Smith’s wristwatch is broken. We will express this by the observation that the state of Mr. Smith’s wristwatch is Ex Sm (abstract degrees). Due to the given linguistic description (13), it is sufficient to say that its state is small, i.e. formally we obtain the perception Mp (LPerc LD wWatchState (WatchState hSmith ) Sm) = 1. (PercC4) Quality of Mr. Smith’s suit is Bi (more than about 2, 500 $). Formally, Mp (LPerc LD wSuitQual (SuitQual sSmith ) Bi ) = 1. (PercC5) Strike strength in the Mr. Smith’s wristwatch is MLSm (less than about 0.1; abstract degrees), i.e. the strike was caused by the body when falling down. Formally, Mp (LPerc LD wWatchStrike (WatchStrike hSmith ) MLSm) = 1. To apply the rule rP bLD in (6), we must find the perception. To evaluate perception in the model M requires finding a truth degree in which the associated formula of the form zϕ wx from the topic of the corresponding linguistic description is true in it (cf. formula (5)). Then perception is true in the degree 1 if the former formula is true at least in the degree c for some suitable, sufficiently high threshold truth value (e.g., c = 0.9; cf. definition (5)). 5.2.2
Evidence by Mr. Brown
Evidence leading to perceptions: (PercB1) My drive duration was about 30 minutes. In the model M this formally means that Mp ((DriveDur cBrown ) ≡ >wDd ) = 1. (PercB2) The wristwatch on Mr. Smith’s hand belongs to him. Formally, Mp (Own Smth hSmith ) = 1. Implicit consequence of (PercB2) is that the wristwatch displays the time of Mr. Smith’s death.
5.3
Construction of epistemic state
The principles of preferential nonmonotonic reasoning require to split the reasoning into a sequence of formal derivations each proceeding in some special theory FTT which includes parts of T Ev and whose special axioms and provable formulas are just those derived from the corresponding linguistic descriptions. Thus, only those formulas will be used in the reasoning that relate directly to the given situation. The structure of Lt. Columbo’s reasoning is represented by the structure of admissible belief states described in Figure 2. Each of the above formal theories is a label of some admissible belief state s, in symbols `(s). Recall that a belief state s is preferred to s0 , s0 s, if some of the consequences of `(s0 ) are necessary for inference in `(s). The states sBrown1 , sBrown2 correspond to two kinds evidence provided by Mr. Brown.
sBrown1
sProducts
. sPhysics
sHabits
sboth1
. .
.
sColumbo
.
sBrown2
. .
sboth2
Figure 2: Structure of Lt. Columbo’s epistemic state
11
By T ⊆ T Ev we denote a part of T Ev necessary for our reasoning. By T h(·) we denote a deductive closure (of a set of formulas) in FLb. The corresponding belief sets are defined as follows: `(sP hysics ) =T h(T ∪ {LD P hysics , (8), (9), LPerc LD wEngTemp (EngTemp cBrown ) ML Sm}), `(sP rod ) =T h({LD LPerc
P rod
LD
LPerc
Habits1
(15)
wWatchState (WatchState hSmith ) Sm,
wWatchStrike (WatchStrike hSmith ) Sm}),
`(sHabits ) =T h(T ∪ {LD LD
, LPerc
LD
, LD
Habits2
, LPerc
LD
(16)
wSuitQual (SuitQual sSmith ) Bi ,
wHouseQual (HouseQual mSmith ) VeBi , Eval wW q (WatchQual hSmith ) Sm}), (17)
`(sBrown1 ) = T h(T ∪ {(DriveDur cBrown ) ≡ >wDd }), `(sBrown2 ) = T h(T ∪ {Own Smth hSmith }), ¬ Bi wD (DriveDur cBrown )), `(sColumbo ) =T h({Υ(¬ Eval wW q (WatchQual hSmith ) Sm, ¬ Own Smth hSmith }), `(sboth1 ) = T h(`(sColumbo ) ∪ `(sBrown1 )), `(sboth2 ) = T h(`(sColumbo ) ∪ `(sBrown2 ))
(18) (19) (20) (21) (22)
For example, formula LPerc LD wEngTemp (EngTemp cBrown ) ML Sm}) in (15) expresses that “perception of engine temperature of Mr. Brown’s car is more or less small ”, and similarly also the other formulas.
6
Formal solution of Lt. Columbo’s case
We will formally demonstrate that Lt. Columbo’s observations contradicts Mr. Brown’s evidence.
6.1
Proofs of contradiction
(FS1) `(sBrown1 ) ` Bi wDriveDur (DriveDur cBrown )
(Drive duration of Mr. Brown was big).
proof: This follows immediately from the properties of the evaluative expression Bi . 2 ¬ Bi wDd (DriveDur cBrown )) (FS2) `(sP hysics ) ` Υ(¬ (It is true in a non-zero degree that the drive duration of Mr. Brown was not big).
proof: From intension of (10) we can derive by contraposition intension of the rule IF engine temperature is ¬ Bi THEN drive duration is ¬ Bi .
(23)
Moreover, using (8) and Lemma 3(a), we conclude that ∆(Υ(Sm ν wEt x) ⇒ Υ(¬ ¬ Bi wDd y)). ` (∀x)(∃y)∆
(24)
Let us introduce a new constant such that ` b0 ≡ MLSm wEngTemp (EngTemp cBrown ). Since we know that `(sP hysics ) ` LPerc LD wEngTemp (EngTemp cBrown ) ML Sm we get from the provable property T Ev ` Υzϕ w x ≡ Eval w x zϕ that `(sP hysics ) ` Υb0 . From this and (24), we obtain `(sP hysics ) ` b0 ⇒ ¬ Bi wDd (DriveDur cBrown ) ¬ Bi wDd (DriveDur cBrown )) by Lemma 1(b). which implies `(sP hysics ) ` Υ(¬
12
(25) 2
(FS3) `(sP rod ) ` Eval wW q (WatchQual hSmith ) Sm (Quality of wristwatch on Mr. Smith’s hand is low ).
proof: Since we know that `(sP rod ) ` LPerc LD wWatchState (WatchState hSmith ) Sm as well as
`(sP rod ) ` LPerc LD wWatchStrike (WatchStrike hSmith ) MLSm, using rP bLD we immediately obtain the conclusion. 2
(FS4) `(sHabits ) ` ¬ Own Smth hSmith
(The wristwatch on Mr. Smith’s hand does not belong to him.)
proof: From LD Habits1 and the perceptions LPerc LD wSuitQual (SuitQual sSmith ) Bi and LPerc LD wHouseQual (HouseQual mSmith ) VeBi we obtain
`(sHabits ) ` Eval wW P (PersonWealth Smth ) Bi ,
(26)
i.e. Mr. Smith was wealthy. Further derivation can proceed as follows: (L.1) IF quality of wristwatch is Sm THEN quality of wristwatch is ¬ Bi (L.2) IF quality of wristwatch is ¬ Bi THEN wealth of xπ is ¬ Bi OR xπ does not own wristwatch (L.3) `(sHabits ) ` Eval wW q (WatchQual hSmith ) Sm
(rule (8)) (contraposition of (12))
(Axiom, conclusion from `(sP rod ))
¬ Bi wPersonWealth (PersonWealth Smth ) ∨ ¬ Own Smth hSmith (L.4) `(sHabits ) ` Υ¬ (L.5) `(sHabits ) ` ¬ Own Smth hSmith
(L.2)
(L.4 and (26), properties of FTT) 2
(FS5) `(sBoth1 ) is contradictory.
proof: This follows from ¬ Bi wDd (DriveDur cBrown )) `(sBoth1 ) ` Bi wDriveDur (DriveDur cBrown ) & Υ(¬ 2
and Lemma 2. (FS6) `(sBoth2 ) is contradictory.
proof: This follows immediately from `(sBoth2 ) ` ¬ Own Smth hSmith as well as `(sBoth2 ) ` 2
Own Smth hSmith .
Since both states sBoth1 as well as sBoth2 are preferred and the corresponding belief sets are contradictory, we conclude that Mr. Brown lies and he had an opportunity to kill Mr. Smith. Therefore, Lt. Columbo (and we) concluded that Mr. Brown assassinated Mr. Smith.
6.2
Discussion
Note that the reasoning described above is on many places close to that in classical logic. This is correct since fuzzy logic generalizes but does not deny classical logic. The main point is that we work in situations when evidence varies around the boundaries of vaguely-defined concepts. Our example has been constructed to show that we may come to contradiction. In practice, the situation need not be so straightforward as in Columbo’s detective story. For example, we have formulated the knowledge from physics in (10) according to deeper physical knowledge. People, however, quite often reverse the cause and consequence so that the linguistic description can look as follows: IF engine temperature is Sm THEN drive duration is Ve Sm IF engine temperature is Ve Bi THEN drive duration is Ro Bi ................................................................... 13
(27)
Since the Columbo’s perception of engine temperature is ML Sm (see Evidence (PercC1)), the truth degree that the former is small is Mp (LPerc LD wEngTemp (EngTemp cBrown ) Sm) < 1.
(28)
We can even specify a concrete truth value of (28) in the model M. This means that the result of \ PbLD using (27) will not lead to the conclusion (FS2) and provides only element DriveDur cBrown (after using (6)) whose interpretation is a value closer to Mr. Brown’s testimony. Consequently, this branch of Columbo’s reasoning would become less convincing and he would have to search for other arguments against Mr. Brown. We see that modification of the perception leads to modification of the corresponding truth values (for example, the truth of “temperature of engine is small” can be higher, or vice versa, the truth of “temperature of engine is more or less big” can be nonzero) so that the conclusion can be made more, or less, convincing. Then further evidence may be necessary, or vice-versa, more psychological constraint can be passed on Mr. Brown, etc. (this is the typical Lt. Columbo’s method). The previous discussion demonstrates that vagueness plays important role in our reasoning and making conclusions. Classical boolean logic can provide solutions only at the price of imposing, quite often, improper precision and thus, making the model of reasoning unrealistic. Introduction of truth degrees may help to overcome such restrictions since we can balance them: the higher truth values the more convincing conclusion but the limit (i.e., full truth 1 or full falsity 0) is unnecessary.
7
Conclusion
In this paper, we have considered a complex case of the famous movie character, detective Lt. Columbo, and demonstrated on it the power of fuzzy logic in broader sense. At the same time, we also had to apply principles of nonmonotonic reasoning. We analyzed the structure of Lt. Columbo’s reasoning and came (together with him) to the conclusion that the suspect, Mr. Brown, assassinated Mr. Smith. One of the main features of our theory is intensive use of the formal theory of meaning of evaluative linguistic expressions. Our analysis raised several questions. A principal one is, how the given situation can be represented so that linguistic descriptions can be generated automatically from it. Another significant question is the possibility to automatize the reasoning. In the further research, we will focus on these questions. Note that a similar methodology can be used elsewhere, for example in the analysis of economic situations (cf. [8]).
References [1] G. Antoniou, Nonmonotonic Reasoning, MIT Press, Cambridge, Massachusetts, 1997. [2] B. Bennett, A. Galton, A unifying semantics for time and events, Artificial Intelligence 153 (1-2) (2004) 13–48. [3] A. Bochman, A Logical Theory of Nonmonotonic Inference and Belief Change, Springer, Berlin, 2001. [4] E. Davis, The naive physics perplex, AI Magazine 19 (3) (1998) 51–79. [5] E. Davis, L. Morgenstern, Introduction: Progress in formal commonsense reasoning, Artifical Intelligence 153 (2004) 1–12. [6] B. Diaz, T. Takagi, PNL applied to economics, in: Proc. Int. Conf. FUZZ-IEEE’2005, Reno, USA, 2005. [7] A. Dvoˇr´ak, V. Nov´ ak, Fuzzy logic as a methodology for the treatment of vagueness, in: L. Bˇehounek, M. B´ılkov´ a (eds.), The Logica Yearbook 2004, Filosofia, Prague, 2005, pp. 141–151.
14
[8] A. Dvoˇr´ak, V. Nov´ ak, Towards automatic modeling of economic texts, Mathware & Soft Computing XIV (3) (2007) 217–231. [9] A. Gordon, J. Hobbs, Formalizations of commonsense psychology, AI Magazine 25 (4) (2004) 49–62. [10] P. H´ajek, What is mathematical fuzzy logic, Fuzzy Sets and Systems 157 (2006) 597–603. [11] E. Hajiˇcov´ a, B. H. Partee, P. Sgall, Topic-Focus Articulation, Tripartite Structures, and Semantics Content, Kluwer, Dordrecht, 1998. [12] P. Materna, Concepts and Objects, Acta Philosophica Fennica 63, Helsinki, 1998. [13] J. McCarthy, Programs with common sense, in: Proc. of the Teddington Conf. on Mechanization of Thought Processes, Her Majesty’s Stationary Office, London, 1959. [14] V. Nov´ak, Towards formalized integrated theory of fuzzy logic, in: Z. Bien, K. Min (eds.), Fuzzy Logic and Its Applications to Engineering, Information Sciences, and Intelligent Systems, Kluwer, Dordrecht, 1995, pp. 353–363. [15] V. Nov´ak, On fuzzy type theory, Fuzzy Sets and Systems 149 (2005) 235–273. [16] V. Nov´ak, Perception-based logical deduction, in: B. Reusch (ed.), Computational Intelligence, Theory and Applications, Springer, Berlin, 2005, pp. 237–250. [17] V. Nov´ak, Which logic is the real fuzzy logic?, Fuzzy Sets and Systems 157 (2006) 635–641. [18] V. Nov´ak, A comprehensive theory of trichotomous evaluative linguistic expressions, Fuzzy Sets and Systems 159 (22) (2008) 2939–2969. [19] V. Nov´ak, EQ-algebra-based fuzzy type theory and its extensions, Logic Journal of the IGPL (2010) (to appear)DOI: 10.1093/jigpal/jzp087. [20] V. Nov´ak, S. Lehmke, Logical structure of fuzzy IF-THEN rules, Fuzzy Sets and Systems 157 (2006) 2003–2029. [21] S. C. Shapiro, SNePS: A logic for natural language understanding and commonsense reasoning, in: L. Iva´ nska, S. C. Shapiro (eds.), Natural Language Processing and Knowledge Representation: Language for Knowledge and Knowledge for Language, AAAI Press/The MIT Press, MA, U.S.A., 2000, pp. 175–195. [22] L. A. Zadeh, A note on web intelligence, world knowledge and fuzzy logic, Data & Knowledge Engineering 50 (2004) 291–304. [23] L. A. Zadeh, Precisiated natural language, AI Magazine 25 (2004) 74–91.
15