some applications of non clausal deduction - CiteSeerX

0 downloads 0 Views 639KB Size Report
SOME APPLICATIONS OF NON CLAUSAL DEDUCTION by. ANAVAI G. RAMESH. A Dissertation. Submitted to the University at Albany, State University of New ...
University at Albany, State University of New York at Albany COLLEGE OF ARTS AND SCIENCES The Dissertation submitted by Anavai G. Ramesh under the title

SOME APPLICATIONS OF NON CLAUSAL DEDUCTION

has been read by the undersigned. It is hereby recommended for acceptance to the Faculty of the University in partial ful llment of the requirement for the degree of Doctor of Philosophy. (Neil V. Murray)

(Date)

(Richard E. Stearns)

(Date)

(Paliath Narendran)

(Date)

(Erik Rosenthal)

(Date)

Recommended by the Department of Computer Science. , Chair. (Signed) Recommendation accepted by the Dean of Graduate Studies for the Graduate Academic Council. (Signed) (Dated)

SOME APPLICATIONS OF NON CLAUSAL DEDUCTION

by

ANAVAI G. RAMESH

c Copyright, 1995

SOME APPLICATIONS OF NON CLAUSAL DEDUCTION

by

ANAVAI G. RAMESH

A Dissertation Submitted to the University at Albany, State University of New York at Albany in Partial Ful llment of the Requirements for the Degree of Doctor of Philosophy

College of Arts and Sciences Department of Computer Science 1995

Abstract In this thesis it is shown that by using negation normal form for representing propositional formulas, rather than clause forms such as conjunctive and disjunctive normal forms, reasoning systems that are more ecent for many classes of formulas can be built. This is due the fact that the process of converting arbitrary propositional formulas into clause forms is an expensive computational task. Algorithms for two related problems in arti cial intelligence, namely computing prime implicates and implicants, and computing minimal diagnoses are developed and implemented. These algorithms use negation normal form for representing proportional formulas. These algorithms are based on dissolution, an inference rule for negation normal form. Through theoretical and experimental analysis it is shown that these algorithms are superior to many clause-based algorithms. Anti-links are de ned and certain operations based on them are introduced. By performing these operations, many non-prime implicants/implicates and many nonminimal diagnoses can be eliminated without doing expensive subsumption checks. Experimental results showing signi cant improvements obtained by using these operations are also given. An algorithm for computing prime implicants and implicates of multiple-valued logics is also developed. This algorithm is based on signed dissolution, an inference rule for multiple-valued logics.

Acknowledgments I am deeply indebted to my thesis advisor Professor Neil V. Murray, for suggesting the topic to me, and for providing suggestions and comments which were crucial in the development of this thesis. He also introduced me to the exciting topic of automated reasoning. His constructive criticism helped in improving my writing and presentation skills. I would also like to thank Professor Richard Stearns, Professor Paliath Narendran and Professor Erik Rosenthal of The University of New Haven for serving on the thesis committee. I would like to thank Dr. Reiner Hahnle and Bernhard Beckert of the University of Karlsruhe, Germany for providing some of the initial ideas in the thesis. I thank Professor S.S. Ravi for his friendship, encouragement and support. I thank the Ph.D. students of the computer science department for making my stay at Albany a pleasant one. In particular, I would like to thank George Becker, Madhav Marathe, Venkatesh Radhakrishnan, Tushar Saxena, and Sreenivas Rao. I thank Pat Keller and Joan Nelhaus for helping me with the administrative issues. Last but not the least, I would like to thank my wife Shoba and my parents, for all the support and encouragement they have given me over the years. I would like to dedicate this thesis to my son Varun, whose birth coincided with the completion of this thesis. This work was supported in part by National Science Foundation grants: CCR9101208 and CCR-9404338.

Contents 1 Introduction

1

2 Preliminaries

7

2.1 The Language of Propositional Logic : : : : : : : : : : : : : : : : : :

7

2.1.1 Syntax of Propositional Logic : : : : : : : : : : : : : : : : : :

7

2.2 Semantics of Propositional Logic : : : : : : : : : : : : : : : : : : : :

8

2.3 Normal Forms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

9

2.4 Semantic Graphs : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

10

2.4.1 Subgraphs : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

13

2.4.2 Blocks : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

14

2.5 Path Dissolution : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

15

2.6 Fully Dissolved Formulas : : : : : : : : : : : : : : : : : : : : : : : : :

20

3 Generating Prime Implicates/Implicants

22

3.1 Prime Implicates/Implicants : : : : : : : : : : : : : : : : : : : : : : :

22

3.2 PI - An Algorithm to Compute Prime Implicates : : : : : : : : : : :

26

3.3 Experiments on NNF Formulas : : : : : : : : : : : : : : : : : : : : :

33

3.4 Formulas Intractable for CNF/DNF Based Approaches : : : : : : : :

35

iii

3.5 Alternative Approaches : : : : : : : : : : : : : : : : : : : : : : : : : :

36

3.5.1 Structure preserving transformations : : : : : : : : : : : : : :

37

3.5.2 Using minterms as input : : : : : : : : : : : : : : : : : : : : :

39

3.5.3 Comparison with BDDs : : : : : : : : : : : : : : : : : : : : :

40

3.6 Other prime implicate/implicant generation algorithms : : : : : : : :

40

3.6.1 Algorithms based on minterm representation : : : : : : : : : :

40

3.6.2 Algorithms based on formulas : : : : : : : : : : : : : : : : : :

41

3.7 Applications of prime implicate/implicant algorithms : : : : : : : : :

46

3.7.1 Knowledge compilation : : : : : : : : : : : : : : : : : : : : : :

46

3.7.2 Assumption based truth maintenance systems : : : : : : : : :

47

3.7.3 Other applications : : : : : : : : : : : : : : : : : : : : : : : :

47

4 Eliminating Subsumed Paths Without Subsumption Checking

48

4.1 Subsumed paths and Anti-links : : : : : : : : : : : : : : : : : : : : :

48

4.2 Redundant Anti-links : : : : : : : : : : : : : : : : : : : : : : : : : : :

50

4.3 An Anti-Link Operator : : : : : : : : : : : : : : : : : : : : : : : : : :

52

4.4 Correctness of DADV : : : : : : : : : : : : : : : : : : : : : : : : : : :

53

4.5 Simpli cations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

59

4.6 Disjunctive Anti-Links and Factoring : : : : : : : : : : : : : : : : : :

60

4.7 Conjunctive Anti-Links : : : : : : : : : : : : : : : : : : : : : : : : : :

61

4.8 Complexity Considerations : : : : : : : : : : : : : : : : : : : : : : : :

61

4.9 Some Benchmark Examples : : : : : : : : : : : : : : : : : : : : : : :

66

4.10 A Generalized Purity Principle : : : : : : : : : : : : : : : : : : : : :

69

4.11 Strictly Pure Full Blocks : : : : : : : : : : : : : : : : : : : : : : : : :

69

iv

4.12 Multi-Pure Full Blocks : : : : : : : : : : : : : : : : : : : : : : : : : :

71

4.13 More Examples : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

75

4.14 Experimental results : : : : : : : : : : : : : : : : : : : : : : : : : : :

77

5 Computing diagnoses

80

5.1 De nitions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

80

5.2 Extensions to PI algorithm : : : : : : : : : : : : : : : : : : : : : : : :

81

5.3 A Method to Find Minimal Diagnoses : : : : : : : : : : : : : : : : :

82

5.3.1 Diagnoses and c-paths : : : : : : : : : : : : : : : : : : : : : :

82

5.3.2 An Example : : : : : : : : : : : : : : : : : : : : : : : : : : : :

83

5.3.3 Purity Reductions : : : : : : : : : : : : : : : : : : : : : : : : :

85

5.3.4 Single Faults : : : : : : : : : : : : : : : : : : : : : : : : : : :

86

5.3.5 Diagnosis for Horn formulas : : : : : : : : : : : : : : : : : : :

88

5.4 Experimental results : : : : : : : : : : : : : : : : : : : : : : : : : : :

89

5.5 Other algorithms for computing diagnoses : : : : : : : : : : : : : : :

91

5.5.1 Consistency based algorithms : : : : : : : : : : : : : : : : : :

91

5.5.2 Abduction based diagnosis : : : : : : : : : : : : : : : : : : : :

92

6 Computing Prime Implicates/Implicants for Multiple Valued Logics 94 6.1 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

95

6.1.1 Syntax of multiple valued logics : : : : : : : : : : : : : : : : :

95

6.1.2 Semantics of multiple-valued logics : : : : : : : : : : : : : : :

95

6.1.3 Signed formulas : : : : : : : : : : : : : : : : : : : : : : : : : :

96

6.1.4 -atomic formulas : : : : : : : : : : : : : : : : : : : : : : : :

98

6.1.5 Regular logics : : : : : : : : : : : : : : : : : : : : : : : : : : :

98

v

6.1.6 Adding negation to signed formulas : : : : : : : : : : : : : : : 100 6.1.7 Notation for signed formulas : : : : : : : : : : : : : : : : : : : 101 6.1.8 Signed path dissolution : : : : : : : : : : : : : : : : : : : : : : 103 6.2 Prime Implicants/Implicates and Signed Formulas : : : : : : : : : : : 105 6.2.1 De nitions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 105 6.2.2 Prime S-Implicants of -atomic Formulas : : : : : : : : : : : : 107 6.2.3 Foundations : : : : : : : : : : : : : : : : : : : : : : : : : : : : 107 6.3 Post Logics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 110 6.4 Other algorithms for computing prime implicates/implicants of formulas of multiple valued logics : : : : : : : : : : : : : : : : : : : : : : : 114

7 Future work

115

7.1 Semi-Resolution : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 115 7.2 Improved data structures : : : : : : : : : : : : : : : : : : : : : : : : : 118

8 Summary

122

vi

List of Figures 3.1 The PI algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

28

3.2 Kean and Tsiknis algorithm for computing prime implicates : : : : :

42

3.3 MM Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

45

4.1 Semantic graph of Ngair's formulas before and after dissolution. : : :

67

4.2 Ngair's formulas after dissolution and factoring. : : : : : : : : : : : :

68

4.3 Semantic graph of Kean & Tsiknis's formulas. : : : : : : : : : : : : :

75

4.4 The full dissolvent of Km2 (left) and of Kmn , n > 2 (right). : : : : : :

77

5.1 Inverter example : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

83

5.2 System description and observation for inverter example : : : : : : :

84

6.1 Signed semantic graph : : : : : : : : : : : : : : : : : : : : : : : : : : 101 6.2 Subraphs in signed formulas : : : : : : : : : : : : : : : : : : : : : : : 102 6.3 A signed formula : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 104 6.4 Signed formulas and signed implicants : : : : : : : : : : : : : : : : : 107 6.5 Signed full dissolvent (with respect to d-links) : : : : : : : : : : : : : 110 7.1 A trie : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 119 7.2 A d-trie : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 119 vii

7.3 A small d-trie for Sn : : : : : : : : : : : : : : : : : : : : : : : : : : : 120 7.4 A large d-trie for Sn : : : : : : : : : : : : : : : : : : : : : : : : : : : 121

viii

List of Tables 3.1 Experimental results for small NNF formulas : : : : : : : : : : : : : :

33

3.2 Experimental results for large NNF formulas : : : : : : : : : : : : : :

34

4.1 Number of subsumption checks needed for Kmn by Cltms, Gen-Pi, and Pi + anti-link. : : : : : : : : : : : : : : : : : : : : : : : : : : : :

76

4.2 Results of applying anti-link operators. : : : : : : : : : : : : : : : : :

79

5.1 Running times for n-bit adder problem : : : : : : : : : : : : : : : : :

90

5.2 Running time for single fault diagnosis : : : : : : : : : : : : : : : : :

91

6.1 Example steps in the computation of prime implicants : : : : : : : : 114

ix

Chapter 1 Introduction Propositional logic is widely used as a framework for knowledge representation in Arti cial Intelligence (AI) applications ( Note that terms appearing in italics in the introduction are de ned later in this thesis). Though propositional logic is computationally weaker than rst and higher order logic, it is has been shown to be sucient for many applications. Many problems in propositional logic are computationally hard. Some problems are NP-complete, many are believed to be even harder, and some are provably exponential [17]. Given the diculty of such problems, the development of ecient implementations is desirable. One source of potential inef ciency is the use of clause forms, such as conjunctive or disjunctive normal form (CNF/DNF), for representing propositional logic formulas, by almost all AI systems. Formulas not in clause form must be rst converted to clause form before using these systems. Converting arbitrary propositional formulas into equivalent clause forms can be very inecient. Many systems overcome this by using ecient transformations which, though sucient for theorem proving, do not guarantee the preservation of equivalence [44]. However, for many applications equivalence has to preserved by transformations.

In this thesis we show that by not requiring clause-based representations for formulas, it is possible to build reasoning systems which are more ecient on certain classes of formulas than the corresponding clause based implementations. To demonstrate 1

this we have chosen two related problems in AI which require reasoning: computing prime implicates/implicants, and computing minimal diagnoses. Both these problems have been shown to be of exponential time complexity. Our algorithms are based on Negation Normal Form (NNF), which is a more natural and less restrictive representation than conjunctive and disjunctive normal forms. Arbitrary formulas in propositional logic using ^; _; :, and ), can be converted into equivalent formulas in NNF in linear time. Our algorithms use dissolution, an inference rule for NNF formulas introduced in [38], and the notion of paths in NNF formulas. Dissolution has many advantages over clause based inference rules - repeated application to an NNF formula is guaranteed to terminate, and checking for termination is easy. The consequences of a propositional formula, expressed as minimal implied clauses, are the formula's prime implicates, whereas the minimal conjunctions that imply it are its prime implicants. Prime implicates and prime implicants are duals of each other. Prime implicates/implicants have been used in many reasoning tasks such as nonmonotonic systems [21, 46], clause management systems [51], machine learning [53]. Prime implicant algorithms were rst used in conjunction with the design of boolean circuits. These algorithms use a truth table representation of the propositional formula to compute prime implicants. However, in AI applications we deal with propositional formulas rather than truth tables. Converting a propositional formula into a truth table is an expensive computational step. Therefore these algorithms are of very little use in AI systems. Many other algorithms have been proposed to compute the prime implicates of propositional formulas. The algorithms in [9, 28, 27, 30, 57] require clause forms where as the algorithm in [43] requires the input to be a conjunction of DNF formulas. The algorithm from [6] assumes a binary decision diagram (BDD) as the input. Our techniques have been implemented1 We have experimented by running our algorithm on many well known benchmark problems, and comparing the running times The implementation has been developed in C/C++ and runs on a SPARC 10. The software is available for use via anonymous ftp at ftp://ftp.cs.albany.edu/nvm/pi. 1

2

with that of deKleer's system CLTMS [9], which is the probably the fastest known implemented clause based prime implicate/implicant generating system. Our system performs consistently better than deKleer's system on almost all of these benchmark examples. In addition, we have also discovered classes of formulas for which we can prove that our techniques are very ecient but would require exponential time in any CNF/DNF-based technique. Ngair has also independently discovered similar examples [43]. The bottlenecks in any prime implicant/implicate generating system are the subsumption (minimality) checking routines; subsumption is performed every time a potential prime implicate/implicant is found, and between all currently computed potential prime implicates. Since the number of prime implicates/implicants are in general exponential in the size of the input formulas, performing subsumption checks is expensive. One way to to speed up these routines is to use ecient data structures in their implementation. The data structure called discriminated tree (or trie), which was originally developed for performing dictionary search [32], has been very e ectively used in building ecient prime implicate algorithms [9]. We make use of tries in our implementation also. Another way to improve the performance of prime/implicate algorithms is to reduce the number of subsumption checks to be done, thereby reducing the number of times the subsumption routines are called. We do that by restructuring the formulas so that many non-prime implicates are removed without doing subsumption checks at all. We de ne disjunctive and conjunctive anti-links2 in NNF formulas, and we identify operations to remove such anti-links and their associated subsumed paths (potential implicates/implicants). This leaves fewer subsumption checks to be done Anti-links and some associated operators were rst proposed by Beckert and Hahnle (personal communication). The rst motivation for studying anti-links arose in connection with regular clausal tableau calculi [33]. The anti-link rule as it will be de ned later can be viewed as an implementation of the regularity condition in [33] for the propositional non-clausal case (Letz et al. considered the rst-order clausal case). There, re nements of general inference rules are considered, whereas the anti-link rule is used as a preprocessing step here. 2

3

later. Anti-link operations, when performed on NNF formulas result in equivalent NNF formulas. Anti-link operations can also be applied to CNF (or DNF) formulas, however their application may not result in CNF (or DNF) formulas. Although we use prime implicate/implicant generation as an example to demonstrate the utility of anti-link operations, anti-link operations can be employed in any application that requires eliminating subsumed paths in an NNF formula. We have successfully used anti-link operations in our NNF based diagnosis system also (see Chapter 5). Next, we investigate the adaptability of prime implicant/implicate generating algorithms to multiple-valued logics (MVLs). MVLs are used in many AI applications such as nonmonotonic reasoning [20, 13], and hardware veri cation [25]. Our method is based on signed dissolution, an extension of dissolution to signed formulas, a classical logic that serves as a meta-logic for multiple-valued logics. Signed formulas capture, at a meta-level, queries about formulas of multiple-valued logics. The answers to these queries are either true or false. The answer to such queries is yes if the corresponding signed formula is satis able, false if not. Satis ability for signed formulas is de ned in the classical way, but under certain restrictions allowing only those assignments that correspond to \reasonable" assignments over the multiple-valued language. Depending upon the deductive machinery employed, armative answers to such queries may be accompanied by a representation of those interpretations under which the query comes out true. It is this feature that is crucial for producing prime implicants/implicates of multiple valued logics. We rst de ne the notion of prime s-implicant and prime s-implicate with respect to signed formulas that are normalized in a way analogous to NNF. These are generalizations to signed formulas of the classical notions of prime implicant/implicate. Then, to nd the prime implicants/implicates of multiple-valued logic formulas, we rst form the corresponding signed formulas, nd the prime s-implicants/s-implicates of these signed formulas, and nally translate them to prime implicants/implicates of the multiple-valued function. Our method therefore provides a way of nding prime implicants or prime implicates that is quite independent of the particular MVL 4

employed (within the class of MVLs called regular logics). Many such algorithms have been developed for computing prime implicates/implicants [1, 26, 59] for multiple-valued logics, where the formulas are represented as truth-tables; which as in the two valued case; is of little interest to the AI community. The next problem we address in this thesis is the problem of computing all minimal diagnoses in diagnosis from rst principles. Diagnosis from rst principles is a formal technique invented by Reiter [50] based on the informal methods of deKleer[11] and Gensereth [19]. In diagnosis from rst principles, we are given a logic based3 based description of a some system (e.g., a circuit) and an observation of the systems behavior. We then try to nd a set of components in the system which, when assumed to be abnormal, explains the discrepancy between the intended behavior of the system and the observation. Usually the number of diagnoses will be exponential in the total number of components. Therefore we attempt to compute only the set of minimal diagnosis, which is a much smaller set. Diagnosis systems have been used in digital and analog circuit troubleshooting [11, 19]. Recently they have also been applied to the automated debugging of programs written in PROLOG [5]. All currently known algorithms generate minimal con icts as an intermediate step; the minimal diagnosis is then computed from them. Generating minimal con icts can result in poor performance of the diagnosis algorithms. Our algorithm does not generate minimal con icts. Anti-link operations developed for computing prime implicates/implicants are used to eliminate many non-minimal diagnoses without any checks for subsumption. In many situations one is only interested in single fault diagnoses (diagnoses where only one component is faulty). Our algorithm can be easily modi ed so as to eliminate multiple fault diagnoses as early as possible, thereby improving the performance of the algorithm. We achieve this by identifying parts of the formulas which can never In principle any form of logic - propositional, rst order, default logic etc. can be used depending on the application. However, we assume the use of propositional logic only. 3

5

lead to single fault diagnoses, and then eliminating such parts of the formula. We have implemented4 both the all-fault and single-fault diagnoses algorithms. We employ commonly used benchmarks to compare our system with some of the most recently published diagnoses systems. We show that there seems to be no advantage in using a clause based representation. In Chapter 2 we give the background necessary to understand this thesis. We give a description of semantic graphs, a graphical notation used for representing NNF formulas. A description of dissolution and some of its basic properties are also given. In Chapter 3 we give a description of our algorithms for computing prime implicants/implicates and prove their correctness. Some applications of prime implicate/implicant computations are also discussed. In Chapter 4 we discuss techniques to eliminate subsumption checks based on anti-links. Complexity results for identifying anti-links and for eliminating subsumed paths are also derived. We then give some experimental results on the speed up resulting due to anti-link removal. Techniques to compute prime implicants/implicates for multiple valued logics are given in Chapter 6. These are extensions of the algorithms developed in Chapter 3 to signed formulas. This chapter also contains some basic de nitions of signed formulas and signed dissolution. In Chapter 5 we describe the algorithm to compute minimal diagnosis and single fault diagnosis, and prove its correctness. We then give some experimental results comparing our algorithm with some of the recently developed algorithms. In Chapter 7 we suggest two possible improvements which can potentially speed up the computation of prime implicate/implicant and minimal diagnosis algorithms.

4

The software is also available through anonymous ftp at ftp://ftp.cs.albany.edu/nvm/diag.

6

Chapter 2 Preliminaries This chapter contains a brief review of propositional logic and path dissolution. Path dissolution was developed by Murray and Rosenthal [40] as an inference rule for propositional two valued logic. It was then extended to multiple valued logics using the notion of signed formulas. The de nitions and notation used in this section are taken from [55, 37, 40]. In this thesis we deal with formulas in propositional logic only.

2.1 The Language of Propositional Logic A language consists of a set of symbols, and a set of rules describing the set of strings which are legal sentences. The set of symbols and the set of rules form the syntax of the language. The meaning associated with the legal strings in the language is the semantics of the language.

2.1.1 Syntax of Propositional Logic The set of legal sentences of propositional logic are referred to as well formed formulas (or simply as formulas). 7

The set of symbols used in constructing propositional formulas are Logical constants: true, false Propositional symbols: A1; A2::: Propositional connectives: _; ^; : The following rules are used to construct propositional formulas. 1. All constants and propositional symbols are formulas. 2. If F is a formula then :F is also a formula. 3. If F1 and F2 are formulas then F1 ^ F2 and F1 _ F2 are formulas. An atom is a either a logical constant or a propositional symbol. A literal is either an atom or a negated atom (i.e is of the form :A for some atom A.)

2.2 Semantics of Propositional Logic An interpretation is a mapping from propositional symbols to the set ftrue,falseg. Note that meaning of true and false is di erent from the constants used in building propositional formulas. Let I be an interpretation. I can be extended to I 0, a mapping from propositional formulas to the set ftrue,falseg, as follows. 1. I 0(true) = true and I 0(false) = false 2. I 0(A) = I (A) for all atoms A 8 > < true if I 0(F ) = false 3. I 0(:F ) = > : false otherwise 8 > < true if I 0(F1) = true or I 0(F2) = true 4. I 0(F1 _ F2) = > : false otherwise 8

8 > < true if I 0(F1 ) = true and I 0(F2) = true 5. I 0(F1 ^ F2) = > : false otherwise An interpretation I satis es (falsi es) a formula F if I 0(F ) = true(false). A formula F is valid (contradictory) if every interpretation satis es (falsi es) it. A formula F1 is entailed by another formula F2 (i.e F1 j= F2) if the formula :F1 _ F2 is valid. Two formulas F1 and F2 are equivalent (i.e. F1  F2) if F1 j= F2 and F2 j= F1. We use the abbreviation Wn F for F _ (F _ :::(F _ F )) and, i=1 i

Vn F for F ^ (F ^ :::(F ^ F )) 1 2 n?1 n i=1 i

1

2

n?1

n

2.3 Normal Forms Restricting the syntactic structure of propositional formulas results in various normal forms. The most common normal forms used in the literature are Negation Normal Form (NNF), Conjunctive Normal Form (CNF), and Disjunctive Normal Form (DNF). A formula in NNF contains the connectives ^; _, and :, such that the : connective appears only preceding atomic formulas. An arbitrary formula can be converted to any equivalent NNF formula using the following equivalence, in linear time. 1. :(F1 ^ F2)  :F1 _ :F2 2. :(F1 _ F2)  :F1 ^ :F2 3. ::F  F A formula F is in CNF if it is a conjunction of disjunctions of literals, i.e F = (Vni=1 ( Wmj=1 Li;j )) where Li;j is a literal. i

9

A formula F is in DNF if it is a disjunction of conjunctions of literals, i.e F = (Wni=1 ( Vmj=1 Li;j )) where Li;j is a literal. i

A given formula in NNF can be converted to an equivalent formula in CNF or in DNF using the following equivalences (distributive laws). 1. F1 _ (F2 ^ F3)  (F1 _ F2) ^ (F1 _ F3) 2. F1 ^ (F2 _ F3)  (F1 ^ F2) _ (F1 ^ F3) Converting NNF formulas to equivalent CNF/DNF can result in an exponential increase in size. A CNF formula F is a Horn formula if every disjunction contains at most one positive literal. Horn formulas are weaker than formulas in NNF,CNF, or DNF, i.e. there are formulas for which there exits no equivalent horn formula. For example, the formula A _ B has no horn equivalent.

2.4 Semantic Graphs The notion of semantic graphs was rst introduced by Murray and Rosenthal in [40]. Semantic graphs are a graphical representation of NNF formulas:

De nition 1 A semantic graph consists either of 1. one of the constants true and false, 2. a literal A or A, 3. a c-arc, which is a conjunction of two semantic graphs, or 4. a d-arc, which is a disjunction of two semantic graphs. We use the notation (X; Y )c for the c-arc from X to Y and similarly use (X; Y )d for a d-arc; the subscript may be omitted when no confusion is possible.

10

Each semantic graph used in the construction of a semantic graph G is called an explicit subgraph of G. If G = (X; Y )c, then X (resp. Y ) is a fundamental subgraph of G if X (Y ) is not a c-arc; otherwise the fundamental subgraphs of X (Y ) are fundamental subgraphs of G. Similarly if G = (X; Y )d, then X (Y ) is a fundamental subgraph of G if X (Y ) is not a d-arc, otherwise the fundamental subgraphs of X (Y ) are fundamental subgraphs of G. The set of nodes of a semantic graph G consists of all literal occurrences used in its construction; the same holds for the set of c-arcs of G and the set of d-arcs of G; i.e., these sets include the nodes, c-arcs, and d-arcs, respectively, occurring in the explicit subgraphs of G.

In the following, we identify a semantic graph G and the formula it represents1; essentially, the only di erence between the semantic graph and the formula is the point of view, and we will use either term depending upon the desired emphasis. For a more detailed exposition, see [40]. In addition, we identify a semantic graph and the triple consisting of its set of nodes, its set of c-arcs, and its set of d-arcs. The only exception where this is not possible are the semantic graphs true and false (both correspond to (;; ;; ;)). Note, however, that when a semantic graph contains occurrences of true and false, the obvious truth-functional reductions apply. Unless otherwise stated, we will assume that semantic graphs are automatically so reduced. In pictorial representations, c-arcs and d-arcs are indicated by the usual symbols for conjunction and disjunction; the arguments of a c-arc are placed vertically above each other, the arguments of a d-arc horizontally besides each other.

Example 1 Below, the formula G = (X ^ Y ) = ((:C ^ A) _ D _ E ) ^ (:A _ (B ^ C ))

(2:1)

, false, and positive literals represent themselves; a negative literal A represents :A; (X; Y )c represents X ^ Y ; and (X; Y )d represents X _ Y . 1 true

11

is displayed as a semantic graph: X

C

^ _ D _ E

A

^ Y

(2:2)

B

A _ ^ C

The boxes in (2.2) show the explicit subgraphs used in the construction of the semantic graph (since c-arcs and d-arcs are associative and commutative we do not show the explicit subgraphs in subsequent pictorial representations).

De nition 2 If A and B are nodes in a graph, and if (X; Y ) is an arc ( = c or = d) with A in X and B in Y , we say that (X; Y ) is the arc connecting A and B , and that A and B are - connected.

Example 2 In (2.2), C is c-connected to each of B , A, C , D, and E , and is d-

connected to A.

De nition 3 Let G be a semantic graph. A partial c-path through G is a set of

nodes such that any two are c-connected, and a c-path through G is a partial c-path that is not properly contained in any partial c-path. (Partial) d-paths are de ned accordingly using d-arcs instead of c-arcs.

`(p) denotes the set of literals of a path p.

12

Example 3 Below, the semantic graph (2.2) is shown with lines indicating its c-paths (on the left) and its d-paths (on the right):

C

C

A

A

^ _ D _ E ^

^ _ D _ E

B

A _ ^ C

^

B

A _ ^ C

The c-paths are fC; A; Ag, fC; A; B; C g, fD; Ag, fD; B; C g, fE; Ag, fE; B; C g; the d-paths are fC; D; E g, fA; D; E g, fA; B g, fA; C g.

The following lemma is obvious.

Lemma 2.4.1 Let G be a semantic graph. Then an interpretation I satis es (falsi-

es) G i I satis es (falsi es) every literal on some c-path (d-path) through G.

2.4.1 Subgraphs We will frequently nd it useful to consider subgraphs of a semantic graph that are not explicit.

De nition 4 Given a semantic graph G and a non-empty subset N of the nodes of G, the subgraph of G that corresponds to N is that part of G that consists of nodes from N , where the logical structure of that part is preserved.

G ? N denotes the subgraph of G corresponding to the set of nodes of G that are not in N . Two subgraphs H and H 0 of G meet each other if they have nodes in common.

13

A non-empty subset N of nodes corresponds unambiguously to one subgraph of G. The empty set corresponds to both true and false; true and false are subgraphs of all semantic graphs. For a more precise de nition of subgraphs, see [40].

Example 4 Below the subgraph of (2.2) is shown that corresponds to the node set fA; D; A; g. A _ D ^ A

2.4.2 Blocks The most important subgraphs are the blocks:

De nition 5 A c-block H is a subgraph of a semantic graph G with the property that any c-path p that includes at least one node from H passes through H , where p passes through H i the subset of p consisting of nodes of H is a c-path through H ; d-blocks are similarly de ned using d-paths.

Example 5 In (2.2), the subgraph corresponding to the node set fA; D; E; A; C g is a c-block. However, it is not a d-block since the d-path fA; B g restricted to the subgraph is fAg, which is a proper sub-path of fA; C g in the subgraph. De nition 6 A full block is a subgraph that is both a c-block and a d-block. One way to envision a full block is to consider conjunction and disjunction as n-ary connectives. Then a full block is a subset of the arguments of one connective, i.e., of one explicit subformula. Full blocks may be treated as essentially explicit subgraphs (up to the order of arguments), and the Isomorphism Theorem from [37] assures us that they are the only structures that may be so treated. 14

Example 6 In (2.2), the subgraph corresponding to fC; A; E g is a full block. It can be written as (fC; Ag; E )d; i.e., we can regard the upper part of the graph as (fC; A; E g; D)d. The fundamental subgraphs of the upper disjunction are (fC g; fAg)c and the literals D and E .

De nition 7 Let H be a full block; H is a conjunction or a disjunction of funda-

mental subgraphs of some explicit subgraph M . If the nal arc of M is a conjunction, then we de ne the c-extension of H to be M and the d-extension of H to be H itself. The situation is reversed if the nal arc of M is a d-arc.

We use the notation CE (H ) and DE (H ) for the c- and d-extensions, respectively, of H .

Example 7 In (2.2), CE (A) = A

B

DE (A) = A _ ^ : C

and

In this thesis, we compute c- and d-extensions of single nodes only. Single nodes are always full blocks and so testing for this property will be unnecessary. If we assume that formulas are represented as n-ary trees, computing these extensions can be done in constant time; we merely determine whether the given node's parent is a conjunction or a disjunction, and the appropriate extension is then either the node itself or the parent.

2.5 Path Dissolution Path dissolution [40] is an inferencing mechanism for classical logic that has several interesting properties. It is an ecient generalization of the method of analytic tableaux [16], is strongly complete in the propositional case, and can produce a list of satisfying interpretations of a formula. The latter feature is particularly valuable 15

in this or in any setting in which one wishes to make use of satisfying interpretations rather than merely to determine whether any exist. Path dissolution works by selecting a link and restructuring the formula so that all paths through the link are eliminated. The nature of the restructuring is such that one cannot rely on CNF: Even if a formula starts out in CNF, a single dissolution step produces an unnormalized formula, unlike resolution [52]. One consequence of eliminating all paths through a link is strong completeness: Any sequence of dissolution steps will eventually create a linkless formula. The paths that remain may be interpreted as models (satisfying interpretations) of the formula.

De nition 8 A c-link is a complementary pair of c-connected nodes; d-connected complementary nodes form a d-link.

Unless stated otherwise, we use the term link to refer to a c-link. Path dissolution is in general applicable to collections of links; here we restrict attention to single links.

Example 8 Consider the link fA; Ag in (2.2). Then the entire graph G = (X ^ Y ) is the smallest full block containing the link.

De nition 9 Let X be a semantic graph and H an arbitrary subgraph.2 The c-path complement of H with respect to X , written CC (H; X ), is the subgraph of X consisting of all literals in X that lie on c-paths that do not contain nodes from H . If no such literal exists, CC (H; X ) = false. The c-path extension of H with respect to X , written CPE (H; X ), is the subgraph of X containing all literals that lie on c-paths that pass through H . If no such literal exists, CPE (H; X ) = false.3 H usually is but does not have to be a subgraph of X . Note, that CPE has two arguments whereas CE (Def. 7) has but one; intuitively, CE has an implicit second argument that is always the entire graph in which the explicit argument occurs. 2

3

16

In the development of anti-link operations, we will use operations that are the duals of CC and CPE . We use DC for the d-path complement and DPE for the d-path extension operators. Their de nitions are straightforward by duality,

Example 9 In (2.2), CC (A; X ) CPE (A; X ) CPE (A; G) CE (A)

= = = =

(D _ E ) (C ^ A) (C ^ A ^ Y ) (C ^ A)

Proposition 1 Let G be a semantic graph and H an arbitrary subgraph. Then CPE (H; G) =

8 > false if H does not meet G > > > n CPE (H ; F ) > if the nal arc of G is a d-arc F i i=1 > > : Vki=1 CPE (HF ; Fi) ^ Vnj=k+1 Fj if the nal arc of G is a c-arc 8 > true if H does not meet G > > > n DPE (H ; F ) > if the nal arc of G is a c-arc F i i=1 > > W W : ki=1 DPE (HF ; Fi) _ nj=k+1 Fj if the nal arc of G is a d-arc 8 > G if H does not meet G > > > < false if H = G Wn CC (H ; F ) > > if the nal arc of G is a d-arc F i i=1 > > V V : ki=1 CC (HF ; Fi) ^ nj=k+1 Fj if the nal arc of G is a c-arc 8 > G if H does not meet G > > > < true if H = G V > n DC (H ; F ) > if the nal arc of G is a c-arc F i i=1 > > : Wki=1 DC (HF ; Fi) _ Wnj=k+1 Fj if the nal arc of G is a d-arc i

i

DPE (H; G) =

i

i

CC (H; G) =

i

i

DC (H; G) =

i

i

where Fi (i  i  k) are the fundamental subgraphs of G that meet H , and Fi (k + 1  i  n) are those that do not.

17

Remark 1 If K is a node in a graph G, then CC(K ,G) can be obtained by delet-

ing CE(K ) the c-extension of K , in G, i.e replacing it by false and then applying appropriate reductions to the graph.

If K is a node in a graph G, then DC(K ,G) can be obtained by deleting DE(K ) the d-extension of K , in G, i.e replacing it by true and then applying appropriate reductions to the graph.

The reader is referred to [40] for the proofs of the lemmas below.

Lemma 2.5.1 Let H be an arbitrary subgraph of G. The c-paths of CPE (H; G) are precisely the c-paths of G that pass through H .

Corollary 2.5.2 CPE (H; G) is exactly the subgraph of G relative to the set of nodes that lie on c-paths that pass through H .

Lemma 2.5.3 Let H be an arbitrary subgraph of G. The c-paths of CC (H; G) are precisely the c-paths of G that do not pass through H .

Corollary 1 CC (H; G) is exactly the subgraph of G relative to the set of nodes that lie on c-paths that do not pass through H .

Lemma 2.5.4 If H is a c-block, then CC (H; G) _ CPE (H; G) and G have the same c-paths.

The above lemmas and corollaries about CC and CPE all hold in dual form for DC and DPE . Suppose that we have literal occurrences A and A residing in conjoined subgraphs X and Y , respectively. It is intuitively clear that the c-paths through (X ^ Y ) that do not contain the link fA; Ag are those through (CPE (A; X ) ^ CC (A; Y )) plus those through (CC (A; X ) ^ CPE (A; Y )) plus those through (CC (A; X ) ^ CC (A; Y )). 18

De nition 10 Let H = fA; Ag be a link, and let M = (X; Y )c be the smallest full

block containing H . DV (H; M ), the dissolvent of H in M , is de ned as follows:

If H is a single c-block, then DV (H; M ) = CC (A; M ) = CC (A; M ) = false. Otherwise (i.e., if H consists of two c-blocks),

DV (H; M ) =

CPE (A; X )

^

CC (A; Y )

_

CC (A; X )

^

CPE (A; Y )

_

CC (A; X )

^

CC (A; Y )

The only way that H can be a single c-block is if H is a full block (it is trivially a d-block). In that case, H = M , and A and A must be (up to commutations and reassociations) arguments of the same conjunction. The following proposition follows from the corollaries and Lemma 2.5.4:

Proposition 2 Either of the two more compact graphs shown below has the same c-paths as DV (H; M ), and may thus be used instead:

X

^

CC (A; Y )

_

CC (A; X )

^

(2:3)

CPE (A; Y )

CC (A; X )

^

Y

_

CPE (A; X )

^

(2:4)

CC (A; Y )

The semantic graphs from the above proposition are not identical to DV (H; M ) as graphs, but they do have the identical c-paths: all those of the original full block M except those of CPE (A; X ) ^ CPE (A; Y ), i.e., except those through the link.

Proposition 3 If X is A then CC(A,X) is false, hence DV (H; M ) becomes This is a special case of the dissolution referred to as unit dissolution.

X

^

.

CC (A; Y )

Example 10 If we dissolve on the link fA; Ag in (2.2) (using the compact form (2.4) 19

of dissolution from Proposition 2), the graph that results is:

C D _ E

^

B

A _ ^ C

^

A

_ ^

B

^

C

Theorem 2.5.5 Let H be a link in a semantic graph G, and let M be the smallest full block containing H . Then M and DV (H; M ) are logically equivalent.

A proof of Theorem 2.5.5 (in a more general form) can be found in [40]. We may therefore select an arbitrary link H in G and replace the smallest full block containing H by its dissolvent, producing (in the ground case) an equivalent graph. We call the resulting graph the dissolvent of G with respect to H . Since the paths of the new graph are all that appeared in G except those that contained the link, this graph has strictly fewer c-paths than the old one. As a result, nitely many dissolutions (bounded above by the number of c-paths in the original graph) will yield a linkless equivalent graph. This proves:

Theorem 2.5.6 At the ground level, path dissolution is a strongly complete rule of inference.4

2.6 Fully Dissolved Formulas If we dissolve in a semantic graph G until it is linkless, we call the resulting graph the full dissolvent of G and denote it by FD(G). Observe that FD(G) is dependent That means, that the result of applying dissolution repeatedly to an unsatis able semantic graph results in the graph false, independently of the choice of the link is dissolved upon at each step. 4

20

on the order in which links are activated. However, the set of c-paths in FD(G) is unique: It is exactly the set of satis able c-paths in G. In a dual manner, we may de ne dissolution for disjunctive links; in that case, FD(G) has no disjunctive links.

21

Chapter 3 Generating Prime Implicates/Implicants In this chapter we give an algorithm for computing prime implicates and implicants of propositional formulas in NNF.

3.1 Prime Implicates/Implicants We brie y summarize basic de nitions regarding implicates. The treatment for implicants is completely dual and is indicated by appropriate dual expressions in parentheses.

De nition 11 A disjunction (conjunction) D subsumes another disjunction D0 (conjunction D0) i D j= D0 (D0 j= D). true (false) is subsumed by all disjunctions (conjunctions).

A disjunction is called true i it is equivalent to true. A conjunction is called false i it is equivalent to false.

Lemma 3.1.1 If a disjunction (conjunction) D0 that is not true (false), then D subsumes D0 i D  D0 . 22

A true disjunction (false conjunction) subsumes another true disjunction (false conjunction) only.

De nition 12 A disjunction (conjunction) P of literals is an implicate (implicant) of a formula G, i G j= P (P j= G). A disjunction (conjunction) D is a prime implicate (prime implicant) of a formula G i 1. D is not true (false). 2. D is an implicate (implicant) of G. 3. For all literals Ai in D, G j= 6 (D ? fAig) ((D ? fAig)j=6 G).

Note that the set of all prime implicates (implicants) of a formula G, when treated as a CNF (DNF) formula, is equivalent to G.

De nition 13 Let D be the set of all prime implicates of a formula G. A prime implicate D of G is essential if D n fDg is not equivalent to G, otherwise D is inessential.

De nition 14 A set P of d-paths (c-paths) is minimal if it contains neither tautologies (contradictions) nor any paths subsumed by other paths of P.

In the discussion that follows, we will often refer to subsumption of d- and c-paths rather than of disjuncts and conjuncts. Paths are de ned as sets of literal occurrences, but with regard to subsumption, we consider the literal set `(p) of a path p. In this way, no change in the standard de nitions is necessary. In the proof of Theorem 3.1.3 below, we make use of the following lemma:

Lemma 3.1.2 Let G be a semantic graph, and let H be a subgraph of G such that all d-paths through H are only partial d-paths in G. Then there exists a c-path q through G that does not meet H.

23

Proof: By induction on the size of G. If G consists of a single literal L, H must be empty and L is the required c-path through G. So suppose G is a disjunction or a conjunction of X and Y , and let H = Hx [ Hy , where Hx is H restricted to X and Hy is H restricted to Y.

X

If G = ^ , then all d-paths of Hx and of Hy are only partial in X and in Y, Y respectively. By the induction hypothesis there is a c-path qx through X that does not meet Hx and a c-path qy through Y that does not meet Hy . But then qxqy is a c-path through G that does not meet H. If G = X _ Y , then all d-paths of at least one of Hx and Hy (say Hx) are only partial in (say) X. (Otherwise, we could choose a d-path from each of Hx and Hy that is a d-path of X and of Y respectively, from which we immediately obtain a d-path of H that is also a d-path through G, contrary to the hypothesis.) By the induction hypothesis there is a c-path qx through X that does not meet Hx . But qx is a c-path through G, and the proof is complete. 2

Theorem 3.1.3 In any non-empty formula G in which no c-path contains a link, every implicate of G is subsumed by some d-path of G.

Proof: Let C be an implicate of G. Since G j= C, G ^ :C is unsatis able and is therefore spanned by its full set of (conjunctive) links, i.e., all c-paths contain a link. Let R be the set of all literals in G that are linked to :C. It suces to show that GR contains a d-path through G since R  C . So assume that all d-paths of GR are partial in G. There is a c-path q in G that does not meet GR by 3.1.2. But since G ^ :C is spanned, and since G is linkless, some literal in q is linked to a literal in : C, contrary to the de nitions of q and R. Therefore, some d-path of GR is not partial and subsumes C. 2

Corollary 3.1.4 Every implicate of a reduced DNF formula (one with no false con-

juncts) is subsumed by some d-path in the formula.

24

This follows from Theorem 3 as such a DNF formula has no c-paths with links.

Theorem 3.1.5 In any non-empty formula in which no d-path contains a link, every

implicant of the formula is subsumed by some c-path in the formula. Proof: The proof is dual to that of Theorem 3.

2

Corollary 3.1.6 Every implicant of a reduced CNF formula (one with no true dis-

juncts) is subsumed by some c-path in the formula.

This follows from Theorem 3.1.5 as such a CNF formula has no d-paths with links.

Lemma 3.1.7 The full dissolvent FD(G) (with respect to either c-links or d-links)

of a graph G, is equivalent to G.

2

Proof: The result is immediate from Theorems 1 and 2.

We omit proofs for Lemmas 3.1.8 3.1.9, and for Theorems 3.2.3 and 3.1.11; they follow directly from Theorems 3.1.3 and 3.1.5, and from the de nitions of implicant, implicate, and satis ability of semantic graphs.

Lemma 3.1.8 Every d-path through a semantic graph is an implicate of the graph. Lemma 3.1.9 Every c-path through a semantic graph is an implicant of the graph. Theorem 3.1.10 Suppose G is a non-null graph without c-links; we de ne (G ) to be the largest subset of the d-paths of G that is minimal. I.e.,

(G ) = fP j (P is a d ? path through G ) ^ (P 6= true ) ^ (8 d ? paths Q through G ; Q 6 P )g: Then (G) is the set of all prime implicates of G.

Theorem 3.1.11 Suppose G is a non-null graph without d-links; we de ne (G ) to

be the largest subset of the c-paths of G that is minimal. all prime implicants of G.

25

Then (G) is the set of

Note that Theorem 3.1.11 is a generalization (from CNF to NNF) of Nelson's Theorem [42].

3.2 PI - An Algorithm to Compute Prime Implicates This algorithm is an extension to NNF formulas of Jackson and Pais' [28] MM algorithm, which requires CNF/NNF formulas. Depending upon whether the input formula is in CNF or DNF, and upon whether prime implicates or implicants are desired, the MM algorithm may have to be run twice. In situations where, if used alone, PI would also be run twice, dissolution may be used in place of the rst run. The PI algorithm computes (G), the d-paths of G which are neither tautologies nor subsumed by other d-paths of the graph. From Theorem 3.1.10, we know that the prime implicates of the formula G are exactly those d-paths in (FD(G)) (where dissolution is applied to c-links). We rst compute the full dissolvent FD(G), which is in NNF. If FD(G) is empty, then G is unsatis able and false is its only prime implicate. Otherwise, we submit FD(G) as input to PI. The semantic graph of FD(G) (which is represented as a binary tree) is then recursively traversed by PI in a left to right, bottom up manner computing partial paths along the way, and eliminating tautologies and subsumed paths when they are encountered. The input of each recursive call is a subgraph and the set of d-paths through the portion of the graph to the left of that subgraph in the tree. Note that PI is invoked initially on the singleton f;g and the entire graph. We assume for expository purposes that all logical operators are binary. The algorithm can be trivially extended to operators of arbitrary arity. When the subgraph is a single literal, each path is extended (if it does not form a tautology) by this literal. We say it is trivially extended if the literal is already 26

in the partial path, properly extended otherwise, just as in the MM algorithm. The subsumption check can be limited to testing properly extended paths against trivially extended paths, as in the MM algorithm. In the case of a disjunct, PI rst extends the paths with respect to the rst operand, and then extends the resulting paths over the second. In the case of a conjunct PI extends the paths through each of the operands separately, and then combines them while eliminating any subsumed paths. Note that, for a conjunct, any path which has been trivially extended by the rst operand will subsume all extensions of this path by the second operand, and hence these paths are excluded in the call to PI on the second operand. In practice, the general test for this case is expensive. There is, however, a special case that can be detected easily: When the rst operand is a literal, the trivially extended paths are exactly those that contain that literal. Hence they may be identi ed eciently and be removed from the rst argument of the recursive call to PI at line 17. In the experiments of Tables 3.1 and 3.2, the implementation did not include any removal (from the variable Paths) of paths that were trivially extended by line 16. The special case described above is being included in the next version of the system. PI is de ned as shown in the Figure 3.1. To prove Theorem 3.2.2 we make use of the following lemma.

Lemma 3.2.1 Let D be a minimal set of d-paths, G be a non-null graph, and S be the set of d-paths of the form pDpG , where pD 2 D and pG is a d-path in G. We

denote by S  the largest minimal subset of S. Then PI(D,G) = S .

Proof: If D = ; then S and hence PI(D,G) must be ;, which is true by line 1 of the algorithm. If D 6= ;, the proof is by induction on the structure of G. Base Case: Suppose G is a single literal. Then every path of S has the form pD G. The for loop beginning at line 4 obviously surveys all the paths of S, so we must show that the paths in S  are precisely those collected in Paths00 at line 12.

27

PI(Paths, G) 1: if (Paths = ;) return ; 2: if (G is a literal) then 3: Paths00 := ; ; Paths0 := ; 4: for P in Paths do 5: if (G 2 P) then Paths0 := Paths0 [ P 6: // collect trivially extended paths in Paths0. 7: else if (G 62 P ) then 8: Paths00 := Paths00 [( P [ G) 9: //collect properly extended paths in paths00 10: endif 11: endfor 12: Paths00 := Paths0 [ f P 2 Paths00 j 6 9P 0 (P 0 2 Paths0 ^ P 0  P )g) 13: // Eliminate subsumed paths 14: return Paths00 15: else if (G = ( X , Y)c) then 16: Paths0 := PI(Paths, X) //Extend every path in Paths through X 17: Paths00 := PI((Paths - Paths0 ), Y) 18: // Extend every path in Paths - Paths0 through Y 19: Paths00 := (Paths0[ Paths00) ? f P j P 2 Paths0 ^ 9 P0 2 Paths00 ^ P0  Pg 20: ? f P j P 2 Paths00 ^ 9 P0 2 Paths0 ^ P0  Pg 21: // Eliminate subsumed paths 22: return Paths00 23: else if (G = ( X , Y)d) then 24: Paths0 := PI(Paths, X) // First extend all paths in Paths 25: Paths00 := PI(Paths0, Y) // through X, then extend them in Y 26: return Paths00 27: endif Figure 3.1: The PI algorithm 28

Consider a path pD of D containing G. It is added to Paths0 and therefore to Paths00 in line 12. (Although the corresponding path in S is technically pDG, the algorithm omits the extra occurrence of G). This path is obviously in S : It cannot be subsumed by another path being added to Paths0 since D is minimal, and it cannot be subsumed by a path qDG not added to Path0 because by de nition it is not subsumed by q. Now suppose that that pD does not contain G. At line 8, pDG will be added to Paths00 only if G 62 pD, so tautologies are ruled out. Therefore, pDG is either in S  or it is subsumed by some path in S . In the latter case, the subsuming path must be from Paths0: Those from Paths00 are of the form qDG, (where G 62 qD), and since qD and pD do not subsume each other, neither do qDG and pDG. Clearly, line 12 removes from Paths00 those paths that are subsumed by others in Paths0, and line 14 returns exactly S . Induction step: There are 2 cases. Case 1. G is a conjunct, i.e. G = (X; Y )c .

All paths in S are of the form pD X or pD Y . By the induction hypothesis, PI(Paths, X) and PI(Paths, Y) are minimal path sets of the form pD pX and pD pY , respectively. The paths of S  are also of this form and hence S   (PI (Paths; X ) [ PI (Paths; Y )). At lines 19-20, the largest minimal subset of Paths0 [ Paths00 is computed; since Paths0 = PI(Paths, X), this will be S  provided that no member of S  is in (Paths \ Paths0), the paths omitted in the rst argument of PI at line 17. But these paths are extensions (either trivial or non-trivial) of paths in Paths0. All such paths would be removed in the two-way subsumption check of lines 19-20. Therefore, the largest minimal subset of (PI (Paths; X ) [ PI (Paths; Y )) is equal to the largest minimal subset of (PI (Paths; X ) [ PI (Paths ? Paths0; Y )), where Paths0 = PI (Paths; X ), and Case 1 is complete. Case 2. G is a disjunct, i.e. G = (X; Y )d .

All paths in S are of the form pD pX pY . By the induction hypothesis Paths0 is set 29

to the largest minimal subset of paths of the form pD pX , and Paths00 is set to the largest minimal subset of paths of the form pD pX pY ; this is just S , and the proof is complete. 2

Theorem 3.2.2 Given a formula FD(G) in NNF, PI(f;g,FD(G)) is the set of all and only the prime implicates of G.

Proof: The set f;g has only one d-path which consists of no literals. This set therefore contains no tautologies and has no path which is subsumed by another path. Any extension in FD(G) of the path ; is a d-path in G. Therefore by 3.2.1, PI(f;g,FD(G)) contains exactly those d-paths of G that are not tautologies and are not subsumed by other d-paths in G. The theorem follows from 3.1.7 and 3.1.10. 2

Since :PI(f;g, : G) is logically equivalent to FD(G) (dissolution removes linked paths while PI removes both linked and subsumed paths), the prime implicates of a formula G can be computed with any of the three following methods: Option 1

1. Compute G0, the full dissolvent of G

(with respect to c-links)

2. If G0 is empty

then the only prime implicate is false

else call PI(f;g, G0)

Option 2

1. Compute G0 = : PI(f;g, :G) = FD(G) (6 FD(G0)) 2. If G0 is empty

then the only prime implicate is false else call PI(f;g, G0) 30

Option 3

1. Compute G0, the full dissolvent of G

(with respect to c-links)

2. If G0 is empty

then the only prime implicate is false else compute G00 = :FD(:G0)) (with respect to d-links of G0)

3. If G00 is empty

then G00 (hence G) is a tautology and has no prime implicates else call PI 0(f;g, G00) where PI 0 = PI without the tautology check We illustrate the various options with an example. Consider G a semantic graph de ned as A _ B _ C

^ A _ D

Note G has only one c-link fA,Ag. In Step 1, Option 1 produces G0 an equivalent graph which has no c-links. G0 is A

^ _

D

A

^ B _ C

Step 2 of Option 1 then collects all the d-paths of G0 which are neither tautologies or are subsumed by other d-paths. These paths are set of prime implicates: A, B, C, A, D, B, D, C . Note that the prime implicate B, D, C is not a d-path in G but is a d-path in G0. In Step 1 of Option 2 we get the graph G0

31

A

A

A

D

B

C

^ _ ^ _ ^

which is a DNF equivalent of the graph G. Step 2 of Option 2 then collects all the d-paths of G0 which are neither tautologies or are subsumed by other d-paths. Again These paths are the implicates of the original graph. The graph G0 produced in Step 2 of Option 3 is the same as the graph produced by Step 1 of Option 1. This graph has a d-path A A which is a tautology. From this graph we get in Step 2 of Option 3, a graph G00

A

^ _ B _ C

D

^ D _ A

which does not have any d-paths which are tautologies. Step 3 then collects all the d-paths of G00 which don't subsume each other. These are the prime implicates of the original formula. We may compute prime implicants in the same manner as for prime implicates: The steps are the same except that dissolution is performed on d-links, true is replaced by false, and we use PI 0, a modi ed version of PI. (It is the obvious dual: (X; Y )d is replaced by (X; Y )c and vice versa.)

Theorem 3.2.3 Given a formula G in NNF, and FD(G) is de ned with respect to d-links, then PI 0(f;g; FD(G )) contains all and only the prime implicants of G. Proof: Similar to the proof of Theorem 3.2.2.

32

3.3 Experiments on NNF Formulas Some initial experiments were performed to compare the options mentioned described in the previous section. The system used in the experiments was written using PASCAL (dissolution part) and in Koyoto Common Lisp (PI part). The interface between the two is through a driver written in C. The prime implicates were stored as linked lists1. The rst two options described above have been implemented and tested on randomly generated NNF formulas. The third option containing two rounds of dissolution performed poorly compared with Options 1 and 2 and is not discussed. Number of literal occurrences = 25 Alphabet size = 5 Alphabet size = 12 Alphabet size = 20 Option 1 Option 2 Option 1 Option 2 Option 1 Option 2 (with (without (with (without (with (without dissolution) dissolution) dissolution) dissolution) dissolution) dissolution) 0.1 0.1 0.1 0.4 0.8 3.0 0.1 0.1 0.4 2.4 0.1 0.6 0.1 0.2 0.7 4.7 0.5 4.8 0.1 0.1 0.4 0.4 1.2 4.6 0.1 0.1 0.1 0.2 5.3 35.7 0.1 0.1 0.2 1.4 0.2 2.7 0.1 0.1 0.1 0.3 0.1 1.5 0.1 0.1 0.1 0.1 0.8 14.0 0.2 0.1 0.2 0.5 0.2 2.5 0.1 0.1 0.5 0.2 0.1 0.2 Table 3.1: Experimental results for small NNF formulas Table 3.1 involves NNF formulas with 25, and Table 3.2 involves formulas 50 literal The current implementation is written completely in C/C++ and uses the trie data structure for storing the prime implicates 1

33

Number of literal occurrences = 50 Alphabet size = 5 Alphabet size = 25 Alphabet size = 45 Option 1 Option 2 Option 1 Option 2 Option 1 Option 2 (with (without (with (without (with (without dissolution) dissolution) dissolution) dissolution) dissolution) dissolution) 0.1 0.1 29.7 508.0 3345.8 >46000 0.1 0.2 151.2 19663.3 16170.5 >88000 0.1 0.1 11.8 75.4 245.6 29437.8 0.2 0.1 21249.0 >95000 299.3 32102.1 0.1 0.1 14.0 206.5 118.9 20503.4 0.2 0.2 4.9 90.7 491.1 9574.7 0.1 0.2 33.2 709.8 168.3 6702.7 0.1 0.1 89.2 152.5 48.7 13183.6 0.1 0.1 35.7 1358.9 1118.9 2270.6 0.2 0.2 11.1 746.1 30.4 5770.7 Table 3.2: Experimental results for large NNF formulas

34

occurrences. The sub-cases are classi ed according to alphabet size; as the alphabet size decreases, the average number of links in a formula of xed size increases. The times are given in seconds on a SUN SPARC 10. Using dissolution (Option 1) is better on the average. The most improvement over Option 2 occurs in cases where the alphabet size is high. However Option 1 is not greatly inferior in cases involving smaller alphabets. Due to the poor performance of Option 2 we have not implemented it in the current version. For the performance of our system on commonly used benchmark problems, and a comparison of running times with other algorithms see Section 4.14.

3.4 Formulas Intractable for CNF/DNF Based Approaches Consider the class of formulas Fk , of the form ^ki=1 (Ai = Bi ), where k  1. In NNF, each equivalence can be expressed as ((Ai ^ Bi) _ (Ai ^ Bi). The size of these formulas is O(k). Each has 2  k prime implicates, and each prime implicate is of the form (Ai _ Bi) or (Ai _ Bi); i  k.

Proposition 4 Using Option 1 we can get the set of prime implicates of FN in O(N 2) time. : None of the formulas in FN have any (conjunctive) links, and this can be veri ed in O(N 2) time. The unaltered non-DNF output of Step 1 is acceptable as input to PI in Step 2. As a result, the implicates are collected in O(N 2) time. 2 Proof.

Proposition 5 Option 2 requires O(2N ) to compute the prime implicates of FN . Proof: In Option 2 (and the algorithms of [28] and [57]) an intermediate DNF representation is created. Since the size of any DNF equivalent of the formula is 35

O(2N ), the O(2N ) running time inevitably results.

2

Some algorithms do not create such an intermediate DNF representation, but instead generate all implicates directly through inference techniques such as resolution (e.g., [2,3]). The implicates of the formulas above can be computed eciently using these techniques. However, in [36] a class of formulas  = ; ; ::: was introduced for which conversion to either CNF or to DNF is expensive. Each  is unsatis able, contains 2  i2 literals, and has a minimal CNF equivalent consisting of i2 + ii + 1 literals. The DNF equivalent is the empty disjunction, but this can only be veri ed by discovering that all i + ii disjuncts have a link and can be deleted. Let Mi be a literal not occurring in  and consider the formula 0 =  _ Mi. Obviously, 0 has Mi as its only implicate, but any method based on either CNF or on DNF would nd 0 intractable. Nevertheless, dissolution can handle  and hence 0 in polynomial time [36]. Thus Option 1 can compute implicates for these formulas eciently. Ngair gives similar examples in his paper [43].

3.5 Alternative Approaches The techniques presented in this chapter are based on the processing of boolean functions represented directly as symbolic formulas. We have illustrated some potential disadvantages when such formulas are required to be in clause form. One possible alternative is to use a (non-equivalence preserving) transformation to clause form that is linear, apply clause based techniques, and then extract the required output. Another alternative is to assume that the input is given as the minterms of the boolean function in question. In this section, we brie y discuss both of these alternatives.

36

3.5.1 Structure preserving transformations Theorem provers that require CNF or DNF input handle arbitrary formulas by preprocessing them into clause form. This may be done using either the standard laws of boolean algebra (distributive laws) or structure preserving transformations such as those of [44]. Application of the distributive laws can result in a clause set that is exponential in the size of the original formula. Structure preserving transformations produce clause forms with only a linear increase in size, and are therefore an apparently attractive option for processing arbitrary formulas. However, these transformations do not preserve equivalence of formulas. Our investigations reveal signi cant disadvantages for this approach. The obvious way to avoid the exponential increase in converting to clause form is to employ an ecient structure preserving transformation, and then compute the prime implicants/implicates of the transformed formulas. Given an input formula , we proceed as follows:

1. Use structure preserving transformations to convert  into a CNF/DNF formula clause .

2. Compute the implicants/implicates of clause 3. Extract the prime implicants/implicates of the original formula  Since structure preserving transformations do not in general preserve equivalence, the prime implicants/implicates of clause will be di erent from those of . Therefore in Step 3 of the strategy above, the following issues must be addressed.

a) The relationship between the prime implicates/implicants of the normal form and of the original formula must be speci ed. This relationship depends on the type of structure preserving transformation used. 37

b) The computational cost of extracting the implicants/implicates from those of the structure preserving CNF/DNF forms must be minimized.

For a) we have not found any obvious relationship for the typical transformations such as that of [44], other than the obvious one in which the variables introduced by the transformation are replaced by their de nitions. The issue raised in b) is not encouraging since the number of prime implicants/implicates of clause can be exponentially greater than that of . For example consider DN , the disjunction of N equivalences of the form (Ai _ Bi) ^ (Ai _ Bi); i  N . This formula has 2  N prime implicants, and each is of the form (Ai ^ Bi) or (Ai ^ Bi). Greenbaum's and Plaisted's [44] structure preserving transformation results in DN0 , a CNF formula containing 4  N + 2 clauses; each clause has at most N + 1 literals. This formula also has 3  N + 1 new variables p1 ::: p3N +1, and is a union of N + 1 sets of clauses. For each i, i  N we have a set of clauses Si =

f(p3(i?1)+1 _ ai _ bi); (p3(i?1)+2 _ ai _ bi); (pi3 _ p3(i?1)+1); (pi3 _ p3(i?1)+2)g Note that each of the above clause sets contains no literal in common with another. In addition, DN0 includes the set of clauses given by (p3 _ p6 _ ::: _ p3N _ pN 3+1); pN 3+1

Proposition 6 DN0 has 2  N  9N ? 1 prime implicants. Proof: The prime implicants of DN0 correspond to the set of all c-paths not subsumed by any other and that have no link. Since the literal pN 3+1 appears only once as a single clause, all prime implicants will contain this literal. Every prime implicant must also contain exactly one of p3; p6; p9:::p3N . If a prime implicant contains pi3, 1  i  N , then it can be extended by contributions from the set Si in two ways and by 9 ways due to the each of the other N ? 1 sets. Hence there are exactly 2  9N ? 1 prime implicants which contain p3i. This gives a total of 2  N  9N ? 1 prime implicants. 2

An alternative possibility for employing structure preserving transformations is to discover an algorithm for computing the prime implicants/implicates of the original 38

formula directly from the transformed formula. (Of course, the method discussed above is in e ect such an algorithm, but one that is apparently no improvement over direct processing of the original formula by non-clausal techniques.) We feel that it may not be possible to do this in an ecient manner; we have not yet discovered any obvious relationship between the prime implicants/implicates of the equivalence formulas above and those of their ecient clause form translations.

3.5.2 Using minterms as input The computation of prime implicants has been used as a rst step in the minimization of Boolean functions. Such functions are typically speci ed in canonical disjunctive form (i.e., as a set of minterms, which is essentially the true lines of the function's truth table). Many algorithms have been developed for computing the prime implicants from such a canonical disjunctive representation. In [58], Strzemecki provides a thorough survey of these algorithms. He also gives the rst algorithm which computes the prime implicants in polynomial time with respect to the number of minterms in the input. Our investigations are concerned primarily with the case in which the input is a symbolic formula, not a truth table. Of course, the number of minterms of a boolean function can be exponential in the number of variables. In Section 6.1, we introduced the formulas DN of the form _Ni=1 (Ai = Bi), where N  1. The number of minterms of DN is O(2N ), yet DN has only O(N) prime implicants. We conclude this discussion by noting that the input domains of the minterm based and formula based algorithms are quite di erent. The minterm based algorithms, if forced to preprocess the above formulas to produce acceptable input, would do exponentially worse than PI+dissolution. However, if the only available input for PI were the minterms of such functions, a preprocessor would be required to create a formula representation. In that case, PI would fare no better than Strzemecki's algorithm, although, with the use of factoring [36], possibly no worse. 39

3.5.3 Comparison with BDDs BDDs [4] are commonly used in veri cation of boolean circuits. Coudert and Madre [6] describe an algorithm which produces the prime implicates/implicants of a propositional formula represented as a BDD. Any algorithm including theirs which uses BDDs must perform large amounts of subsumption testing. Given any formula in NNF, a BDD based method would rst construct the BDD and then extract the prime implicates/implicants from it. In contrast, our system would rst compute the full dissolvent. But for either of these approaches, the next stage | extracting the prime implicates/implicants | requires extensive testing for subsumption. The size of the BDD depends critically on the ordering (of variables) chosen. There are classes of NNF formulas [3] for which any BDD will be exponentially large in the formula size. These formulas do not have any c-links, so the dissolution phase of our method does not change the formula. Hence the input to Pi would be a small formula, whereas the BDD based method would have to handle an exponentially larger intermediate representation. However these formulas have many prime implicates and prime implicants, and subsumption checking is the bottleneck for both methods. In all likelihood, the relative performance of these two methods can be determined through experimental evaluation only; formulas will exist for which one method is superior, and vice versa.

3.6 Other prime implicate/implicant generation algorithms 3.6.1 Algorithms based on minterm representation The Quine-Mckluskey rule [47] is an algorithm for simplifying a propositional formula expressed as a set of minterms (all the true lines in a truth table). The algorithm consists of two parts. The rst part computes the set of prime implicants of the 40

propositional formula. The second part selects a minimal subset of the set of prime implicates. This minimal subset when expressed as a DNF formula is a simpli cation of the propositional formula. The Quine-Mckluskey prime implicant computation algorithm, which is the earliest known algorithm, is based on iterative consensus. Tison [60] later generalized iterative consensus to generalized consensus. Just like the Quine-McKluskey's algorithm Tison's algorithm requires the input to be the set of minterms of the propositional formula. Tison's algorithm can be easily adapted to formulas represented as a set of CNF clauses, which is useful for AI applications (we give a description of an incremental version of this algorithm in the next section). Minterm based algorithms have been mainly used by the VLSI community; many algorithms have been developed based on the the Quine-Mckluskey and Tison algorithms. The best known algorithm to date is the recent algorithm due to Strzemecki [58]. His algorithm can compute all the prime implicates of the a propositional formula represented as a set of minterms, in time polynomial in the number of minterms.

3.6.2 Algorithms based on formulas Interest in algorithms based on propositional formulas increased sharply after Rieter and deKleer's paper in 1987 [51], in which a compilation technique for clause management systems was proposed based on the concept of prime implicates. Since then many formula-based algorithms have been proposed for computing prime implicants/implicates. We can broadly classify these into two groups: those based on some inference rule, and those based on the notion of paths in formulas (see chapter 2.3).

Inference based algorithms In 1990, Kean and Tsiknis proposed an incremental algorithm IPIA [30] based on Tison's method. IPIA is an adaptation of Tison's algorithm to clauses. The input is 41

a set of CNF clauses and the output is the set of prime implicates. If the input is a set of DNF clauses then the output produced will be the set of prime implicates. The algorithm is based on consensus, a resolution like rule. Given two caluses x _ and :x _ , where x is an atom, and are rest of the clauses, the clause _ is the consensus of these clauses, provided it is not a tautology. The input is a set of current prime implicates N , and a set S of new clauses. To run it in a non-incremental mode we call IPIA with N being the empty set. Along with the basic algorithm, they provide certain optimizations which prevent many redundant consensus operations. IPIA(N ,S ) 1. Delete any D 2 N [ S that is subsumed by another D0 2 N [ S . 2. If S = ; return N . 3. Remove the smallest clause C , from S . 4. For each literal l in C , construct l which contains all clauses of N which can form a consensus with C. 5. Let  be the set fC g. 6. Perform the following for each literal l of C (a) For each clause in  which is still in N compute the consensus of it and every clause in l which is still in N . (b) For every new consensus, discard it if has been subsumed by N [ S . Otherwise, remove any clauses in N [ S subsumed by it. Add the clause to N and . 7. Got to step 2 Figure 3.2: Kean and Tsiknis algorithm for computing prime implicates 42

Jackson's algorithm PIGLET [27] is an improvement over Kean and Tsiknis' algorithm. He identi es conditions under which certain consensus operations will always result in a clause that will be subsumed by another clause produced later on inthe computation. Therefore one can avoid generating such clauses. In the paper Jackson shows, by experimentation, that his algorithm is superior than Kean and Tsiknis' algorithm on many randomly generated formulas. He also shows that the use of some combinations of optimizations proposed by Kean and Tsiknis' leads to incorrect results. De Kleer's algorithm [9] is the most popular algorithm in the literature. His algorithm is essentially the same as that of Kean and Tsiknis' except that he uses a data structure called a trie. A trie is used to represent a collection of implicates in a tree like fashion. His trie based implementation showed dramatic improvements in running times over a list based implementation (by a factor greater than 103 for many problems). His algorithm was the rst one that could handle non trivial problems. Ngair's algorithm [43], which was rst published around the time same as the PI algorithm, does not require the input to be in the form of clauses. His algorithm is based on order theory. The input to the algorithm is a conjunction of DNF formulas. The output is the set of prime implicates. His paper identi es classes of formulas for which prime implicates can be generated eciently using his algorithm, whereas any CNF based algorithm would take exponential time. We have also discovered similar results independently (see Section 3.4). He also shows that his implementation does as well as the clausal algorithm of deKleer's on clause formulas.

Path Based Approaches In [28], Jackson and Pais give an algorithm (MM) to compute prime implicates/implicants based on the connection method of Bibel [2]. Given a set of CNF clauses the algorithm enumerates the set of paths which are prime implicants (see theorem 4.4.2), using heuristics to eliminate subsumed paths as early as possible. The algorithm is shown in Figure 3.3 (at the end of the chapter). T is the input, (i.e set of CNF) 43

clauses. See section 3.2 for the de nitions of trivial extended paths and properly extended paths . Jackson and Pais show through experimental evaluation that this algorithm is better than the algorithm due to Slagle et. al. [57]. If the input is a set of DNF clauses, the output will be the set of prime implicates. To compute the set of prime implicates of a given set of CNF clauses, we need to compute the set of prime implicants using the MM algorithm and then run the MM algorithm on the set of prime implicants to get the prime implicates. Slagle et. al. give an algorithm for computing prime implicants [57] based on the semantic tree, a conceptual representation used in proving completeness properties of resolution like inference rules. Their algorithm was the rst known algorithm which does not require minterms as inputs. The algorithm takes as input CNF formulas and produces prime implicants. Rymon's algorithm [54] is an improvement over the algorithm of Slagle et. al. His algorithm is based on SE-HS trees, a generalization of the semantic tree used by Slagle et. al. In [6] Courdet and Madre give an algorithm which requires as input an ordered binary decision diagram OBBD representation of the formula [4]. An OBDD is a graph based represented of propositional formulas. Their algorithm produces the set of all prime implicants. They also use an OBDD to represent the collection of implicants via a characteristic function.

44

1. Initialize D to the smallest clause in T Initialize OPEN to the set T ? D Set PATHS to the set ffl1 g,...,flngg where D =fl1,...,lng Set POS to be the set of all positive literals in D and NEG to the set of all negative literals in NEG 2. If OPEN = ; return PATHS 3. For every clause D0 2 OPEN compute h(D0 ) = kn1k  kn2k, where n1 is the number of literal D0 that are in the set POS [ NEG, and n2 is the number of literals in D0 whose negation is in either NEG [ POS . 4. let D be the clause with smallest value for h Update the sets NEG and POS with the positive and negative literals form D respectively 5. For each path in P 2 PATHS (a) Remove P from PATHS (b) Extended P with literals form D one at a time, eliminating paths that have a literal and its negation. (c) Collect all trivially extend paths in PATHS 0 and properly extended paths in PATHS 00 6. Remove form PATHS 00 all the paths that are subsumed by some path in PATHS 0 7. Set PATHS to PATHS 0 Remove D from OPEN Got to step 2. Figure 3.3: MM Algorithm 45

3.7 Applications of prime implicate/implicant algorithms 3.7.1 Knowledge compilation In [56], Selman and Kautz propose a method of knowledge compilation (by horn approximations) as an aid to faster inferencing in propositional knowledge bases. Given a set of propositional clauses , two sets of propositional horn clauses 2 lub and glb computed such that glb j=  j= lub , or equivalently M (glb)  M ()  M (lub ), where M () is the set of models of . In addition must be no 0 such that either M (glb )  M (0)  M () or M ()  M (0)  M (glb ). glb and lub are known as the horn lower bound and horn upper bound respectively. Such a compilation scheme can improve the eciency of reasoning in knowledge bases. Given a query , the horn lower and upper bounds can be used to answer the query with respect to  (i.e does  j= hold ?) as follows. If lub j= we can conclude that  j= . If glb 6j= then we conclude  6j= . Otherwise we use the original theory  to answer the query. Since lub and glb are sets of horn clauses, testing if lub j= or glb 6j= can be done in time polynomial in the size of lub and glb respectively. Selman and Kuatz have shown that lub is unique and is the same as set of all prime implicates of  that are horn clauses. They also give an algorithm for computing lub based on the prime implicate computing of Kean and Tsiknis (Figure 3.2). Though query evaluation in propositional knowledge bases is an NP-complete problem, use of knowledge compilation techniques results in ecient algorithms in practice [29]. In [12] del Val gives another knowledge compilation technique based on the notion of prime implicates. 2

A horn clause is a disjunction of literals, containing no more than one positive literal.

46

3.7.2 Assumption based truth maintenance systems A clause management system (CMS) [51] is an extension of the traditional truth maintenance system (TMS) [7] to general clauses. TMS's can handle horn clauses only. TMS's have been used in many applications such as non-monotonic reasoning, abductive reasoning, diagnosis etc. In [51], Reiter and deKleer characterize the computations of CMS's in term of prime implicate computations via minimal supports. Given a set of clauses  and a clause C , then the clause S is a minimal support if 1.  [ S is consistent. 2.  j= S [ C . 3. No proper subset of S has properties 1 and 2. Reiter and deKleer show that set f S jS 2 (C; ) and no clause of (C; ) is a subset of S g is the set of all minimal supports of C with respect to , where (C; ) is the set f  ? C j  is a prime implicate of  and  \ C 6 ; g. They propose that the set of clauses provided to the CMS be compiled into prime implicates. The minimal supports can then be obtained from the prime implicates.

3.7.3 Other applications Prime implicants and prime implicates have also been used in other AI applications such as computing circumscription [21, 46], machine learning [53], and computing diagnoses [10, 31]. In this thesis we develop an alternate method to compute diagnoses without using prime implicates and prime implicants.

47

Chapter 4 Eliminating Subsumed Paths Without Subsumption Checking 4.1 Subsumed paths and Anti-links Our goal is to rst identify as many subsumed paths in an NNF formula as possible in an ecient manner and then eliminate them. The presence of anti-links (both disjunctive and conjunctive) in a graph may indicate that subsumed d-paths are present in the graph. We now de ne anti-links and then discuss ways to identify and remove subsumed paths due to anti-links.

De nition 15 If M = (X; Y )d is a d-arc in a semantic graph G and if AX and AY are nodes (occurrences of literal A) in X and in Y respectively, then we call fAX ; AY g a disjunctive anti-link.

If M = (X; Y )c is a c-arc in G, then we call fAX ; AY g a conjunctive anti-link.

Note, that M is the smallest full block containing the anti-link. The following theorem relates subsumed paths to anti-links. The theorem is immediate for CNF formulas; there is an obvious dual theorem regarding subsumed c-paths that is immediate for DNF formulas. 48

Theorem 4.1.1 Let G be a semantic graph in which a d-path p is subsumed by a

distinct non-tautological d-path p0 in G. Then G contains either a disjunctive antilink or a conjunctive anti-link.

Proof: There are two distinct possibilities, either (p) = l(p0 ) or (p)  l(p0 ). Suppose l(p) = l(p0). Then there must be a literal having two di erent occurrences. These two occurrences must be either d- or c-connected and thereby constitute either a disjunctive or a conjunctive anti-link.

Suppose (p)  l(p0). The proof is by induction on the structure of G.

Basis: G is a literal. The result is vacuously true since there cannot be two distinct d-paths through G.

Induction step: (a) Suppose G = (X ^ Y ) for some X and Y . Then either p and p0 are both from the same explicit subgraph (X or Y ) or from di erent explicit subgraphs. If they lie in the same explicit subgraph then the result follows directly from the induction hypothesis. If they are from di erent subgraphs then every literal in (p0) occurs at least once in X and at least once in Y . Any two such occurrences of some literal in (p0) constitute a conjunctive anti-link. (b) Suppose G = (X _ Y ). Let pX and pY be the restriction of p to X and to Y respectively. Let p0X and p0Y be the restriction of p0 to X and to Y respectively. Since p and p0 are distinct, either pX and p0X must be distinct, or pY and p0Y must be distinct (or both), so assume without loss of generality that pX and p0X are distinct. If either pX subsumes p0X or vice versa, then by the induction hypothesis, X must have an anti-link and so does G. On the other hand if pX and p0X do not subsume each other, then there must be some literal (say L) in (p0X ) which is not in pX . But since p0 subsumes p, there must be an 49

occurrence of L in pY . The two occurrences of L, one in p0X and the other in pY , constitute a disjunctive anti-link.

2 Unfortunately, the presence of anti-links does not imply the presence of subsumed paths, and hence the converse of the above theorem is not true.

4.2 Redundant Anti-links We now identify those disjunctive anti-links which do imply the presence of subsumed paths.

De nition 16 A disjunctive anti-link fAX ; AY g with respect to the graph G is redundant if either CE (AX ) = 6 A or CE (AY ) 6= A. De nition 17 Let fAX ; AY g be a disjunctive anti-link in graph G, where M = (X; Y )d is the smallest full block containing the anti-link.

Then DP (fAX ; AY g; G) is the set of all d-paths of M which pass through both CE (AX ) ? fAX g and AY or through both CE (AY ) ? fAY g and AX .

Example 11 Consider the following graph G = (X; Y )d: Y

X

AX _ C

^

_

B

AY

^ E _ C

(4:1)

The two occurrences of A form a disjunctive anti-link. Because CE (AY ) ? fAg = Y ? fAg and DPE (AX ; X ) = A _ C , DP (fAX ; AY g; G) contains the d-path p = fAX ; C; E; C g (indicated by a line). But since CE (AX ) = AX , there are no paths through CE (AX ) ? fAX g; p is the only member of DP (fAX ; AY g; G). The anti-link is redundant, and p is subsumed by p0 = fAX ; C; AY g (with literal set fA; C g). Notice

50

that had G been embedded in a larger graph G0 , every d-path q containing p in G0 would be subsumed by a corresponding d-path q 0 that di ers from q only in that q 0 contains p0 instead of p.

In general, one or both of the literals in a redundant anti-link fAX ; AY g is an argument of a conjunction, and DP (fAX ; AY g; G) 6= ;. In the above example, the two occurrences of C are both arguments of disjunctions, and thus comprise a nonredundant anti-link for which DP (fCX ; CY g; G) = ;. Although only redundant disjunctive anti-links contribute directly to subsumed d-paths, non-redundant anti-links do not prohibit the existence of subsumed paths. However, such non-redundant anti-links do not themselves provide any evidence that such paths are in fact present.

Theorem 4.2.1 Let fAX ; AY g be a redundant disjunctive anti-link in a semantic graph G. Then each d-path in DP (fAX ; AY g; G) is properly subsumed by a d-path in G that contains the anti-link.

Proof: Recall that a d-path (c-path) in a graph G is said to pass through a subgraph X of G if the path when restricted to the set of nodes in X forms a d-path (c-path) in X . Let p 2 DP (fAX ; AY g; G), and assume without loss of generality that p passes through both CE (AX ) ? fAX g and AY . Note that CE (AX ) ? fAX g is non-empty and that M = (X _ Y ) is the largest full block containing the anti-link. We may write CE (AX ) as (A ^ C1 ^ : : : ^ Cn), where n  1.

Let p = pX [ pY [ po where pX and pY are p restricted to X and to Y , respectively, and po is p restricted to nodes outside of both X and Y . By construction, AX 62 pX and thus pX passes through some Ci, 1  i  n. So pX = p0X [ pC , where pC is pX restricted to Ci, and hence p = p0X [ pc [ pY [ po . The d-path p0X [fAg clearly passes through X , and since AY 2 pY , p0 = p0X [ AX [ pY [ po subsumes p. 2 i

i

51

i

4.3 An Anti-Link Operator The identi cation of redundant disjunctive anti-links can be done easily by checking to see if either CE (AX ) 6= AX or CE (AY ) 6= AY . After identifying a redundant anti-link, it is possible to remove it using the disjunctive anti-link dissolvent (DADV) operator de ned below; in the process, all d-paths in DP (fAX ; AY g; G) are eliminated, and the two occurrences of the anti-link literal are collapsed into one.

De nition 18 Let fAX ; AY g be a disjunctive anti-link and let M = (X; Y )d be the smallest full block containing the anti-link. Then

DC (AX ; X ) _ DC (AY ; Y )

^ DADV (fAX ; AY g; M ) = DC (CE (AX ); X ) _ DPE (AY ; Y )

^ DPE (AX ; X ) _ CC (AY ; Y )

Example 12 Consider again the semantic graph (4.1) from Example 11. We have DC (AX ; X ) = B and DC (AY ; Y ) = (E _ C ), so the upper conjunct in DADV is (B _ C _ E ). For the middle conjunct, CE (AX ; X ) = AX DC (CE (AX ); X ) = B DPE (AY ; Y ) = AY this conjunct is (B _ A). Finally in the lower conjunct, DPE (AX ; X ) = (A _ C ) and CC (AY ; Y ) = false, so this reduces to (A _ C ). The result is:

B _ E _ C DADV (AX ; AY ; M ) =

52

^ B _ A ^ A _ C

We point out that although DADV produces a CNF formula in the above simple example, in general it does not. In particular, the above graph can be simpli ed as the consequence of easily recognizable conditions, and the resulting graph is not in CNF. For the details, see Case 1 of Section 4.5.

4.4 Correctness of DADV In Theorem 4.4.2 we show that DADV (fAX ; AY g; G) is logically equivalent to G and does not contain the d-paths of DP (fAX ; AY g; G). First we have to prove the following lemma.

Lemma 4.4.1 If G is a graph and A is a literal occurrence in G, then CC (A; G) is

logically equivalent to

(DPE (A; G) ? fAg) ^ DC (CE (A); G) : Proof: We prove the lemma by showing that the formula on the left and the formula on the right possess exactly the same set of d-paths; the result then follows from Lemma 2.4.1. The proof is done via induction on the syntactic structure of G (the lemma trivially holds if G = true or G = false).

1. If G is a literal, then G = A and both the set of d-paths of CC (A; G) and the set of d-paths of DPE (A; G) ? fAg) ^ DC (CE (A); G) are empty. Note, that DC (CE (A); G) = DC (A; A) = true, but (DPE (A; G) ? fAg) = fAg ? fAg = false = CC (A; A). 2. If G = (X; Y )d, then without loss of generality assume A belongs to X . Hence CC (A; G) = (CC (A; X ) _ Y ). By the induction hypothesis, the d-paths of CC (A; X ) are just those of (DPE (A; X )?fAg)^DC (CE (A); X ). So CC (A; G) has the same d-paths as (DPE (A; X ) ? fAg) ^ DC (CE (A); X ) _ Y . Now consider the right hand side of the equation. Since A is in X ,

DPE (A; G) = (DPE (A; X ) _ Y ) : 53

Therefore, DPE (A; G) ? fAg = (DPE (A; X ) ? fAg _ Y ). Also, CE (A) will be disjoint from Y , and thus

DC (CE (A); G) = DC (CE (A); X ) _ Y : Therefore we can write the right hand side of the equation as (DPE (A; X ) ? fAg _ Y ) ^ (DC (CE (A); X ) _ Y ). By factoring out the subgraph Y we get an equivalent subgraph ((DPE (A; X ) ? fAg) ^ DC (CE (A); X )) _ Y having the same d-paths. But this is just the semantic graph that has been shown to have the same d-paths as the left hand side. 3. Finally suppose G = (X; Y )c ; again assume that A is in X . Now there are two subcases to consider. (a) If CC (A; G) = false and thus has no d-paths, then A in X is not dconnected to any other subgraph in X . Hence X is of the form A ^ C1 ^ : : : ^ Cn (where n  0). But then G = A ^ C1 ^ : : : ^ Cn ^ Y , CE (A) = G, and DPE (A; G) = DPE (A; X ) = A. As a result, both DC (CE (A); G) and DPE (A; G) ? fAg have no d-paths. (b) If CC (A; G) 6= false, then

CC (A; G) = CC (A; X ) ^ Y ; and thus CC (A; X ) 6= false. Therefore, by the induction hypothesis, CC (A; G) has the same d-paths as (DPE (A; X ) ? fAg) ^ DC (CE (A); X ) ^ Y : Focusing now on the right hand side of the equation, DPE (A; G) = DPE (A; X ) by de nition. The c-extension of A can only include nodes from X (otherwise, CC (A; G) = false, contrary to the subcase (b) condition). Therefore, DC (CE (A); G) = DC (CE (A); X ) ^ Y . Therefore the right hand side of the equation has the same d-paths as (DPE (A; X ) ? fAg) ^ (DC (CE (A); X ) ^ Y ). This is just the result obtained for the left hand side in this subcase. 54

2

Theorem 4.4.2 Let M = (X; Y )d be the smallest full block containing fAX ; AY g, a disjunctive anti-link in semantic graph G. Then DADV (fAX ; AY g; M ) is equivalent to M and di ers in d-paths from M as follows: d-paths in DP (fAX ; AY g; G) are not present, and any d-path of M containing the anti-link is replaced by a path with the same literal set having only one occurrence of the anti-link literal.

Note that AX and AY are literal occurrences (and hence d-blocks) in X and in Y respectively. By the dual of Lemmas 2.5.4, X is equivalent to DC (AX ; X ) ^ DPE (AX ; X ), and from the distributive law, M is equivalent to

DC (AX ; X ) _ Y

^ DPE (AX ; X ) _ Y Similarly, Y is equivalent to DC (AY ; Y ) ^ DPE (AY ; Y ), and we expand the upper occurrence of Y and distribute. Thus, M is equivalent to

DC (AX ; X ) _ DC (AY ; Y )

^ DC (AX ; X ) _ DPE (AY ); Y ^ DPE (AX ; X ) _ Y By the duals of Lemmas 2.5.1 and 2.5.3, not only have we rewritten M equivalently, but the d-paths of M have been preserved. We will continue to rewrite M ; our goal is to eventually put it in an equivalent form in which the d-paths of DP (fAX ; AY g; M ) have been omitted. Consider the d-paths of DC (AX ; X ) | the d-paths in X that miss AX . They either miss CE (AX ), the c-extension of AX , or pass through CE (AX ) ?fAX g. Hence DC (AX ; X ) has the same d-paths as

DPE (CE (AX ) ? fAX g); X ) ^ DC (CE (AX ); X ) : 55

By replacing the lower occurrence of DC (AX ; X ) in the previous graph, we get the following graph M 0 which is equivalent to M and has the same d-paths as M :

DC (AX ; X ) _ DC (AY ; Y )

^ M0 =

DPE ((CE (AX ) ? fAX g); X )

^

DC (CE (AX ); X )

_ DPE (AY ; Y ) ^

DPE (AX ; X ) _ Y Every d-path in the subgraph DPE ((CE (AX ) ? fAX g); X ) _ DPE (AY ; Y ) is in DP (fAX ; AY g; M ). By Theorem 4.2.1, all these paths are subsumed by other dpaths. Therefore, we can remove the subgraph DPE ((CE (AX ) ?fAX g); X ) from M 0 while preserving equivalence to get the graph M 00 shown below. DC (AX ; X ) _ DC (AY ; Y )

^ M 00 = DC (CE (AX ); X ) _ DPE (AY ; Y )

^ DPE (AX ; X ) _ Y Again by using arguments dual to the one given earlier for X , we have that Y and DPE (AX ; Y )

^

DPE ((CE (AY ) ? fAY g); Y )

^

DC (CE (AY ); Y ) have identical d-paths. Replacing Y in M 00, we nd that every d-path in the subgraph DPE (AX ; X ) _ DPE ((CE (AY ) ?fAY g); Y ) is in DP (fAX ; AY g; M ). Again by Theorem 4.2.1, these 56

paths are also subsumed by other d-paths. Therefore we can remove the subgraph DPE ((CE (AY ) ? fAY g); Y ) and preserve equivalence; M 000 results.

DC (AX ; X ) _ DC (AY ; Y )

^ M 000 =

DC (CE (AX ); X ) _ DPE (AY ; Y )

^ DPE (AX ; X ) _

DPE (AY ; Y )

^

DC (CE (AY ); Y )

The d-paths in M 000 are those of M excluding the d-paths in DP (fAX ; AY g; M ). Consider now the d-paths of DPE (AX ; X ) _ DPE (AY ; Y ) in M 000. They are exactly those of M (and of M 000) that contain the anti-link: They each contain two occurrences of the literal A. Hence we can remove the node AY from DPE (AY ; Y ) and apply Lemma 4.4.1 to get M 0000.

DC (AX ; X ) _ DC (AY ; Y )

^ M 0000

=

DC (CE (AX ); X ) _ DPE (AY ; Y )

^ DPE (AX ; X ) _

57

DPE (AY ; Y ) ? fAY g

^

DC (CE (AY ); Y )

Applying Lemma 4.4.1 to M 0000 we get the following graph

DC (AX ; X ) _ DC (AY ; Y )

^ DC (CE (AX ); X ) _ DPE (AY ; Y )

^ DPE (AX ; X ) _ CC (AY ; Y ) which is DADV (fAX ; AY g; M ). In constructing DADV (fAX ; AY g; M ) we have removed only subsumed d-paths and altered only d-paths that contain the anti-link by collapsing the double occurrence of the anti-link literal. Hence DADV (fAX ; AY g; M ) is equivalent to M , does not contain the anti-link, and does not contain any d-path of DP (fAX ; AY g; M ). 2 Theorem 4.4.2 gives us a method to remove disjunctive anti-links and some subsumed d-paths: Simply identify a redundant anti-link H = fAX ; AY g and the smallest full block M containing it, and then replace M by DADV (H; M ). The cost of this operation is proportional to the size of the graph replacing M , and this is linear in M . Also, c-connected literals in M do not become d-connected in DADV (H; M ). Thus truly new disjunctive anti-links are not introduced. However, parts of the graph may be duplicated, and this may give rise to additional copies of anti-links not yet removed. Nevertheless, persistent removal of redundant disjunctive anti-links (in which case DP (fAX ; AY g; M ) 6= ;) is a terminating process, because the number of d-paths is strictly reduced at each step. This proves:

Theorem 4.4.3 Finitely many applications of the DADV operation on redundant

anti-links will result in a graph without redundant disjunctive anti-links, and termination of this process is independent of the choice of anti-link at each step. 2

Although we can remove all the redundant disjunctive anti-links in the graph, this process can introduce new conjunctive anti-links. Such anti-links may indicate the 58

presence of subsumed d-paths, but the situation is not as favorable as with disjunctive anti-links | see Section 4.7.

4.5 Simpli cations Obviously, DADV (fAX ; AY g; M ) can be syntactically larger than M = (X; Y )d. Under certain conditions we may use simpli ed alternative de nitions for DADV . These de nitions result in formulas which are syntactically smaller than those that result from the general de nition. The following is a list of possible simpli cations. 1. If

(and CE (AX ) 6= X ) ;

CE (AX ) = AX

then DC (CE (AX ); X ) = DC (AX ; X ). Therefore by (possibly non atomic) factoring on DC (AX ; X ) and observing that (DC (AY ; Y ) ^ DPE (AY ; Y )) has the same d-paths as Y , DADV (fAX ; AY g; M ) becomes

DC (AX ; X ) _ Y

^ DPE (AX ; X ) _ CC (AY ; Y ) It turns out that this rule applies to (2.2) in Example 1. Since CE (AX ) = AX , the simpli ed rule for this case results in the following graph.

B _

^

A

^ E _ C

A _ C 2. If

CE (AX ) = X ; 59

then DC (CE (AX ); X ) = true. Hence

DPE (AX ) = AX

DC (AX ; X ) = (X ? fAX g) :

and

DADV (fAX ; AY g; M ) becomes X ? fAX g _ DC (AY ; Y )

^ AX _ CC (AY ; Y ) 3. If both Case 1 and Case 2 apply, then CE (AX ; X ) = X = AX , and the above formula simpli es to AX _ CC (AY ; Y ) : Note that in all the above versions of DADV , the roles of X and Y can be interchanged.

4.6 Disjunctive Anti-Links and Factoring It is interesting to note that the DADV operation contains factoring (i.e., the ordinary application of the distributive law to a pair of conjunctions containing a common argument) as a special case. This is just the condition for Case 2 above except that both CE (AX ) = X and CE (AY ) = Y hold. Under these conditions, DADV (fAX ; AY g; M ) becomes

X ? fAX g _ Y ? fAY g

^

:

A This is the graph obtained by disjunctive factoring [40]. 60

The DADV operator also captures the absorption law (or merging). If AX and AY are both arguments of the same disjunction, then X = AX , Y = AY , and DADV (fAX ; AY g; M ) = AX . Note, however, that technically the anti-link is not redundant in this case.

4.7 Conjunctive Anti-Links There are conjunctive anti-links that always indicate the presence of d-paths that are subsumed by others, and they are easy to detect. However, the conditions to be met are much more restrictive than those for redundant disjunctive anti-links. Consider a conjunctive anti-link fAX ; AY g, where the smallest full block M containing the antilink is (AX ; Y )c. Every d-path in Y which passes through AY will be subsumed by the d-path consisting of the single literal AX . Hence we can replace Y by DC (AY ; Y ). This is a kind of dual to Case 3 of the simpli ed versions of DADV discussed earlier. There, the anti-link fAX ; AY g is disjunctive and M = (AX ; Y )d. The simpli ed DADV operation just replaces Y by CC (AY ; Y ). Note that the conjunctive anti-link operation above removes subsumed d-paths, whereas the Case 3 disjunctive anti-link operation can either remove paths or merely remove the second occurrence of the anti-link literal on paths that contain the anti-link. Both operations involve d-paths, and both have strictly dual operations that would a ect c-paths instead.

4.8 Complexity Considerations The problem of eliminating all subsumed paths in a graph in an ecient manner does not seem feasible. The following de nition makes precise the notion of minimality with respect to subsumed d-paths. Then we show that it is NP-hard to achieve this property.

61

De nition 19 Let G be a semantic graph; we say that a graph G0 is a d-minimal equivalent of G if it satis es the following conditions. 1. G is logically equivalent to G0 . 2. If p0 and q 0 are two distinct d-paths in G0, then p0 does not subsume q 0 and vice versa. 3. If p0 is a d-path in G0 , then there is a d-path p in G such that, (p) = l(p0 ). 4. If p is a minimal d-path in G, then there is a d-path p0 in G0 such that (p0 ) = l(p). The c-minimal equivalent of a graph is de ned in the obvious dual way.

Note that Property 1 above is implied by Properties 3 and 4, and that G0 need not be unique. However, the d-paths of G0 will always include all essential (and possibly some inessential) prime implicates of G. Computing d-minimal equivalent graphs eciently would be helpful for nding prime implicates. In a d-minimal equivalent graph of a full dissolvent, subsumption checks can be completely eliminated by Property 2 above. Hence to nd the prime implicates of G, we can nd a d-minimal equivalent G0 of the full dissolvent FD(G), and then simply enumerate the d-paths of G0. A d-minimal equivalent of a given graph G can be trivially obtained by rst enumerating all the d-paths of the given graph G and then eliminating all the subsumed d-paths. The above algorithm is exponential in the size of G, because G0 is being constructed in CNF. However an NNF d-minimal equivalent G0 of G may be small compared to a CNF d-minimal equivalent. Even so, the problem is NP-hard (proof follows) and hence is not likely to have an ecient algorithm.

Theorem 4.8.1 The following problem (elimination of subsumed paths) is NP-hard. Given a graph G, nd a d-minimal equivalent graph G0 .

62

To show NP-hardness we reduce from satis ability of CNF formulas. Let C be an instance of the CNF satis ability problem and fX1; : : : ; Xn g be the set of variables in C . Let A; X10 ; : : :; Xn0 be distinct variables not occurring in fX1; : : :; Xn g. Let D be the semantic graph obtained by replacing Xi by Xi0 , 1  i  n, in :C (by which we denote the NNF of the negation of C ). We construct the following semantic graph G.

A _ D

^ X1 _ X10 ^ ...

^ Xn _ Xn0 The size of the graph G is no more than a constant factor of the size of C and can therefore be constructed in linear time. It is easy to see that any d-path which includes the literal A must pass through D and vice versa. Let G0 be any graph which is a d-minimal equivalent to G. We will show that C is satis able i the literal A occurs in G0. Suppose C is satis able; then :C is falsi able, and there are d-paths (in fact, clauses, since :C is in DNF) in :C that do not contain any disjunctive link fXi; Xi g. All such d-paths through D do not contain any fXi; Xi0g, 1  i  n; at least one such, say p, is not subsumed by another d-path through D. The d-path pA cannot be subsumed by any other d-path in G and hence there will be a path in G0 which has the same literal set as pA. Hence the literal A must occur in G0. If C is not satis able, then :C is valid. Therefore for every d-path p in D (and hence every d-path through A), there is some i, 1  i  n, such that the pair of literals fXi; Xi0g  (p). But every such pair of literals forms a d-path in G and hence every d-path containing A will be subsumed by another d-path in G. Furthermore the subsuming path will not contain the literal A. By de nition of d-minimal equivalent and by construction of G0, no d-path in G0 can contain the literal A, and thus the 63

literal A cannot occur in G0. If we can solve the elimination of subsumed paths problem in polynomial time then we have the following algorithm which can solve the satis ability of CNF in polynomial time: Given any instance C of the CNF satis ability problem, we can construct in polynomial time the graph G as shown earlier. We then nd the graph G0 using the algorithm for elimination of subsumed paths. The size of G0 will be polynomial in the size of G (since computing it required only polynomial time). Now C is satis able i the literal A occurs in G0 and this check can be done in polynomial time. 2 By a completely dual construction we obtain the following corollary.

Corollary 4.8.2 Given a graph G, nding a c-minimal equivalent of G is NP-hard. We have seen that the general problem of computing d- or c-minimal graphs is NP-hard. Nevertheless, redundant disjunctive anti-links are easily recognized, and eliminating their corresponding subsumed d-paths can be done without direct subsumption checks. On the other hand, recognizable subsumed d-paths due to conjunctive anti-links are not likely to be as plentiful due to the strong restriction de ning such useful anti-links. It is also dicult to nd out if an arbitrary conjunctive anti-link results in subsumed d-paths. In fact, this problem is NP-complete.

Theorem 4.8.3 The following problem is NP-complete. Given a conjunctive antilink fAX ; AY g in a graph G, determine whether there are there two d-paths pX and pY in G, such that pX passes through AX and pY passes through AY and either pX subsumes pY or vice versa.

It is easy see that this problem is in NP. To show NP-hardness we reduce from satis ability of CNF formulas. Let C be an instance of the CNF satis ability problem and fX1; : : :; Xn g be the set of variables in C . Let A; B; X10 ; : : :; Xn0 be distinct variables not occurring in fX1; : : : ; Xn g. Let D be the semantic graph obtained by 64

replacing Xi by Xi0, 1  i  n, in :C . We construct the following semantic graph G.

A1 _ D

^ X1

Xn

A2 _ B _ ^ _ : : : _ ^ Xn0 X10 The subgraph D is a DNF formula, and A1 and A2 are two di erent occurrences of the literal A. These two literal occurrences form a conjunctive anti-link in G. Every d-path through A2 contains the literal B , and only d-paths containing A1 can possibly subsume d-paths through A2. The size of the graph G is no more than a constant factor of the size of C and can therefore be constructed in linear time. We must show that C is satis able i there is a d-path through A1 that subsumes another d-path through A2. Suppose rst that C is satis able. Then :C is falsi able, and there is at least one d-path in :C that does not contain any disjunctive link fXi; Xi g. Therefore some d-path p through D does not contain any of the literal pairs fXi ; Xi0g, 1  i  n. Recall that (p) is de ned to be the literal set of path p. It is easy to see that since (p) does not contain such a literal pair (corresponding to a d-link in :C ), a d-path p0 can be chosen that passes through the n rightmost disjuncts in the lower part of G, such that (p0)  l(p). Clearly, A2Bp0 is subsumed by A1p. To show the if-part, suppose there is some d-path (say p) through A1 that subsumes another d-path (say p0 ) through A2. Then p cannot contain any literal pair fXi; Xi0g because no such pair occurs on any d-path in the lower part of G. Therefore p when restricted to D will not have such a literal pair, :C has a d-path without a d-link, and hence C is satis able. 2

Corollary 4.8.4 The problem dual to the one described in Theorem 4.8.3 involving disjunctive anti-links and c-paths is NP-complete.

65

4.9 Some Benchmark Examples Ngair [43] has investigated examples that prove dicult for many proposed prime implicate/implicant algorithms. In this section, we show that Pi + anti-links is e ective for some of these examples. For other examples from [43], applying anti-link techniques appears not to produce as signi cant an improvement. We develop an additional technique based on strictly pure full blocks that results in a dramatic improvement for these latter examples. In [43] a class of formulas is proposed for which reliance on an intermediate CNF form can result in an exponential increase in size and hence would be intractable for CNF-based algorithms. Dissolution + Pi also does poorly for these examples: Although the full dissolvent can be computed quickly, a large number of subsumption checks must be performed by Pi. It turns out, however, that in this case the subsumed implicates correspond to easily recognizable anti-links of both the disjunctive and conjunctive kind. We show that if these anti-links are removed after dissolution is performed, dissolution + Pi can nd all the implicates in polynomial time. Ngair's formulas are abbreviated with Fn (n  1) and are de ned as: ! ! _n ! ^n ^n ! Fn = A2i?1 _ A1 ^ A2i _ A2 ^ (A2i?1 ^ A2i) i=2

i=2

i=1

In the left part of Figure 4.1 we show the graph of Fn for a xed n. Clearly Fn has 4n literals, and 2n ? 2 c-links; dissolution can remove these links by performing 2n dissolution steps. The full dissolvent that results is depicted in the right part of Figure 4.1. The structure of the full dissolvent depends on the order in which links are selected for application of dissolution; the above dissolvent is the one obtained by the current version of our propositional dissolution prover Dissolver. (The compact version (2.3) from Proposition 2 of the dissolvent is used; X is chosen to be the smallest of the two c-blocks). We can now factor on all the occurrences of both A1 and A2 in the upper right hand part of the graph and on the two occurrences of A1 66

A3

^ ...

^

A2n?1

A3

^ ...

_ A1

^

A2n?1

^

A4

^ ...

^

A2n

_ A1 ^ ^

A1

A1

A2n?1

A2

A2n

^ _ ::: _

A2

^

^ _ A4 A2 ^

^

A1

A2

A2

^

^

A1

_ A2

A1

_

^

_ ::: _ ^

^

^

A2n?1 A2n

A3 A4

...

^

^

A2n

Figure 4.1: Semantic graph of Ngair's formulas before and after dissolution. in the lower left corner. The resulting graph is shown in Figure 4.2. (Since Dissolver is NNF-based, such factoring is not only feasible but is in fact implemented and routinely employed.) The two occurrences of A2 at the bottom left part of the graph form a redundant disjunctive anti-link; they can be removed using the special case covered by Rule 1 for disjunctive anti-links. The two occurrences of A1 on the left hand side of the graph form a conjunctive anti-link and can be removed using the conjunctive anti-link rule. This produces:

67

A3

^ ...

^

A2n?1

_ A1 A1

^

^

A1

A2

_

^

^

A2

A2n?1

^

^

A4

A2n

A2 _ ^ ...

A3

_ ::: _ ^

A4

^

A2n Figure 4.2: Ngair's formulas after dissolution and factoring.

A1 A1

^

A2

^

_

A2 A2n?1

^

A2n

^

A3

_ ::: _ ^

A4

By factoring on A1 and removing the conjunctive anti-link comprised of the two occurrences of A2 (or by just factoring on A1 ^ A2), the above graph reduces to (A1 ^ A2); the prime implicates are just fA1g and fA2g. To get this graph, n + 68

3 factoring and 3 anti-link operations were required | obviously polynomial time. Hence dissolution + removal of anti-links + Pi can handle the above class of problems in polynomial time. Perhaps the most important point is that no subsumption checks whatsoever are required.

4.10 A Generalized Purity Principle 4.11 Strictly Pure Full Blocks Recall that a full block is essentially an explicit subgraph; it is a subset of the arguments of a conjunction or disjunction, and, via commutations and reassociations, can in fact be made explicit.

De nition 20 A subgraph M in a graph G is pure i all c-links or d-links that meet M at all are totally within M .1 If, in addition, all conjunctive or disjunctive anti-links that meet M at all are totally within M , we say that M is strictly pure.2 If M is a full block in G we speak of a (strictly) pure full block.

When factored, some of the examples from [43] contain surprisingly many strictly pure full blocks. Note that both factoring and recognizing strictly pure full blocks are polynomial operations. Intuitively, such full blocks can be replaced by single new variables, and the implicates of the resulting graph bear a strong relationship to those of the original. Of course, the full block in question must be satis able (since the new variable certainly is). At rst, this may appear to be a heavy penalty. It is not, however, because the prime implicates of the full block itself must be computed anyway. In doing so, its satis ability is determined as a byproduct. This is just the obvious generalization of the concept of a pure literal as it is used in the literature on CNF-based automated deduction. 2 Simply put, M shares no variables with the rest of G. 1

69

The following theorems characterize the properties of strictly pure full blocks with respect to prime implicates. In them we employ the following notation: let M be an explicit subgraph of a graph G and let X be a variable not occurring in G. By GXM we denote the graph obtained by the substitution of X for M in G. Similarly, if D is a disjunction of literal occurrences from G we denote by DM the disjunction of literals that occur in M and by DG?M the disjunction of literals that do not. Obviously, D = (DG?M _ DM ) holds. Finally, we set 8 > < D _ X if DM 6= false (empty disjunction) DMX = > G?M : DG?M otherwise

Theorem 4.11.1 Let M be a satis able strictly pure full block in a satis able se-

mantic graph G and let D be a non-tautological disjunction of literals from G. If DM 6= false, then the following statements are equivalent: 1. D is a prime implicate of G. 2. DMX is a prime implicate of GXM , and DM is a prime implicate of M .

2 This theorem turns out to be a special case of Theorem 4.12.1 to be proved in the following subsection. If a graph G contains several strictly pure full blocks M1; : : : ; Mn, then the repeated application of Theorem 4.11.1 provides a potentially signi cant speedup in computing the prime implicates of G: Replace each strictly pure full block Mi by a new variable Xi (1  i  n) and compute the prime implicates of the resulting graph GXM . Then, all substitutions of prime implicates of Mi for Xi in the prime implicates of GXM result in prime implicates of G. The speedup is potentially dramatic: Each subsumption test performed within some Mi would otherwise be performed once for every d-path in GXM that can be extended through Mi to form a d-path in G. Observe that prime implicates of GXM containing none of the variables Xi (1  i  n) are simply prime implicates of G that do not contain literals from any of the blocks Mi . 70

Several distinct strictly pure full blocks can be handled by repeated application of Theorem 4.11.1 as explained above. However, multiple occurrences M 1; : : :; M n of a single full block M in G require an extended analysis. The problem is that the multiple occurrences themselves preclude any single M i from being strictly pure, even if M shares no variables with the rest of G. Intuitively, we would expect that by replacing each of the occurrences M i in G by the single new variable X , the prime implicates of the resulting graph GXM would also bear a strong relationship to those of G. This is made precise in the next section.

4.12 Multi-Pure Full Blocks De nition 21 Suppose that M 1; : : :; M n are occurrences of full blocks in G and that all of them are syntactically identical (up to associativity and commutativity of disjuncts and conjuncts). The subgraph M  formed by taking all the nodes of the blocks M i is not necessarily a full block;3 but let M  be strictly pure. Then we call the M i multi-pure full blocks.

In addition, suppose that there are occurrences M n+1 ; M n+2 ; : : :; M n+m of full blocks syntactically identical (up to associativity and commutativity of disjuncts and conjuncts) to M , the NNF of the complement of any of the M i (1  i  n). We call M 1; : : : ; M n; M n+1 ; M n+2 ; : : :; M n+m complementary multi-pure full blocks.

Note that each (complementary) multi-pure full block M i is not strictly pure, since it has anti-links (and possibly links) to its equivalent (complementary) full blocks in G. Observe that complementary non-atomic formulas must be recognized. For example, if M = (A _ B ), then M could be A ^ B or B ^ A. In fact, M could have been input as :(A _ B ) or as :(:A  B ). If NNF formulas are stored in an appropriate canonical way, complementarity is easily (that is, in polynomial time) detectable; the 3

As the blocks M i could be single literals occuring in arbitrary positions, this is hardly surprising.

71

situation is also straightforward when complementary formulas have the form M and :M prior to conversion to NNF. In any case, a detailed treatment of this issue is beyond the scope of this thesis. We do note that in the absence of complements, multi-pure full blocks are recognizable in polynomial time via a canonical NNF representation. A modi cation of any algorithm for nding common subtrees (see [22] for one such algorithm) can be used for recognizing multi-pure full blocks. It turns out that the results of Theorem 4.11.1 can be extended to the case in which a formula contains complementary multi-pure full blocks. We use the notation of Theorem 4.11.1 with the understanding that M denotes any occurrence of an M 1; M 2; : : :; M n and the M n+1 ; M n+2 ; : : :; M n+m are treated as negated occurrences of M (thus GXM replaces the complementary occurrences M j as well). M  is de ned as the subgraph of G relative to the M i and M j (1  i  n < j  n + m). GXM , DM , DG?M , and DMX are de ned as before, but relative to M . Additionally, we de ne DMX in the obvious way (Intuitively, we use DMX when the literals of DM correspond to unnegated occurrences of M , and we use DMX when the literals in DM correspond to negated occurrences of M ).

Theorem 4.12.1 Let M 1 ; M 2; : : :; M n and M n+1 ; M n+2 ; : : :; M n+m be complemen-

tary multi-pure full blocks in a satis able semantic graph G, where all of the blocks M i and M j are satis able (we allow m = 0). Let D be a non-tautological disjunction of literals from G. Then the following statements are equivalent: 1. D is a prime implicate of G. 2. DM = false and D is a prime implicate of GX or DM 6= false and DMX is a prime implicate of GXM , and DM is a prime implicate of M or DM 6= false and DMX is a prime implicate of GXM , and DM is a prime implicate of M .

72

Let G0 XM be a graph without c-links that is equivalent to GXM (for instance, G0 XM could be the full dissolvent of GXM ). Similarly, let M 0 be a c-linkless equivalent of M , and M 0 be a c-linkless equivalent of M . Let G0 be the graph obtained from G by replacing X by M 0 and X by M 0. It is easy to see that G0 is equivalent to G but has no c-links. By Theorem 3.1.3, every prime implicate of G is present as a d-path in G0 and every prime implicate of GXM is present as a d-path in G0XM . To prove the only-if-part, let D be a prime implicate of G. Then there must be an unsubsumed d-path p in G0 such that (p) = D. Since M 0 and M 0 are complementary, p can never meet (and thus pass through) both M i and M j for any i and j . Suppose rst that p does not pass through any of the occurrences of M or M . In this case, p must also be a d-path in G0XM (technically, p is isomorphic to a dpath in G0XM ). To prove that p is not subsumed by another d-path in G0XM , assume otherwise, namely, that there is a path p0 in G0XM that subsumes p. But p0 cannot contain X or X , and hence would also be a d-path in G0 that subsumes p, which is a contradiction. Thus D is a prime implicate of G0XM and hence of GXM . Now suppose p passes through the full blocks M i1 0; : : : ; M i 0. Let pj , 1  j  q, be the restriction of p to M i 0. Notice that for 1  j , k  q, (pj ) = l(pk ); otherwise, the d-path obtained by replacing pj by pk in p would subsume p. Similarly, pj cannot be subsumed by another d-path in M i 0. Since pj is an unsubsumed d-path in M i 0 which is a linkless equivalent of M , DM = (pj ) is a prime implicate of M . Now let pX be the d-path obtained by replacing each of the pj by X in p; pX is a d-path in GXM . Furthermore, pX cannot be subsumed by another d-path in GXM (again, such a subsuming path would induce a path in G0 that subsumes p). Therefore DMX = (pX ) is a prime implicate of GXM . q

j

j

j

Finally, in the case that p passes through the full blocks M i10; : : :; M i 0, the argument is similar as in the previous paragraph. r

To prove the if-part, rst suppose DM = false and D is a prime implicate of GXM . Then there is an unsubsumed d-path (say p) in G0XM such that D = (p). Since DM = false, p contains neither X nor X . Thus p is an unsubsumed d-path of G0, and 73

D is a prime implicate of G. Now suppose that DMX is a prime implicate of GXM and that DM is a prime implicate of M . Then there are unsubsumed d-paths pX in G0XM and pM in M 0, respectively, such that (pX ) = DMX and l(pM ) = DM , respectively. In the present subcase, by de nition, DMX contains X (and cannot contain X ). Let p0 be the result of replacing all occurrences of X in pX by pM ; p0 is a d-path in G0. Since both pM and pX are unsubsumed in GXM and M 0, respectively, and since M 0 does not share any variables with the rest of G0, p0 will also be unsubsumed in G0. Hence, by Theorem 3.1.3, D = (p0 ) is a prime implicate of G. Similarly, if DMX is a prime implicate of GXM and DM is a prime implicate of M , then D is a prime implicate of G. 2 It is straightforward to see that in the case when n = 1 and m = 0 Theorem 4.12.1 collapses into Theorem 4.11.1. On the one hand we expect to substitute new variables for multi-pure full blocks and achieve a savings in the computation of prime implicates comparable to that provided by Theorem 4.11.1. But note that some complementary occurrences of M may be c-connected; this means that such occurrences play a role in whatever inference process is employed prior to computation of the implicates themselves. In particular, we may treat them as literals and dissolve (indeed, the set of individual links between such full blocks would satisfy the requirements of a multiple link dissolution chain as it is de ned in [40]). That dissolving on two complementary full blocks accomplishes exactly what dissolving on all the corresponding single-links would is clear: All c-paths through both full blocks are eliminated from the graph. But the former operation is much more ecient than the latter. Therefore, recognizing such complementary full blocks and performing inference directly on them, rather than on their constituent literals, is desirable. Note also that for the inference phase of a prime implicate computation, complementary full blocks do not have to be multi-pure full blocks. This condition is necessary only for the extraction of implicates using Theorem 4.12.1 once all 74

implicates are known to be present. Finally, the remarks above apply also to identical full blocks if they form appropriate non-atomic anti-links as discussed in Section 4.1.

4.13 More Examples Kean & Tsiknis [30] provide a class of examples referred in the following to as Knm . They have mn + 1 input CNF clauses and (m + 1)n + mn prime implicates. This set of clauses can be factored to obtain a more compact representation in NNF as shown in Figure 4.3.

S11 A1 _

^

^ ...

^

S1m

...

^ Sn1 An _

^

^ ...

^

Snm

A1 _ : : : _ A n Figure 4.3: Semantic graph of Kean & Tsiknis's formulas. 75

Since the number of prime implicates is exponential, so is the number of subsumption checks required. The number of subsumption checks for the Cltms [9] and Gen-Pi [43] algorithms are shown in Table 4.1. Examples Cltms Gen-Pi Pi + anti-link K33 5166 972 164 K44 506472 11600 887 K54 1730120 29074 887 Table 4.1: Number of subsumption checks needed for Kmn by Cltms, Gen-Pi, and Pi + anti-link. For each i, the literals Si1; : : :; Sim form a full block Mi, and all literals in it 0 be the graph obtained by replacing each full block Mi are strictly pure. Let Kmn by a new variable Xi. By the corollary of Theorem 4.11.1, we can get the prime 0 . Since each of the subgraphs Mi implicates of Kmn from the prime implicates of Kmn has no c-links, the prime implicates of Mi are present as d-paths by Theorem 3.1.3. Since they also have no anti-links, by the contrapositive of Theorem 4.1.1, neither are subsumption checks required to nd these prime implicates. Thus the number of subsumption checks to be done is exactly that required for computing the prime 0 , and this is signi cantly less than that needed for Kmn . Note implicates of Kmn 0 is only 2n + n. For the problems in that the number of prime implicates of Kmn Table 4.1, we applied the above technique in combination with anti-link operations. 0 , the full dissolvent depends only on n and can be de ned recursively. The For Kmn 0 (n > 2) are shown in full dissolvent of Km0 2 (basis) and of the general case of Kmn Figure 4.4. The number of subsumption checks required here is also shown in Table 4.1. Clearly, our techniques produce a signi cant reduction in the number of subsumption checks required. Note that for the problem Kmn , the number of subsumption checks depends only on n and not on m, and is not reduced by applying the anti-link operations to the full dissolvent. 76

Xn

^

A1 _ X1

^

A2

^

X2

An

A1

_

^

Xn _ An

^ A2 _ X 2

Km0 (n?1)

^

X1

^

_

A1

^ A2 _ X2 ^ ...

An?1

^ _ Xn?1

Figure 4.4: The full dissolvent of Km2 (left) and of Kmn , n > 2 (right). Our techniques are not limited to NNF formulas far from clause form. They can sometimes be used by other algorithms like Cltms and Gen-Pi which are based on 0 turns out to be in CNF and hence both Cltms CNF formulas. For example Kmn and Gen-Pi can handle these formulas, thereby reducing the number of subsumption checks needed. However normal forms like CNF provide very little scope for applying these techniques directly. For example the literals Si1; : : :; Sim in the unfactored form of Kmn do not form a full block. Hence one cannot apply Theorem 4.11.1. They do form a full block after factoring. This provides stronger evidence that by avoiding less general normal forms like CNF/DNF, one can improve the performance of prime implicate algorithms.

4.14 Experimental results Although all disjunctive anti-links can be removed via anti-link operations (see Theorem 4.4.3), removing all such anti-links can result in exponential growth of the 77

formula, worsening the performance of PI. We therefore restrict our attention to anti-link operations, that when applied will not increase the size of the graphs. We found that even with such a restriction we had obtained a signi cant reduction in the number of subsumption checks performed. We also have implemented the conjunctive anti-link operator form section 4.7. We have not implemented the full block replacement technique mentioned in section 4.10. Such full blocks occurred in only one class of examples. The e ect of such operations on the number of subsumption operations performed by our system is given in Table 4.1. Table 4.2 contains the running times of PI, with and without anti-link operations, and of deKleer's CLTMS. The tasks involved computing prime implicates of formulas taken from applications such as diagnosis and truth maintenance, and some arti cially created problems. The table also contains the number of subsumption checks performed by each of the algorithms. The running times for CLTMS are taken from the paper [9]. Some of the timings reported in that paper are incorrect. We do not include those results in the table. The examples Kmn, adder, BD, and two-pipes are examples for which running times have been reported in [9]. The three-pipes example is an extension of the twopipe example. The parity example is a 3 bit parity circuit diagnosis problem from [8]. We can see that PI, even without anti-link operations, has better running times than CLTMS on all except one of the examples used. With anti-link operations we achieve a further reduction in running time of up to 40 percent over the time taken without those operations. This is due to the reduction in the number of subsumption checks being performed. Note that running times for the Knm set of problems was not e ected by the anti-link operations, because the full dissolvent had very few subsumed paths.

78

Problem

#PIs

k33 73 k44 641 k55 7801 k54 1316 k66 117685 k67 823585 adder 9700 BD 1907 two-pipes 638 three-pipes 2360 parity 476

CLTMS PI w/o anti-links PI + anti-links # subs secs # subs secs # subs secs 5166 0.03 596 0.02 596 0.02 506472 0.46 7874 0.12 7874 0.12 9.1107 8.6 184140 1.54 184140 1.54 1730120 1.1 64721 0.63 64721 0.63 2.41010 224 3675886 36.26 3675886 36.26 1.31015 2541 9075946 91.56 9075946 91.56 ? 1137 10859062 356.9 10416517 347.41 1:1  109 38 2601435 84.24 1353311 51.01 ? ? 73903 1.23 48255 1.03 ? ? 211272 4.34 164523 4 ? ? 7130710 373.55 4903570 284.48

Table 4.2: Results of applying anti-link operators. A * indicates estimates, ? indicates unknown.

79

Chapter 5 Computing diagnoses In this chapter we develop a new algorithm for computing the minimal diagnoses of a system using the dissolution rule and some path based operations on NNF formulas.

5.1 De nitions De nition 22 A system is a pair (SD, COMPONENTS) where SD, the system de-

scription, is a propositional formula, and COMPONENTS, the set of system components, is a nite set. An observation OBS is a propositional formula.

For each component c we have a propositional variable denoted by ab(c) and is interpreted to mean that component c is abnormal. We use ABVARS to denote the set of these variables.

De nition 23 A diagnosis for the system (SD, COMPONENTS) with observation OBS is a set   ABV ARS such that F(SD, OBS, ) is consistent, where

F(SD, OBS, ) is the propositional formula SD ^ OBS ^ ( Vab(c)2  ab(c) )

^ ( ^ab(c)2 ABV ARS ?  ab(c) ) :

A diagnosis  is minimal if no proper subset of  is a diagnosis.

80

We use standard de nition of subsumption with respect to sets of literals form Chapter 3, i.e., literal set S subsumes literal set S0 if S  S0.

De nition 24 Given a collection of literal sets D, we de ne SUBS(D) to be the subset of D whose members are not subsumed by any member of D.

De nition 25 A diagnosis  is a single fault diagnosis if  is a singleton set, oth-

erwise it is a multiple fault diagnosis.

The above de nitions are a modi ed version of those used by Reiter [50]. In this thesis we use only propositional formulas whereas Reiter's original de nition uses rst order formulas. As a result, Reiter's procedure is more general but is not guaranteed to terminate.

5.2 Extensions to PI algorithm In Chapter 3 we describe an algorithm PI which computes the following set: (F ) = fl(P ) j (P is a cpath through F ) ^ (P 6= false) ^ (8c ? paths Q through F ; l(Q) 6 l(P ))g: Later in this chapter we will e computing the following sets, under the assumption that F has no unsatis able c-paths: 0(F ) = fl(p) j (p is a c ? path through F ) ^ (8c ? paths q through F ; l(q) 6 l(p))g and 00(F ) = fl(p) j (p is a c ? path through F ) ^ k l(p) k = 1g: PI can be very easily modi ed to compute the above sets. For 0, we can remove the check for unsatis ability of a c-path (line 10 of the algorithm in section 3.2). To compute 00, we can ignore the satis ability check and also eliminate c-paths whose literal sets have cardinality greater than 1; this can be done as soon as we detect them (at line 11 of the algorithm). In subsequent discussions in the chapter we will refer to all these computations as the PI algorithm. The speci c version of PI being used can easily be inferred from the context. 81

5.3 A Method to Find Minimal Diagnoses In this section we describe an algorithm to nd the set of minimal diagnoses. We then give modi cations to this algorithm that can be used to nd the set of single fault diagnoses only. First we show the relation between c-paths and diagnoses.

5.3.1 Diagnoses and c-paths Lemma 5.3.1 Let p be a satis able c-path in the propositional formula SD ^ OBS. Then l(p) \ ABVARS is a diagnosis. Proof: p can be trivially extended to from a satis able c-path through the formula F (SD, OBS, l(p) \ ABVARS). 2

Lemma 5.3.2 Let  be a diagnosis for a system (SD, COMPONENTS) with obser-

vation OBS. Then there is a satis able c-path p through the propositional formula SD ^ OBS, such that   l(p) \ ABVARS. Proof: Since  is a diagnosis for (SD, COMPONENTS) with observation OBS, the formula F (SD, OBS, ) is satis able; i.e, there is a satis able c-path (say q) through it. Let p be q restricted to SD ^ OBS; p is a satis able c-path in SD ^ OBS. Let ab(c) be a member of l (p) \ ABVARS. To prove that ab(c) 2 , assume the contrary. ab(c) will be in l(q). Since q is a c-path in F (SD, OBS, ), ab(c) 2 l (q). Therefore both ab(c) and ab(c) are in l(q), which is impossible since q is satis able.

2

Theorem 5.3.3 Let (SD, COMPONENTS) be a system with observation OBS, and let D be the set f l(p) \ ABVARS j p is a satis able c-path in OBS ^ SD g. Then the set SUBS(D) contains all the minimal diagnoses of (SD, COMPONENTS) with observation OBS. Proof: Follows directly from Lemmas 5.3.1 and 5.3.2.

82

2

Corollary 5.3.4 Suppose D0 is the set fl(p) \ ABV ARS j p is a c ? path in FD(OBS ^ SD)g. Then SUBS(D0 ) contains all the minimal diagnosis of (SD, COMPONENTS) with observation OBS.

Corollary 5.3.5 Suppose D0 is the set fl (p ) \ ABV ARS j p is a c ? path in OBS ^ SD and k l (p) \ ABVARS k  1g: Then the set SUBS(D0 ) contains all the single fault diagnosis of (SD, COMPONENTS) with observation OBS.

To compute all minimal diagnoses we rst compute (SD ^ OBS ) using PI; then we restrict each set in (SD ^ OBS ) to literals from the variable set ABVARS to get the set D. Eliminating all subsumed sets in D would produce the desired result. However this is not feasible for sizable inputs, since the number of c-paths can be extremely large even if the set of diagnoses is small. If, instead, we use the full dissolvent FD(SD ^ OBS ), we can ignore literals containing either variables not in ABVARS or negations of variables from ABVARS, while enumerating c-paths. Even this improvement is not sucient for our experimental results; our implementation uses a further re nement which is explained in the next section.

5.3.2 An Example We now give an illustrative example. Consider a two inverter circuit as shown in gure 5.1. In

X I

Out I

1

2

Figure 5.1: Inverter example The behavior of the circuit can be described by the following propositional formula: ((In , X ) ) ab(I1)) ^ ((X , Out) ) ab(I2)) : Suppose we observe that with 83

input 1 the output produced is 0. This observation can be modeled as In ^ Out. The formula SD ^ OBS is represented by the following graph shown below on the left. On the right is the full dissolvent of this graph.

In _ X

^ In _ X Out _ X

_ ab(I1)

X

^

In

^

^ _ ab(I2) Out _ X ^

_ ^

ab(I2)

ab(I1)

^

ab(I1)

_ ^

X

^

Out

In

^

In

^

Out

(a)

(b)

Out

Figure 5.2: System description and observation for inverter example The graph of (a) of Figure 5.2 has 25 c-paths, 3 of which are satis able: fX, In, ab(I2), In, Outg, fab(I1), ab(I2), In, Outg, and fab(I1), X , Out, In, Outg. Therefore the set of minimal diagnoses is ffab(I1)g,fab(I2)gg; the diagnosis fab(I1),ab(I2)g is not minimal. The graph of gure 5.2(b) has exactly the same satis able c-paths as the graph of gure 5.2(a). Therefore we get the same diagnoses from this graph also. The problem with simply computing the full dissolvent is that it can be prohibitively expensive: The full dissolvent may be quite large, and so may the formula during the intermediate stages of the computation, even when the full dissolvent that results is not. Therefore we need some way of reducing formula growth. We now present some techniques to achieve that, and, to this end, we must rst establish some results. 84

5.3.3 Purity Reductions In this section we assume that a literal is pure i it is not a part of any c-link 1.

Lemma 5.3.6 Let K be a pure literal in graph G, where K 62 ABV ARS (Note that

literal K may be a negated variable from ABVARS). Then for any satis able c-path p in G that passes through DE(K), there is a satis able c-path q through K such that l (q) \ ABV ARS  l(p) \ ABV ARS . Proof: If p contains K, then q = p satis es the lemma. Otherwise let p0 = p \ DE (K ), and let q = fp - p0g [ K. Since p is satis able and K is pure, q is satis able. Since K 62 ABV ARS , any member of q \ ABV ARS is also a member of

p \ ABV ARS:

2

The above lemma shows that if we have a pure literal K, then the d-extension of K does not contribute to the set of minimal diagnoses and can therefore be deleted. Recall that this amounts to computing the d-path complement of K.

Theorem 5.3.7 Let K be a pure node in the graph G, where K 62 ABV ARS . Let D = fl(p) \ ABV ARS j p is a satis able c-path in Gg, and let D0 = f l (p ) \ ABV ARS j p is a satis able c-path in DC(K, G)g. Then SUBS (D) = SUBS (D0).

Proof: We show that for every set S 2 SUBS (D), there is a set S 0 2 SUBS (D0 ) such that S 0  S , and for every set S 0 2 SUBS (D0), there is a set S 2 SUBS (D) such that S  S 0. Since both SUBS (D0) and SUBS (D) have no subsumed sets, we can conclude that SUBS (D0) = SUBS (D). To prove the former, let S 2 SUBS (D). Then there is a satis able c-path (say p) in G such that S = l (p ) \ ABVARS . If p misses DE(K ), then p is also a satis able c-path in DC(K , G); therefore S (or some S 0 such that S 0  S ) must be in SUBS (D0 ). If p passes through DE(K ) then This is weaker than the de nition used in 4.10, where a literal is considered to be pure i it not part of any c-link or any d-link 1

85

by Lemma 5.3.6 there must be a c-path q in G that passes through K , and l(q) \ ABVARS  S . By de nition of DC(K , G), q ? fKg is a satis able c-path in DC(K , G). Let S 0 be l(q ? fKg) \ ABVARS. Since K 62 ABV ARS , S 0 = l(q) \ ABVARS. Now either S 0 or some S 00  S 0 will be in SUBS (D0). Similarly we can prove that for every set S 0 2 SUBS (D0), there is a set S 2 SUBS (D) such that S  S 0. 2 The DC operator strictly reduces the size of the graph on which it is applied. We apply this reduction whenever we discover a pure occurrence of a literal. Therefore, by applying this theorem during the process of dissolving, we can signi cantly reduce the combinatorial explosion. By de nition, all the literal occurrences in a full dissolvent are pure. Therefore, at the end of the process we will be left with a graph which has literals from the set ABVARS only. We may now use PI to compute the set 0(FD(SD ^ OBS )). Consider the two-inverter example from gure 5.2. If we apply the DC reduction operations on all pure occurrences of the variables not in ABVARS, then the full dissolvent of the previous example becomes ab(I1) _ ab(I2). This has only two c-paths from which we get the two minimal diagnoses of ffab(I1)g,fab(I2)gg.

5.3.4 Single Faults To nd all and only single faults, the algorithms that compute all minimal diagnoses can be used in one of two ways. The rst method is to nd all diagnoses and then discard the multiple fault diagnoses. This is not a good solution since the diculty of computing all diagnoses is worse than that of nding only single fault diagnoses. The second method is to augment the system description with axioms so that all multiple fault diagnoses are eliminated leaving only the single fault diagnoses. For example if a system has two components a and b, we can add the clause (ab(a) ) :ab(b)) to the system description, which would rule out both a and b being abnormal at the same

86

0 1 n time. In general, if the system has n components, we have to add B @ CA such clauses. 2 This method is similar to the way some theorem provers treat equality: by adding the equality axioms to the theory. But most theorem provers handle equality by augmenting the basic inference mechanisms with special rules for equality. Similarly, we augment dissolution and purity reduction with two techniques to handle single fault diagnosis more eciently. During the process of dissolving, we restructure the graph so as to eliminate c-paths which have at least two di erent variables from the set ABVARS occurring in them. By eliminating such paths, potential multiple faults (see Corollary 5.3.5 to Theorem 5.3.3) are also eliminated. These operations strictly reduce the size of the graph and therefore improve the performance of the algorithm. The following theorem outlines one such restructuring operation.

Theorem 5.3.8 Let M be a full block in the graph G such that M contains literals

from the variable set ABVARS, only. Let G 0 be the graph obtained by replacing M by the graph M 0 = _fab(c)g 2 (M )ab(c), if 00(M ) 6= ;, or by false otherwise. Then for every c-path p in G having kl (p ) \ ABVARS k = 1, there is a c-path p0 in G0 such that l(p0) \ ABV ARS = l(p) \ ABV ARS and vice versa. 00

Proof:. Suppose 00(M ) = ;. Then any c-path through M must have at least two literals from the variable set ABVARS. Therefore by Corollary 5.3.5 to Theorem 5.3.3, these c-paths do not lead to single fault diagnoses. By replacing M by false, we eliminate all c-paths through M.

Suppose 00(M ) 6= ;. Every c-path through M 0 has exactly one occurrence from the set ABVARS, and each such c-path has a corresponding c-path in M which has the same literal set. It is obvious that the theorem follows in this case. 2

Corollary 5.3.9 The set of single fault diagnoses of G and G0 are the same. 87

Thus while employing dissolution, if we nd a full block M that has literals formed from only the variable set ABVARS, then we can replace M by false if 00(M ) = ; or by the graph _fab(c)g 2 (M )ab(c) if 00(M ) 6= ;. 00

Some c-paths with more than two occurrences of literals from the variable set ABVARS can also be eliminated using the dissolution rule itself. If we nd two c-connected literals ab(c) and ab(c0 ) (where c 6= c0), we can eliminate all c-paths through them, just as we eliminate paths through a link using the dissolution rule; i.e, we can treat such literal pairs as links and apply dissolution to them. Dissolving on such pairs of literals can cause additional blow-up of the graph without producing sucient computational bene t. However, the special case of dissolution known as unit dissolution strictly reduces the size of the graph. We can therefore apply the dissolution rule whenever we nd a c-connected pair of literals fab(c),ab(c0 )g, where c 6= c0 , to which unit dissolution applies. To nd all the single fault diagnoses, rst de ne FD0(OBS ^ SD ) to be the graph obtained by applying dissolution, purity reduction, and the two reduction operations mentioned above, to OBS ^ SD until it is link free. Then we use PI to compute the set 00(FD0(OBS ^ SD)). Here, PI requires time linear in the size of the FD0(OBS ^ SD ).

5.3.5 Diagnosis for Horn formulas If SD and OBS are both horn formulas then we can nd the minimal diagnosis using a di erent technique based on minimal models.

Theorem 5.3.10 Let (SD, COMPONENTS) be a system and OBS be some obser-

vation about the system such that both SD and OBS are Horn formulas, and let M be the minimal model of SD ^ OBS. Then M \ ABVARS is the only minimal diagnosis. Proof: Since F(SD, COMPONENTS, M \ ABVARS) is satis able, M \ ABVARS is a diagnosis. Suppose M \ ABVARS is not a minimal diagnosis; then there must be

88

some D  M \ ABVARS which is a diagnosis. Therefore there is some model M0 such that D = M0 \ ABVARS. Since M0  M, M0 is also a minimal model contradicting the unique minimality of models of horn formulas. Similarly we can show that there is no other minimal diagnosis. 2 To nd the unique minimal diagnosis, we rst nd M, the minimal model of SD ^ OBS and then nd the minimal diagnosis by removing from M all literals not in ABVARS. Since the minimal model can be computed in time linear in the size of the formula (see [14] for a description of an algorithm), the minimal diagnosis can also be computed in linear time.

5.4 Experimental results We have currently implemented both the single fault and multiple fault diagnosis algorithms. Our implementation is in C/C++ and runs on a Sun Sparc 5 workstation. All the timings reported in this chapter were obtained on this machine. The implementation uses the PI function from Chapter 3 for computing all the unsubsumed c-paths in a graph. This has been enhanced with the anti-link operations from Chapter 4 to reduce the number of subsumption checks. The implementation uses the trie to store minimal diagnoses. We compare our algorithm with that of Mozetic and Holzbaur 2 [34]. We have used the familiar n bit carry adder example from [50]. The n-bit adder circuit is considered to be a dicult example for diagnosis systems. Table 5.1 gives the running times (in seconds) of our system on randomly chosen observations of the system for the 3 bit and 5 bit adder examples. We also give the running times of the Mozetic-Holzbaur algorithm. We can see that our algorithm outperforms theirs on examples which have many faults. The last ve problems in the table are for for one speci c observation where all An implementation of their algorithm is available by anonymous ftp. We obtained a copy, which was run on our Sparc 5 in our experiments. 2

89

Problem 3 bit adder 3 bit adder 3 bit adder 3 bit adder 5 bit adder 5 bit adder 2 bit adder 3 bit adder 4 bit adder 5 bit adder 6 bit adder

Gates Mozetic Holzbaur 15 260.47 15 16.36 15 0.28 15 0.01 25 19834.98 25 >9 Hrs 10 0.01 15 0.13 20 1.96 25 55.54 30 >2 Hrs

Ours Diagnosis 0.41 0.02 0.14 0.35 1.27 62.04 0.01 0.47 4.03 53.29 524.26

204 74 4 2 722 1784 5 8 11 14 17

Table 5.1: Running times for n-bit adder problem the inputs are 0 and all the outputs except the nth bit are 0. There are exactly 3n 1 minimal diagnoses. This task is considered to be a dicult task for diagnosis. For the other problems we use a random observation. Table 5.2 gives the running time for computing the single fault diagnosis only. The observation used is the same as the last ve examples in the previous table. These results show that our algorithm can compute single fault diagnoses very quickly. DeKleer's algorithm [8] can nd the set of single fault diagnosis for the 500 bit adder in 6 seconds. However his algorithm is not purely symbolic: it requires additional input from the user in the form of measurements.

90

Problem Gates Ours # Single fault Diagnoses 1 bit adder 5 0.01 2 2 bit adder 10 0.01 5 3 bit adder 15 0.05 5 4 bit adder 20 0.08 5 5 bit adder 25 0.11 5 6 bit adder 30 0.13 5 7 bit adder 35 0.17 5 100 bit adder 500 15.42 5 500 bit adder 2500 279.96 5 Table 5.2: Running time for single fault diagnosis

5.5 Other algorithms for computing diagnoses There are two main approaches to diagnosis: consistency based diagnosis and abductive based diagnosis. The two approaches di er in the way diagnoses are de ned and computed.

5.5.1 Consistency based algorithms In this thesis we have concentrated on consistency based approaches. There are many algorithms for consistency based diagnosis. These fall into two categories: purely symbolic or probability based. Most of the symbolic algorithms are based on Reiter's theory of diagnosis. Reiter also gave an algorithm in his paper, which was superseded by another algorithm he developed along with de Kleer and Mackworth in [10]. The algorithm of [10] is shown below. SD is the system description, OBS is the faulty observation. It is assumed that SD and OBS are in some decidable subset of rst order logic. 91

1. Compute all the prime implicates of the formula SD ^ OBS . 2. Let D be the minimal con ict set of SD ^ OBS ( the set of prime implicates of SD ^ OBS that are composed of positive \ab" 3 predicates only). 3. Compute the minimal hitting set of D (the set of prime implicants of D), which is the set of minimal diagnoses. Due to the inherent diculty in computing prime implicates, this algorithm was never implemented. Mozetic and Holzbaur have developed an algorithm IDA [34], for computing minimal diagnosis, using PROLOG to represent the system description and observation. In IDA, Step 1 of Reiter's algorithm is not performed. Steps 2 and 3 are combined into one step. The algorithm computes the minimal hitting set and the con ict set simultaneously in an incremental manner. DeKleer and Willams' algorithm (GDE) [11] is the most popular probability based algorithm. Probability based algorithms have fared better in practice because many improbable faults can be eliminated very early. Techniques known as probing [8] have also been used make these algorithms more ecient. The main disadvantage of these algorithms is that the prior failure probability for each component in the system has to known a priori. This limits the application to domains where such information is available, for example circuit diagnosis.

5.5.2 Abduction based diagnosis A abductive diagnosis for the system (SD, COMPONENTS) with observation OBS is a set D = fmode() j mode = ab or mode = :ab ^  2 COMPONENTS g such that 1. SD ^ OBS Vl2D l is consistent Since SD and OBS are rst order formulas, predicates rather than variables are used to capture the notion of abnormality 3

92

2. SD ^ OBS Vl2D l j= OBS An abductive diagnosis  is minimal if no proper subset of  is a diagnosis. Many algorithms have been proposed for computing diagnoses: Reggia et. al. give an algorithm based on set covering [49], Ge er and Perl give a probabilistic approach in [18], and Kean and Tsiknis give an algorithm based on prime implicates [31].

93

Chapter 6 Computing Prime Implicates/Implicants for Multiple Valued Logics In this chapter we investigate the adaptability of techniques for classical propositional logic from Chapter 3 to multiple-valued logics. We rst de ne the notion of prime s-implicant and prime s-implicate with respect to signed formulas that are normalized in a way analogous to NNF. These are generalizations to signed formulas of the classical notions of prime implicant/implicate. Then, to nd the prime implicants/implicates of multiple-valued logic formulas, we rst form the corresponding signed formulas, nd the prime s-implicants/s-implicates of these signed formulas, and nally translate them to prime implicants/implicates of the multiple-valued formula. Our method therefore provides a way of nding prime implicants or prime implicates that is quite independent of the particular MVL employed.

94

6.1 Preliminaries In this section we provide basic de nitions necessary for understanding this chapter. Some of the terms de ned in this chapter are used in Chapter 2 in a classical propositional logic context. Unless speci ed, we assume a multiple valued logics context when using such overloaded terms in this chapter.

6.1.1 Syntax of multiple valued logics The language of a propositional multiple valued logic (MVL) , consists of logical formulas built from from a set A of atoms, a set ? of connectives, a set  of logical constants, and  a set of truth values. Associated with each connective  of arity n there is an n-ary function  : n ! . Associated with every logical constant c there is a truth value c0 in . The set of sentences in  are referred to as multiple valued formulas. A multiple valued formula in  can be constructed using the following set of rules. 1. Atoms and logical constants are formulas. 2. If  is a connective of arity n and if F1; F2; :::; Fn are formulas, then so is (F1; F2; :::; Fn).

6.1.2 Semantics of multiple-valued logics De nition 26 An Interpretation for  is a function from A to . Interpretations can be extended to mappings from formulas to . Let I be an interpretation. I can be extended to I 0 a mapping from multiple-valued formulas to the set  as follows. 1. I 0(c) = c0 if c 2  95

2. I 0(a) = I (a) for all atoms a 3. I 0((F1; :::; Fn)) = (I 0(F1); :::; I 0(Fn))

Example 13 In this thesis we will be using a family of multiple valued logics Pn due to Post [45], where n is the number of truth values in the logic.  the set of truth values in Pn is the set f0,1, 2, ..., n - 1g. Pn has two connectives ^p and _p , and one unary connective . ^p and _p correspond to the min and max functions respectively.  is the function (i + 1) mod n.

6.1.3 Signed formulas Signed formulas have been used by Murray and Rosenthal [39], [41] and Hahnle [24] to build inference systems for multiple valued logics. A review of signed formulas and of the corresponding propositional logic appears below; it is borrowed from [41].

De nition 27 A sign is any subset of  or any expression that denotes a subset

of .

De nition 28 A signed expression is an expression of the form S : F , where S is a sign and F is a formula in a multiple valued logic . De nition 29 A signed atom is a signed expression of the form S:F where S is a sign and F is a formula in . In this thesis we are interested in signed formulas because they represent queries of the form, "Are there interpretations under which F evaluates to a truth value in S ?" To answer arbitrary queries, we map formulas in  to formulas in a classical propositional logic s, called signed formulas.

De nition 30 A signed formula is a formula in classical logic built form signed expressions using the connectives ^ and _. 96

Note a signed formula S : F is an atom in s regardless of the size or complexity of F . The set of truth values is of course ftrue,falseg. An arbitrary interpretation for s may make an assignment of true or false to any signed formula (i.e., to any atom) in the usual way. Our goal is to focus attention only on those interpretations that relate to the sign in a signed formula. To accomplish this we restrict attention to -consistent interpretations.

De nition 31 Let I be an interpretation over  and let I 0 be the extension of I. IS the corresponding -consistent interpretation is de ned by IS (S : F ) = true if I 0(F ) 2 S , and IS (S : F ) = false if I 0(F ) 62 S . Note that there is a 1-1 correspondence between the set of all interpretations over  and the set of -consistent interpretations over S . Intuitively, -consistent means an assignment of true to all signed formulas whose signs are simultaneously achievable via some interpretation over the original language. Restricting attention to -consistent interpretations yields a new consequence relation: If F1 and F2 are formulas in s, we write F1 j= F2 if whenever Is is a -consistent interpretation and Is(F1) = true, then Is(F2) = true.

De nition 32 A -consistent interpretation Is satis es a signed formula F if Is(F ) is true.

A -consistent interpretation Is falsi es a signed formula F if Is(F ) is false. A signed formula F is valid if every -consistent interpretation satis es it. A signed formula F is unsatis able if every -consistent interpretation falsi es it.

The following lemma is immediate.

Lemma 6.1.1 Let Is be a -consistent interpretation, let A be an atom and F a formula in , and let S1 and S2 be signs. Then:

i Is(; : F ) = false; 97

ii Is( : F ) = true; iii S1  S2 if and only if S1 : F j= S2 : F for all formulas F ; iv There is exactly one  2  such that Is(fg : A) = true:

2

The next lemma follows immediately from part iv of Lemma 6.1.1. First, we say that two formulas Fs and F 0s in s are -equivalent if Is(Fs ) = Is(F 0s ) for any consistent interpretation Is; we write Fs  F 0s . Observe that Fs  F 0s if and only if Fs j= F 0s and Fs0 j= Fs.

Lemma 6.1.2 (The Reduction Lemma). Let S1 : A and S2 : A be signed atoms in s; then S1 : A ^ S2 : A  (S1 \ S2 ) : A and S1 : A _ S2 : A  (S1 [ S2 ) : A :

6.1.4 -atomic formulas We say that a formula in s is lambda atomic if for each atom S : A, A is an atom in ; i.e., if the only atoms in the formula are signed atoms. In [41], it was shown that a -equivalent version of any signed formula can in principle be computed when  is nite. The conversion is achieved through equivalences similar to that used converting arbitrary formulas in classical logic into NNF (see section 2.3). This conversion is also referred to as pushing signs inward. In general, computing the -atomic equivalent of S : F can be prohibitively expensive. Yet for many applications in which prime implicants/implicates are useful (e.g., hardware design), the logic employed is highly structured and turns out to be a regular logic as de ned by Hahnle [23].

6.1.5 Regular logics Intuitively, a regular logic has a nite linearly ordered truth domain, and its connectives behave monotonically with respect to each of their arguments; for details, see [23]. 98

We assume that  is a nite linearly ordered set f0,1,2, ..., n-1g.

De nition 33 A sign S is a regular sign if it represents an interval of  containing either 0 or n-1.

Obviously, any regular sign can be represented as fig, where i 2  (or as  or ; in case they are needed). In this chapter we show how signed formulas can be used to compute prime implicates/implicants for regular logics. To compute prime implicates/implicants we initially start with with a signed formula with a regular sign.

De nition 34 A connective  is a regular connective if for any regular sign S, S:(F1 ; :::; Fn)  (S1 : F1 ; :::; Sn : Fn) for some regular signs S1; :::; Sn, where is either ^ and _ (classical 2 valued connectives), or if S:(F )  S 0 : F where  is

an unary connective, S and S 0 are regular signs.

A regular logic is a multiple valued logic that has regular connectives only.

When a regular sign is pushed inside a regular connective, the arguments of the connective become signed, and those resulting signs are also regular. The connectives ^p and _p of Post logics are regular connectives, where as the connective  is not. However  can be expressed using regular connectives: (i) = 0(i) ^p Jn?1(i), where 0(i) = min(i + 1; n ? 1) and Jn?1(i) = n ? 1 ? i + (i mod n ? 1). Given a signed formula S : F , where S is regular and F is a formula from Pn we have the following rules for driving S inward.

99

f< ig : (G ^p H) f> ig : (G ^p H) f< ig : (G _p H) f> ig : (G _p H) f> ig : 0(F ) f< ig : 0(F ) f> ig : Jn?1 (F ) f< ig : Jn?1 (F )

       

f< ig : G _ f< ig : H f> ig : G ^ f> ig : H f< ig : G ^ f< ig : H f> ig : G _ f> ig : H f> i ? 1g : F f< i ? 1g : F f< n ? 1g : F f> n ? 2g : F

In applying the above rules, we assume the following simpli cations: fn-1g:F are always false; f-1g:F are always true. Consider the following signed formula in P4, f>1g:F where F is ((x1 _ x2) ^ (x3)). After driving the signs inward we get the following -atomic formula F 0 = ((f> 1g : x1 _ f> 1g : x2) ^ (f> 1g : x3 ^ f< 3g : x3)).

6.1.6 Adding negation to signed formulas Let F be any signed formula. We introduce classical negation (:) into signed formulas by de ning :F as follows: If Is is a -consistent interpretation, Is(:F ) = true if Is(F ) = false; otherwise, Is(:F ) = false. The proofs of the identities listed below follow from -consistency and well known properties of classical two-valued logic. 1. :S : F  S : F , where S =  ? S: 2. :(S1 : F1 ^ S2 : F2)  :S1 : F1 _ :S2 : F2. 3. :(S1 : F1 _ S2 : F2)  :S1 : F1 ^ :S2 : F2. 4. S1 : F1 j= S2 : F iff :S1 : F1 _ S2 : F2 is valid w.r.t. -consistency.

100

6.1.7 Notation for signed formulas The notion of semantic graphs was extended by Murray and Rosenthal to signed formulas in [40]. This section contains a description of the notation and de nitions that are needed from their paper. The formulas in s that we are interested in are in NNF. Such s formulas can be represented as semantic graphs in the usual way (as in section 2.4). The only di erence being that signed literals replace literals. The de nition of c-paths and d-paths are identical to the classical case. As an example, the formula ((S1 : C ^ S2 : A) _ S3 : C ) ^ (S4 : A _ (S5 : B ^ S6 : C )) is displayed graphically in Figure 6.1:

S1 : C V W S :C 3 S2 : A V S4 : A W

S5 : B V S6 : C

Figure 6.1: Signed semantic graph The graph above contains four c-paths (maximal conjunctions of literal occurrences): fS1 : C , S2 : A, S4 : Ag, fS3 : C , S5 : B , S6 : C g, fS1 : C , S2 : A, S5 : B , S6 : C g, fS3 : C , S4 : Ag. Since there is no negation symbol in -atomic formulas the de nition of links is di erent from the propositional case.

De nition 35 A c-link is a minimal set of mutually unsatis able pairwise c-connected

signed atoms and a d-link is a minimal set of mutually valid pairwise d-connected

101

signed atoms.

It is easy to see that that the intersection of their signs of a link is ; for c-links and the union is  for d-links. Note that for the special case of classical logic, this reduces to the usual de nition of a pair of complementary literals. In S , however, links are not necessarily binary. Since dissolution was designed to operate on pairs of objects we have de ne partial links.

De nition 36 A partial c-link is a pair of c-connected literals that di er only in

their signs and a partial d-links is a pair of d-connected literals.

For example in the above gure S2 : A and S4 : A form partial c-link, and if S2 \ S4 = ;, then they also form a c-link.

Lemma 6.1.3 Let G be a semantic graph. G is unsatis able (valid) if and only if

every c-path (d-path) is unsatis able (valid), and a c-path (d-path) is unsatis able (valid) if and only if it contains a c-link (d-link).

The de nition of a subgraph is the same as the propositional case. The example of gure 6.1 is shown below on the left; the subgraph relative to the set X = S2 : A; S4 : A; S6 : C is the graph GX on the right.

S1 : C V W S :C 3 S2 : A V

S4 : A W

S2 : A V

S5 : B V

S4 : A

S6 : C

W

Figure 6.2: Subraphs in signed formulas 102

S6 : C

Note that every c-path (d-path) in GX will be a partial c-path (d-path) in G. The de nitions of CC,CPE,DC,and DPE carry over from propositional logic (See section 2.5).

6.1.8 Signed path dissolution Signed dissolution results from generalizing the standard formulation to operate on partial c-links. Suppose literals S1 : A and S2 : A reside in conjoined subgraphs X and Y. The standard dissolvent contains all paths of X ^ Y not containing S1 : A, S2 : A. We de ne the signed path dissolvent to be the standard dissolvent disjoined with that subgraph whose paths are exactly those that contain both S1 : A and S2 : A. However, these latter paths can be simpli ed: The two literals in the partial link can be replaced by the single literal (S1 \ S2) : A, by the Reduction Lemma. Formally, let M = X ^ Y , where S1 : A is in X and S2 : A is in Y. Then the signed path dissolvent of L = fS1 :A, S2:Ag in M is: X CC(S1:A,X ) fCPE(S1:A,X )g - fS1:Ag W W DV(L,M ) = ^ ^ ^ CC(S1:A,X) CPE(S2:A , Y ) fCPE(S2:A,Y )g-fS2:Ag

^ (S1\ S2):A

The number of c-paths does not decrease unless S1 \ S2 = ;, in which case the rightmost disjunct in the dissolvent is dropped. Strong completeness, however, still holds (see [41]); a nite number of steps will yield a graph without partial links. We illustrate signed path dissolution with an example. Consider the -atomic formula in Figure 6.3 below, where  = f0, 1, 2g. We dissolve on ff1,2g:A, f0,1g:Ag. The dissolvent is:

103

f0,1g:C W V

f0,1g:A W

f1g : B ^ f1; 2g : A f1g : C ^ f0g : B

Figure 6.3: A signed formula

f0; 1g : C

W

^ f1g : C ^ f0g : B

f1g : B ^ f1; 2g : A

f0,1g:C V

_

f0,1g:A

f1g:B _

V

f1g:A

If we continue dissolving until no partial links remain, a full dissolvent results:

f1g:C f0,1g:C f1g:B ^ _ ^ _ ^ f0g:B f0,1g:A f1g:A The preceding development of dissolution been conducted from a refutation viewpoint. Just as unsatis ability can be dualized to validity, so can these connectives, and therefore dissolution itself, be dualized so as to focus on d-links and d-paths rather than on c-links and c-paths. Alternatively, such a disjunctive full dissolvent of G may be obtained using the version of dissolution de ned above, operating on c-paths and partial c-links, simply by computing :FD(:G). When there is a possibility of confusion, we will mention explicitly the type of link (and path) on which we are dissolving. 104

6.2 Prime Implicants/Implicates and Signed Formulas 6.2.1 De nitions In this section we give basic de nitions of prime implicants/implicates with respect to signed formulas. Note that the signed formula G mentioned in these de nitions need not be -atomic, although typically it is.

De nition 37 A conjunctive term is a conjunction of signed literals Si : xi where each xi appears exactly once.

De nition 38 A conjunctive term C subsumes another C 0 i C 0 j= C. De nition 39 A conjunctive term C is a s-implicant of a signed formula G, i C j= G. De nition 40 A conjunctive term C is a prime s-implicant of a signed formula G

i

1. C is not false 2. C is a s-implicant of G and 3. and there is no other conjunctive term C 0 which subsumes C and is an implicant of G.

De nition 41 A disjunctive term is a disjunction of signed literals Si : xi where each xi appears at most once.

De nition 42 A disjunctive term D subsumes another D0 i D j= D0. 105

De nition 43 A disjunctive term D is a s-implicate of a signed formula G, i G j=

D.

De nition 44 A disjunctive term D is a prime s-implicate of a signed formula G i 1. D is not true 2. D is a s-implicate of G and 3. there is no other disjunctive term D0 which subsumes D and is an implicate of G.

Consider a c-path consisting of the literals (signed atoms) Si : xi, 1  i  n. Essentially, it denotes the conjunction of its literal occurrences. Such a conjunction is more succinctly represented as follows: If there are distinct i and j such that xi = xj , then replace Si : xi and Sj : xj by (Si \ Sj ) : xi (by the Reduction Lemma). Iteration of this process will obviously produce a conjunctive term equivalent to the original c-path. If any of the resulting signs are empty, then the conjunctive term reduces to false; if any are equal to , then that literal is dropped from the conjunctive term. Similarly we can associate with any d-path an equivalent disjunctive term where signs of similar literals are combined by union rather than by intersection. If any of the resulting signs are equal to , then the conjunctive term reduces to true; if any are empty, then that literal is dropped from the disjunctive term.

De nition 45 C-path (or a d-path) P subsumes another c-path (or d-path) P 0 i the

conjunctive term (or disjunctive term ) equivalent to P subsumes the corresponding term for P 0

Since d-paths and c-paths are essentially conjunctive and disjunctive terms respectively, we will use the terms interchangeably. The test for subsumption between conjunctive and disjunctive terms is straightforward: 106

Lemma 6.2.1 A conjunctive term C (6= false) is subsumed by another conjunctive term C 0 if for every S 0 : x 2 C 0, there is a literal S : x 2 C such that S  S 0 . Lemma 6.2.2 A disjunctive term D (6= true) is subsumed by another disjunctive term D0 if for every S 0 : x 2 D0 , there is a literal S : x 2 D such that S  S 0 . As an example consider the signed formula in Figure 6.4, where  = f0, 1, 2g.

f0,1g:A V

f2g:B

W

f0,2g:A V

f2g:C

W f1,2g:A W f0g:B

Figure 6.4: Signed formulas and signed implicants An example of an s-implicant that is not a prime s-implicant is f0,2g:A ^ f2g:C. However, f2g:C is a prime s-implicant. Similarly f1,2g:A _ f0,2g:B _ f1,2g:C is an s-implicate which is not prime, whereas f1,2g:A _ f0,2g:B _ f2g:C is a prime s-implicate.

6.2.2 Prime S-Implicants of -atomic Formulas We now have the tools necessary for computing prime s-implicants (or s-implicates) of any  atomic formula. In addition, these techniques can handle any MVL in which formulas can be converted to -atomic form. To nd all prime s-implicants that force the condition that F evaluates to an element of S , nd a -atomic equivalent F 0 of S : F , then nd a (disjunctive) linkless equivalent F 00 of F 0, and nally collect the conjunctive terms of the c-paths of F 00 that are not subsumed by others. In the subsection below, we establish the results that form the basis for this approach. In the following subsection, the s-implicant methods are de ned precisely.

6.2.3 Foundations 107

Lemma 6.2.3 Every c-path in a -atomic formula G corresponds to a s-implicant of G.

Proof: Given any c-path p in G, let C be the conjunctive term corresponding to p, and let Is be a -consistent interpretation which satis es C . For each literal S : x in C , there are literals S1 : x; S2 : x; :::; Sk : x in p such that S  Si; 1  i  k. Furthermore, all literals in p are so related to some literal in C . As a result, Is satis es every signed atom in p and hence satis es G. 2

Theorem 6.2.4 For any non-empty -atomic formula G in which no d-path contains a partial link, every s-implicant of G is subsumed by some c-path of G.

Proof: Let C = S1:x1 ^ :: ^ Sn:xn be an s-implicant of G. Since C j= 6  G, G _ :C is valid with respect to -consistency and is therefore spanned by its full set of (disjunctive) links, i.e., all d-paths contain a link. Moreover, only binary links need be considered because C is a conjunctive (and hence :C is a disjunctive) term, and G is assumed to be free of partial links. Note that :C = S1:x1 _ :: _ Sn :xn. Therefore, all d-links are of the form fSi : xi; Si0 : xig, where Si : xi 2 :C , Si0 : xi is in G, and Si [ Si0 = .

Let R be the set of all literals in G that are linked to :C . From the observations above, we know that corresponding to each literal Si0 : xi in R, there is a literal Si : xi 2 C such that Si  Si0. So any c-path through R will subsume C , and, to prove the theorem, it suces to show that GR contains a c-path through G. So suppose that every c-path in GR is partial in G. By the dual of Lemma 4.4.1 in Chapter 3 (Though this lemma expresses a structural property of semantic graphs and was proved in for classical logic, it still holds for signed formulas) there is a d-path p through G which does not meet GR . The path p is not linked to :C because it has no nodes from R; furthermore, p itself contains no links since G is linkless. Therefore, the d-path comprised of the nodes from both p and :C is a linkless d-path through G _ :C . But this is a contradiction: G _ :C is spanned by its d-links by de nition. As a result, some c-path in GR is not partial in G and the proof is complete. 2 108

Theorem 6.2.5 Suppose G is a non-null graph representing a -atomic formula

having no partial d-links, and let (G) be de ned as follows:

(G) = fp j (p is a d - path through G) ^ (p 6= false) ^ (8 d - paths q through G , q does not subsume P g. Then (G) is the set of all prime s-implicants of G. Proof: Direct result of Lemma 6.2.3 and Theorem 6.2.4.

2

We say that a signed formula is in reduced CNF if each of its clauses are disjunctive terms, i.e., if it has no true disjuncts, and every disjunct has at most one signed literal for each atom.

Corollary 6.2.6 If G is a signed formula in reduced CNF, then (G) is the set of

all prime s-implicants of G.

This follows from Theorem 6.2.5 as such a CNF formula has no d-paths with partial links.

Theorem 6.2.7 FD(F) (with respect to c-links or to d-links) is equivalent to F under

all -consistent interpretations. Proof: See [41].

2

The classical version of these theorems coincide with those in chapter 3. We can now state the steps in nding all the prime s-implicants of a signed formula F.

 Step 1: If F is not -atomic, push the sign inward resulting in a -atomic formula F.

 Step 2: Find F 0, a -atomic formula equivalent to F that has no d-links, by computing FD(F) with respect to d-links or by converting to reduced CNF. 109

 Step 3: Find all the c-paths of F 0 which are not subsumed by other c-paths. The conjunctive terms corresponding to these c-paths are the prime s-implicants of F.

The correctness of the steps follows from Theorems 6.2.5 and 6.2.7. We illustrate the di erent steps by an example. Consider the semantic graph from Figure 6.4. This is in -atomic form so we do nothing in Step 1. In Step 2 we nd the full dissolvent, (with respect to d-links) which is as shown below in Figure 6.5.

f1; 2g : A Wf0; 2g : B Wf2g : C Figure 6.5: Signed full dissolvent (with respect to d-links) Since none of the c-paths subsume each each other, the prime s-implicants are exactly the c-paths of the graph in the Figure 6.5: f1,2g:A, f0,2g:B and f2g:C. We have so far restricted our discussion to nding prime s-implicants only. However we can dualize all the theorems and lemmas. This would give us a method to nd prime s-implicates. Alternatively, given G, one could nd the prime s-implicants of :G and then negate them to produce the prime s-implicates of G.

6.3 Post Logics In this section we describe a method to compute all the prime implicants of arbitrary formulas in Post logics, Pn. The key observation is that this can be accomplished using signed formulas in which all signs are regular. The set truth values is the set  = f0; :::n ? 1g, with _p, ^p, and  as the operators. The notational conventions for expressing prime implicants and prime implicates of formulas in Post logics di ers from our general notation for MVL's. In particular, these notions are approached in the context of the MVL itself, whereas our approach 110

is through the use of the meta-language of signed formulas. In the remainder of this section, we relate concepts and notation from Post logics to signed formulas. Many of the following de nitions and notation are taken from [23].

De nition 46 A literal is a term of the form Xi (a,b), where Xi is any atom in Pn, and 0  a; b  n ? 1, and for any interpretation I in Pn, the corresponding extension I 0 maps Xi(a,b) to n ? 1 if a  I (Xi)  b, 0 otherwise. De nition 47 A product term is of the form C ^p X1(a1; b1) ^p ::: ^p Xm(am; bm), where 1  C  n ? 1, and Xi (ai ; bi) is a literal. Therefore, any interpretation will make the product term evaluate to either C or to 0. If a product term does not include atoms Xj , then we can nd an equivalent product term which includes Xj by adding the literal Xj (0; n ? 1). In the rest of this chapter we assume that product terms include all atom under consideration.

De nition 48 A product term P is an implicant of a formula F if for every interpretation I, I 0(P)  I 0(F). De nition 49 Let P be the product term C ^p x1(a1; b1) ^p ::: ^p xm(am; bm), and let P 0 be the term C 0 ^p x1(c1 ; d1 ) ^p ::: ^p xm(cm ; dm). We say that P subsumes P 0 if C 0  C and, for each i, ai  ci  di  bi. A Product term P is a prime implicant of F if no other implicant of F subsumes it.

Note that the intervals associated with the atoms of product terms do not necessarily correspond to regular signs. This is to be expected: Although we can preserve regularity while generating a -atomic formula, processing the formula with either signed dissolution or conversion to reduced CNF will produce non-regular signs. However, each such sign is a nite set, and can therefore be be represented as a unique minimal set of disjoint intervals. For example, the sign f0,1,3,4,6g can be viewed as 111

the set of intervals f(0,1), (3,4), (6,6)g; we say that (0,1), (3,4), and (6,6) are the intervals corresponding to the sign f0,1,3,4,6g.

Notation 6.3.1 Let f X j X is a prime s-implicant of f>ig:Fg be denoted by Sf>ig:F . Let M = Si : x1 ^ ::: ^ Sm : xm be an s-implicant of f>ig:F. We denote PM to be the set of all product terms of the form (i + 1) ^p x1(a1; b1) ::: ^p xm(am; bm), where for 1  j  m, (aj ; bj ) is an interval corresponding to Sj . (If xj does not appear in M, then aj = 0 and bj = n ? 1.) Example 14 Let f1,2,4g:x1 ^ f3g:x2 be a s-implicant of f>3g:F. Then the corresponding product terms are 4 ^p x1(1; 2) ^ x2(3; 3) and 4 ^p x1 (4; 4) ^ x2(3; 3). It is easy to see that none of the product terms in PM subsume each other.

Theorem 6.3.2 Let F be a formula, and let Sf>ig:F be the set of all prime s-implicants of f>ig:F, and let Pf>ig:F be the union of all the sets PM , where Mk 2 Sf>ig:F . Then every product term in Pf>ig:F is an implicant of F. k

Proof: Let P be a product term in Pf>ig:F , and let M be the corresponding simplicant of f> ig : F g. Every interpretation over  maps P either to 0 or to i + 1. So suppose I 0(P ) = i + 1. I 0 will also map each literal xj (aj ; bj ) in the product term to n ? 1 and therefore aj  I 0(xj )  bj . Let Is be the corresponding  consistent interpretation; by de nition Is(Sj : xj ) = true, since faj ; aj +1; ::: ; bj g  Sj . Hence Is(M ) is true, and therefore Is(f>ig:F ) is also true ( since M is a prime s-implicant of f> ig : F ). Therefore I 0(F )  i + 1, and P is an implicant of F. 2

Theorem 6.3.3 Suppose F is a formula and P is an implicant of F, where c (c > 0) is the constant in P. Then there is some product term in Pf>c?1g:F which subsumes

P.

Proof : Let P be c ^p x1 (a1; b1) ^p ::: ^p xm(am; bm) and consider M = S1 : x1 ^ ::: ^ Sm : xm, where Sj is the set of elements in the range aj to bj . M

112

is an s-implicant of f> c ? 1g:F and will be subsumed by some prime s-implicant S 01 : x1 ^ ::: ^ S 0m : xm in Sf>c?1g:F (say M 0), where S 0j  Sj , 1  j  m. Now consider the product P 0 in PM that is of the form c ^p x1(a01; b01) ^p ::: ^p xm(a0m ; b0m) such that a0j  aj  bj  b0j for 1  j  m. There must be such a product term since S 0j  Sj for each j and hence P 0 subsumes P. 2 0

Theorem 6.3.4 Let F be a formula, and let K = Sni=0?2 Pf>ig:F . Let I be the result of removing all product terms in K that are subsumed by others. Then I is the set of all prime implicants of F.

Proof: Follows directly from Theorems 6.3.2 and 6.3.3.

2

We can now give a method to nd the set of prime implicants for a given formula.

 Step 1 Set I to ;.  Step 2 For each i, from n ? 2 to 0, do  Step 2a Find S the set of s-implicants of f> ig : F .  Step 2b Find PS the set of all product terms corresponding to s-implicants of f> ig : F .  Step 2c Add to I all those product terms in PS not subsumed by other product terms in I . At the end of step 2, I contains exactly the prime implicants of F . The correctness of the steps follows from Theorem 6.3.4. We illustrate the above method with an example in P3. Consider the following formula : (x1 _ x2) ^ (x3). The values of S,PS ,I at the end of each iteration of Step 2 are shown in the table below. The iteration terminates and I has all the prime implicants of F . 113

i 1

S

f2g:x1 ^ f1g : x3 f2g : x2 ^ f1g : x3 0 f1,2g:x1 ^ f0; 1g : x3 f1; 2g : x2 ^ f0; 1g : x3

2 2 1 1

^p ^p ^p ^p

PS x1(2; 2) x2(2; 2) x1(1; 2) x2(1; 2)

^p ^p ^p ^p

I

x3(1; 1) 2 ^p x1(2; 2) ^p x3(1; 1) x3(1; 1) 2 ^p x2(2; 2) ^p x3(1; 1) x3(0; 1) 2 ^p x1(2; 2) ^p x3(1; 1) x3(0; 1) 2 ^p x2(2; 2) ^p x3(1; 1) 1 ^p x1(1; 2) ^p x3(0; 1) 1 ^p x2(1; 2) ^p x3(0; 1)

Table 6.1: Example steps in the computation of prime implicants

6.4 Other algorithms for computing prime implicates/implicants of formulas of multiple valued logics All the current algorithms are based on the truth table representation of multiple valued formulas, and are adapted from Tison's consensus based algorithm for propositional logic. These include algorithms due to Allen [1], Smith [26], and Su and Cheung [59]. We do not know of any algorithm which is based on formulas, other than the algorithm given in this chapter.

114

Chapter 7 Future work 7.1 Semi-Resolution In chapter 4 we showed how many of the redundancies created by dissolution are manifested by anti-links (a pair of distinct occurrences of the same literal). Some such anti-links can be removed through the anti-link operator. Ideally, however, an inference operation would simply avoid as many such redundancies as possible, and yet have better termination properties. The semi-resolution operation introduced in this section is a tentative step in this direction.

De nition 50 Let H be an arbitrary semantic graph G, then WS(H; G) the weak

split is de ned8 as follows:

> G if H = ; > > > < false if H = G WS(H,G) = > Wn > i=1 WS (HF ; Fi ) if the nal arc of G is a d-arc > > : Wki=1 WS (HF ; Fi) if the nal arc of G is a c-arc where Fi (i  i  k) are the fundamental subgraphs of G that meet H , and Fi (k + 1  i  n) are those that do not. i

i

115

De nition 51 Let G be a semantic graph in which M = X ^ Y is the smallest full block containing the link l = fA; Ag, where A 2 X and A 2 Y . We de ne RS (l; M ), the semi-resolvent of M with respect to l, to be X

^ Y [A = (A ^ WS(A; X ))] where Y [A = (A ^ WS(A; X ))] means that the A from the link is replaced by A ^ WS(A; X ) in Y .

Intuitively, a satisfying c-path of M cannot contain both A and A. As a result, if it does hit A in Y , it must miss A in X , and hence it must contain a c-path of WS(A; X ). We may therefore replace M by its semi-resolvent RS (l; M ) without a ecting the semantics of G. It is noteworthy that only the nodes of WS(A; X ) are duplicated | none from Y are. Furthermore, we may swap X and Y to minimize this duplication. Semi resolution is like path resolution [37], but introduces less redundancy. To provide some intuition about semi-resolution and to help explain its potential usefulness in prime implicate computations, consider the following example.

A1

A2

An

C1 _ L

B1

B2

Bn

L

^ _ ^ _ ::: _ ^ _

^

C2 _ ^ C3

Methods based on clause form essentially cannot handle such formulas. Converting to clause form involves 3  2n clauses; 2n of these contain L, and another 2n contain L. None can be eliminated by subsumption; so, for example, in a resolution-based system, at least 22n resolutions must be performed before we can begin to accumulate prime implicates. 116

Using dissolution on the fL; Lg link produces the following linkless formula:

A1

A2

An

C1 _ L

B1

B2

Bn

C2

^ _ ^ _ ::: _ ^ _

^

C1

_ ^

L

^

C3 Since the formula above is linkless, we may begin to extract prime implicates from its 6  2n d-paths. Nevertheless, the situation is far from ideal. The d-paths can be partitioned into six subsets of size 2n corresponding to the six d-paths through the two rightmost disjuncts. It is easy to see that each d-path in two of these subsets (those corresponding to fC1; L; Lg and fC1; L; C3g) is subsumed by a d-path in the subset corresponding to fC1; L; C1g. Such redundancies can be eliminated via the anti-link mechanism introduced in chapter 4, but it would be preferable to simply avoid introducing them. Consider now semi-resolving on the fL; Lg link resulting in the formula below:

A1

A2

An

C1 _ L

B1

B2

Bn

C1

^ _ ^ _ ::: _ ^ _

^

C2 _ ^ L

^

C3 This formula has only 4  2n d-paths; the redundant 2  2n paths in the graph produced by dissolution are simply not present. Furthermore, although the L and L literals are still c-connected after semi-resolving, they are no longer linked if links are deleted upon activation. Semi resolution is sucient for computing prime implicates/ implicants, i.e prime implicants/implicates can be computed using semi resolution rather than dissolution. 117

We do not know if it has an elegant termination property such as the strong termination of dissolution, or if it has non termination properties as does the connection graph procedure [15]. All currently known results have been published elsewhere [35]. The e ectiveness of using this rule instead of dissolution in prime implicate/implicants and diagnoses computation are currently under investigation.

7.2 Improved data structures The trie seems the most e ective data structure for implementing prime implicate/implicant computations. Although the trie has been shown to be very e ective in prime implicate/implicant computations, the tire is not at all optimal in terms of memory requirements. A trie has at least as many nodes as the set of implicates it represents. Since the number of implicates can be extremely large, the trie is typically large. Structure sharing can be used to reduce the space requirements of a trie. This can be achieved by relaxing the single parent requirement of a trie. We therefore have a DAG instead of a tree. We call this new data structure a d-trie (for DAG-trie). Figure 7.2 shows a trie representing the collection of implicates :- ffa, cg, fa, fg, fb, f, h g, fb, gg, fe, gg, fe, f, hg g. Here we assume that the elements of the domain are ordered and that a < b < c::: < h. The d-trie for the above collection of sets is as shown below in gure 7.2. The size of tries and d-tries depends on the ordering of the domain. There are large sets with can be compactly (with logarithmic space requirements) stored in a d-trie, whereas the trie would have at least as many nodes as the cardinality of the set. For example consider the following set Sn of implicates parameterized by n. We assume n is even. The set of atoms is fA1:::Ang. Sn is the set of implicates ffx1; :::xngjxi = Ai or xi = Ai, and for j  n=2; (xj = Aj ) , (xj+n=2 = Aj+n=2) g. The size of Sn is 2n=2. 118

a

e b

c nil

g

f

f

nil

g

f

nil h

nil h

nil

nil

Figure 7.1: A trie e

a b c

f

f

g h

nil

Figure 7.2: A d-trie For example if n = 2 , the Un = fA1; A2; A1; A2g and Sn would be ffA1; A2g; fA1; A2gg. Consider the ordering A1 < A1 < An=2+1 < An=2+1 < A3 < A3 < An=2+2 < An=2+2; ::: < An < An, The smallest d-trie for the above set is shown in Figure 7.2. The d-trie has 3=2n nodes. However under the ordering A1 < A1 < A2 < A2::: < An=2 < An=2 < An < An < An?1 < An?1 ::: < An=2+1 < An=2+1, the smallest d-trie is shown in Figure 7.2. The d-trie has 3:2n=2 ? 1 nodes. Although some sets have small tries, there are some large sets for which any d-trie 119

A1

A1

A n/2+1

A n/2+1

A2

A2

A n/2+2

A n/2+2

A n/2

A n/2 An

An nil

Figure 7.3: A small d-trie for Sn (under any ordering) will be large. We illustrate this with an example. Consider the following set of implicates, parameterized by n (assume that n is divisible by 16). Let fA1:::Ang be the set of atoms and let the collection of implicates - ffx1; :::xngjxi = Ai or xi = Ai, and if the number of Aj 's in fx1; :::xng is k (k > 0) then xk = Ak g. For any ordering of the set fA1:::Ang, the d-trie for the above set will have (2n=8) nodes. For a proof of this see [48]. A trie storing a collection of implicates is also a d-trie (by de nition). For any given ordering of the atom set, the size of d-trie varies: with maximum structure sharing it has the smallest size; with minimum structure sharing it resembles a trie and has the largest size. Although d-tries have in general less space requirements than tries, it is not obvious they can be used to obtain faster implementations of prime implicate/implicant algorithms. We believe that all common operations on a d-trie can be performed in time 120

A1

A1 A2

A2

A n/2

A n/2

An

An

A n/2

An

A n/2+2

A2

A n/2

An

A n/2

An

A n/2+2 A n/2+1 nil

A2

A n/2

An

A n/2

An

A n/2+2

A n/2

An

A n/2+2

A n/2+1

Figure 7.4: A large d-trie for Sn polynomial in the size of the d-trie. Partial results in this direction have been reported in [48]. Experimental evaluation of d-tries and its practical usefulness are currently being investigated.

121

Chapter 8 Summary In this thesis we have shown that by using an NNF representation for propositional logic formulas, we can build reasoning systems which are more ecient on certain classes of formulas than the corresponding ones that require and maintain a clause form representation. We have demonstrated this on two related problems, namely computing prime implicants/implicates and computing minimal diagnoses. In Chapter 3 we developed a method for computing prime implicates/implicants of NNF formulas using dissolution, an NNF based inference rule, and the PI algorithm. PI itself is sucient to compute prime implicants/implicates; however, we have shown that using dissolution in conjunction with PI often results in much better performance than when PI is used twice. The main reason seems to be that dissolution often produces concise NNF as output. In those cases where dissolution does not help, its use usually does not result in a signi cant penalty. Our empirical results are augmented by the discovery of the class of formulas described in Section 3.4, for which intermediate normalization is shown to result in an exponential penalty. Furthermore, our investigations into the use of ecient clause form transformations indicate that they will not be of much help in these computations. In Chapter 4 we introduced anti-links and de ned useful equivalence-preserving 122

operations on them. These operations can be employed so as to strictly reduce the number of d-paths in an NNF formula. Anti-link operations remove subsumed paths without any direct checks for subsumption. This is signi cant for prime implicate computations, since such computations tend to be dominated by subsumption checks. In addition, we are able to improve performance greatly on the inherently exponential examples of [43]. We also gave experimental evidence demonstrating the superiority of our prime implicant/implicate generating system over the system of de Kleer [9] on many of the common benchmark problems, even without anti-link operations. In Chapter 5 we developed a new algorithm for computing diagnosis from rst principles. Our algorithm does not generate the set of minimal con icts, which can be potentially large even if the the number of minimal diagnoses is small. This algorithm also uses dissolution and PI. We have also used anti-links to speed up this computation process. The implementation of our proposed techniques generally performs as well as any of which we are aware, when nding all diagnoses automatically. Our method appears to be much faster when nding all single faults. In Chapter 6 we provided a bridge between earlier work on prime implicant methods for MVL's based on truth table analysis and our formula-based approach using the logic of signed formulas. We have shown the use signed formulas allows us to compute prime implicates and implicants of various MVL formulas independently of the underlying logic. For the class of regular logics, the process of converting MVL's into signed formulas is linear, and we have have developed an ecient method for computing prime implicants and implicates.

123

Bibliography [1] C. M. Allen. The allen-givone implementation oriented algebra. In Computer Science and multiple-valued logic, pages 268{288. North Holland, 1984. [2] Wolfgang Bibel. Automated theorem proving. Arti cial intelligence. Vieweg, Braunschweig, Germany, 1987. [3] Y. Breitbart, Harry B. Hunt III, and Daniel Rosenkrantz. On the size of binary decision diagrams representing boolean functions. Theoretical Computer Science, 145:45{59, 1995. [4] Randal E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE trans. on computers, c-35(8):677{691, August 1986. [5] Luca Console, Gerhard Friedrich, and Daniele Theseider Dupre. A new algorithm for incremental prime implicate generation. In Proceedings, IJCAI-93, Chambery, France, pages 1494{1499, August 1993. [6] Olivier Coudert and Jean-Christophe Madre. Implicit and incremental computation of primes and essential implicant primes of boolean functions. In Proceedings of the 29th ACM/IEEE Design Automation Conference, pages 36{39, 1992. [7] Johan de Kleer. An assumption based truth maintenance system. Arti cial intelligence, 28:127{162, 1988. [8] Johan de Kleer. Focusing on probable diagnosis. In Proceedings, AAAI-91, Anaheim, CA, pages 842{848, 1991. 124

[9] Johan de Kleer. An improved incremental algorithm for computing prime implicants. In Proceedings, AAAI-92, San Jose, CA, pages 780{785, 1992. [10] Johan de Kleer, Alan K. Mackworth, and Raymond Reiter. Characterizing diagnoses and systems. Arti cial Intelligence, 32:97{130, 1992. [11] Johan de Kleer and Brian C. Williams. Diagnosing multiple faults. Arti cial Intelligence, 56:197{222, 1987. [12] Alvaro del Val. Tractable databases: How to make propositional unit resolution complete through compilation. In Proceedings of the fourth international conference on principles of knowledge representation and reasoning, pages 551{561, 1994. [13] Patrick Doherty. Nm3 - a three-valued non-monotonic formalism (preliminary report). In Z. Ras, M. Zemankova, and M. Emrich, editors, Proceedings, 5th International Symposium on Methodologies for Intelligent systems, Knoxville, TN, pages 498{505. North-Holland, 1990. [14] W. Dowling and Jean Gallier. Linear time algorithms for testing the satis ability of propositional horn formulae. Journal of Logic Programming, 3:267{284, 1984. [15] Norbet Eisinger. What you always wanted to know about clause graph resolution. In Proceedings of the Eighth International Conference on Automated Deduction, Oxford, England, (CADE-8), LNCS 230, pages 316{336. Springer Verlag, 1986. [16] Melvin Fitting. First order logic and automated theorem proving. Texts and monographs in computer science. Springer Verlag, New York, 1990. [17] Micheal R. Garey and David S. Johnson. Computers and Intractability. W.H. Freeman Company, San Francisco, 1979. [18] H. Ge ner and J. Perl. An improved constraint propagation algorithm for diagnosis. In Proceedings of the 10th international conference on arti cial intelligence, Milan, Italy, pages 1105{1111. Morgan Kaufmann, 1987. 125

[19] Micheal R. Gensereth. The use of design descriptions in automated diagnosis. Arti cial Intelligence, 24:411{436, 1984. [20] Mathew Ginsberg. Many valued logic. In Proceedings of AAAI-86, pages 243{ 247, 1986. [21] Mathew Ginsberg. A circumscriptive theorem prover. Arti cial Intelligence, 39:209{230, 1989. [22] Roberto Grossi. On nding common subtrees. Theoretical Computer Science, 108(2):345{356, 1993. [23] Reiner Hahnle. Uniform notation tableau rules for multiple-valued logics. In Proceedings, The International Symposium on Multiple-Valued Logic Victoria, BC, pages 238{245, May 1991. [24] Reiner Hahnle. Tableaux-Based Theorem Proving in Multiple-Valued Logics. PhD thesis, University of Karlsruhe, 1992. [25] Reiner Hahnle and Werner Kernig. Veri cation of swith-level designs with manyvalued logic. In Proceedings, 4th International Conference on Logic Programming and Automated Reasoning (LPAR), St. Petersburg, Russia, LNCS 698, pages 158{169. Springer, July 1993. [26] W. R. Smith III. Minimization of multivalued functions. In Computer Science and multiple-valued logic, pages 227{267. North Holland, 1984. [27] Peter Jackson. Computing prime implicants incrementally. In Proceedings, 11th International Conference on Automated Deduction (CADE), Saratoga Springs, NY, LNCS 607, pages 253{267, June 1992. [28] Peter Jackson and John Pais. Computing prime implicants. In Proceedings, 10th International Conference on Automated Deduction (CADE), Kaiserslautern, Germany, LNCS 449, pages 543{557. Springer, July 1990. 126

[29] Henry Kautz and Bart Selman. An empirical evaluation of knowledge compilation by theory approximation. In Proceedings of the twelfth national conference on arti cial intelligence, Seattle, WA, pages 155{161, 1994. [30] Alex Kean and George Tsiknis. An incremental method for generating prime implicants/implicates. Journal of Symbolic Computation, 9:185{206, 1990. [31] Alex Kean and George Tsiknis. Assumption based reasoning and clause management systems. Computational Intelligence, 8(1):1{24, November 1992. [32] Donald E. Knuth. The art of computer programming, volume 1, Fundamentals of algorithms. Addison-Wesley, Reading MA, 1990. [33] Reinhold Letz, Johann Schumann, Stephan Bayerl, and Wolfgang Bibel. SETHEO: A high-performance theorem prover. Journal of Automated Reasoning, 8(2):183{212, 1992. [34] I. Mozetic and C. Holzbaur. Controlling the complexity in model-based diagnosis. In Proceedings, The European Conference on Arti cial Intelligence, pages 729{ 733. John Wiley & Sons, 1992. [35] Neil V. Murray, Anavai Ramesh, and Erik Rosenthal. The semi-resolution inference rule and prime implicate computations. In Proceedings of the fourth golden west conference on intelligent systems, pages 153{158. International society for computers and their applications - ISCA, 1995. [36] Neil V. Murray and Eirc Rosenthal. Reexamining intractability of analytic tableaux. In Proceedings, International Symposium on Symbolic and Algebraic Computation, Tokyo, Japan, pages 52{59, August 1990. [37] Neil V. Murray and Erik Rosenthal. Inference with path resolution and semantic graphs. Journal of the ACM, 34(2):225{254, April 1987. [38] Neil V. Murray and Erik Rosenthal. Path dissolution: A strongly complete rule of inference. In Proceedings, 6th National Conference on Arti cial Intelligence, Seattle, WA, pages 161{166, July 1987. 127

[39] Neil V. Murray and Erik Rosenthal. Resolution and path dissolution in multiplevalued logics. In Proceedings, The International Symposium on Methodologies for Intelligent Systems, Charlotte, NC, volume 542 of LNAI 542, pages 570{579. Spinger, October 1991. [40] Neil V. Murray and Erik Rosenthal. Dissolution: Making paths vanish. Journal of the ACM, 48(3):504{535, July 1993. [41] Neil V. Murray and Erik Rosenthal. Signed formulas: A liftable meta-logic for multiple-valued logics. In Proceedings, The International Symposium on Methodologies for Intelligent Systems, Trondheim, Norway, volume 698 of LNAI, pages 275{284, June 1993. [42] R. J. Nelson. Simplest normal truth functions. Journal of Symbolic Logic, 20:105{ 108, 1955. [43] Teow-Hin Ngair. A new algorithm for incremental prime implicate generation. In Proceedings, IJCAI-93, Chambery, France, August 1993. [44] David Plaisted and S. Greenbaum. A structure preserving clause form translation. Journal of Symbolic Computation, 2:293{304, 1986. [45] Emil Leon Post. Introduction to a general theory of elementary propositions. In Jean van Heijenoort, editor, From Frege to Godel. A source book in mathematical logic, pages 264{283. Harvard University Press, Cambridge MA, 1971. [46] Teodor C. Przymusinski. An algorithm to compute circumscription. Arti cial Intelligence, 30:49{73, 1989. [47] W. V. Quine. On cores and prime implicants of truth functions. IEEE trans. on computers, 66:755{760, 1959. [48] Anavai Ramesh and Neil V. Murray. D-trie: A new data structure for a collection of minimal sets. TR 95-4, Dept. of Computer Science, SUNY at Albany, Albany, NY, September 1995. 128

[49] J. A. Reggia, D. S. Nau, and P. Y. Wang. Diagnostic expert systems based on a set covering model. International journal of man-machine studies, 32:57{96, 1987. [50] Raymond Reiter. A theory of diagnosis from rst principles. Arti cial Intelligence, 32:57{95, 1987. [51] Raymond Reiter and J. de Kleer. Foundations of assumption-based truth maintenance systems: preliminary report. In Proceedings, 6th National Conference on Arti cial Intelligence, Seattle, WA, pages 183{188, July 1987. [52] J. A. Robinson. A machine oriented logic based on the resolution principle. Journal of the ACM, 12:3{41, 1965. [53] Ron Rymon. On kernal rules and prime implicants. In Proceeding of the twelfth national conference on arti cial intelligence, Seattle WA, pages 181{186, 1994. [54] Ron Rymon. An se tree based prime implicant generating algorithm. Annals of mathematics and AI, 11, 1994. [55] Uwe Schoning. Logic for computer scientists. Progress in computer science and applied logic. Birkhauser, Berlin, 1989. [56] Bart Selman and Henry Kautz. Knowledge compilation using horn clauses. In Proceedings of the ninth national conference on arti cial intelligence, Anaheim, CA, pages 904{909, 1991. [57] James R. Slagle, Chin-Liang Chang, and Richard C. T. Lee. A new algorithm for generating prime implicants. IEEE Transactions on Computers, C-19(4):304{ 310, 1970. [58] T. Strzemecki. Polynomial-time algorithms for generation of prime implicants. Journal of Complexity, 8:37{63, 1992.

129

[59] S.Y.H. Su and P.T Cheung. Computer simpli cation of multi-valued functions. In Computer Science and multiple-valued logic, pages 195{226. North Holland, 1984. [60] P. Tison. Generalizations of consensus theory and application to the minimization of boolean functions. IEEE trans. on computers, c-16:446{456, 1967.

130

Suggest Documents