Reduced Implicant Tries - Department of Computer Science

0 downloads 0 Views 200KB Size Report
Nov 1, 2007 - email: nvm@cs.albany.edu ... Horn sets, ordered binary decision diagrams, sets of prime .... implicant trie is on the left in the diagram below.
Reduced Implicant Tries ∗

Technical Report SUNYA-CS-07-01 November, 2007

Neil V. Murray

Erik Rosenthal Department of Mathematics University of New Haven North Haven, CT 06516 [email protected]

Department of Computer Science University at Albany Albany, NY 12222 email: [email protected]

Abstract The goal of knowledge compilation is to enable fast queries. Such queries are often in the form of conjunctive normal form (CNF) clauses, and the query amounts to the question of whether the clause is implied by the knowledge base. Here, we consider the dual type of query which is a DNF clause; the query then amounts to the question of whether the knowledge base is implied by the clause. Prior approaches had the goal of small (i.e., polynomial in the size of the initial knowledge bases) compiled knowledge bases. Typically, query-response time is linear, so that the efficiency of querying the compiled knowledge base depends on its size. In [37], a target for knowledge compilation called the ri-trie was introduced; it has the property that even when large they nevertheless admit fast queries in the form of CNF clauses. Specifically, a query can be processed in time linear in the size of the query regardless of the size of the compiled knowledge base. In this report, these techniques are extended to allow the dual type of query expressed as a DNF clause.



This research was supported in part by the National Science Foundation under grants IIS-0712849 and IIS-0712752.

1

1

Introduction

The last decade has seen a virtual explosion of applications of propositional logic. One is knowledge representation, and one approach to it is knowledge compilation. Knowledge bases can be represented as propositional theories, often as sets of clauses, and the propositional theory can then be compiled; i.e., preprocessed to a form that admits fast response to queries. While knowledge compilation is intractable, it is done once, in an off-line phase, with the goal of making frequent on-line queries efficient. Heretofore, that goal has not been achieved for arbitrary propositional theories. A typical query of a propositional theory has the form, is a clause logically entailed by the theory? This question is equivalent to asking, is the conjunction of the theory and the negation of the clause unsatisfiable? Propositional logic is of course intractable (unless N P = P), so the primary goal of most research is to find relatively efficient deduction techniques. A number of languages — for example, Horn sets, ordered binary decision diagrams, sets of prime implicates/implicants, decomposable negation normal form, factored negation normal form, and pairwise-linked formulas — have been proposed as targets for knowledge compilation. (See, for example, [2, 3, 8, 10, 18, 20, 21, 29, 47, 51]. Knowledge compilation was introduced by Kautz and Selman [24]. They were aware of one issue that is not discussed by all authors: The ability to answer queries in time polynomial (indeed, often linear) in the size of the compiled theory is not very fast if the compiled theory is exponential in the size of the underlying propositional theory. Most investigators who have considered this issue focused on minimizing the size of the compiled theory, possibly by restricting or approximating the original theory. Another approach is considered in [37]: admitting large compiled theories — stored off-line1 — on which queries can be answered in time linear in the size of the query. There, a data structure that has this property called a reduced implicate trie or, more simply, an ri-trie, was introduced. In this report, the dual (and equally intractable) type of query is considered: Does a DNF clause logically entail the theory? The reduced implicant trie is introduced in Section 3.2. These tries can be thought of as compact implicant tries, which are introduced in Section 3.1. Note that the target languages studied by the authors in [20, 37] are related but nevertheless distinct from this work; they enable response times linear only in the size of the compiled theory, which (unfortunately) can be exponentially large. The Tric operator and RITc operator, which are the building blocks of reduced implicant tries, are introduced and the appropriate theorems are proved in Section 4. A direct implementation of the RITc operator would appear to have a significant inefficiency. However, the INT operator explained in Section 6 avoids this inefficiency. 1

The term off-line is used in two ways: first, for off-line memory, such as hard drives, as opposed to on-line storage, such as RAM, and secondly, for “batch preparation” of a knowledge base for on-line usage.

2

2

Preliminaries

For the sake of completeness, define an atom to be a propositional variable, a literal to be an atom or the negation of an atom, and a clause to be a disjunction of literals. Clauses are often referred to as sets of literals in the context of CNF formulas. For disjunctive normal form (DNF) formulas, conjunctions of literals are also referred to as clauses; if any confusion is possible, we will refer to them as DNF clauses. Consequences expressed as minimal clauses that are implied by a formula are its prime implicates; minimal conjunctions of literals that imply a formula are its prime implicants. Implicates are useful in certain approaches to non-monotonic reasoning [27, 44, 50], where all consequences of a formula — for example, the support set for a proposed common-sense conclusion — are required. The implicants are useful in situations where satisfying models are desired, as in error analysis during hardware verification. Many algorithms have been proposed to compute the prime implicates (or implicants) of a propositional boolean formula [5, 14, 22, 23, 26, 43, 46, 54, 55]. An implicate of a logical formula is a clause that is entailed by the formula; i.e., a clause that contains a prime implicate. Thus, if F is a formula and C is a clause, then C is an implicate of F if (and only if) C is satisfied by every interpretation that satisfies F. Still another way of looking at implicates is to note that asking whether a given clause is entailed by a formula is equivalent to asking whether the clause is an implicate of the formula. Throughout the paper, this question is what is meant by CNF query. An implicant of a logical formula is a DNF clause that entails the formula; i.e., a clause that contains a prime implicant. Thus, if F is a formula and D is a DNF clause, then D is an implicant of F if (and only if) F is satisfied by every interpretation that satisfies D. Still another way of looking at implicants is to note that asking whether a given formula is entailed by a DNF clause is equivalent to asking whether the clause is an implicant of the formula. Throughout the paper, this question is what is meant by DNF query.

3

A Data Structure That Enables Fast DNF Query Processing

The goal of knowledge compilation is to enable fast queries. Prior approaches had the goal of a small (i.e., polynomial in the size of the initial knowledge base) compiled knowledge base. Typically, queryresponse time is linear, so that the efficiency of querying the compiled knowledge base depends on its size. The approach considered in this paper is to admit target languages that may be large as long as they enable fast queries. The idea is for the query to be processed in time linear in the size of the query. Thus, if the compiled knowledge base is exponentially larger than the initial knowledge base, the query must be processed in time logarithmic in the size of the compiled knowledge base. One data structure that admits such fast queries is called a reduced implicant trie.

3

3.1

Implicant Tries

The trie is a well-known data structure introduced by Morrison in 1968 [30]; it is a tree in which each branch represents the sequence of symbols labeling the nodes2 on that branch, in descending order. A prefix of such a sequence may be represented along the same branch by defining a special end symbol and assigning an extra child labeled by this symbol to the node corresponding to the last symbol of the prefix. For convenience, it is assumed here that the node itself is simply marked with the end symbol, and leaf nodes are also so marked. One common application for tries is a dictionary. The advantage is that each word in the dictionary is present precisely as a (partial) branch in the trie. Checking a string for membership in the dictionary merely requires tracing a corresponding branch in the trie. This will either fail or be done in time linear in the size of the string. Tries have also been used to represent logical formulas, including sets of prime implicates [50]. This works for prime implicants as well: The nodes along each branch represent the literals of a DNF clause, and the disjunction of all such clauses is a DNF equivalent of the formula represented by the trie. But observe that this DNF formula introduces significant redundancy. In fact, the trie can be interpreted directly as an NNF formula, recursively defined as follows: A trie consisting of a single node represents the constant labeling that node. Otherwise, the trie represents the conjunction of the label of the root with the disjunction of the formulas represented by the tries rooted at its children. When clause sets are stored as tries, space advantages can be gained by ordering the literals and treating the clauses as ordered sets. An n-literal clause will be represented by one of the n! possible sequences. If the clause set is a set of implicants, then one possibility is to store only prime implicants — clauses that are not subsumed by others — because all subsumed clauses are also implicants and thus implicitly in the set. The space savings can be considerable, but there will in general be exponentially many prime implicants. Furthermore, to determine whether clause D is in the set, the trie must be examined for any subset of D; the literal ordering helps, but the cost is still proportional to the size of the trie. Suppose instead that all implicants are stored; the resulting trie is called an implicant trie. To define it formally, let p1 , p2 , ..., pn be the variables that appear in the input knowledge base F, and let qi be the literal pi or ¬pi . Literals are ordered as follows: qi ≺ qj iff i < j. (This can be extended to a total order by defining ¬pi ≺ pi , 1 ≤ i ≤ n. But neither queries nor branches in the trie will contain such complementary pairs.) The implicant trie for F is a tree defined as follows: If F is a tautology (contradiction), the tree consists only of a root labeled 1 (0). Otherwise, it is a tree whose root is labeled 1 and has, for any implicant D = {qi1 , qi2 , . . . , qim }, a child labeled qi1 , which is the root of a subtree containing a branch with labels corresponding to C − {qi1 }. The clause D can then be checked for membership in time linear in the size of D, simply by traversing the corresponding branch. Note that the node on this branch labeled qim will be marked with the end symbol. Furthermore, given any node labeled by qj and marked with the end symbol, if j < n, it will have as children nodes 2

Many variations have been proposed in which arcs rather than nodes are labeled, and the labels are sometimes strings rather than single symbols.

4

labeled ¬qk and qk , j < k ≤ n, and these are all marked with the end symbol. This is an immediate consequence of the fact that a node marked with the end symbol represents an implicant which is a prefix (in particular, subset) of every clause obtainable by extending this implicant in all possible ways with the literals greater than qj in the ordering.

3.2

Reduced Implicant Tries

Recall that for any logical formulas F and α and subformula G of F, F[α/G] denotes the formula produced by substituting α for every occurrence of G in F. If α is a truth functional constant 0 or 1 (f alse or true), and if p is a negative literal, we will slightly abuse this notation by interpreting the substitution [0/p] to mean that 1 is substituted for the atom that p negates. The following simplification rules are useful (even if trivial). F −→ F[G/G ∨ 0] F −→ F[0/G ∧ 0] F −→ F[0/p ∧ ¬p]

SR1. SR2. SR3.

F −→ F[G/G ∧ 1] F −→ F[1/G ∨ 1] F −→ F[1/p ∨ ¬p]

If C = {qi1 , qi2 , . . . , qim } is an implicant of F, it is easy to see that the node labeled qim will become a leaf if these rules are applied repeatedly to the subtree of the implicant trie of F rooted at qim . Moreover, the product of applying these rules to the entire implicant trie until no applications of them remain will be a trie in which no internal nodes are marked with the end symbol and all leaf nodes are, rendering that symbol merely a convenient indicator for leaves. The result ot this process is called a reduced implicant trie. Consider an example. Suppose that the knowledge base F contains the variables p, q, r, s, in that order, and suppose that F consists of the following DNF clauses: {p, q, ¬s}, {p, q, r}, {p, r, s}, and {p, q}. Initialize the reduced implicant trie as a single node labeled 1 and then build it one clause at a time. After the first is added to the tree, its two supersets must also be added. The resulting reduced implicant trie is on the left in the diagram below. Adding the second clause implies that the node labeled by r is also a leaf. Then all extensions of this branch are entailed by {p, q, r} (and thus by F), and the corresponding child is dropped resulting in the reduced implicant trie in the center. 1

1

1

p

p q

q s

p

r

s

s

r

q

r s

s

r

q

r

s

Adding the last two clauses produces the reduced implicant trie on the right. 5

r

s

The complete implicant trie in the example is shown below. It has eight branches, but there are eleven end markers representing its eleven implicants (nine in the subtree rooted at q, and one each at the other two occurrences of s.) 1 p q

q

r s

s

r

r

s

r

s

s

s

s

s

Observations (reduced implicant tries). 1. The NNF equivalent of the reduced implicant trie is p ∧ ((¬q ∧ r ∧ s) ∨ q ∨ (r ∧ s)). 2. In general, the length of a branch is at most n, the number of variables that appear in the original logical formula F. 3. In order to have the determination of entailment of a given clause be linear in the size of the clause, enough branches must be included in the tree so that the test for entailment can be accomplished by traversing a single branch. 4. When a DNF clause is added to the trie, the literal of highest index becomes the leaf, and that branch need never be extended; i.e., if a clause is being tested for entailment, and if a prefix of the clause is a branch in the trie, that the clause is entailed. 5. The reduced implicant trie will be stored off-line. Even if it is very large, each branch is small — no longer than the number of variables — and can be traversed very quickly, even with relatively slow off-line storage. 6. If the query requires sorting, the search time will be n log n, where n is the size of the query. But the sorting can be done in main memory, and the search time in the (off-line) reduced implicant trie is still linear in the size of the query.

4

A Procedure For Computing Reduced Implicant Tries

In this section a procedure for computing an reduced implicant trie as a logical formula is presented. The T ric operator is described in Section 4.1; it can be thought of as a single step in the process that

6

creates a reduced implicant trie. The RIT c operator, described in Section 4.2, produces the reduced implicant trie with recursive applications of the Tric operator. It will be convenient to assume that any constant that arises in a logical formula is simplified away with repeated applications of rules SR1 and SR2 (unless the formula is constant).

4.1

The Tric Operator

The Tric operator restructures a formula by substituting truth constants for a variable. Lemmas 1 and 2 and the corollary that follows provide insight into the implicants of the components of Tri(F, p). Let Impc (F) denote the set of all implicants of F. The tric -expansion3 of any formula F with respect to any atom p is defined to be Tric (F, p) = (¬p ∧ F[0/p]) ∨ (p ∧ F[1/p]) ∨ (F[0/p] ∧ F[1/p]). Lemma 1. Suppose that the clause D is an implicant of the logical formula F, and that the variable p occurs in F but not in D. Then D ∈ Impc (F[0/p]) ∩ Impc (F[1/p]). Proof 1. First note that ¬F → ¬D. Let I be an interpretation that falsifies F[0/p]; we must show ˜ = 0. Clearly, I˜ falsifies F 4 , so I˜ falsifies D. But then, that I falsifies D. Extend I to I˜ by setting I(p) ˜ must be since p does not occur in D, I falsifies D. The proof for F[0/p] is identical, except that I(p) set to 1. 2 Lemma 2. Let F and G be logical formulas. Then Impc (F) ∩ Impc (G) = Impc (F ∧ G). Proof 2. If D is an implicant of both F and G, then any interpretation satisfying D satisfies both formulas and thus F ∧ G. Suppose now that D ∈ Impc (F ∧ G). We must show that D ∈ Impc (F) and that D ∈ Impc (G). To see that D ∈ Impc (F), let I be any satisfying interpretation of D. Then I satisfies (F ∧ G) and thus D ∈ Impc (F). The proof that D ∈ Impc (G) is entirely similar. 2 Corollary. Let C be a clause not containing p or ¬p, and let F be any logical formula. Then C is an implicant of F iff C is an implicant of F[0/p] ∧ F[1/p]. 2

4.2

The RITc Operator

The reduced implicant trie of a formula can be obtained by applying the Tric operator successively on the variables. Let F be a logical formula, and let the variables of F be V = {p1 , p2 , ..., pn }. Then the RITc operator is defined by 3

This is tri as in three, not as in trie; the pun is probably intended. It is possible that there are variables other than p that occur in F but not in F [0/p]. But such variables must “simplify away” when 0 is substituted for p, so I can be extended to an interpretation of F with any truth assignment to such variables. 4

7

RITc (F, V ) =

                                

F

V =∅

¬pi ∧ RITc (F[0/pi ], V − {pi }) ∨ c pi ∧ RIT (F[1/pi ], V − {pi }) ∨ c RIT ((F[0/pi ] ∧ F[1/pi ]), V − {pi })

pi ∈ V

where pi is the variable of lowest index in V . Implicit in this definition is the use of simplification rules SR1-3. Theorem 2 below essentially proves that the RITc operator produces reduced implicant tries; reconsidering the above example illustrates this fact: F = {p, q, ¬s} ∨ {p, q, r} ∨ {p, r, s} ∨ {p, q}, so that V = {p, q, r, s}. Let F[1/p] and F[0/p] be denoted by F1 and F0 , respectively. Since p occurs in every clause, F1 amounts to deleting p from each clause, and F0 = 0. Thus, RITc (F, V ) = (¬p ∧ 0) ∨ (p ∧ RITc (F1 , {q, r, s})) ∨ RITc ((0 ∧ F1 ), {q, r, s}) = (p ∧ RITc (F1 , {q, r, s})) where F1 = {q, ¬s} ∨ {q, r} ∨ {r, s} ∨ {q}. Let F1 [1/q] and F1 [0/q] be denoted by F11 and F10 , respectively. Observe that F11 = 1, and q ∧ 1 = q. Thus, RITc (F1 , {q, r, s}) = (¬q ∧ RITc (F10 , {r, s})) ∨ q ∨ RITc ((F10 ∧ 1), {r, s}), where F10 = F1 [0/q] = 0 ∨ 0 ∨ (r ∧ s) ∨ 0 = (r ∧ s). Observe now that F10 [1/r] = s and F10 [0/r] = 0. Thus, RITc (F10 , {r, s}) = (¬r ∧ 0) ∨ (r ∧ RITc (s, {s})) ∨ RITc ((0 ∧ s), {s}). Finally, since RITc (s, {s}) = s, substituting back produces RITc (F, V ) = p ∧ ((¬q ∧ r ∧ s) ∨ q ∨ (r ∧ s)), which is exactly the formula obtained originally from the reduced implicant trie. If a logical formula contains only one variable p, then it must be logically equivalent to one of the following four formulas: 1, 0, p, ¬p. The next lemma, which is trivial to prove from the definition of the RITc operator, says that in that case, RITc (F, {p}) is precisely the simplified logical equivalent. 8

Lemma 3. Suppose that the logical formula F contains only one variable p. Then RITc (F, {p}) is logically equivalent to F and is one of the formulas 1, 0, p, ¬p. 2 For the remainder of the paper, assume the following notation with respect to a logical formula F: Let V = {p1 , p2 , ...pn } be the set of variables of F, and let Vi = {pi+1 , pi+2 , ..., pn }. Thus, for example, V0 = V , and V1 = {p2 , p3 , ..., pn }. Let Ft = F[t/pi ], t = 1, 0, where pi is the variable of lowest index in F. Theorem 1. If F is any logical formula with variable set V , then RITc (F, V ) is logically equivalent to F, and each branch of RITc (F, V ) is an implicant of F. Proof 3. We first prove logical equivalence. Proceed by induction on the number n of variables in F. The last lemma takes care of the base case n = 1, so assume the theorem holds for all formulas with at most n variables, and suppose that F has n + 1 variables. Then we must show that ¬p1 ∧ RITc (F0 , V1 ) ∨ p1 ∧ RITc (F1 , V1 ) ∨ c RIT ((F0 ∧ F1 ), V1 ).

F ≡ RITc (F, V ) =

By the induction hypothesis, F1 ≡ RITc (F1 , V1 ), F0 ≡ RITc (F0 , V1 ), and (F1 ∧ F0 ) ≡ RITc ((F1 ∧ F0 ), V1 ). Let I be any interpretation that falsifies F, and suppose first that I(p1 ) = 1. Then I falsifies ¬p1 , F1 , and (F1 ∧ F0 ), so I falsifies each of the three disjuncts of RITc (F, V ); i.e., I falsifies RITc (F, V ). The case when I(p) = 0 and the proof that any falsifying interpretation of RITc (F, V ) falsifies F are similar. (Any interpretation satisfying F clearly satisfies p1 and the first disjunct or else satisfies ¬p1 and the second disjunct.) Every branch is an implicant of F since, by the distributive laws, RITc (F, V ) is logically equivalent to the disjunction of its branches. 2 Lemma 4. Let C be an implicant of F containing the literal p. Then C − {p} ∈ Impc (F[1/p]). Proof 4. Let I be an interpretation that falsifies F[1/p]. Extend I by defining I(p) = 1. Then I falsifies F, so I faslifies C. Since I assigns 1 to p, I must falsify a literal in C other than p; i.e., I falsifies C − {p}. 2 The theorem below says, in essence, that reduced implicant tries have the desired property that determining whether a clause is an implicant can be done by traversing a single branch. If {q1 , q2 , ..., qk } is the clause, it will be an implicant iff for some i ≤ k, there is a branch labeled q1 , q2 , ..., qi . The clause {q1 , q2 , ..., qi } subsumes {q1 , q2 , ..., qk }, but it is not an arbitrary subsuming clause. To account for this relationship, a prefix of a clause {q1 , q2 , ..., qk } is a clause of the of the form {q1 , q2 , ..., qi }, where 0 ≤ i ≤ k. Implicit in this definition is a fixed ordering of the variables; also, if i = 0, then the prefix is the empty (DNF) clause. 9

Theorem 2. Let F be a logical formula with variable set V , and let C be an implicant of F. Then there is a unique prefix of C that is a branch of RITc (F, V ). Proof 5. Let V = {p1 , p2 , ...pn }, and proceed by induction on n. Lemma 3 takes care of the base case n = 1. Assume now that the theorem holds for all formulas with at most n variables, and suppose that F has n + 1. Let C be an implicant of F, say C = {qi1 , qi2 , ...qik }, where qij is either pij or ¬pij , and i1 < i2 < ... < ij . We must show that a prefix of C is a branch in RITc (F, V ), which is the formula ¬p1 ∧ RITc (F0 , V1 ) ∨ p1 ∧ RITc (F1 , V1 ) ∨ c RIT ((F0 ∧ F1 ), V1 ).

RITc (F, V ) =

Observe that the induction hypothesis applies to the third branch. Thus, if i1 > 1, there is nothing to prove, so suppose that i1 = 1. Then q1 is either p1 or ¬p1 . Consider the case q1 = p1 ; the proof when q1 = ¬p1 is entirely similar. By Lemma 4, C − {p1 } is an implicant of F1 , and by the induction hypothesis, there is a unique prefix B of C − {p1 } that is a branch of RITc (F1 , V1 ). But then A = {p1 } ∪ B is a prefix of C that is a branch of RITc (F, V ). To complete the proof, we must show that A is the only such prefix of C. Suppose to the contrary that D is another prefix of C that is a branch of RITc (F, V ). Then either D is a prefix of A or A is a prefix of D; say that D is a prefix of A. Let D = {p1 } ∪ E. Then E is a prefix of B in RITc (F1 , V1 ), which in turn means that E is a prefix of C − {p1 }. But we know from the inductive hypothesis that C − {p1 } has a unique prefix in RITc (F1 , V1 ), so E = B, so D = A. If A is a prefix of D, then it is immediate that E is a prefix of C − {p1 } in RITc (F1 , V1 ), and, as before, E = B, and D = A. 2 The corollaries below are immediate because of the uniqueness of prefixes of implicants in RITc (F, V ). Corollary 1. Every prime implicant of F is a branch in RIT( F, V ).

2

Corollary 2. Every subsuming implicant (including any prime implicant) of a branch in RITc (F, V ) contains the literal labeling the leaf of that branch. 2

5

An Algorithm for Computing reduced implicant tries

In this section, an algorithm that produces reduced implicant tries is developed using pseudo-code. The algorithm relies heavily on Lemma 2, which states that Impc (F1 ∧ F0 ) = Impc (F1 ) ∩ Impc (F0 ); i.e., the branches produced by the third disjunct of the RITc operator are precisely the branches that occur in both of the first two (ignoring, of course, the root labels pi and ¬pi ). The algorithm makes use of this lemma rather than directly implementing the RITc operator; in particular, the recursive call 10

RITc ((F[0/pi ] ∨ F[1/pi ]), V − {pi }) is avoided. This is significant because that call doubles the size of the formula along a single branch. No attempt was made to make the algorithm maximally efficient. For clarity, the algorithm is designed so that the first two conjuncts of the RITc operator are constructed in their entirety, and then the third conjunct is produced by parallel traversal of the first two. The algorithm employs two functions: ritc and buildone. The nodes of the trie consist of five fields: label, which is the name of the literal that occurs in the node; parent, which is a pointer to the parent of the node; and minus, plus, and one, which are pointers to the three children. The function ritc recursively builds the first two conjuncts of the RITc operator and then calls buildone, which recursively builds the third conjunct from the first two. The reader may note that the algorithm builds a ternary tree rather than an n-ary trie. The reason is that the construction of the subtree representing the third conjunct of the RITc operator sets the label of the root to 0. This is convenient for the abstract description of the algorithm; it is straightforward but tedious to write the code without employing the one nodes. Observations. 1. Recall that each (sub-)trie represents the conjunction of the label of its root with the disjunction of the sub-tries rooted at its children. 2. If either of the first two sub-tries are 0, then that sub-trie is empty. The third is also empty since it is the intersection of the first two. 3. If any child of a node is 1, then the node reduces to a leaf. In practice, in the algorithm, this will only occur in the first two branches. 4. If both of the first two sub-tries are leaves, then they are deleted by SR8 and the root becomes a leaf. 5. No pseudocode is provided for the straightforward routines “makeleaf”, “leaf”, and “delete”. The first two are called on a pointer to a trienode, the third is called on a trienode.

11

The Reduced Implicant Trie Algorithm. declare(

structure trienode(

lit: parent: minus: plus: one:

label, ↑trienode, ↑trienode, ↑trienode, ↑trienode);

RImplicanttrie:trienode);

input(G); {The logical formula G has variables p1 , p2 , ..., pn } RImplicanttrie ← ritc (G, 1, 0);

function ritc (G: wff, polarity, varindex: integer): ↑ trienode; N ← new(trienode); if varindex=0 then N.lit ← 1 {root of entire trie is 1} else if polarity = 0 then N.lit ← −pvarindex else N.lit ← pvarindex Gminus ← G[0/pvarindex ]; Gplus ← G[1/pvarindex ]; if (Gplus = 1 or Gminus = 1) then makeleaf(N); return(↑N); {Observation 3.} if Gplus = 0 then N.plus ← nil; N.one ← nil {Observation 2.} c else N.plus ← rit (Gplus, 1, varindex+1); if N.plus.lit = 0 then delete(N.plus↑); N.plus ← nil; if Gminus = 0 then N.minus ← nil; N.one ← nil {Observation 2.} c else N.minus ← rit (Gminus, 0, varindex+1); if N.minus.lit = 0 then delete(N.minus↑) if N.plus = nil then N.lit ← 0; makeleaf(N); if (leaf(N.plus) and leaf(N.minus)) {Observation 4.} then delete(N.plus); delete(N.minus); makeleaf(N); return(↑N); if (N.plus 6= nil and N.minus 6= nil) then N.one ← buildone(N.plus, N.minus); N.one↑.parent ← ↑N; N.one↑.lit ← 0 return(↑N); end rit;

12

function buildone(N1, N2, ↑trienode): ↑trienode; None ← new(trienode); None.lit ← N1↑.lit; if leaf(N1) then None.(plus, minus, one) ← N2↑.(plus, minus, one); return(↑None) if leaf(N2) then None.(plus, minus, one) ← N1↑.(plus, minus, one); return(↑None); if (N1↑.plus = nil or N2↑.plus = nil) then None.plus ← nil else None.plus ← buildone(N1↑.plus, N2↑.plus); if (N1↑.minus = nil or N2↑.minus = nil) then None.minus ← nil else None.minus ← buildone(N1↑.minus, N2↑.minus); if (N1↑.one = nil or N2↑.one = nil) then None.one ← nil else None.one ← buildone(N1↑.one, N2↑.one); if leaf(↑None) then delete(↑None); return(nil) else begin if None.plus 6= nil then None.plus↑.parent ← ↑None; if None.minus 6= nil then None.minus↑.parent ← ↑None; if None.one 6= nil then None.one↑.parent ← ↑None; return(↑None) end end buildone.

6

Intersecting Reduced Implicant Tries

In Section 5, the algorithm that produces ri-tries relies heavily on Lemma 2, which states that Impc (F0 ∧ F1 ) = Impc (F0 ) ∩ Impc (F1 ); i.e., the branches produced by the third conjunct of the RITc operator represent precisely the implicants that are represented in both of the first two (ignoring, of course, the root labels pi and ¬pi ). Given any two formulas F and G, fix an ordering of the union of their variable sets, and let TF and TG be the corresponding reduced implicant tries. The intersection of TF and TG , is defined to be the trie that represents the intersection of the implicant sets with respect to the given variable ordering. By Theorems 1 and 5, and Lemma 2, this is the reduced implicant trie for F ∧ G. This definition captures the computation expressed in pseudocode as the function buildone in Section 5. 13

As with RITc , the INTc operator can be defined recursively. It is again assumed for convenience that the trie is represented as a ternary tree rather than as an n-ary trie. The root of the entire trie is 1 (for non-contradictions), and any node at level i has ordered children that are either empty or are tries whose roots are labeled ¬pi+1 , pi+1 , and 0. As a result, a trie T rooted at pi can be represented notationally as a 4-tuple < pi , T − , T + , T 0 >. This trie represents the formula pi ∧ (T − ∨ T + ∨ T 0 ), and we write T − pi to denote the second conjunct (which is equivalent to T ∨ ¬pi ). The predicate leaf returns true whenever all sub-tries are empty. Given two identically ordered tries TF and TG , INT(TF , TG ) is defined as follows.

INT(TF , TG ) =

                                      

where TF

L

TG =



TF = ∅ ∨ TG = ∅

TF

leaf (TG )

TG

leaf (TF )



leaf (TF

TF

L

TG

L

TG )

otherwise

< r, INT(TF− , TG− ), INT(TF+ , TG+ ), INT(TF0 , TG0 ) >

and r is the root label of both TF and TG . First note that the INT operator clearly produces a labeled trie that has the structure of a ternary tree exactly like its arguments. Lemma 5 and Theorem 3 show that this trie is precisely the ri-trie that is the L intersection of its arguments. Observe also that in case four, the leaf test on TF TG is required. When neither argument is a leaf (as ruled out by previous cases) and yet the intersections of all corresponding L sub-tries are empty, TF TG produces a leaf, but the two tries share no branches. Finally, note that when considering the implicates corresponding to a branch in an ri-trie, the zero labels are ignored. Lemma 5. Let TF and TG be reduced implicant tries having the same variable ordering. Let CF be a non-empty prefix of CG , where CF is a branch in TF and CG is a branch in TG . Then CG is a branch in INT(TF , TG ). Proof 6. By induction on n, the number of literals in CF . If n = 1, then CF = {pi } is a singleton; it is also a branch in TF , which in turn must be a root leaf. So case 3 in the definition of INT applies. The intersection will be TG , and CG is by definition a branch in TG and thus also in the intersection. Otherwise, assume true for 1 ≤ n ≤ k, and suppose n = k + 1. Let pi be the first literal in CF . Since both CF and CG correspond to branches of length greater than one, case 5 must apply. Clearly, 14

pi is the root label of TF and of TG . Each of CF − {pi } and CG − {pi } is non-empty, and the former is a prefix of the latter. As a result, CF − {pi } is a branch in TF− , TF+ , or TF0 ; without loss of generality say TF+ . Then CG − {pi } is a branch in TG+ . Since CF − {pi } contains at most k literals, the induction hypothesis applies. Therefore, CG − {pi } is a branch in INT(TF+ , TG+ ), which implies that CG is a branch in INT(TF , TG ). 2 Observe that all non-empty branches of the intersection trie are constructed via zero or more applications of case 5, followed by one application of either case 2 or case 3, in the definition of INT. When the computation terminates from case 2(3), each such branch corresponds to the identical branch in TF (TG ) and to a branch in TG (TF ) that is a prefix of the branch in the intersection. Theorem 3. Let TF and TG be the reduced implicant tries for F and G having the same variable ordering. Then INT(TF , TG ) is the reduced implicant trie that is the intersection of TF and TG and, as a result, is the reduced implicant trie for F ∧ G with respect to the given variable ordering. Proof 7. Let C be an implicant of F ∧ G. By Lemma 2, C is an implicant of both F and G. Then by Theorem 2, there is a unique prefix CF of C that is a branch in TF ; similarly, there is a unique prefix CG of C that is a branch in TG . We must show that some unique prefix of C is a branch in the intersection. If C is the empty clause, then both TF and TG are singleton roots labeled 1, so is the intersection, and there is nothing to prove. If either CF or CG is empty, then one of the tries is a singleton root, the intersection is the other trie, and the result is immediate. So assume C, CF , and CG are not empty. One or both of CF and CG is a prefix of the other; without loss of generality, assume CF is a prefix of CG . Therefore the branch corresponding to CF in TF is a prefix of the branch in TG corresponding to CG . By Lemma 5, CG corresponds to a branch in INT(TF , TG ). No other prefix C 0 of C can be a branch in the intersection because by the observation after Lemma 5, C 0 would be a branch in one of TF or TG . But this would violate the unique prefix property for either CF or CG . 2 Using Theorem 3, we can give the following definition for the RITc operator. It differs from the definition given in section 4.2 in that it provides a formal basis for the computation of ri-tries as ternary trees using intersection and structure sharing, exactly as embodied by the pseudocode in that section.

RITc (F, V ) =

   

F

V =∅

   (¬p ∧ B ) ∨ (p ∧ B ) ∨ (1 ∧ B ) i 1 i 2 3

where pi is the variable of lowest index in V , and 15

pi ∈ V

and

7

B1 = B2 = B3 =

RITc (F[0/pi ], V − {pi }) RITc (F[1/pi ], V − {pi }) INT(B1 , B2 )

Reduced Implicant Tries as Dags

Definition. Let D1 and D2 be directed acyclic graphs (dags). Then D1 and D2 are said to be isomorphic if there is a bijection f such that if (A, B) is an edge in D1 , then (f (A), f (B)) is an edge in D2 . If the nodes of the dags are labeled, then the isomorphism must also preserve labels: Label(A) = Label(f (A)). For the remainder of this paper, we will assume all isomorphisms to be label-preserving. Theorem 4. Let R be a reduced implicant trie, and let f be an isomorphism from R to R. Then f is the identity map on R. Proof 8. We proceed by induction on the number of variables in R. The result is trivial if there is only one variable, so assume true for any reduced implicant trie with at most n variables, and suppose R has n + 1 variables. The root has at most three children, labeled p1 , ¬p1 , 1. Since each has a distinct label and f is label preserving, f must map each of these children to itself. Note that the edge preservation property of f ensures that f maps each subtrie to itself. The induction hypothesis thus applies to each subtrie, and the proof is complete. 2 The theorem, while straightforward, is not immediate. In Figure 1, the dag has a label-preserving isomorphism that is not the identity because it swaps the nodes labeled ”a”.

r a

a

b

Figure 1: Non-Identity Label-Preserving Isomorphism The induction of the last theorem can easily be adapted to prove Theorem 5. Let F and G be logically equivalent formulas. Then, with respect to a fixed variable ordering, RITc (F) is isomorphic to RITc (G). 16

Note that F and G may have different variable sets. In that case, we assume that the fixed ordering in the theorem refers to the union of these variable sets. As a result, comparing the ri-tries of two formulas amounts to a (not necessarily practical) test for logical equivalence. On the other hand, if the formulas are known to be equivalent, attention can be restricted to variables in the symmetric difference of their variable sets; all others are redundant.

8

Reduced Implicate/Implicant Tries

Recall that reduced implicate tries (or ri-tries) were introduced and further developed in [39] and in [40], along with the RIT and INT operators. Implicate tries are the duals of implicant tries; for either type of trie, the INT operator takes two trie arguments and produces the trie having exactly the branches that appear in both arguments. This operator is based entirely on the structure of its arguments and, as a result, is neutral with respect to implicate or implicant tries. For the remainder of this section, the reader is assumed to be familiar with these operators and with the developments in [39] and in [40]. The RITc and RIT operators are certainly different, but the recursive computations in both require replacing a variable with truth constants. The goal of this section is to take advantage of the common subcomputations in the design of a new operator, RIIT, that produces reduced implicate/implicant tries (rii-tries) that can determine both implicates and implicates.

8.1

Merging ri-tries and Reduced Implicant Tries

The goal for the design of the RIIT operator is to build a trie in which both the implicates and implicants of a formula are represented by the branches. So what is desired is some kind of combination of the tries built by RIT and RITc . Let V = {p1 , . . . , pn } and consider TF = RIT(F, V ) and TFc = RITc (F, V ) and how they compare. Each is a ternary trie in which the ith variable appears at level i. Any node of either trie can be uniquely specified by a position in the obvious way, where a position is a sequence of integers taken from {1,2,3}. (The root is at the empty position.) Suppose position P exists in both tries; let N be the node at P in the ri-trie, and let N c be the node at P in the reduced implicant trie. It is obvious from the definitions of RIT and RITc that if N has label q, then N c has the complement label which is equivalent to ¬q. When convenient, we will use the complement l of label l to denote ¬q when l = q, and q when l = ¬q. Lemma 6. Let TF and TFc be, respectively, the ri-trie and the reduced implicant trie for F under a given variable ordering. Suppose node N in TF and node N c in TFc are each at position P . Then neither N nor N c is a leaf. Proof 9. Suppose N is a leaf, and suppose the branch ending in N represents the clause C. Then C is an implicate of F. If position P corresponding to N c exists in TFc , then some extension P Q of P is a 17

branch in TFc (where Q could be empty). So P Q represents the DNF clause D = ¬C ∪ {q1 , . . . , qm }. Clearly D is an implicant of F. This means that D = (¬C ∪ {q1 , . . . , qm }) |= F |= C. Since D is a satisfiable DNF clause, all literals in D can be simultaneously made true, falsifying C, and this is a contradiction. If N c is a leaf, let D be the clause represented by the branch ending in N c . Then D is an implicant of F. If position P corresponding to N exists in TF , then some extension P Q of P is a branch in TF (where Q could be empty). So P Q represents the CNF clause C = ¬D ∪ {q1 , . . . , qk }. Clearly C is an implicate of F. This means that D |= F |= C = (¬D ∪ {q1 , . . . , qm }). Since C is a falsifiable CNF clause, all literals in C can be simultaneously made false, satisfying D, and this is a contradiction. 2 The proof of Lemma 6 relies on entailment between implicants, formulas, and implicates. Therefore it applies with only trivial changes to the following corollary. Corollary 3. Let T1 and T2c be the ri-trie and the reduced implicant trie, respectively, for F1 and for F2 under a given variable ordering, where F2 |= F1 . Suppose node N in T1 and node N c in T2c are each at position P . Then neither N nor N c is a leaf. While the corollary is stated in rather general form, it will be most useful in the particular case where F1 = F ∨ G and F2 = F ∧ G. Whenever Lemma 6 applies to ri-trie T and to reduced implicant trie T c , it can be concluded that the leaf node positions of T are disjoint from those of T c . The two tries share some internal structure but no branches. Given a position P = (i1 , i2 , . . . , in ) for leaf N in T , we may start at the root of T c and trace the path corresponding to P . There must be a node N 0 at some position P 0 = (i1 , i2 , . . . , ic ), c < n such that ic+1 ∈ {1, 2}, the position (i1 , i2 , . . . , ic , ic+1 ) does not exist in T c , and the position (i1 , i2 , . . . , ic , (3 − ic+1 )) does exist in T c . In other words, T and T c have branches that consist of identical positions up through P 0 . But one node at P 0 has only a first or only a second child, while the other has a different only child. Any path common to both tries must “split” at some point prior to encountering a leaf, otherwise the leaf position would be common to both tries which is impossible. This split means that the two nodes at the lowest common position must be missing different first and second children. Note that such nodes cannot have a third child since the intersection of the first two must be empty. As a result, it is straightforward to build a trie whose branches are precisely the branches in both T and T c . In a simple recursive process, the tries can be traversed in parallel. A trie can be constructed that is isomorphic to the common parts of the two tries. When a node is encountered where the tries diverge, the corresponding node in the trie under construction can be given (essentially) the first child from one trie and the second child from the other. Note that the third child will be empty since it is empty in both tries being traversed. Let TF = h0, TF+ , TF− , TF0 i be the ri-trie for F, and let TGc = h1, c TG+ , c TG− , c TG0 i be the reduced implicant trie for G under a fixed variable ordering, where G |= F. The merge of TF and TGc is defined to be the trie whose branches are exactly the branches of both TF and TGc . In order to type each branch 18

according to its trie of origin, the leaf nodes will be marked as type-d or type-c. Branches leading to type-d (type-c) leaves are called type-d (type-c) branches. Since nodes in positions common to both tries have complementary labels, we will use a toggling symbol ∝ to indicate the label that a node at a given position would have if it were at that position in TF or in TGc . When considering implicates, ∝ is interpreted as the identity, but for implicants it is interpreted as complement. Note also that ri-tries and reduced implicant tries are interpreted as specific (and dual) logical structures using conjunction and disjunction. But both must be captured by an rii-trie, and so the symbols and ⊗ will be used to denote, respectively, connectives along and between branches. Connective symbols do not appear explicitly in definitions employing the 4-tuple notation, but it is assumed that in the result of MERGE, the connective symbols are and ⊗. Given ri-trie T and reduced implicant trie T c , we denote by d(T ) the trie that results from marking the leaves of T as type-d and prepending the symbol ∝ to all its labels. We denote by c(T c ) the trie that results from marking the leaves of T c as type-c, and complementing and then prepending ∝ to all its labels. The merge of TF and TGc may be computed as follows.

MERGE(TF , TGc )

=

                                      



TF = ∅ and TGc = ∅

d(TF )

TGc = ∅

c(TGc )

TF = ∅

h ∝ r, MERGE(TF+ , c TG+ ), MERGE(TF− , c TG− ), MERGE(TF0 , c TG0 ) i

otherwise

Note that in the recursive call to MERGE (fourth case), the root labels of the two arguments are complementary. The root of the constructed trie uses the label of the ri-trie but with ∝ prepended. This yields exactly the correct label when these nodes are viewed for implicates, and dually for implicants. For uniformity, ∝ is prepended to all labels in the base cases as well. For case two, the construction requires only that the leaves be marked as type-d and that ∝ is added to the labels, and this is exactly what d(TF ) produces. For case three, TGc is a correctly labeled reduced implicant trie. When searching for implicants, ∝ means complement, so for this case the labels have to be complemented and then prepended with ∝; in essence, evaluating c(TGc ) complements the correct labels twice. Along with Lemma 6, this proves Lemma 7. If G |= F, the d-branches and c-branches of MERGE(TF , TGc ) are disjoint and have exactly the positions of the branches of RIT(TF ) and of RITc (TG ). The sub-trie consisting of the type-d 19

branches is identical to RIT(TF ) if ∝ and leaf marks are removed. The subtrie consisting of the type-c branches is identical to RITc (TG ) if labels are complemented, and ∝ and leaf marks are removed. 2 As a result, the rii-trie for F is defined to be MERGE(TF , TFc ). The goal, however, is to define the RIIT operator to compute this trie without first computing the ri-trie and the reduced implicant trie.

8.2

Intersecting rii-Tries

If an rii-trie is viewed simply as the MERGE of an ri-trie and a reduced implicant trie, then the notion of intersection need not be addressed explicitly. Intersections of implicates and implicants have already been computed as necessary in forming the tries to be merged. However, it is necessary to address intersection directly in order to compute rii-tries directly. Given any two formulas F and G, fix an ordering of the union V of their variable sets, let TF and TG be the corresponding ri-tries, and let TFc and TGc be the corresponding reduced implicant tries. We denote by TFii and TGii the rii-tries for F and for G. The intersection of TFii and TGii , is defined to be MERGE(RIT(F ∨ G, V ), RITc (F ∧ G, V )), which is MERGE(INT(TF , TG ), INT(TFc , TGc )). This trie, while not in general the rii-trie of any formula, is the trie whose d-branches correspond precisely to the branches of RIT(F ∨G, V ), and whose c-branches correspond precisely to the branches of RITc (F ∧ G, V ). Note that by Lemma 7 and the corollary to Lemma 6, this merge of the ri-trie and reduced implicant trie of, respectively, the different formulas (F ∨ G) and (F ∧ G) is well-defined. Our goal here is to define IINT, the rii-trie intersection operator, directly without the use of MERGE, RIT, or RITc . The IINT operator is defined similarly to INT. Recall that the INT operator is a recursion that traverses its trie arguments. The base cases that end the recursion involve either the empty trie or a leaf node. This holds for the IINT operator as well: If one argument is empty, then so is the intersection. When one argument is a type-d leaf, then the intersection is all branches in the other argument that end in type-d leaves. Dually, when one argument is a type-c leaf, then the intersection is all branches in the other argument that end in type-c leaves. Let TF and TG be the rii-tries for F and for G, respectively, under a fixed variable ordering. Let hr, TF+ , TF− , TF0 i and hr, TG+ , TG− , TG0 i be the 4-tuples denoting TF and TG , respectively. Let T γ be L the sub-trie of T whose branches end in γ-leaves, γ = d, c, and let TF ii TG be the four-tuple hr, IINT(TF+ , TG+ ), IINT(TF− , TG− ), IINT(TF0 , TG0 )i. Then

20

IINT(TF , TG ) =

                                      



TF = ∅ or TG = ∅

TFγ

γ-leaf (TG )

TGγ

γ-leaf (TF )



leaf (TF

TF

L

ii TG

L

ii TG )

otherwise

Lemma 8. Let TF and TG be the rii-tries of F and of G with the same variable ordering. Let CF be a non-empty prefix of CG , where CF is a d-branch (c-branch) in TF and CG is a d-branch (c-branch) in TG . Then CG is a d-branch (c-branch) in IINT(TF , TG ). 2 Proof 10. By induction on n, the number of literals in CF . Without loss of generality, assume the branches are type-d. If n = 1, then CF = {pi } is a singleton; it is also a branch in TF , which in turn must be a root type-d leaf. So case 3 in the definition of IINT applies, and γ = d. The intersection will be TGd , and CG is by definition a type-d branch in TG and thus also in TGd ; it is therefore in the intersection. Otherwise, assume true for 1 ≤ n ≤ k, and suppose n = k + 1. Let pi be the first literal in CF . Since both CF and CG correspond to branches of length greater than one, case 5 must apply. Clearly, pi is the root label of TF and of TG . Each of CF − {pi } and CG − {pi } is non-empty, and the former is a prefix of the latter. As a result, CF − {pi } is a branch in TF+ , TF− , or TF0 ; without loss of generality say TF+ . Then CG − {pi } is a branch in TG+ . Since CF − {pi } contains at most k literals, the induction hypothesis applies. Therefore, CG − {pi } is a branch in IINT(TF+ , TG+ ), which implies that CG is a branch in IINT(TF , TG ). The proof is completely dual for branches of type-c.

2

Observe that all non-empty branches of the intersection trie are constructed via zero or more applications of case 5, followed by one application of either case 2 or case 3, in the definition of IINT. When the computation terminates from case 2(3), each such branch corresponds to the identical branch in TF (TG ) and to a branch in TG (TF ) that is a prefix of the branch in the intersection. Theorem 6. Let TF and TG be the rii-tries for F and G with the same variable ordering. Then IINT(TF , TG ) is the trie that represents the intersection of TF and TG with respect to the given variable ordering. 2 21

Proof 11. Case 1: Let D be an implicant of F ∧ G. Let TFc and TGc be the reduced implicant tries of F and of G, respectively. By Lemma 2, D is an implicant of both F and G. Then by Theorem 2, there is a unique prefix DF of D that is a branch in TFc ; similarly, there is a unique prefix DG of D that is a branch in TGc . We must show that some unique prefix of D is a c-branch in the intersection. If D is the empty clause, then both TFc and TGc are singleton roots labeled 1, so is the intersection, and there is nothing to prove. If either DF or DG is empty, then one of the tries is a singleton root, the intersection is the other trie, and the result is immediate. So assume D, DF , and DG are not empty. One or both of DF and DG is a prefix of the other; without loss of generality, assume DF is a prefix of DG . Therefore the branch corresponding to DF in TFc is a prefix of the branch in TGc corresponding to DG . By Lemma 8, DG corresponds to a branch in IINT(TFii , TGii ). No other prefix D0 of D can be a branch in the intersection because by the observation after Lemma 8, D0 would be a branch in one of TFc or TGc . But this would violate the unique prefix property for either TFc or TGc . Case 2: Consider an implicant C of F ∨ G. The proof is entirely dual to that of Case 1.

2

Theorem 6 guarantees that when applied to rii-tries, IINT produces precisely the trie whose dbranches represent (uniquely) the implicates (CNF clauses) represented in both sub-tries and whose cbranches represent (uniquely) the implicants (DNF clauses) represented in both sub-tries. They coexist peacefully in one trie because branches may end in one type or the other, but not both.

8.3

The RIIT Operator

If T is the rii-trie for a logical formula F, and if L is an ordered set of literals, then, since L can be interpreted as either a disjunctive or a conjunctive clause, one can ask, respectively, whether L is an implicate or an implicant of F. The rii-trie will have the property that in either case, the answer is yes if and only if a unique prefix of L is a branch in T . The notation d-clause and c-clause will be used to indicate, respectively, disjunctive and conjunctive clauses. Similarly, d-search and c-search will be used to indicated that the trie is being searched for, respectively, d-clauses or c-clauses. The choice of interpretation will determine the connectives along branches and those between branches, node labels, and the truth constants labeling interior nodes. There is a straightforward duality of the logical connectives in these tries. For ri-tries, branches are disjunctions that are conjoined to each other; for reduced implicant tries, branches are conjunctions that are disjoined. In the rii-trie, the connectives must be interpreted; i.e., whether each connective is a disjunction or a conjunction depends upon whether implicates or implicants are being sought. Consider next interior nodes labeled by truth constants. For non-contradictory non-tautological formulas, the root is 0 for the ri-trie, 1 for the reduced implicant trie. Assuming ternary structure, the third sub-trie of an ri-trie is rooted at 0; a reduced implicant trie is rooted at 1. In ri-tries, interior zeros are disjoined to the conjunction of their children; in reduced implicant tries, interior ones are conjoined

22

to the disjunction of their children. In each case, the ternary structure is maintained by not applying simplification rules to such interior nodes. This convention works in exactly the same way in both types of trie because the simplification rules, if applied, would have exactly the same effect on interior constants. In particular, as with connectives in an rii-trie, the value of constants labeling interior nodes depends on the search (d or c), but their behavior with respect to the structure of the trie is independent of the search. The RIIT operator is defined using ⊗ and to represent, respectively, the connective between subtries and the connective along a branch. For example, with a d-search, ⊗ = ∧ and = ∨. It will also be convenient to regard the label of an interior node as either unnegated or negated, depending on the search. The symbol ∝ will be used as a unary operator representing the identity — i.e., unnegated — for a d-search, and negation for a c-search. In the ternary structure, the third child of a node is always labeled with the constant ∝0, producing 0 for an implicate search and 1 otherwise. The definition of the RIIT operator requires the IINT operator and the simplification rules. The latter are more complicated than they are for ri-tries. They must allow for the dual connectives between and along branches. The definitions of the required simplification rules follow the definition of RIIT.

RIIT(F, V ) =

   

F

V =∅

   (∝p B ) ⊗ (∝¬p B ) ⊗ (∝0 B ) i 1 i 2 3

pi ∈ V

where pi is the variable of lowest index in V , and B1 = B2 = B3 =

and

RIIT(F[0/pi ], V − {pi }) RIIT(F[1/pi ], V − {pi }) IINT(B1 , B2 )

Notice that the truth constants substituted for variables in F are not preceded by the symbol ∝. The simplification rules SR1, SR2, and SR3 are repeated in the list below; the asterisks in SR*4 – SR*6 are used to indicate that those rules contain the interpretation-dependent operators ⊗ and . SR1. SR2. SR3. SR∗ 4. SR∗ 5. SR∗ 6.

F F F F F F

−→ −→ −→ −→ −→ −→

F[G/G ∨ 0] F[0/G ∧ 0] F[0/p ∧ ¬p] F[∝q d /(∝q 0)] F[0/(∝pd ) ⊗ (∝¬pd )] F[G/G ⊗ (∝0 ∝1)]

F F F F F

−→ −→ −→ −→ −→

F[G/G ∧ 1] F[1/G ∨ 1] F[1/p ∨ ¬p] F[∝q c /(∝q 1)] F[1/(∝pc ) ⊗ (∝¬pc )]

Rules SR1 – SR3 are unchanged but are now applicable only in simplifying F after substituting truth values for variables and never along or between the trie branches — the trie itself has no occurrences of ∨ or ∧. As a result, SR1 – SR3 cannot be extended to and ⊗. The result of such analogous operations is unknown unless the interpretation of the operator is known. The result is determined only 23

after the choice of implicate or implicant has been made: In one case the proposition (i.e., sub-trie) vanishes, and in the other case the constant vanishes. To allow for both cases, these simplifications cannot be performed on the rii-trie but instead must be accounted for during each search. Observe that nodes labeled with 0 or 1 can occur only as leaves5 without siblings and can be simplified away with SR*4. A 0-leaf occurs exactly when the branch represents an implicate, and a 1leaf occurs with an implicant. This distinction must be maintained when the leaves are simplified away, so SR*4 makes the parent into a type-d or type-c leaf by giving the label the appropriate superscript. Constants also arise when the IINT operation produces the empty trie. In this case, the third subtrie, whose root is labeled ∝0, can simply be removed. To see why, note that IINT produces a trie whose branches represent both the intersection of implicate sets and the intersection of implicant sets. Consider the implicate interpretation. An empty intersection of implicates is an empty conjunction and hence is equal to the constant 1; ∝0 ∝1 = 0 ∨ 1 = 1, and 1 ∧ G = G. Then SR*6 applies, and there is no third branch. Similarly, in the implicant interpretation, an empty intersection of implicants is an empty disjunction and hence is equal to 0; ∝0 ∝1 = 1 ∧ 0 = 0, 0 ∨ G = G, and again SR*6 applies. Rule SR*5 reduces complementary sibling type-d leaves to 0 and complementary type-c leaves to 1. (It is easy to verify that this is correct under both implicate and implicant interpretations.) In essence, SR*5 eliminates sections of the trie that are implicit in one interpretation and ignored in the other. Note that complementary sibling leaves of opposite type cannot be reduced: One, but not the other, is treated as if it were not there once the type of search has been specified. Lemma 7 can be reconsidered in the context of the simplification rules. If attention is restricted to implicates — i.e., if , ⊗, and ∝ are specified as ∨, ∧, and the identity — then the c-branches can be removed from the rii-trie for F; the result is (essentially) the ri-trie for F. In essence, the second half of SR*4 is replaced by the second half of SR2 since branches are disjunctions. Type-c leaves represent their label disjoined with 1 and are simplified away up to the first ancestor having another child. Similarly, if the d-branches are removed, the result is the reduced implicant trie for F. Theorem 7 below states that the object built by the RIIT operator is exactly the rii-trie of its first argument. Theorem 7. If F is any logical formula with variable set V , then RIIT(F, V ) = MERGE(RIT(F, V ), RITc (F, V )). Proof 12. By induction on the number n of variables in V = {p1 , . . . , pn }. If n = 1, then by Lemma 3 and its counterpart for ri-tries, there are four cases. In the two constant cases, the ri-trie and the reduced implicant trie are either both 0 or 1. Either way, their MERGE is ∝ 0 or ∝ 1, respectively. But in the ∝ 0 case, RIIT(F, V ) would first produce the trie h∝ 0, ∝ pd1 , ∝ ¬pd1 , ∅i and this would reduce to ∝ 0 via the first half of SR*5. Similarly for the ∝ 1 case. If F = p1 , then the ri-trie is h0, p1 , ∅, ∅i, the reduced implicant trie is h1, ∅, ¬p1 , ∅i, and their merge is h∝ 0, ∝ pd1 , ∝ ¬pc1 , ∅i, which is exactly what RIIT produces. Similarly if F = ¬p1 . 5

Internal nodes labeled with constants are labeled ∝0.

24

So assume that the theorem holds for k or fewer variables, let n = k + 1, and consider MERGE(RIT(F, V ), RITc (F, V )). From the definitions of RIT and of RITc , this is equivalent to the MERGE of

        

p1 ∨ RIT(F[0/p1 ], V − {p1 }) ∧ ¬p1 ∨ RIT(F[1/p1 ], V − {p1 }) ∧ 0 ∨ (RIT(F[0/p1 ] ∨ F[1/p1 ], V − {p1 }))





         ,       

¬p1 ∧ RITc (F[0/p1 ], V − {p1 }) ∨ c p1 ∧ RIT (F[1/p1 ], V − {p1 }) ∨ c 1 ∧ (RIT (F[0/p1 ] ∧ F[1/p1 ], V − {p1 }))

By the definition of MERGE, we have ∝ p1 MERGE(RIT(F[0/p1 ], V − {p1 }), RITc (F[0/p1 ], V − {p1 })) ⊗ ∝ ¬p1 MERGE(RIT(F[1/p1 ], V − {p1 }), RITc (F[1/p1 ], V − {p1 })) ⊗ ∝ 0 MERGE(RIT(F[0/p1 ] ∨ F[1/p1 ], V − {p1 }), RITc (F[0/p1 ] ∧ F[1/p1 ], V − {p1 })) The induction hypothesis applies to the first two MERGE expressions; they are therefore equivalent to RIIT(F[0/p1 ], V − {p1 }) and to RIIT(F[1/p1 ], V − {p1 }), respectively. By the definition of intersection and Theorem 6, the third MERGE expression is equivalent to IINT(RIIT(F[0/p1 ], V − {p1 }), RIIT(F[1/p1 ], V − {p1 })). Substituting for these MERGE expressions produces (∝p1 RIIT(F[0/p1 ], V − {p1 })) ⊗ (∝¬p1 RIIT(F[1/p1 ], V − {p1 })) ⊗ (∝0 IINT(RIIT(F[0/p1 ], V − {p1 }), RIIT(F[1/p1 ], V − {p1 })) and this is exactly RIIT(F, V ) after expanding on p1 .

2

Searching an rii-trie for implicates and implicates is straightforward. To determine whether a given d-clause is an implicate, traverse the branch consisting of the literals in the clause. If the search fails or if a c-leaf is encountered, the clause is not an implicate. If the search ends in a d-leaf, then the path to the leaf must be a prefix of the clause, and the clause is an implicate. The search for implicants is the same, except that a yes answer requires termination at a c-leaf. A simple example is the rii-trie for ((p ↔ q) ∨ r), displayed in Figure 2. The assignments of truth values to variables determined by the definition of RIIT are indicated below the corresponding nodes. Third sub-tries of interior nodes are not associated with particular truth 25

        

α0 αp

[0/q ]

αr d [0/ r ]

α q [1/q ]

α rr c [1/ ]

q)

r

α0

α p

[0/ p ]

αq c

(p

[1/ p ]

α0

αq

α q

[0/q ]

α rc αr d

[0/ r ]

c

α0

[1/q ]

α rr c

α rc

αq

α q α0

α rc α rc α rc

[1/ ]

Figure 2: rii-Trie for (p ∨ q) assignments, because they are computed as the intersection (IINT) of the first two. In this example, the implicants (¬p ∧ r), (p ∧ ¬r), (¬q ∧ r), (q ∧ r), and (r) are all computed as the result of the IINT operator. This formula has only two relatively prime implicates: (p ∨ ¬q ∨ r) and (¬p ∨ q ∨ r). They appear as branches ending in type-d leaves in the rii-trie. The nine relatively prime implicants appear as the branches ending in type-c leaves. For an example of a search, consider determining whether the clause (q ∨ ¬r) is an implicate. The ∝ operator is interpreted as the identity, and the leftmost branch of the third child of the root would be traced. The leaf is ¬rc , and so the answer is no.

References [1] Bibel, W. On matrices with connections. J. ACM 28,4 (1981), 633 – 645. [2] Bryant, R. E., Symbolic Boolean manipulation with ordered binary decision diagrams, ACM Comput. Surv. 24, 3 (1992), 293–318. [3] Cadoli, M., and Donini, F. M., A survey on knowledge compilation, AI Commun. 10 (1997), 137–150. [4] Chin, R. T., and Dyer, C. R. Model-based recognition in robot vision. ACM Computing Surveys 18(1) (Mar. 1986) 67–108. [5] Coudert, O. and Madre, J. Implicit and incremental computation of primes and essential implicant primes of boolean functions. In Proceedings of the 29th ACM/IEEE Design Automation Conference, (1992) 36-39. [6] Coudert, O. and Madre, J. A new graph based prime computation technique. In Logic Synthesis and Optimization (T. Sasao, Ed.), Kluwer (1993), 33–58. [7] D’Agostino, M., Gabbay, D. M., H¨ahnle, R., and Posegga, J., Handbook of Tableau Methods, Kluwer Academic Publishers, 1999. 26

[8] Darwiche, A., Compiling devices: A structure-based approach, Proc. Int’l Conf. on Principles of Knowledge Representation and Reasoning (KR98), Morgan-Kaufmann, San Francisco (1998), 156–166. [9] Darwiche, A., Model based diagnosis using structured system descriptions, Journal of A.I. Research, 8, 165-222. [10] Darwiche, A., Decomposable negation normal form, J.ACM 48,4 (2001), 608–647. [11] Darwiche, A. and Marquis, P., A knowledge compilation map, J. of AI Research 17 (2002), 229– 264. [12] Davis, M. and Putnam, H. A computing procedure for quantification theory. J.ACM, 7 (1960), 201–215. [13] de Kleer, J. Focusing on probable diagnosis. Proc. of AAAI-91, Anaheim, CA, pages 842–848, 1991. [14] de Kleer, J. An improved incremental algorithm for computing prime implicants. Proceedings of AAAI-92, San Jose, CA, (1992) 780–785. [15] de Kleer, J., Mackworth, A. K., and Reiter, R. Characterizing diagnoses and systems, Artificial Intelligence, 32 (1987), 97–130. [16] de Kleer, J. and Williams, B. Diagnosing multiple faults, Artificial Intelligence, 56:197–222, 1987. [17] Fitting, M., First-Order Logic and Automated Theorem Proving (2nd ed.), Springer-Verlag, New York, (1996). [18] Forbus, K.D. and de Kleer, J., Building Problem Solvers, MIT Press, Cambridge, Mass. (1993). [19] Gomes, C. P., Selman, B., and Kautz, H., Boosting combinatorial search through randomization, Proc. AAAI-98 (1998), Madison, Wisconsin, 431-437. [20] H¨ahnle, R., Murray N.V., and Rosenthal, E. Normal Forms for Knowledge Compilation. Proceedings of the International Symposium on Methodologies for Intelligent Systems, (ISMIS ‘05, Z. Ras ed.), Lecture Notes in Computer Science, Springer (to appear). [21] Hai, L. and Jigui, S., Knowledge compilation using the extension rule, J. Automated Reasoning, 32(2), 93-102, 2004. [22] Jackson, P. and Pais, J., Computing prime implicants. Proceedings of the 10th International Conference on Automated Deductions, Kaiserslautern, Germany, July, 1990. In Lecture Notes in Artificial Intelligence, Springer-Verlag, Vol. 449 (1990), 543-557.

27

[23] Jackson, P. Computing prime implicants incrementally. Proceedings of the 11th International Conference on Automated Deduction, Saratoga Springs, NY, June, 1992. In Lecture Notes in Artificial Intelligence, Springer-Verlag, Vol. 607 (1992) 253-267. [24] Kautz, H. and Selman, B., A general framework for knowledge compilation, in Proceedings of the International Workshop on Processing Declarative Knowledge (PDK), Kaiserslautern, Germany (July, 1991). [25] Kautz, H. and Selman, B., Proceedings of the Workshop on Theory and Applications of Satisfiability Testing, Electronic Notes in Discrete Mathematics 9 (2001), Elsevier Science Publishers. [26] Kean, A. and Tsiknis, G. An incremental method for generating prime implicants/implicates. Journal of Symbolic Computation 9 (1990), 185-206. [27] Kean, A. and Tsiknis, G. Assumption based reasoning and clause management systems. Computational Intelligence 8,1 (1992), 1–24. [28] Loveland, D.W. Automated Theorem Proving: A Logical Basis. North-Holland, New York, (1978). [29] Marquis, P., Knowledge compilation using theory prime implicates, Proc. Int’l Joint Conf. on Artificial Intelligence (IJCAI) (1995), Morgan-Kaufmann, San Mateo, Calif, 837-843. [30] Morrison, D.R. PATRICIA — practical algorithm to retrieve information coded in alphanumeric. Journal of the ACM, 15,4, 514–34, 1968. [31] Murray, N.V. and Rosenthal, E. Inference with path resolution and semantic graphs. J. ACM 34,2 (1987), 225–254. [32] Murray, N.V. and Rosenthal, E. Dissolution: making paths vanish. J.ACM 40,3 (July 1993), 504– 535. [33] Murray, N.V. and Rosenthal, E. On the relative merits of path dissolution and the method of analytic tableaux, Theoretical Computer Science 131 (1994), 1-28. [34] Murray, N.V. and Ramesh, A. An application of non-clausal deduction in diagnosis. Proceedings of the Eighth International Symposium on Artificial Intelligence, Monterrey, Mexico, October 17-20, 1995, 378–385. [35] Murray, N.V., Ramesh, A. and Rosenthal, E. The semi-resolution inference rule and prime implicate computations. Proceedings of the Fourth Golden West International Conference on Intelligent Systems, San Fransisco, CA, (June 1995) 153–158. [36] Murray, N.V. and Rosenthal, E. “Tableaux, Path Dissolution, and Decomposable Negation Normal Form for Knowledge Compilation.” Proceedings of the International Conference TABLEAUX 2003 – Analytic Tableaux and Related Methods, Rome, Italy, September 2003. In Lecture Notes in Artificial Intelligence, Springer-Verlag, Vol. 2796, 165-180. 28

[37] Murray N.V. and Rosenthal, E. Duality in Knowledge Compilation Techniques. Proceedings of the International Symposium on Methodologies for Intelligent Systems, (ISMIS ‘05, Z. Ras ed.), Lecture Notes in Computer Science, Springer (to appear). [38] Murray N.V. and Rosenthal, E. Duality in Knowledge Compilation Techniques. Proceedings of the International Symposium on Methodologies for Intelligent Systems - ISMIS 2005 Saratoga Springs, NY, May 2005. In Lecture Notes in Artificial Intelligence, Springer-Verlag, Vol. 3488, 182-190. [39] Murray N.V., and Rosenthal, E. Efficient query processing with compiled knowledge bases. Proceedings of the International Conference TABLEAUX 2005 – Analytic Tableaux and Related Methods, Koblenz, Germany, September 2005. In Lecture Notes in Artificial Intelligence, Springer-Verlag, Vol. 3702, 231-244, 2005. [40] Murray, N.V. and Rosenthal, E. Efficient query processing with reduced implicate tries. To appear, Journal of Automated Reasoning. [41] Murray N.V., and Rosenthal, E. Updating reduced implicate tries. Proceedings of the International Conference TABLEAUX 2007 – Analytic Tableaux and Related Methods, Aix en Provence, France, July 2007. In Lecture Notes in Artificial Intelligence, Springer-Verlag, Vol. 4548, 183-198, 2007. [42] Nelson, R. J. ‘Simplest normal truth functions’, Journal of Symbolic Logic 20, 105-108 (1955). [43] Ngair, T. A new algorithm for incremental prime implicate generation. Proc of IJCAI-93, Chambery, France, (1993). [44] Przymusinski, T. C. An algorithm to compute circumscription. Artificial Intelligence 38 (1989), 49-73. [45] Ramesh, A. D-trie: A new data structure for a collection of minimal sets. Technical Report SUNYA-CS-95-04, December 28, 1995. [46] Ramesh, A., Becker, G. and Murray, N.V. CNF and DNF considered harmful for computing prime implicants/implicates. Journal of Automated Reasoning 18,3 (1997), Kluwer, 337–356. [47] Ramesh, A. and Murray, N.V. An application of non-clausal deduction in diagnosis. Expert Systems with Applications 12,1 (1997), 119-126. [48] Ramesh, A. and Murray, N.V. Parameterized Prime Implicant/Implicate Computations for Regular Logics. Mathware & Soft Computing, Special Issue on Deduction in Many-Valued Logics IV(2):155-179, 1997. [49] Reiter, R. A theory of diagnosis from first principles. Artificial Intelligence, 32 (1987), 57-95.

29

[50] Reiter, R. and de Kleer, J. Foundations of assumption-based truth maintenance systems: preliminary report. Proceedings of the 6th National Conference on Artificial Intelligence, Seattle, WA, (July 12-17, 1987), 183-188. [51] Selman, B., and Kautz, H., Knowledge compilation and theory approximation, J.ACM 43,2 (1996), 193-224. [52] Selman, B., Kautz, H., and McAllester, D., Ten challenges in propositional reasoning and search, Proc of IJCAI-97, Aichi, Japan (1997). [53] Sieling, D., and Wegener, I., Graph driven BDDs – a new data structure for Boolean functions, Theor. Comp. Sci. 141 (1995), 283-310. [54] Slagle, J. R., Chang, C. L. and Lee, R. C. T. A new algorithm for generating prime implicants. IEEE transactions on Computers C-19(4) (1970), 304-310. [55] Strzemecki, T. Polynomial-time algorithm for generation of prime implicants. Journal of Complexity 8 (1992), 37-63. [56] Ullman, J.D. Principles of Database Systems, Computer Science Press, Rockville, 1982.

30