In 3] it was shown that the falsi ability problem for PIF{formulas with every variable ... In this paper we present some special classes of non clausal propositional Boolean ... k-SAT, where all clauses are restricted to length k or less, is solvable in ...
Complexity Results of Subclasses of the Pure Implicational Calculus Peter Heusch Marc-Andre Lemburg Ewald Speckenmeyer Universitat zu Koln Institut fur Informatik Abstract About 50 years ago Lukasiewicz, Tarski, see [6, 4] and others studied the implicational calculus, i. e. the set PIF (pure implicational formulas) of those propositional formulas that are constructed exclusively from Boolean variables and the propositional implication as the only connective. Obviously this class of formulas is not able to represent all Boolean functions. While every formula in PIF is satis able, the falsi ability problem remains NP{complete. In [3] it was shown that the falsi ability problem for PIF{formulas with every variable occurring at most twice besides one variable z occurring an arbitrary number of times remains NP{complete, it is solvable in polynomial time O(nk ) when the number of occurrences of z is bounded by k. In this paper we present some further subclasses of PIF{formulas for which the falsi ability problem remains NP{complete, e. g. for those PIF{formulas F where every implicant on the path from the root of the tree that corresponds to F to the rightmost variable contains at most 3 variables. If however subformulas of type u (v w) are forbidden, then the falsi ability problem for this class is solvable in linear time. We present some further classes of PIF{formulas where falsi ability is solvable in polynomial time. !
!
!
1 Introduction The Satis ability problem for propositional Boolean formulas in conjunctive normal form (CNF) is the classical NP{complete problem, its complexity has been analysed w. r. t. many dierent parameters. Nevertheless, there is still a considerable amount of research to be done, since there are few results only on the complexity of the satis ability problem
for non clausal propositional formulas. This is remarkable, since Boolean formulas not in CNF had been studied for a long time, see [6, 4]. Lukasiewicz [4] proved about 50 years ago that all propositional tautologies in PIF can be derived by substitution and modus ponens from the single axiom ((p q) r) ((r p) (s p)). The use of non clausal Boolean formulas in logic programming showed that many applications are much easier and clearer formulated by non clausal formulas than by clausal formulas, hence it is not surprising that one of the few works on this subject was done by researchers from the logic programming community, see [7] e. g. In this paper we present some special classes of non clausal propositional Boolean formulas, where all formulas are constructed by using the implication as the only connective to connect Boolean variables. Formulas of this type are often used in PROLOG{type programs, where they are created from (existentially) quanti ed formulas by a uni cation process. The reason to look at this type of formulas was that the SAT-problem for Boolean formulas in CNF shows a form of threshold behaviour when its inputs are restricted to get easier solvable formulas: either we have linear run time or the restricted problem still remains NP-complete. Two examples for this are k-SAT and Horn formulas. While k-SAT, where all clauses are restricted to length k or less, is solvable in linear time for k 2, it becomes NP-complete for k 3. A similar property holds for Horn, which is the class of all formulas in CNF where any clause contains at most one positive literal: Horn formulas are linear time testable for satis ability while for formulas in CNF where any clause may contain at most two positive literals, the satis ability problem becomes NP-complete. One of the few exceptions to this rule are the extended Horn formulas of Gallo et al, see [2] for the de nition. The extended Horn formulas partition the set of all CNF-formulas into a set of subsets p1; such that formulas in pi can be tested for satis ability in time O(ni ). The disadvantage of this partitioning scheme is that the test whether some formula belongs to a certain class pi needs time O(ni ), too, hence by the time we know how dicult the problem of satisfying some particular formula is, we could have solved the problem. The remaining part of our paper will be organized as follows: After introducing the notation we will compare our class of formulas to van Gelders AND{OR{formulas. We then present a basic algorithm that tests our implicational formulas for falsi ability and explain how this algorithm uses a substitution mechanism to compute its result. We also explain how this algorithm can be improved to have polynomial run time when applied to some commonly used classes of inputs (Horn clauses, 2{SAT). These improvements !
!
!
!
!
!
include the detection of unit{clauses and autark assignments of variables, as well as some optimizations where the result of an implication can be computed without looking at all of its parts.
2 The pure implicational calculus 2.1 De nition In the sequel we will only treat formulas in pure implicatical form (PIF) i. e. those formulas that are being constructed from a set of variables by means of the propositional implication ( ) as the only connective. For a given implication A B we call A the implicant and B the consequence of the implication, the implication evaluates to 0, if A evaluates to 1 and B evaluates to 0, otherwise it evaluates to 1. An assignment to the variables of a formula F is also said to satisfy (falsify) F if F evaluates to 1 (0) under the given assignment. Since the implication is not associative, every expression has to be fully parenthesized. To ease readability we choose A B C D as an abbreviation for A (B (C D)) i. e. we place parentheses as far to the right as possible. This indeterminism is also completely avoided by writing down a PIF as a binary tree whose inner nodes are labeled with the operator and whose leaves are labeled with the variables. There is an obvious correspondence between the nodes of the tree and the subformulas. This tree representation also allows for straightforward de nition of some functions associated with PIF{formulas. For any node V 0 in the tree representation of a PIF-formula F represented by a node V (and for its corresponding subformula F 0) we de ne the left distance Dl (V 0) of V 0 as the number of left edges on the way from V to V 0. The rightmost variable rightmost(F 0) of F 0 is de ned to be that variable in the subformula F 0 which corresponds to the rightmost literal if F 0 is written down the usual way as a Boolean expression. We will see later that this variable plays an important role when testing a PIF{formula for falsi ability. For any subformula F 0 of F that is an implicant of its immediate predecessor (i.e. whose corresponding node is the left successor of its predecessor), we de ne the backbone of F 0 as the set of all those nodes in the tree corresponding to F 0 whose left distance equals Dl(F 0); for F itself, the backbone is the set of all nodes with leftdistance 0. The set (F ) by de nition contains the backbones of F and all its subformulas. We call a backbone a 0-backbone if all of its nodes have even leftdistance, otherwise we call it a 1-backbone. In the explanation of the falsi ability algorithm it will be shown that a 0{backbone remains !
!
!
!
!
!
!
!
!
B
a 0{backbone and a 1-backbone remains a 1-backbone.
2.2 Properties of the implication Even though the propositional implication is neither associative nor commutative it has some interesting properties that allow for certain straigthforward optimizations when trying to falsify a Boolean formula in PIF. These optimizations can be carried out in constant time while trying to falsify a Boolean formula in PIF, when using appropriate data and convenient information gathering in a preprocessing step. These data structures mainly have to guarantee that all occurrences of a given variable can be found within time proportional to the number of occurrences. Assuming that t is a variable that evaluates to 1 and f , f 0 are variables evaluating to 0, these optimizations include:
a t t a f a (a f ) !
!
!
!
!
f0
is replaced by t is replaced by a; is replaced by t is replaced by a
We note that optimizations may be cascaded i. e. the application of one optimization may result in another optimization being applicable as follows:
F F
1
z
Figure 1 Setting z to 0 satis es the whole subformula given by F , since the implication whose implicant is z evaluates to 1, which in turn causes all of its predecessors up to F to evaluate to 1, too.
2.3 PIF and other classes of Boolean formulas In [7], Allen van Gelder introduced the class of succinct Boolean AND{OR{formulas, also known as negation normal form (NNF) formulas. These formulas are de ned by the following properties: 1. The tree contains a truth-value node (empty AND or OR ) if and only if it is the only node of the tree 2. No AND has an AND for a child, no OR has an OR for a child 3. Every operation node has at least two children (unless 1 applies) 4. No operation node has two imediate leaves as children that both represent the same variable For succint Boolean AND/OR{formulas formulas, van Gelder proved that a simple divide and conquer algorithm, augmented by a clever technique to determine the next variable used for dividing the problem into two subproblems, is able to solve the satis ability problem in time O((1:18 + ")L), where L denotes the number of occurrences of variables in the formula. Since van Gelders formulas may use AND, OR and NOT as Boolean connectives (the latter ones moved down to the leaves by application of De Morgan's rules, yielding NNF-type formulas), they can represent any Boolean function and are not limited to represent special Boolean functions like PIF{formulas, that can only represent Boolean formulas with a positive prime implicant of length at most 1, see [1]. Nonetheless the two types are closely related: PIF-formulas can easily be transformed into equivalent NNF-formulas by using the tree representation and the following mapping: node-type leftdistance from to inner even OR inner odd AND leaf even variable x literal x leaf odd variable x literal x Table 1 ! !
When these transformation rules are applied to the formula (a b) (c d), the leaves corresponding to b and c and the rst have odd leftdistance, all other nodes have even leftdistance, hence (a AND b) OR (c OR d) is the resulting formula in NNF. !
!
!
!
3 The basic falsi ability algorithm We now are ready to introduce a basic falsi ability tester for Boolean formulas in PIF. The algorithm uses backtracking to nd a falsifying solution, i. e. to nd an assignment of 0; 1 to the Boolean variables such that all backbone implicants evaluate to 1 while the rightmost variable evaluates to 0. If the rightmost variable rightmost(F ) of a formula F does not appear as a rightmost variable in any backbone implicant, this goal is easily obtained by assigning 0 to rightmost(F ) and 1 to any other variable. However, if rightmost(F ) appears as a rightmost variable in any of the backbone implicants, we would falsify this backbone implicant by the trivial assignment, thereby satisfying the whole formula. Hence for any backbone implicant F 0 that has rightmost(F ) or some other variable with value 0 as its rightmost variable, we must try to evaluate one of the backbone implicants F 00 of F 0 to 0, too. If there is no such F 00, the search terminates without success, otherwise the algorithm proceeds by cutting o F 0 from the formula and replacing rightmost(F ) by F 00. This is illustrated by the drawing given below: F
1
F
F
1
2
F
=
3
)
j
2
F
F’1
F
F
3
F’i F’k
F
z
j-1
F
z
k
F
j+1
F’i
Figure 2 Its runtime can be improved by applying some simple modi cations as the last substep in every backtracking step. In this improvement we can check whether any of these variables that occurred in F 0 but not in F 00 now occur with odd (even) left distance only. In this case we may safely set it to 1 (0) and perhaps optimize some further formulas. Another possible optimization consists of testing whether some backbone implicant of F 00 has size 1, in this case its corresponding variable must be set to 1 to falsify the formula. The latter test must be carried out anyway, since if there is some backbone implicant F 000 of F 00 of size 1, we may terminate the backtracking if the corresponding variable has been assigned the value 0, such a formula cannot be falsi able.
Each of these tasks can be carried out in constant time for formulas from many important subclasses of PIF, e. g. if:
the number of occurrences of every variable is bounded by a constant the length of every backbone (except the backbone of the input formula) is bounded by a constant.
This involves some preprocessing to construct suitable data structures which allow testing as well as updating in constant time (in the special cases) or in time depending on n (in the general case). The falsifying algorithm can now be formulated as follows:
Algorithm PIF solve (F ,Z ) Input: Boolean formula F in PIF,
a set Z of variables that must be set 0 to falsify the formula (initially Z = rightmost(F ) ) Output: A falsifying assigment, if one exists begin if no backbone implicant of F contains a variable from Z then output Z and exit else begin let = F F is backbone implicant ; rightmost(F ) Z select one backbone implicant F 0 from . let F s := F F 0 delete F 0 from F for every backbone implicant F 00 of F 0 do let F t := F s with rightmost(F s) replaced by F 00 call PIF solve(F t; Z rightmost(F 00)) done end end f
C
f
g
j
2
g
C
n
f
g
[
This algorithm systematically tests all possible ways to falsify its input F . We note that for non{falsi able formulas (i. e. tautologies) the algorithm produces no output at all. The termination of the algorithm follows from the fact that if we have some variable in z as an backbone implicant of F and we select this implicant of size 1 as F 0 then PIF solve does not get called recursively.
4 Runtime of PIF solve Since the falsi ability problem for general Boolean formulas is NP-complete, there is probably no polynomial bound on the runtime of the basic falsi ability algorithm applied to an arbitrary formula in PIF. In order to get polynomial runtime, we must restrict the class of inputs to be allowed. One possibile approach consists of restricting the number of occurrences of any variable (except for the rightmost) by 2, the rightmost variable may occur at most k times. We call this class Imp(2; k). For these formulas the following result has been shown in [3]:
Theorem 1 Let F Imp(2; k) contain n dierent variables. Then the runtime of the 2
basic falsi ability algorithm is bounded by O(nk ).
The runtime follows from the fact that the number of formulas in (as de ned in PIF solve) never exceeds k, hence there are at most O(nk ) possible choices for . Since any backtracking step removes at least one formula from (it can also insert some) there are at most O(nk ) backtracking steps. If there is no restriction on the allowed inputs, PIF solve can have a runtime that is exponential in the number of variable occurrences, to get an exponential runtime in the number of variables we must maintain the set Z of variables that must not have assigned the value 0 to falsify the formula. A variable z0 is added to Z when the formula turned out to be a tautology with z0 = 0. We then only backtrack if the rightmost variable is not contained in Z to avoid branching twice at the same variable. With this modi cation, PIF solve has a runtime of O(2n ) where n is the number of variables in the input formula. C
C
C
5 Simple reductions As we have seen in the previous section, formulas in PIF can be transfered to NNFformulas in a very simple manner. This immediatly leeds to the question, how the simple yet powerful reduction tools like pure literal rule and unit clause rule that exist for NNFformulas can be adapted for PIF-formulas. We will see that both cases have easily recognizeable correspondances in PIF. Even the concept of autarkie, see [5], can be restated for PIF-formulas, yielding a powerful tool for cutting o large parts of the search-tree.
5.1 Pure literals A pure literal l in a NNF-formula F is a literal which only occurs either positive or negative in the formula. It is obvious that if we want to satisfy the formula, this literal has to be set to 1, and if want to falsify F , the literal must be set to 0. When transforming a PIF-formula G into an equivalent NNF-formula F the sign of the literals only depends on the leftdistance of the original variables in G. A pure literal l in F is generated, i either all occurences of the variable corresponding to l in G have odd leftdistance, or all occurences of the variable corresponding to l have even leftdistance.
5.2 Unit clauses A unit clause in a NNF-formula is a clause consisting of only one literal. If the clauses are connected by AND and we want to satisfy the formula, the literal obviously has to be set to 1. As the transformation shows, clauses in NNF correspond to backbone-implicants in PIF. Positive unit-clauses are simply backbone-implicants consisting of a single variable. In the process of falsifying the PIF-formula this variable must be set to 1. On the other hand, negative unit clauses cannot be recognized immediatly, since negation is not directly possible in PIF. A backbone-implicant (a b) generates a negative unit clause, i falsifying the formula implies, that b has to be set to 0. In this case, (a b) would be equivalent to (a) in the transformed formula. The latter is automatically implied, if b is the rightmost variable in the PIF-formula. !
!
5.3 Autarkie We call a (partial) truth-assignment A autark for a formula F in conjunctive normal form, i every clause that contains a literal which evaluates to true or false under A gets satis ed by A. In this case we de ne F (A) to be the set of clauses not satis ed by A, it is clear that F (A) is satis able i F is. The same eect exists for PIF-formulas. When falsifying a PIF-formula, every backboneimplicant has to be simpli ed to 1, since this is the only way for a 0 to propagate from the rightmost variable of the formula up the backbone to its root and thereby falsifying the whole formula. A truth-assignment causing this for every backbone-implicant it
touches, produces a falsi ability equivalent subformula, resulting in the same implication mentioned above for CNF-formulas. Autarkie will later be used to solve 2-SAT, expressed in PIF, in polynomial time.
6 An enhanced PIF-Solver PIF solve can be seen as a prototype algorithm. The following enhancements are basically natural extensions that include the already mentioned variant of the unit-resolution and autarkie-tests. The algorithm exploits these enhancments in order to handle well-known but not trivial input classes (e.g. transformed 2-SAT formulas) in polynomial time. For F in PIF we call BI (F ) = B (F ) : Dl(B ) = Dl (F ) + 1 the set of backboneimplicants of F 0. Note that BI (F ) is empty, i F consists of a single variable x, yet even in this case there is a rightmost variable rightmost(F ) = x. f
2 B
g
Algorithm ePIF-Solve(F ) : Input: F 2 PIF using variables Var Output: the falsifying assignment, if F is falsi able; no output if it is not Remarks: uses a global variable Forig Procedure solve(F; S0; S1) : Input: formula F 2 PIF over variables Var, the set S0 Lit(Var) of false variables and the set S1 Lit(Var) of true variables Output: the falsifying assignment, if F is falsi able; no output if it is not 1. repeat the following reduction-steps as long as they apply (a) 8F 0 2 BI (F ) mit F 0 = y : (positive unit-resolution) F := F n F 0 , S1 := S1 [ fyg (b) S0 \ S1 6= ;: (contradiction) leave solve (c) 8B 2 B(F ) with rightmost(B ) 2 S1 : (leaf-reduction) remove B from F (d) 8B 2 B(F ) with BI (B ) \ S0 6= ;: (implicant-reduction) remove B from F
2. 3. 4. 5. 6.
(e) if F = , (formula solved) print the solution S1 [ S0 and halt if F = x, let S0 = S0 [ fxg, print the solution S1 [ S0 und halt suppose F = F0 ! : : :Fj ! : : :Fm ! x if the global variable Forig hasn't been set yet, save F in the global variable Forig let S0 = S0 [ fxg S if S0 \ i BlattSF = ;, let S1 := S1 [ i BlattF , print the solution S1 [ S0 and halt choose (deterministic) a shortest Fj 2 BI (F ) with rightmost(Fj ) 2 S0 . suppose Fj = Fj;0 ! : : :Fj;k ! y for all i 2 f0; : : :; kg: (a) let Fi0 := F0 ! : : :Fj ?1 ! Fj +1 ! : : :Fm ! Fj;i (b) if k = 0, ('neg. unit-res.') set F = F00 and goto step 1 of solve (c) call solve(Fi0 ; S0; S1) recursively if 8Fi 2 BI (F ) : Fi 2 BI (Forig ), (test for autarkie) halt i
i
7. 8. 9.
10.
1. if F = , halt 2. call solve(F; ;; ;)
6.1 Correctness Since the enhancements to the original algorithm apply locally we only focus on the modi cations and argue for their correctness. The rest immediately follows from the correctness of the original algorithm PIF solve. If F = , the algorithm stops correctly without output. Now let F = F0 Fm x PIF and suppose the correctness for formulas of smaller size has already been shown. In step 1, we have added the well-known unit-resolution and two cut-reductions, introduced by van Gelder for AND-OR-trees in [7] (Dominance-Lemma 6.2), which we have adopted to the PIF-formalism. ! !
2
!
Step 1(a) contains the positive part of the unit-resolution, while step 9(b) can be interpreted as the negative part. In 1(a) we simply use the fact that 1 F 0 = F 0, in 9(b), (F 0 0) 0 = F 0. The latter is not an exact mapping of unit-resolution to the PIFcalcule but essentially works in the same manner: the falsi ability of the whole formula F depends on the falsi ability of the subformula F 0, much like a negative unit-clause in a CNF-formula induces a zero assignment for its variable to make the the whole formula satis able. In step 1(b) we make an early test for contradiction. Since all variables so far included in S0 must be set to 0 and all in S1 to 1, to guarantee the falsi ablity of the formula, having a variable in both sets obviously results in a contradiction. In step 4 we save a normalized state of the input formula, so that we can apply a test for autarkie in step 10. By doing this, we can ignore some nasty special cases while still working with a falsi ablity equivalent version of the input formula. If the test in step 10 succeeds, none of the formulas in 9 were falsi able and since F in this case is subset of Forig , e.g. falsi ability equivalent to Forig , this proves the input formula to be non-falsi able. !
!
!
6.2 Simple formulas We will show, that ePIF-Solve can handle 2-SAT, if the formulas are transfered to PIF in the following way: let z be a new variable and construct a PIF-formula with rightmost variable z and backbone-implicants derived from the clauses of the CNF-representation, i.e.: CNF-clause PIF-backbone-implicant a a a a z a; b (a z) b a; b a b a; b a b z f g f g
!
f
g
f
g
f
g
!
!
!
!
!
It can be easily seen, that the resulting PIF-formula is falsi able, i the corresponding CNF-formula is satis able.
Theorem 2 Let F PIF be a n-variable 2-SAT-instance transformed by the above trans2
formation into PIF-notation. Then ePIF-Solve solves F in time O(n2 ).
Proof. Since the transformation works on a clause basis, we'll call the backbone-implicants in F clauses for sake of simplicity. Now when ePIF-Solve is called, rst all the positive and negative unit-clauses in F are eliminated by steps 1 and 9(b). While doing so, all other clauses that contain variables set in this process are reduced and interpreted along the way. As a result, the formula F will only contain clauses of the form a ! b ! z, (a ! z) ! c and a ! b (where z is the rightmost variable in F and a; b are some other variables) by the time the algorithm gets to step 9(c). Furthermore all variables in F - except for z - are not yet set and the clauses in F are all original clauses from the normalized input formula Forig , meaning that the assignment so far made is autark with respect to Forig . So after all the calls in 9(c) have been made, the test in step 10 will stop the algorithm. Since in every recursive call at least one new variable is set and eliminated from the formula, the algorithm stops after O(n) calls with every call using up at most O(n) time.
2
7 Other classes of formulas Even though the falsi ability problem for a number of classes of Boolean formulas in PIF proved to be solvable in polynomial time, there are (not surprising) a lot of classes for which the falsi ability problem turns out to be NP-complete and the border between those is sometimes very thin. One example is the set of formulas in PIF where any backbone implicant contains at most 3 variables, i.e. all backbone implicants must have one of the following shapes:
These formulas will be called PIFBBF 3. When transforming 2-SAT to PIF, we also get formulas from PIFBBF 3, however when transforming a formula in 2-SAT to a formula F in PIF, the shape in the middle always contains rightmost(F ) as its rightmost variable. When this restriction is removed, the falsi ability problem for this formulas becomes NP-complete:
Theorem 3 Let F be the set of Boolean formula in PIF such that no backbone implicant
contains of any formula in F contains more than 3 variables. Then the falsi ability problem for F is NP-complete.
Proof: The proof is done by reduction of the NP-complete 3-SAT problem for Boolean
formulas in CNF. Let F be an instance of the 3-SAT problem. Then the set of clauses in F can be partitioned into 4 subsets, depending on the number of negated literals (0-3). Any clause can then be transformed into one or more implicational formulas such that any assignment satisfying the original clause can be extended to satisfy all implicational formulas simultaneously. The transformation is as follows: is replaced by a0 (a0 (b0 a b c is replaced by a (b0 a b c is replaced by a a b c is replaced by a (c0
a b c _
:
_
_
!
_
(b0 z) z) (b0 z) (b (b z)
c) a; a (a0 z) b; b (b0 z) c) b; b (b0 z) c) c0 ) c; c (c0 z)
!
!
!
!
!
!
!
:
_:
_
!
:
_:
_ :
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
The resulting formula can be transformed back to a CNF-formula that is a conjunction of 2-SAT clauses and de nite Horn clauses of length 3, this proves the following:
Corollar 1 Let M be the set of all CNF-formulas that are conjunctions of clauses of length 2 (2-SAT) and de nite Horn clauses of length 3. Then SAT is NP-complete for M.
References [1] J. Franco, J. Goldsmith, J. Schlipf, and E. Speckenmeyer. An algorithm for the class of pure implicational formulas. In Working Paper, 1996. [2] G. Gallo and M.G. Scutella. Polynomially Solvable Satis ability Problems. Information Processing Letters, 29(5):221{227, 1988.
[3] P. Heusch. The Complexity of the Falsi ability Problem for Pure Implicational Formulas. In Proc. 20th Conf. Math. Foundations of Computer Science 1995 (MFCS '95), pages 221{227. Springer Verlag (LNCS 969), 1995. [4] J. Lukasiewicz. The shortest axiom of the implicational calculus of propositions. Proc. Irish Acad., 52:25{33, 1948. [5] B. Monien and E. Speckenmeyer. Solving Satis ability in Less than 2n Steps. Discrete Applied Math., 10(3):287{295, 1985. [6] A. Tarski and J. Lukasiewicz. Untersuchungen uber den Aussagenkalkul. Comptes Rendus Soc. Sci. Varsovie Classe III, 23:30{50, 1930. [7] A. van Gelder. A Satis ability Tester for Non-clausal Propositional Calculus. Information and Computation, 79(1):1{21, 1988.