Sep 9, 2005 - the expressive power of these generalized formulas, his classification lead to a number of interesting ... machine as a formula in prefix notation.
The Complexity of the Boolean Formula Value Problem Henning Schnoor September 9, 2005 Abstract We examine the complexity of the formula value problem for Boolean formulas, which is the following decision problem: Given a Boolean formula without variables, does it evaluate to true? We show that the complexity of this problem is determined by certain closure properties of the connectives allowed to build the formula, and achieve a complete classification: The formula value problem is either in LOGTIME, complete for one of the classes NLOGTIME, coNLOGTIME or NC1 , or equivalent to counting modulo 2 under very strict reductions.
1
Introduction
Boolean formulas have played an important role in complexity theory for a long time. Many standard complexity classes are related to Boolean formulas in some way, mostly by different flavours of satisfiablity problems, and many of the key questions in complexity theory can be formulated in this context. Asking the famous ”Is P = NP?” question is the same as asking ”Is there a polynomial-time satisfiability test for Boolean formulas?” [Coo71]. Standard Boolean formulas are built using only the connectives ∧, ∨ and ¬. A more general definition allows arbitrary finitary Boolean functions as connectives. Emil Post [Pos41] examined the expressive power of these generalized formulas, his classification lead to a number of interesting complexity results (see [BCRV03] for an overview). Many problems connected to Boolean formulas can be parameterized by restricting the set of connectives allowed to build the formula. In most cases it turns out that the complexity of the problem at hand only depends on the closure properties of the connectives, i.e. the generated clone in the ”Post sense”. Examples for classifications achieved in this way are the satisfiability problem, which is dichotomic: Depending on the connectives, it is either solvable in polynomial time or NPcomplete ([Lew79]). Other problems studied in this way are the Circuit Value problem ([RW00]), which is related to the formula value problem discussed in this work, the equivalence problem for Boolean formulas and circuits, truth evaluation for quantified Boolean circuits, and optimization problems like maximum lexicographic satisfiability ([Rei01]). The relative complexity of the clones has been studied in [BS05]. The formula value problem. which is the problem to determine the truth value of a variablefree Boolean formula, is perhaps the most basic problem in this context. It plays a crucial role in most algorithms for the aforementioned problems. Similar to the problems mentioned above, the formula value problem can be restricted to formulas allowing only certain sets of connectives. In
1
this work we show that the complexity of this problem only depends on the clone generated by the connectives, and achive a complete classification. It is obvious that the formula value problem is always solvable in polynomial time. In [Bus87], Buss showed that the formula value problem can be solved in NC1 , when we only allow ∨, ∧ and ¬ as connectives, and the formulas are given in infix notation. From Beaudry and McKenzie’s work in [BM95], it follows directly that the problem remains in NC1 if we restrict ourselves to binary connectives and the formulas are presented in infix notation. In this paper we generalize these results. Since we do not restrict ourselves to binary operations, we consider formulas in prefix notation, and show that for any set of connectives, the formula value problem still lies in NC1 . In its most general form, the problem is complete for this class. A contribution of this work is the generalization of a lemma from [Lew79]: For ”most” sets B which can be used to build the important functions AND, OR and negation, we can find ”short” formulas representing these connectives. This fact is very useful when considering the complexity of problems related to Post’s lattice, since it allows certain constructions without having to worry about a combinatorical ”explosion”. Technically, the most interesting results in this paper are the aforementioned short formulas and the logtime reduction representing an alternating Turing machine as a formula in prefix notation.
2 2.1
Preliminaries Propositional formulas
A standard way of representing Boolean functions is to write them as propositional formulas, like x ∨ (y ∧ z). Usually, only ∨, ∧, and negation are used in formulas, since every Boolean function can be represented using these three connectives. In this paper, we examine more general propositional formulas, using arbitrary functions from a set B as connectives, so-called B-formulas. Usually, B-formulas are defined as a special case of a Boolean circuit, where each gate has a fan-out of at most one. In this definition, it is possible to have ”syntactically irrelevant” components, i.e. gates which are not connected to the output gate. As long as we consider only complexity classes above L, this does not add to the complexity of any problem we want to study, since we can determine in logarithmic space [Rei04] if a given gate in a circuit or formula is connected to the output gate. For the formula value problem, we are interested in lower complexity classes in which this test cannot be performed. Therefore, we consider formulas as their more natural representation in prefix notation—in our general setting allowing operations with arbitrary arity, the usually studied infix notation cannot be applied. For a set B of functions on a finite domain D, we define B-formulas as usual: Every variable x is a B-formula. If g is a symbol for an n-ary function from B, and f1 , . . . , fn are B-formulas, then gf1 . . . fn is a B-formula. A formula f (x1 , . . . , xn ) is a formula where the occuring variables are a subset of {x1 , . . . , xn }. For a formula f (x1 , . . . , xn ) and formulas t1 , . . . , tn , let f [x1 /t1 , . . . , xn /tn ] be the formula obtained from f by simultaneously replacing every occurance of xi with ti . We say a formula f (x1 , . . . , xn ) represents the n-ary function g if f [x1 /α1 , . . . , xn /αn ] evaluates to g(α1 , . . . , αn ) for all α1 , . . . , αn ∈ D. A variable-free B-formula is a B-formula where all variables are replaced with constants from D. These constants are called input values for the formula. If D is the Boolean domain, then we say a variable-free B-formula is true if it evaluates to 1 in the usual sense. We now define the formula value problem for a set B of Boolean functions:
2
VALF (B) = {f | f is a variable-free B-formula and f is true}. We will assume our sets B to contain only functions that have no irrelevant variables (for a function f (x1 , . . . , xn ), the variable xi is irrelevant if there is some function g such that f (x1 , . . . , xn ) = g(x1 , . . . , xi−1 , xi+1 , . . . , xn )). If we allow fictive variables, our algorithms need to verify the relevance of a given input value, and the resources needed for this operation would dominate the complexity of the decision problem. We also assume that there is a symbol for the unary identity in B, and finally we assume the input for our algorithms to be correct B-formulas. To verify this essentially requires recursively counting the number of arguments to a function, and counting is complete for the class TC0 . In most cases, the actual test liesin a lower complexity class. So, strictly speaking, we consider the formula value problem as a promise problem, with the syntactical correctness of the input as promise.
2.2
Boolean functions and clones
A set B of Boolean functions is called a clone if it contains all identity functions and is closed under permutation, identiBF fication of variables, and arbitrary composition. It is easy to see that the set of clones forms a lattice, which has been completely classified by Emil Post [Pos41] for the Boolean case. The Boolean clones are also known as Post’s lattice. For a M L set B of Boolean functions, let [B] be the smallest clone containing B. It is clear that [B] is the set of Boolean functions which can be represented by Boolean formulas using only connectives from B. V E N We introduce a few properties of Boolean functions which define the relevant clones for this work. As we will see later, we are only interested in clones which contain both constant I Boolean functions. The smallest of these clones is called I, this clone only contains the constants and projection functions, i.e. functions which are of the form f (x1 , . . . , xn ) = xi Figure 1: Clones with constants for some i ∈ {1, . . . , n}. A Boolean function f is linear if it can be written as f (x1 , . . . , xn ) = c ⊕ x1 ⊕ · · · ⊕ xk , where c ∈ {0, 1} and x1 , . . . , xk ∈ {x1 , . . . , xn }. The set of all linear Boolean functions is a clone, which we will call L. The clone of all Boolean function that can be written using only disjunction and constants is called V. It contains those Boolean functions f which can be written as f (x1 , . . . , xn ) = c ∨ x1 ∨ · · · ∨ xk , where c ∈ {0, 1} and x1 , . . . , xk ∈ {x1 , . . . , xn }. Similary, the clone E contains the Boolean functions which can be written as conjunctions of variables and constants. The clone N consists of projection functions, their negations, and both constants. We call f monotone, if x1 ≤ y1 , . . . , xn ≤ yn implies f (x1 , . . . , xn ) ≤ f (y1 , . . . , yn ). The clone of all monotone functions is called M. Finally, the clone BF contains all Boolean functions. Figure 1 shows the inclusion structure of these clones. It wollows from Post’s work that this is a complete list of clones containing both constant functions.
3
2.3
Logtime reductions
A deterministic logtime Turing Machine has access to the input via an index tape, on which it writes a number j and then enters a query state and receives the j-th bit of the input string, at cost 1. The index tape is not deleted after the query. The class LOGTIME contains all decision problems which can be solved by such a machine in logarithmic time. The class ∆R 0 contains the decision problems which can be solved in this way with the additional restriction that the machine only reads one bit of its input. We introduce reductions which are suitable for very low complexity classes. In our context, the appropiate notion is that of logtime-uniform projections, as introduced in [RV97], which were defined to formulate the ”sharpest practical notion of reducibility”. For our purposes, the interesting properties of this reduction are that it is transitive, closes our relevant complexity classes and contains the idea of a ”finite replacement reduction”. For a formal definition and a detailed explanation of these concepts, see [RV97]. Note that all of the reductions appearing in the present work can also be computed by a Mealy automaton.
3
Results
3.1
Main Theorem
Before we state our main result, we note that we can ”get constants for free” in our sets B: Because there is no difference if we allow constants in our formulas or set given input values to these constants, we get the following trivial proposition: F Proposition 3.1 Let B be a finite set of Boolean functions. Then VALF (B∪{0, 1}) ≤dlt proj VAL (B).
This reduces the number of cases we need to consider from the infinitely many classes in Post’s lattice to the seven shown in Figure 1. Now the following theorem completely classifies the complexity of the formula value problem: Theorem 3.2 Let B be a finite set of Boolean functions such that B contains both constant functions. - If [B] = I, then VALF (B) ∈ ΣR 0 . - If [B] = V, then VALF (B) is complete for NLOGTIME under ≤dlt proj reductions. - If [B] = E, then VALF (B) is complete for coNLOGTIME under ≤dlt proj reductions. - If [B] ∈ {N, L}, then VALF (B) is equivalent to MOD2 under ≤dlt proj reductions. - If [B] ∈ {M, BF}, then VALF (B) is complete for NC1 under deterministic log time reductions. We will prove the theorem with the lemmas in this section.
4
3.2
Constructing short formulas
Even when we know that functions like ∨ or ∧ can be represented as B-formulas for some set B of Boolean formulas, we need ”short” formulas to represent these predicates to prevent the ”explosion” of some constructions when we prove hardness results. In general, this cannot be guaranteed (expressing x1 ⊕ x2 ⊕ · · · ⊕ xn with ∧, ∨, ¬ leads to exponential length). To avoid this problem, we use ”short” formulas: For a function f (x1 , . . . , xn ), we say a finite set B of functions efficiently implements f (via g), if there is a B-formula g(x1 , . . . , xn ) such that g represents f and each of the variables x1 , . . . , xn occurs in g exactly once. Lewis showed [Lew79] that complete sets B of Boolean functions efficiently implement ∨, ∧, and ¬. We show that in the clones we are interested in, we can always get short formulas for our relevant functions: Lemma 3.3 Let B be a finite set of Boolean functions such that 0, 1 ∈ B. 1. If [B] ∈ {V, M}, then B efficiently implements ∨. 2. If [B] ∈ {E, M}, then B efficiently implements ∧. 3. If [B] = L, then B efficiently implements ⊕. 4. If N ⊆ [B], then B efficiently implements ¬ via some formula f . If [B] ⊆ L, then f can be chosen in such a way that the variable x occurs in f as the last symbol. 5. If [B] = BF, then B efficiently implements ∨ and ∧. Proof 1. Since ∨ ∈ V2 ⊆ [B], there is a B-formula f (x, y) such that f represents x ∨ y and has a minimal number of occurances of x and y. Let n be the number of occurances of x, and m the number of occurances of y in f . Without loss of generality, let m ≤ n, and assume n ≥ 2. Let f# (x1 , . . . , xn , y1 , . . . , ym ) be the formula obtained from f by numbering the variable occurances, i.e. renaming the i-th occurance of x to xi and accordingly for y. Since 1 ∈ B, we can construct a B-formula equivalent to f 0 (x, y) := f# (x1 /1, x2 /x, . . . , xn /x, y1 /y, . . . , ym /y). The minimality of f implies that f 0 (x, y) does not represent x∨y. Since [B] ⊆ M, the function represented by f# is monotone, and thus f 0 (x, y) ≥ x ∨ y holds. Therefore, we know that f 0 (x/α, y/β) = 1 for all Boolean values α and β, in particular: f 0 (x/0, y/0) = f# (x1 /1, x2 /0, . . . , xn /0, y1 /0, . . . , ym /0) = 1, and since f# is monotone, this implies f# (x1 /1, x2 /α2 , . . . , xn /αn , y1 /β1 , . . . , ym /βm ) = 1 for all α2 , . . . , βm ∈ {0, 1}. Since 0 ∈ B, we can construct the B-formula f 00 (x, y) := f (x1 /x, x2 /0, x3 /x, . . . , xn /x, y1 /y, . . . , ym /y). 5
(1)
Observe that the following holds: f 00 (x/1, y/0) f 00 (x/1, y/1) f 00 (x/0, y/1) f 00 (x/0, x/0)
= ≥ = =
f# (x1 /1, x2 /0, x3 /1, . . . , xn /1, y1 /0, . . . , ym /0) = 1 (equation 1) f 00 (x/1, y/0) = 1 (the function represented by f 00 is monotone) f (x/0, y/1) = 1 (by choice of f ) f (x/0, y/0) = 0 (by choice of f )
Thus, f 00 represents the OR function, and f 00 has one variable less than f , which is a contradiction to the minimality of f . Therefore, f only contains two variable occurances. 2. This follows with an analogous proof. It also follows from the duality implicit in Post’s lattice. 3. Since [B] ⊇ L0 , there is a B-formula f (x, y) such that f represents x ⊕ y. Since [B] ⊆ L, we know f# (x1 , . . . , xn ) represents a function of the form c ⊕ xi1 , . . . , xik for some c ∈ {0, 1} and i1 , . . . , ik ∈ {1, . . . , n}. It is obvious that by replacing all but two of the variables with zeroes, we end up with a formula for ⊕ having exactly one occurance of each variable. 4. Existance of the formula f follows with Lemma 1 from [Lew79]. Lewis only states the result for complete sets B, but the proof only makes use of the fact that negation and both constants can be expressed with B-formulas. Now observe that if [B] ⊆ L holds, then every B-formula represents a function g which is symmetric in all of its relevant arguments (observe that all arguments to all functions in B are relevant). Therefore, the only variable x in f can be moved to the end of the formula by swapping arguments, which does not change the value of the corresponding function. 5. This is Lemma 2 from [Lew79]. 2
3.3
General upper bound
In this section we show that the formula value problem is always solvable in NC1 . Beaudry and McKenzie showed that this holds for sets B containing only 2-ary functions and formulas in infix notation. We show that our problem can be reduced to this form. We quote the following theorem (Proposition 3.2 from [BM95]): Theorem 3.4 ([BM95]) The problem of evaluating a formula in infix notation, over a fixed algebra, is in NC1 . Now the general upper bound follows with a reduction from the general to the 2-ary case. For a function f , ar(f ) is its arity. For a finite set B of functions, let ar(B) denote the maximal arity of a function in B. Theorem 3.5 1. For any finite set B of functions over a finite domain D such that ar(B) ≥ 3, VALF (B) reduces to VALF (B 0 ) for a finite set B 0 of functions over a domain of size |D|2 +|D| where |B 0 | = |B| + 1 and ar(B 0 ) = ar(B) − 1. The reduction can be computed in NC1 . 2. Let B be a finite set of Boolean functions. Then VALF (B) ∈ NC1 . 6
Proof 1. Let D0 := D∪(D×D), and define a new binary operation on D0 : x1 ∗x2 := (x1 , x2 ) for x1 , x2 ∈ D, ∗ is defined arbitrarily otherwise. For any operation f ∈ B such that ar(f ) ≥ 3, define a (ar(f ) − 1)-ary operation gf on D0 as follows: gf (x1 ∗ x2 , x3 , . . . , xn ) := f (x1 , x2 , . . . , xn ). For arguments not of this form, define gf arbitrarily. In a given formula, we replace f f1 f2 . . . fk with gf ∗ f1 f2 f3 . . . fk . Obviously, the resulting formula is equivalent to the original and the construction can be done in NC1 . 2. Let k := ar(B). If k > 2, then k − 2 applications of part 1 reduce the problem to one involving only binary operators. As in the proof of Proposition 3.1 in [BM95], we can further reduce the problem and assume we only have one binary and no unary operators. To convert this to infix notation, note that a term f1 t1 t2 will be converted to (t01 f1 t02 ), where t01 and t02 are the infix representations of t1 and t2 . For any term t, let ext(t) be the length of the infix representation of t. It holds that ext(t) = |t| + 2 · tf , where tf is the number of function symbols occuring in t, because of the added paranthesis. The main operation required to determine the infix representation of a given formula is to determine the place of a function symbol f (the order of the constants is the same in the infix as in the prefix notation). Let f be the symbol for the binary connective appearing in ϕ, and let t1 and t2 be the terms representing the arguments passed to f in the formula ϕ. Now to determine the position of f in the infix representation ϕ0 , the main operation required is to calculate the position of the last character of t2 , which essentially requires counting to determine the arguments for each binary operation: We set a counter to 2 and increment it by 1 whenever we find a binary operation symbol, and decrement it by 1 if we find a constant. Since counting is in TC0 ⊆ NC1 , this can be done in NC1 . Now the position of f in the formula ϕ0 is its position in ϕ plus the ext(t1 ) plus the number of paranthesis introduced in the part of ϕ0 before the occurance of t1 , which can be determined similarly by counting operation symbols and checking if their range extends beyond the position of f . If it does, add 1 for the opening paranthesis, if it does not add 2 for the opening and closing paranthesis. To determine this, the range of any operation symbols has to be checked, this can be done in the same was as for f outlined above. The resulting formula in infix notation can be evaluated in NC1 due to Theorem 3.4. Since B and therefore k does not depend on the input formula, we only have a constant number of reductions for every input. Therefore, it follows from 3.4 that VALF (B) ∈ NC1 . 2
3.4
Classification
The ”easiest” non-constant function is the identity. Formulas which only use the identity as connective are vefry simple to evaluate: Lemma 3.6 Let B be a finite set of Boolean functions such that [B] = I. Then VALF (B) ∈ ΣR 0 .
7
Proof Since B only contains the unary identity and constants, a variable-free B-formula is of the form id . . . idc for a constant c. Thus the algorithm just has to look at the last character of the formula to decide its truth value. 2 The problem is a bit more difficult if we consider OR-formulas. These are true whenever some argument is 1, but we need nondeterminism to find this occurance. Lemma 3.7 Let B be a finite set of Boolean functions such that [B] = V. Then VALF (B) is complete for NLOGTIME. If [B] = E, then VALF (B) is complete for coNLOGTIME. Proof Membership in NLOGTIME is clear, since the non-deterministic logarithmic time algorithm just has to guess the position of one input value which is set to 1 to verify that the formula holds. Since the functions in B do not have irrelevant variables and represent a disjunction, the formula evaluates to true in this case. For the hardness result, we reduce from {0, 1}∗ 1{0, 1}∗ , which is complete for NLOGTIME. Let f be a B-formula such that f (x, y) is equivalent to x ∨ y and x and y occur in f exactly once, let f = f B xf M yf E . Since functions from B are commutative, we can achieve, by swapping arguments, that f E is empty. Let c1 c2 . . . cn be some string from {0, 1}∗ . Now, let ϕ := fB c1 fM fB c2 fB . . . fB cn−1 fB cn fM . It is obvious that this can be calculated by a uniform logtime projection and that ϕ holds if and only if one of the ci ’s is set to 1. For the Ei cases, the proof is analogous. 2 The first case we will look at is formulas representing linear functions. It is evident that evaluating these is basically counting modulo 2, which is the problem MOD2 . The following lemma states that these problems are equivalent even when considerung uniform logtime projections. Lemma 3.8 Let [B] ∈ {N, L}. Then VALF (B) is equivalent to MOD2 under ≤dlt proj reductions. Proof Because of Lemma 3.3, we know that there is a B-formula f¬ such that f¬ represents x and x occurs in f¬ exactly once, namely as the last symbol in the formula. Let (α1 , . . . , αn ) ∈ {0, 1}n be(an instance for MOD2 . We construct a B-formula f as follows: id if αi = 0 f = f1 f2 . . . fn 0, where fi (x) := f¬ if f (x) = 0 This formula can be computed by a deterministic logtime projection, since the reduction is only a finite replacement table. Obviously, the formula evaluates to true if and only if an odd number of the αi ’s is 1. We now reduce VALF (B) to MOD2 : Observe that each function f (x1 , . . . , xn ) ∈ [B] is associative, and equivalent to c ⊕ xi1 ⊕ · · · ⊕ xik for some c ∈ {0, 1} and some i1 , . . . , ik ∈ {1, . . . , n}. Thus, evaluation of a given B-formula corresponds to counting the input values and constants c which are set to 1 modulo 2. This is an application of a MOD2 -circuit. 2 The following proof uses the idea of Theorem 9(a) in [Bus87] to express alternating turing machines as Boolean formulas: 8
Lemma 3.9 Let B be a finite set of Boolean functions such that [B] ∈ {M, BF}. Then VALF (B) is complete for NC1 under deterministic log time reductions. Proof VALF (B) ∈ NC1 because of Theorem 3.5. With Lemma 3.3, we obtain formulas f∨ (x, y) and f∧ (x, y) for x ∨ y and x ∧ y with each of the variables occuring only once. By padding the formulas with symbols for the identify, we can achieve that these formulas are of the following form: f∨ = f∨B x f∨M f∨N y f∨E , and f∧ = f∧B x f∧M f∧N y f∧E and it holds that |f∨E | = |f∨M | = |f∨N | = E |f∨ | = |f∧B | = |f∧M | = |f∧M | = |f∧E |. In the same way, we can get formulas f0 and f1 representing the constants that are split up into these parts, but without occurances of x and y. We also assume that the formula parts have a length of some power of 2, and consider the as of length 1 for the rest of the proof—since devision by and multiplication with a power of 2 is easy, the position calculation in the formula we want to construct can be performed be a logtime machine. Note that these formulas are not necessarily syntactically correct, since the symbol for the unary identity can appear as the last character in the formula. However, to avoid this problem we just have to ensure that in the complete formula we construct for the reduction, the last symbol is different from the identity. Let L be a language in NC1 = ALOGTIME, and let M be an alternating Turing machine accepting L in logarithM1 N1 14 15 mic time. Without loss of generality, assume that M branches binarily in every nonB1 E1 halting configuration. The execution tree of 1 28 the machine M on a given input x directly M2 N2 M5 N5 corresponds to a Boolean formula using only 7 8 21 22 B2 E2 B2 E2 AND (universal configurations), OR (exis2 13 16 27 tential configurations) and constants (haltM3 N3 M4 N4 M6 N6 M7 N7 ing configurations). The formula we want to 4 5 10 11 18 19 24 25 construct is in prefix notation, which corresponds to a tree-walk on the configurations B3 E3 B4 E4 B6 E6 B7 E7 6 9 12 17 20 23 26 of M using depth-first search. More pre- 3 cisely: The root of this tree consists of nodes M1 and N1 representing the two middle parts Figure 2: Example tree of the outmost formula. It has four children: The leftmost is a node B1 representing the beginning of the formula, the second one, ϕ1 represents the first argument, the third child ϕ2 is the second argument and finally the right-most child, E1 , represents the ending part of the formula. The order of the tree-walk is the same as the order in which these parts of the formula appear when written out: B1 ϕ1 M1 N1 ϕ2 E1 . The formulas ϕ1 and ϕ2 represent subtrees constructed in the same manner. Figure 2 is an example for such a tree with height 3. Here B1 refers to the beginning part of the first (outmost) formula, this is f∨B if the first configuration of the alternating machine is existential, and f∧B if it is universal, M1 and N1 refer to the two middle parts and E1 to the end part. The numbers 1 to 28 indicate the order in which the parts appear in the formula. It is obvious that the formula constructed in this way is true if and only if x ∈ L. For a canonical logtime reduction, it would be necessary to compute, on input x and n, the n-th bit of the formula. The main computational problem to solve here is to determine how the node n is labeled in the tree outlined above. The algorithm presented in Figure 3 calculates, on
9
input h (height of the tree) and n, a search string to the node n as follows: Observe that the tree is ”essentially binary”, that is, every vertex has at most two subtrees which are not just single vertices. Now a search string to a node Bi , Mi , Ni or Ei consists of a word from {l, r}∗ , denoting a left/right path to the corresponding Mi node, plus an indicator B, M, N or E, to determine which of these nodes is required. Given such a search string, a deterministic Turing Machine can simulate the alternating machine recognizing the language L and make the non-deterministic choices according to the search string, and thus determine whether this is an existential, universal, accepting or rejecting configuration. Based on the indicator, it is easy to look up the appropiate bit of the corresponding formula part. A tree of the form outlined above with height h (the height is log |x|) has 2h+2 − 4 vertices. The node M1 has the number 2h+1 − 2, E1 has Input: Height h and search number n number 2h+2 − 4. The first B-node of the left searchstring:= child has number 2, for the right subtree, this loop number is 2h+1 . Given these numbers, it is easy if search=1 then to see that the algorithm in Figure 3 correctly output searchstring+”B” computes the search string for a given node: In else if n = 2h+1 − 2 then the conditions adding a ”r” or a ”l” to the string, output searchstring+”M” the new number n is the number of the required else if n = 2h+1 − 1 then node when considering the right or left subtree output searchstring+”N” as an independant tree (the numbers added or else if n = 2h+2 − 4 then subtracted to n correspond to the difference beoutput searchstring+”E” tween the indexes of the B nodes of the three end. subtrees involved). The algorithm does not run else if first bit of n is zero then in logarithmic time, but we can work around // equivalent to n < 2h+1 − 2 that: Since we have the identity in our lan// after cases above have been considered guage, we can insert these without altering the n := n − 1 value of the resulting formula. This corresponds searchstring:=searchstring+”l” to ”leaving out” some numbers in the enumerelse if first bit of n is one then ation, and thus we can reduce the search string // equivalent to n > 2h+1 − 1 calculation to a search string verification as fol// after cases above have been considered lows: n := n − 2h+1 + 1 On input (n, s, x), we put out the corresearchstring:=searchstring+”r” sponding bit of the node described by s on input end if x, if the string s describes the way to node n in h := h − 1 the tree. Since there are O(|x|) nodes in the end loop tree, the binary length of n is log |x|, the same holds for s, and the identity symbol otherwise. The resulting formula contains all formulas representing the configurations in the correct order, plus identity symbols. As mentioned above, we must assure that the final character of the generated formula is not the identity symbol. Therefore we must ”switch” the very last character with the last ”relevant” character, i.e. instead of writing the last symbol of E1 , we write in identity symbol, and as the very last character, we write the last symbol of E1 . This can easily be done, since we do not have to perform the verification in this case: Whenever the search string leads to the last symbol of E1 , write an identity, unless the input is 1 . . . 1, the very last symbol.
10
The verification needs to be performed in time O(log |x|), which is the same as O(|n|) = O(log n). We proceed as follows: On tape 1, we keep a copy of the search string. On tape 2, we keep track of what we added to the number n, on tapes 3 and 4, we keep track of what we subtracted from n. On tape 5, we keep the number h, which is initially set to log |x|. For each symbol of the search string, we perform the additions and subtractions on n and h that the algorithm would have done when writing the symbol. This essentially requires adding the constant 1 to tapes 2 and 3, and setting the left-most bit of tape 4 to 1 (The statement N := n − 2h+1 + 1 is split up into an addition and a subtraction). After processing each symbol, tape 5 is decremented by 1. By standard ammortized running time analysis, this can be done in time O(log n). When the search string is processed, we add to n the content of tape 2 and subtract tapes 3 and 4, which can be done in time O(log n) again. Finally, the conditions whether to put out M,N,E, or B can be verified in time O(|n|), since these are easy patterns in binary representation. The length of the resulting formula is O(2|n|+|s| ) = O(2log n+log n ) = O(n2 ), where n is the length of |x|, and thus polynomial in |x|. Since these formulas have length of more than one bit, we still need to divide by the length, but that can be done easily—just assure the formula length is a power of 21 . 2
4
Conclusion and further research
We achieved a complete classification of the formula value problem, and showed that for sets B1 and B2 for which [B]1 ⊆ B2 holds, VALF (B1 ) can always be reduced to VALF (B2 ) under very strict reductions, under reasonable assumptions for the sets Bi . Other than in the circuit value case, this was not immediate, since the local replacement of gates does not work with formulas. These independance results are achieved with Lemma 3.3, which will probably prove usefel in related work as well. An interesting open question in this context is that of satisfiability. Is is known from [Lew79] that the satisfiability problem for B-formulas is NP-complete if [B] ⊇ S1 and solvable in polynomial time otherwise. A more detailed classification of the polynomial time cases is not known yet. It is immidiate from our work that satisfiability for B-formulas is complete for NC1 if [B] = M , and the current paper can probably serve as a starting point for the other cases as well. Other interesting directions for future research are the quantified case, and equivalence of formulas.
Acknowledgements We thank Steffen Reith for an introduction to the topic and valuable discussions, and Michael Bauland, Pierre McKenzie, and Heribert Vollmer for helpful hints.
References [BCRV03] E. B¨ ohler, N. Creignou, S. Reith, and H. Vollmer. Playing with Boolean blocks, part I: Post’s lattice with applications to complexity theory. SIGACT News, 34(4):38–52, 2003. 1 Move
this elsewhere
11
[BM95]
M. Beaudry and P. McKenzie. Circuits, matrices, and nonassociative computation. J. Comput. Syst. Sci., 50(3):441–455, 1995.
[BS05]
E. B¨ ohler and H. Schnoor. The complexity of the descriptiveness of Boolean circuits over different sets of gates. Technical Report 357, Fachbereich Mathematik und Informatik, Universit¨ at W¨ urzburg, 2005.
[Bus87]
S. R. Buss. The Boolean formula value problem is in ALOGTIME. In Proceedings 19th Symposium on Theory of Computing, pages 123–131. ACM Press, 1987.
[Coo71]
S. A. Cook. The complexity of theorem proving procedures. In Proceedings 3rd Symposium on Theory of Computing, pages 151–158. ACM Press, 1971.
[Lew79]
H. R. Lewis. Satisfiability problems for propositional calculi. Mathematical Systems Theory, 13:45–53, 1979.
[Pos41]
E. L. Post. The two-valued iterative systems of mathematical logic. Annals of Mathematical Studies, 5:1–122, 1941.
[Rei01]
S. Reith. Generalized Satisfiability Problems. PhD thesis, Fachbereich Mathematik und Informatik, Universit¨ at W¨ urzburg, 2001.
[Rei04]
O. Reingold. Undirected st-connectivity in log-space. Technical Report TR04-094, ECCC Reports, 2004.
[RV97]
K. Regan and H. Vollmer. Gap-languages and log-time complexity classes. Theoretical Computer Science, 188:101–116, 1997.
[RW00]
S. Reith and K. W. Wagner. The complexity of problems defined by Boolean circuits. Technical Report 255, Institut f¨ ur Informatik, Universit¨at W¨ urzburg, 2000. To appear in Proceedings International Conference Mathematical Foundation of Informatics, Hanoi, Oct. 25–28, 1999.
12