The number system hidden inside the Boolean satisfiability problem
Author: Keum-Bae Cho Affiliations Department of Electrical and Computer Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 151-742, South Korea *Author to whom correspondence should be addressed: E-mail (
[email protected]) Abstract The Boolean satisfiability (SAT) problem is the first known example of an NPcomplete problem, and thousands of NP-compete problems have been identified by reducing the SAT to the problems. Researchers have tried to find a definite mathematical expression that distinguishes among NL-complete, P-complete, and NPcomplete problems such as 2-SAT, Horn-SAT, and 3-SAT. In this paper, we introduce the natural number system hidden inside the SAT structure. We reduce a SAT instance to an integer-programming instance. Then, we focus on the distance from an integral point to the facets of the projected polytope. We newly define a dominant variable, decision chain, and chain coupler as a novel element of a Boolean formula. From the analysis of the SAT structure using the elements, we show that the coefficients of the normal vector of the facet can be expressed with the natural number system of which the exponent is exponential in the input size. Furthermore, we prove that an integral point, which is not contained in the solution region, can locate exponentially near the projected polytope by the number system. Finally, we show that the number system is not formed in 2-SAT, but partially formed in HornSAT according to the feasible value of a dominant variable, and always formed in kSAT (k>2) regardless of the feasible value of a dominant variable. Two questions, NL =? P and P =? NP, have been open problems for several decades. This study presents a definite supporting evidence for the conjecture that NL⊊ P ⊊ NP and a new solving direction for the P versus NP problem.
Introduction The class P (polynomial) is the set of decision problems that are solvable in polynomial time and the class NP (nondeterministic polynomial) is the set of decision problems for which a solution can be verified in polynomial time. More precisely, a polynomial time algorithm for a problem is a method that solves the problem correctly on every input and takes no more than c∙nk time on an input of size n (i.e., O(nk)) for some constants c>0 and k>0. NP is the class of languages that have a polynomial-time verification algorithm. Any deterministic Turing machine can be simulated by a non-deterministic Turing machine with no overhead. Thus, P is included in NP. Then, if a decision problem can be verified in polynomial time, is it possible to solve the problem in polynomial time? This question is the well-known P versus NP problem1, which is to clarify the relationship for the inclusion of the classes P and NP. Cook and Levin proposed the question about forty years ago as a problem
concerned with the fundamental limits of feasible computation2,3. The NP-complete problem is a problem that belongs to NP and every problem in NP is reducible to the problem in polynomial time. The obvious way to prove P = NP is to show that some NP-complete problem has a polynomial time algorithm. Researchers have found thousands of NP-complete problems since Karp's research4,5. However, although there are so many NP-complete problems, researchers have failed to find a polynomial time algorithm for any one of the problems. Hence, various proof techniques have been studied to distinguish between P and NP with the belief that P ≠ NP. However, all known proof techniques such as relativizing, natural, and algebrizing proofs are insufficient to prove that P ≠ NP. Baker, Gill, and Solovay showed that P = NP with respect to some oracles, while P ≠ NP for other oracles. Hence, the P versus NP question cannot be solved by any of the proof techniques that separate complexity classes relative to an oracle6. Every polynomial-time computable function can be expressed by a circuit with a polynomial number of gates7. From this relationship, exponential lower bounds have been proved for restricted circuit models such as monotone circuits8,9 and bounded depth circuits with unbounded fan-in gates10,11. However, Razborov and Rudich proved that if one-way functions exist, no natural proof method could distinguish between P and NP12. Although one-way functions have never been formally proven to exist, most researchers believe that a proof or disproof of the existence of a one-way function would be much harder than clarifying the relationship of the classes P and NP. An algebrizing proof was successfully used to prove several complexity theories such as IP = PSPACE13,14 and PCP theorem15. However, Aaronson and Wigderson showed that algebrizing technique is fundamentally unable to resolve the barrier problem of P versus NP. They pointed out that the reason for the incapability to solve this problem is the failure of opening the Boolean formula wide enough. Thus, it needs to probe the Boolean formula in some deeper way for further progress16. In this paper, we introduce a novel idea to analyze the Boolean formula more deeply. The Boolean or propositional satisfiability (SAT) problem is to determine whether there exists a feasible set to satisfy a given Boolean formula. SAT is the first known example of a NP-complete problem and thousands of NP-compete problems have been identified by reducing the SAT to the NP-complete problems. There are several special cases of satisfiability problem. The k-SAT determines the satisfiability of a Boolean formula in conjunctive normal form (CNF) where each clause is limited to at most k literals. Especially, 3-SAT is contained in the 21 NP-complete problems researched by Karp. Class NL (nondeterministic logarithm) consists of the decision problems that can be solved by a nondeterministic Turing machine with a read-only input tape and a separate read-write tape whose size is limited to be proportional to the logarithm of the input length. The NL-complete problem is defined as a decision problem where the problem belongs to NL and has the additional property that every other decision problem in NL can be reduced to the NL-complete problem. 2-SAT belongs to the NL-complete problem17. Similarly, the P-complete problem is defined as a decision problem included in P and every problem in P can be reduced to the Pcomplete problem by using an appropriate reduction. Horn-SAT belongs to the Pcomplete problem18, which consists of Horn clauses that contains at most one positive literal. XOR-SAT is another special case of the SAT where each clause contains exclusive OR operators rather than the OR operators. XOR-SAT belongs to P since an XOR-SAT formula can be solved in cubic time by Gaussian elimination. There are only six tractable (polynomial time decidable) cases in SAT: 2-SAT, Horn-SAT, dualHorn-SAT, XOR-SAT, instances satisfied by the all ‘0’ assignment and instances
satisfied by the all ‘1’ assignment. Otherwise, the SAT problem can be reduced by 3SAT. Therefore, SAT is polynomial time decidable or NP-complete because 3-SAT is NP-complete. This relationship was termed the Schaefer Dichotomy Theorem19. However, we do not know whether 3-SAT can be polynomial-time decidable up to now. As shown above, SAT contains NL-complete, P-complete, and NP-complete problems. Thus, SAT is a good research subject to search for some intrinsic property that appears only in NP-complete problems. There have been no ideas for a deep analysis of the SAT structure. We newly define a dominant variable, decision chain, and chain coupler based on the characteristics of the SAT. Through the analysis of SAT structure using the dominant variable, decision chain, and chain coupler, we derive the natural number system hidden inside k-SAT (k>2). In addition, we show that the number system is not formed in 2-SAT, but partially formed in Horn-SAT according to the feasible value of a dominant variable, and always formed in k-SAT (k>2) regardless of the feasible value of a dominant variable. Thus, this study gives us clear answer for the following research question: Is there any definite mathematical expression to explain why Horn-SAT has an intermediate property between 2-SAT and 3-SAT? 1. Definitions of a dominant variable and decision margin We need some sort of a measure in order to assign a suitable value to a variable. It is natural to expect this measure to be represented by a number indicating a definite physical meaning. Hence, we reduce a given SAT instance to an integer-programming instance to express the measure as a distance that represents geometric relationships. For example, let us suppose that a Boolean formula in CNF is given: X X1 X 2 X 4 X1 X 3 X 2 X 3 X 2 X 4
(1)
If we change the literals such as ‘X’ and ‘¬X’ to variables such as ‘x’ and ‘1-x’, and a logical operator ‘˅’ to an arithmetic operator ‘+’, then the constraints for each clause to be satisfiable are converted to inequalities: x1 1 x2 1 x4 1, x1 1 x3 1, x2 x3 1, 1 x2 x4 1
(2)
We can convert eq. (2) to the matrix-vector notation by reflecting the constraint that all variables have feasible regions between ‘0’ and ‘1’: 1 x1 x2 x4 1 0 x1 x3 1 1 x2 x3 2 0 x2 x4 1
1 1 1 0 1 x1 1 0 1 0 1 0 x 1 2 1 0 1 1 0 x3 2 0 0 1 0 1 x4 1
(3)
In general, every clause is converted to a pair of inequalities: p
n
p
n
p X i n Yi 1 xi 1 yi p n n 1 xi yi p ∨i 1 ∨i 1 i 1 i 1 i 1 i 1
(4)
In eq. (4), p and n indicate the number of positive variables and negative variables respectively. These bounded inequalities form a polytope. We can apply exact algorithms such as cutting-plane method and branch-and-bound method in order to acquire an integral point contained in the polytope, although they are not polynomial time algorithms 20,21. Let us suppose that we acquired a solution set for a given integer-programming instance. If there is a variable that takes only one value of ‘0’ or ‘1’, we term this
variable a dominant variable. If there is a dominant variable, the number of integral points that must locate outside the polytope, which represents the solution region, is exponentially increased as the number of variables in the SAT instance increases. For example, we must verify that 2n-1 integral points in n-dimensional space are not included in the polytope in order to confirm that a variable cannot be assigned with ‘0’ or ‘1’. We term these points as infeasible points. We can draw 2n-1 lines that pass on one of the infeasible points and another point, where the dominant variable takes a different value from the selected infeasible point and the other variables take the same values with the selected infeasible point. These lines are termed as decision lines. The intersection point between a decision line and a facet is termed a decision point. The decision point becomes a cut-off point to confirm that the infeasible point is not included in the polytope. The feasible region of a variable is determined by the union of the feasible regions on 2n-1 decision lines. We need to verify that all 2n-1 infeasible points are located outside every feasible region on 2n-1 decision lines. Therefore, if we want to realize a polynomial-time algorithm to solve a given SAT instance, we must reduce the number of decision lines by eliminating the related variables or projecting the polytope to the lower dimensional space. If we project the polytope to ddimensional space, the number of decision lines becomes 2d-1. The smallest distance from the infeasible points to the decision points in projected polytope is termed a decision margin. The decision margin is applied only to the dominant variables because infeasible points are generated only when a variable is a dominant variable. Figure 1 describes the concept of the decision margin and the relationship of the defined terms. SAT instance
Integer programming instance
Represented with CNF, for instance,
X X1 X1 X2 X2 Eq. (1)
Reduction
Decision margin
Represented with n-dimensional Polytope
bmin Ax bmax
Eq. (3)
Projection to d-dimensional space (d2). We need some novel ideas to prove this property without calculation of the exact equations of facets. We introduce the concept of decision chain based on the characteristic of SAT.
2. Concept of the decision chain Let us suppose that we synthesize a SAT instance containing a dominant variable. If we remove the dominant variable from all clauses of the SAT instance, the new generated CNF must be unsatisfiable. Otherwise, the removed variable can be assigned with both ‘0’ and ‘1’ as a feasible value. This contradicts the definition of a dominant variable. Hence, we first synthesize an unsatisfiable CNF and then insert a dominant variable in order to make a SAT instance that contains a dominant variable. We can think of various types of an unsatisfiable CNF. However, we focus on the characteristic of a clause. A clause consists of a disjunction of literals. Therefore, a clause is satisfied if any one or more variables take ‘1’ when the coefficient of the variable is positive and take ‘0’ when the coefficient of the variable is negative. We want to synthesize an unsatisfiable CNF that has a similar satisfiability property with a clause from the point of view of a dominant variable. We can make this CNF with the conjunctions of implication chains such as X→Y in the simplest manner. For example, CNF X1˄(¬X1˅X2)˄(¬X2˅X3)˄…˄(¬Xk-1˅Xk)˄¬Xk satisfies this condition. For this reason, we define decision chain as an unsatisfiable CNF that becomes satisfiable if any one or more clauses are satisfied by a dominant variable that will be inserted. Figure 2 shows decision chains expressed with the types of chain and matrix. CNF
X1 X1 X 2 X 2
X1 X1 C1
x1
C1
x1
C2
-x1
C2
-x1
Chain C3
X1 X1 X 2 X 2 X 3 X 3 C1
x1
x2
C2
-x1
-x2
C3
C1
x2
x3
-1
1
-1
x1
xn 1
C2
1
C3
1
1
-1
x3 -x3
xn
x1
xk
xm n 1
C1
C1
C(m) C(n-1)
Matrix
-x2
C4
(a)
x1
x2
D(n-1)
Cn 1
D(m)
Cm Cm 1 -D(n)
-1
C(3)
-1
-D(3)
(b)
Cn
Block-A
-1
C(n)
Cm n C(m+n)
C(n)
(c)
(d)
Fig. 2. The concept of the decision chain: (a) simplest three decision chains represented by chain type; (b) example of a matrix representation for a decision chain and dominant variable consisting of three clauses; (c) structure of a decision chain consisting of n clauses; and (d) new decision chain constructed by coupling of two decision chains
Figure 2(a) represents the three simplest decision chains expressed by chain type. In order to generate the desired size of a decision chain, we first make two clauses consisting of one variable that has different coefficients. This configuration becomes the minimum size of a decision chain. Second, we insert a dominant variable to the clauses in order to satisfy the constraints of the two clauses simultaneously. The dominant variable connected to a decision chain must have the same coefficients in all clauses contained in a decision chain. Otherwise, the decision chain is always satisfied because at least one clause is satisfied regardless of any value of the variable. Thus, all coefficients of the dominant variable inserted in the clauses must be the same value or ‘0’. Third, we make an unsatisfiable CNF by adding one clause containing the same dominant variable assigned with a different coefficient from the
coefficient of the dominant variable in previous step. We can extend the size of the decision chain by repeating the above process. The underlined value such as ‘1’ in a double circle was used to express that the coefficient of the variable is changeable to ‘0’. We replace the coefficients of a dominant variable and decision chain with a rectangular block for an easy graphical explanation as shown in Fig. 2(b), 2(c), and 2(d). Figure 2(b) shows one example of a decision chain and dominant variable consisting of three clauses. In the rectangular region of a dominant variable, at least one coefficient must be ‘1’ or ‘-1’ and all other coefficients must be the same value or ‘0’. The decision chain including k1 clauses is designated as C(k1), and the dominant variable connected to C(k1) is designated as D(k2) or –D(k2), where ‘–’ indicates that the feasible value of the dominant variable is ‘0’, and the value k2 means the number of ‘1’ or ‘-1’ in the rectangular block. Figure 2(c) shows the structure of the decision chain containing n clauses. Block-A indicates the coefficients of xi (1≤i≤n-1). In order to satisfy the unsatisfiable condition of the CNF, the last clause must not be satisfied by the values of xi (1≤i≤n-1) when the dominant variable takes ‘0’. If all coefficients are ‘0’, this condition is satisfied as the simplest case. Figure 2(d) shows a new decision chain constructed by the coupling of two decision chains connected by one dominant variable with different coefficients. Column and row exchanging does not affect the satisfiability. Thus, we can exchange columns and rows to make a decision chain in a matrix. 3. Structure of SAT based on the decision chain and a dominant variable Let us consider when two or more variables are connected to one decision chain. Figure 3(a) describes a decision chain connected by m+2 variables in 3-SAT. y1
x1 -x1
y2
yk
±D(n1)
±D(n2)
±D(nk)
y1
y2
yq
±D(n11)
±D(n12)
±D(n1q)
±D(n21)
±D(n22)
±D(n2q)
±D(np1)
±D(np2)
±D(npq)
C1
y3
x2
y1
xm 1
x1
y2
C(m)
-xm-2
C(m)
Cm
xm-1
ym
-xm-1
ym+1
ym+2
(b)
(a) x1
xm1 m2 y1
y2
C1
x1
xp C(m1)
D(n1)
C(m1)
D(n3) C(m2)
Cm1
Cm1 1 C(m2)
Cm1 m2
-D(n2)
D(n4) C(mp)
(c)
(d)
Fig. 3. The structures of SAT composed of the decision chain and dominant variable candidates: (a) dominant variable candidates connected to a decision chain in 3-SAT; (b) block representation of multiple connected dominant variable candidates; (c) two decision chains connected by variable y1; and (d) parallel connected dominant variable candidates in multiple decision chains
We can easily verify that decision chain C(m) can be connected by at most m(k-2)+2 numbers of variables in k-SAT (k>2). Figure 3(b) shows a decision chain and k numbers of dominant variable candidates. yi (1≤i≤k) are not dominant variables because if one of the k variables takes ‘1’ when the coefficient of the variable is positive and takes ‘0’ when the coefficient of the variable is negative, the other variables can take any value. On the contrary, variable y2 must take only ‘1’ to satisfy all constraints in Fig. 3(c). Therefore, y2 becomes a dominant variable. In this case, two decision chains were connected by variable y1. We term this variable chain coupler. In Fig. 3(c), if n1 and n2 are all ‘1’, the two decision chains are combined to one decision chain. Figure 3(d) shows another SAT instance existing inside a given SAT instance. At least one positive variables connected to a decision chain must be assigned ‘1’ or at least one negative variables must be assigned ‘0’ to satisfy all constraints of the clauses constructing the decision chain. This condition must be satisfied in all decision chains. Thus, every decision chain generates a new clause consisting of variables yi (1≤i≤q), and the new generated clauses are combined with the AND operation. This process makes a new CNF. All decision chains are satisfied if and only if the new CNF is satisfiable. That is, the SAT of {x1, …, xp, y1, …, yq} is satisfiable if and only if the SAT of {y1, …, yq} is satisfiable. Now, let us investigate the structure of a CNF in terms of the dominant variable candidates, decision chains, and chain couplers. One decision chain must have only one dominant variable by the definition of the dominant variable. None of the other variables except for a dominant variable connected to the same decision chain must affect the satisfiability of related clauses. Otherwise, the dominant variable can be assigned with both value of ‘0’ and ‘1’. This means that each of the other variables connected to the same decision chain with a dominant variable must be connected to another decision chain as a dominant variable because the variable must take only one value. Figure 4(a) shows four decision chains connected by chain couplers such as y1, y2, and y3. A SAT instance x1
x1
x1
x1
C(m1)
x1
C1
y1
y1
xk ±D(ak1)
±D(a11)
C(m1)
-y1
C(m2)
Ci
y2
±D(ak2) ±D(a12)
y3
Ci j
±D(ak3) ±D(a13)
-y3
C(m4)
±D(ak4)
Cm (a)
j m1 m 2 m 3 m 4
±D(a14)
y3
±D(n1) D(n2)
C(m2)
-y2
C(m3)
y2
C(m3)
C(m4)
±D(n3) D(n4) ±D(n5) D(n6)
(b)
Fig. 4. The structure of a CNF: (a) decision chains connected by a dominant variable and chain couplers; and (b) one sub-block of a SAT structure represented by multiple-connected dominant variable candidates, decision chains, and multiple-connected chain couplers
Variable x1 must be assigned only one value of ‘1’ to satisfy all constraints of the clauses contained in the decision chains. Thus, x1 becomes a dominant variable. Dominant variable candidates have the same coefficient in all decision chains, and chain couplers have different coefficients in two decision chains. A dominant variable can be connected one or more times to a decision chain. In addition, we can connect a string of dominant variable candidates to a decision chain instead of one dominant
variable. We can reconstruct a given CNF instance by dominant variable candidates, decision chains, and chain couplers as shown in Fig. 4(b). A SAT instance consists of multiple decision chains coupled by chain couplers. 4. Number system of a facet in a projected polytope We will show that the coefficient of a dominant variable in inequalities generated from the Fourier-Motzkin elimination method24 can be expressed by the natural number system of which the exponent is exponential in the input size. As mentioned above, the generated inequality corresponds to the facet of projected polytope, and the number of facets is exponential in the input size by the McMullen upper bound theorem22,23. We assume that all decision chains have the simplest configurations like Fig 2(a). This assumption makes it easy to calculate the coefficients of the inequality generated by linear operations. If we add all clauses contained in a decision chain, the variables consisting of the decision chain are all erased because these variables have different coefficients as a pair according to the above assumption. In Fig. 4(b), we can make a new inequality that remains only dominant variable candidates xi (1≤i≤k) using the Fourier-Motzkin elimination method. First, we multiply all clauses contained in C(m1) by n2 and all clauses contained in C(m2) by n1. We then add the clauses contained in the decision chains C(m1) and C(m2). Second, we multiply all clauses contained in C(m2) by n4 and all clauses contained in C(m3) by n1n3. The value n1n3 indicates the multiplier of chain coupler y2 generated during the previous elimination process. We then add the clauses contained in the decision chains C(m2) and C(m3). If we repeat this process, all variables consisting of the decision chains and chain couplers are erased, and the new generated inequality is expressed as: bmin a1 x1
ak xk bmax ai ai1n2 ai 2 n1 n4 ai 3n1n3 n6 ai 4n1n3n5 , 1 i 4
(5)
where bmin and bmax are integral numbers. Lemma 1. The normal vector of a facet in the projected polytope can be expressed by the natural number system of which the exponent is exponential in the input size in k-SAT (k >2). Proof. In eq. (5), if we assign ‘1’ to ni where i is an odd number, and assign b to ni where i is an even number, then the coefficients of xi (1≤i≤k) is expressed by the number system with basis b as: ai ai1b3 ai 2b2 ai 3b1 ai 4
(6)
The decision chain consisting of c clauses can be connected by less than or equal to c+2 variables in 3-SAT. Thus, if each decision chain consists of c variables, the relationship between b and the defined multipliers that indicate the number of ‘1’s or the number of ‘-1’s in 3-SAT becomes k
A1 1 c 2, A2 1 b c 2, A3 1 b c 2, A4 b c 2 Aj aij ,1 j 4
(7)
i 1
The number of decision chains determines the maximum exponent of the number system. The number of variables is calculated as the number of dominant variable candidates plus the number of variables in the decision chains and the number of chain couplers. The number of decision chains becomes the number of chain couplers plus one. Let the number of total variables be n. Assume that each decision chain
consists of c variables. Let the number of dominant variable candidates be d. Then, the maximum exponent e is calculated as: n ce e 1 d e
n d 1 c 1
(8)
We can extend eq. (6) as: ai
n d 1 c 1
j 1
n d 1 c 1 j
aij b
(9)
In eq. (9), symbol x means the largest integer that does not exceed x. This value is exponential on n because c and d are constants, and b can be assigned with any value satisfying eq. (7). ■ 5. Number system in tractable problems Horn-SAT is the one of the hardest problems among the tractable SATs in the sense that it is a P-complete problem18. We know that 2-SAT is NL-complete, HornSAT is P-complete, and 3-SAT is NP-complete. In addition, we predict the relations of NL, P, and NP as NL⊊ P ⊊ NP. Therefore, we expect Horn-SAT has an intermediate characteristic between 2-SAT and 3-SAT. Then, is there any definite mathematical expression to explain why Horn-SAT has an intermediate property between 2-SAT and 3-SAT? The following lemma gives us the answer to this question. Lemma 2. 2-SAT and XOR-SAT do not form the natural number system and Horn-SAT partially forms the number system according to the feasible value of a dominant variable. Proof. The number system is formed by the role of the chain coupler. Thus, if we cannot make multiple-connected chain couplers, the number system cannot be formed. The decision chain in 2-SAT is connected by at most two variables as shown in Fig. 5(a). Thus, all chain couplers can be connected to a decision chain only once as shown in Fig. 5(b) since at least one variable must be used for a dominant variable or another chain coupler. As a result, the number system is not formed because exponent b becomes ‘1’ or ‘-1’ in eq. (9). x1 y1
x1 -x1
y1
x2
1
C(m1)
x2 C(m2)
-xm-2
-xm-1
(a)
C(m3)
xm-1 y2
1
C(m4)
(b)
y2
y3 y1
x1
1
-x1
-1
y2
y1
x2
1 -1 1 -1
-xm-2
xm-1
y1
-xm-1
y1
y2
(c)
Fig. 5. The structure of 2-SAT and XOR-SAT
A clause that has a positive variable assigned with ‘1’ or a negative variable assigned with ‘0’ is always satisfied regardless of the values of other variables in normal SAT. Therefore, two or more connections of a dominant variable in a decision
chain do not affect the satisfiability. However, the concept of the multiple-connection is not valid in XOR-SAT. We can make a decision chain with XOR-SAT as with normal SAT as shown in Fig. 5(c). However, if we simultaneously add the same variable in two or more clauses contained in the same decision chain, the number of the multiple connections affects the satisfiability of the decision chain. As a result, the number system cannot be formed in XOR-SAT because we cannot make a multipleconnected chain coupler and multiple-connected dominant variable candidates. x1
+ +
-D(ak1) -D(a11)
C(m1)
+ -D(ak2)
+ -D(a12) +
-D(ak3)
+
-D(a13) -D(ak4)
+ +
-D(a14)
y 2 y3
y1
xk
x1 1
1
x2
-D(ak1) -D(a21)
-D(n1) C(m2)
C(m3)
-D(ak2) -D(a22)
1 -D(n2)
-D(ak3) -D(a23)
1 -D(n3)
C(m4)
C(m1)
y3
-D(n1) 1
C(m2)
C(m3)
-D(n2) 1 -D(n3)
-D(ak4) -D(a24)
(b)
(a)
y2
y1
xk
C(m4)
1
(c)
Fig. 6. The structure of Horn-SAT
We proved that 2-SAT does not form the natural number system and 3-SAT forms the number system. Then, If Horn-SAT has an intermediate characteristic between 2SAT and 3-SAT, how can the intermediate characteristic for the number system be expressed? Horn-SAT has at most one variable with a positive coefficient in all clauses. Due to this constraint, only one positive variable can be connected to a decision chain, as shown in Fig 6(a). This positive variable can be used as a dominant variable candidate or chain coupler. Figure 6(b) shows the case when the positive variable is used as a chain coupler. In this case, the number system is formed. However, Fig. 6(c) shows when the variable is used as a dominant variable candidate. In this case, the positive variable can be only once connected to all decision chains as a dominant variable. Therefore, the number system cannot be formed because we cannot make multiple-connected dominant variable candidates where the feasible value is ‘1’. Dual-Horn-SAT does not form the number system when the feasible value of a dominant variable is ‘0’ in a tautological sense. ■ 6. Size of the decision margin In order to calculate the size of the decision margin, we assume that the coefficients of the dominant variable candidates are all positive values without loss of generality in Fig. 4(b). If all variables except for x1 are ‘0’ in Fig. 4(b), then variable x1 becomes a dominant variable. Thus, x1 must take only one value of ‘1’. In addition, the feasible region of x1 must not include the integral point where x1=0. We can change eq. (5): bmin,1 x1 bmax,1
n d 1 c 1
b b bmin,1 min , bmax,1 max , a1 a1 a1
j 1
n d 1 c 1 j
a1 j b
(10)
The decision margin becomes bmin / a1 because the value of x1 in infeasible points is ‘0’. Now, let us calculate bmin. We assumed that decision chains are formed in the simplest type. In eq. (4), the minimum value of a feasible region is the same to the sum of
negative variables plus one and the maximum value is the same to the number of positive variables. Thus, we can calculate bmin by counting the negative variables. Figure 7 shows two examples of a chain coupler configuration to form a natural number system. bmin 1
-b+1
x1
D(ak1) D(a11) D(ak2) D(a12) D(ak3)
-b+1 D(a13) -b+1
y3
bmin 0
C(m1)
1 -D(b)
0
C(m2)
1 -D(b)
C(m3)
1
D(ak1) D(a11) D(ak2) D(a12) D(ak3)
C(m1)
y2
y3
-1 D(b)
C(m2)
C(m3)
-1 D(b) -1
D(ak4) D(a14)
y1
xk
D(a13)
-D(b)
C(m4)
x1
0 1
D(ak4) D(a14)
y2
y1
xk
C(m4)
D(b)
(b)
(a)
Fig. 7. The minimum value of a feasible region in two special cases of chain coupler configuration to form a natural number system
Figure 7(a) shows when the feasible value of a chain coupler is ‘1’. The minimum value of the inequality generated by addition of total clauses contained in C(m1), C(m2), C(m3), and C(m4) becomes 1, –b+1, –b+1, –b+1 respectively. Figure 7(b) shows when the feasible value of a chain coupler is ‘0’. The minimum value of the inequality generated by addition of total clauses contained in C(m1), C(m2), C(m3), and C(m4) becomes 0, 0, 0, and 1 respectively. We can easily verify that the minimum value of the final inequality generated by linear operations to eliminate chain couplers becomes ‘1’ in both cases. We can assign a1 any exponential value by Lemma 1. As a result, the decision margin of a projected polytope can be decreased to an exponentially small value in k-SAT (k>2). Now, we will prove the exponentially small decision margin without the limited constraint for the number of multipliers of the chain coupler by using the property of a dominant variable. Here, let us focus on the difference of the feasible regions of a variable in different decision lines. Lemma 3. If the natural number system of which the exponent is exponential is formed, decision margin decreases to an exponentially small value. Proof. Let us suppose that we eliminate a set of clauses extracted from a given CNF to make variable x1 take both ‘0’ and ‘1’. Then, some variable must play the role of a dominant variable among the dominant variable candidates. Let us suppose that this elimination affects the feasible values of only x1 and x2. Then, all variables are ‘0’ except for x1 and x2 in eq. (5). We can change eq. (5): bmin a1 x1 a2 x2 bmax ai
n d 1 c 1
j 1
n d 1 c 1 j
aij b
, i 1, 2
(11)
If variable x2 is assigned ‘1’, then x1 is not a dominant variable anymore and x1 can take both ‘0’ and ‘1’ as a feasible value. In this case, at least one point among the previous infeasible points must be included in the projected polytope. We can change the inequality (11) by adding the condition that x2=1:
bmin,2 x1 bmax,2 bmin,2
b a2 bmin a2 , bmax,2 max a1 a1
(12)
Now, let us suppose that we project the polytope to the x1-x2 plane. Figure 8(a) shows the projected polytope in 3-dimensional space. CNF x2 Block-A
x1 x 3
bmin,2
bmin a2 a1
x1 0
x1 1
bmax,2
bmax a2 a1
x2
Decision line (x2=1)
Block-B
x1
b bmax,1 max a1 a2 a1
Block-A
bmin a1
Decision line (x2=0)
DM
Block-A Block-B
bmin,1
(a)
(b)
Fig. 8. The feasible region of x1: (a) projected polytope in 3-dimensional space; and (b) feasible region of x1 on the projected polytope in the x1- x2 plane
Block-A indicates clauses corresponding to a structure of Fig. 4(b) and Block-B indicates the eliminated clauses. The rectangular parallelepiped cube is generated by constraints of Block-A, and the rectangular parallelepiped cube with the cut edge is generated by constraints of Block-A and Block-B. Figure 8(b) shows two feasible regions of x1 generated on the x1-x2 plane. The cutting line inside a parallelogram is generated from the inequalities of the eliminated clauses. We can use the characteristic of a parallelogram because the projection is an affine transformation. The feasible region of the decision line where x2=0 must not include the integral point where x1=0. In addition, the feasible region of the decision line where x2=1 must include the integral point where x1=0. Therefore, the line x1=0 must locate between bmin,1 and bmin,2. The difference of bmin,1 and bmin,2 is calculated to a2/a1 from eq. (10) and eq. (12). Therefore, the decision margin must be less than or equal to a2/a1. We can make a1 any number of exponential size and a2 any number of polynomial size by Lemma 1. As a result, the decision margin of a projected polytope can be decreased to an exponentially small value in k-SAT (k>2). ■ Theorem 1. The smallest distance from an integral point to the projected polytope does not decrease to an exponentially small value in 2-SAT, but partially decrease in Horn-SAT, and always decrease in k-SAT (k>2), according to the formation possibility of the natural number system. Proof. The proof of this theorem is completed by Lemma 1, 2, and 3. ■ 7. Algorithms for solving Horn-SAT based on the decision margin As we know, the problem of Horn-SAT is solvable in linear time. A polynomialtime algorithm for Horn-SAT is based on the rule of unit propagation18. Let us think about another algorithm for Horn-SAT considering the decision margin. If the feasible value of a dominant variable is ‘1’, infeasible points are located far enough from the polytope to separate with any estimation algorithm in polynomial time. However, if the feasible value of a variable is ‘0’, infeasible points can locate exponentially near the polytope. As a result, if a variable takes only ‘0’, the acquired
feasible region of the variable by an estimation algorithm in polynomial time can include both ‘0’ and ‘1’ because the decision margin is exponentially small. However, if the variable takes only ‘1’, we can acquire a feasible region including only ‘1’ using the same algorithm. We can make a Horn-SAT solver considering this condition as follows: Estimate each variable’s feasible region of a given Horn-SAT instance with some iterative estimation algorithm such as linear programming in polynomial time. Select variables of which the feasible region includes only ‘1’. Assign ‘1’ to the variables and assign ‘0’ to all the other variables. Verify satisfiability of the instance with assigned values. If the instance is satisfiable, print ‘accept’ and print the set of assigned values. If the instance is unsatisfiable, print ‘reject’. This algorithm has a polynomial number of calculation steps and does not require exponential accuracy for the applied estimation algorithm because the decision margin is not exponentially small when the feasible value of a dominant variable is ‘1’. Discussion We showed the natural number system hidden inside the SAT as a clear mathematical expression for the classification of 2-SAT, Horn-SAT, and k-SAT (k>2) for the first time. We verified that Horn-SAT has an intermediate property between 2SAT and k-SAT (k>2) regarding the formation possibility of the natural number system. It is known that NL⊆P and P⊆NP, but unknown if NL=P and P=NP. These questions have been open problems for several decades. As we know, 2SAT is NLcomplete, Horn-SAT is P-complete, and k-SAT (k>2) is NP-complete. Therefore, the formation possibility of the number system becomes a definite supporting evidence for the conjecture that NL⊊ P ⊊ NP. In addition, researchers have treated SAT as a set of polynomial Boolean functions. This is the first study verified that SAT must not be treated as a set of polynomial functions in the process of the linear operations. It was an open problem whether the coefficients of an inequality generated by the cutting plane proof system for a SAT instance are polynomially bounded. Researchers believed that cutting plane proofs with large coefficients are highly non-intuitive25. We showed the case when the coefficient of the inequality generated by Fourier-Motzkin elimination method becomes an exponentially large value. Therefore, against their expectations, the coefficients of an inequality generated by the cutting plane proof system are not polynomially bounded. Finally, we showed that an integral point, which is not included in the solution region, could be exponentially near the polytope by the number system. Here, the process of solving a given SAT instance leads to an important question: “Is it possible to separate an integral point from the polytope with a polynomial number of calculation steps when the point locates exponentially near the polytope?” If it is impossible to separate the point from the polytope, the formation possibility of the natural number system becomes an algorithmic barrier that distinguishes among NL, P, and NP. We have not known how to prove the P versus NP problem for a long time. This paper presents a new solving direction for the problem. More deep research based on the result of this study will give us a proof not only for the P versus NP problem but also for the NL versus P problem.
References 1. Stephen Cook, The P versus NP Problem. Clay Mathematics Institute. April. Retrieved 18 October (2006). 2. Stephen A. Cook, The complexity of theorem proving procedures. Proceedings of the Third Annual ACM Symposium on Theory of Computing. pp. 151–158 (1971). 3. Stephen Cook, The Importance of the P versus NP Question. Journal of the ACM, Vol. 50, No. 1, January (2003). 4. Richard M. Karp, Reducibility among combinatorial problems. Complexity of Computer Computations, R. E. Miller and J. W. Thatcher, eds., Plenum Press, New York, 85–103 (1972). 5. Michael R. Garey, David S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Co., New York. (1979). 6. T. Baker, J. Gill, R. Solovay, Relativizations of the P=?NP question. SIAM J. Comput., 4:431–442. (1975). 7. Michael Sipser, Introduction to the theory of computation. PWS Publishing Company, second edition. (1997). 8. A. A. Razborov, Lower bounds on the monotone complexity of some boolean functions. Soviet Math. Dokl. 31, 354–357 (1985). 9. N. Alon, R. B. Boppana, The monotone circuit complexity of boolean functions. Combinatorica 7, 1–22 (1987). 10. Johan Hastad, Almost optimal lower bounds for small depth circuits, in Randomness and Computation. Advances in Computing Research 5, JAI Press Inc., Greenwich, CT, 143–170 (1989). 11. Roman Smolensky, Algebraic methods in the theory of lower bounds for boolean circuit complexity. ACM Symposium on Theory of Computing (STOC) 19, ACM, New York, 77-82 (1987). 12. Alexander A. Razborov, Steven Rudich, Natural proofs. Proc. ACM STOC, 204213 (1994). 13. Carsten Lund, Lance Fortnow, Howard Karloff, Noam Nisan, Algebraic methods for interactive proof systems. J. ACM, 39:859–868 (1992). 14. Adi Shamir, IP=PSPACE. J. ACM, 39(4):869–877 (1992). 15. Ali Juma, Valentine Kabanets, Charles Rackoff, Amir Shpilka, The black-box query complexity of polynomial summation. ECCC TR07-125 (2007). 16. Scott Aaronson, Avi Wigderson, Algebrization: A New Barrier in Complexity Theory. Proc. ACM STOC, 731-740 (2008). 17. Papadimitriou, Christos H. , Computational Complexity, Addison-Wesley, pp. chapter 4.2, ISBN 0-201-53082-1., Thm. 16.3 (1994). 18. Stephen Cook, Logical foundations of proof complexity, Cambridge University Press. p. 224, ISBN 978-0-521-51729-4, Phuong Nguyen (2010). 19. Schaefer, Thomas J, The Complexity of Satisfiability Problems. STOC 1978. pp. 216–226 (1978).
20. G´erard Cornu´ejols, Valid Inequalities for Mixed Integer Linear Programs. Mathematical Programming, March, Volume 112, Issue 1, pp 3-44 (2008). 21. Alexander Schrijver, Theory of linear and integer programming, Wiley, Chichester, (1996) 22. David Monniaux, Quantifier elimination by lazy model enumeration. CAV 2010, Edimburgh, United Kingdom. (2010). 23. P. McMullen, G. C. Shephard, Convex Polytopes and the Upper Bound Conjecture. volume 3 of London Mathematical Society lecture note series. Cambridge University Press, ISBN 0-521-08017-7 (1971). 24. Dimitris Bertsimas, John N. Tsitsiklis, Introduction to Linear Optimization, Massachusetts Institute of Technology (Athena Scientific, Belmont, MA). (1997) 25. Bonet, M., Pitassi, T. and Raz, R., Lower bounds for Cutting Planes proofs with small coefficients. Proc. 27-th STOC, 575-584 (1995). Acknowledgments I especially thank Professor Beom-Hee Lee from the Department of Electrical and Computer Engineering at Seoul National University. In addition, I thank Professors Jae-Byung Park and Hae-Kyung Cho for their advice and guidance of this study.