A NN Algorithm for Boolean Satis ability Problems - Semantic Scholar

1 downloads 0 Views 156KB Size Report
This paper compares a neural network algorithm (NNSAT) with. GSAT 4], a greedy algorithm for solving satis ability problems. GSAT can solve problem.
A NN Algorithm for Boolean Satis ability Problems William M. Spears AI Center - Code 5514 Naval Research Laboratory Washington, D.C. 20375-5320 202-767-9006 (W) 202-767-3172 (Fax) [email protected] ABSTRACT

Satis ability (SAT) refers to the task of nding a truth assignment that makes an arbitrary boolean expression true. This paper compares a neural network algorithm (NNSAT) with GSAT [4], a greedy algorithm for solving satis ability problems. GSAT can solve problem instances that are dicult for traditional satis ability algorithms. Results suggest that NNSAT scales better as the number of variables increase, solving at least as many hard SAT problems.

1. Introduction Satis ability (SAT) refers to the task of nding a truth assignment that makes an arbitrary boolean expression true. For example, the boolean expression a ^ b is true if and only if the boolean variables a and b are true. Satis ability is of interest to the logic, operations research, and computational complexity communities. Due to the emphasis of the logic community, traditional satis ability algorithms tend to be sound and complete1 . However, [4] point out that there exist classes of satis ability problems that are extremely hard for these algorithms and have created a greedy algorithm (GSAT) that is sound, yet incomplete (i.e., there is no guarantee that GSAT will nd a satisfying assignment if one exists). The advantage of GSAT is that it can often solve problems that are dicult for the traditional algorithms. Other recent work has also concentrated on incomplete algorithms for satis ability ([1], [6], [8], [2]). However, comparisons between the algorithms have been dicult to perform, due to a lack of agreement on what constitutes a reasonable test set of problems. One nice feature of [4] is that a class of hard problems is very precisely de ned. This paper compares GSAT with a novel neural network approach on that class of hard problems. The results indicate that the neural network approach (NNSAT) scales better as the number of variables increase, solving at least as many hard problems.

2. GSAT and NNSAT

GSAT assumes that the boolean expressions are in conjunctive normal form (CNF)2 . After generating a random truth assignment, it tries new assignments by ipping the truth assignment of the variable that leads to the largest increase in the number of true clauses. GSAT is greedy because it always tries to increase the number of true clauses. If it is unable to do this it will make a \sideways" move (i.e., change the truth assignment of a variable although the number of true clauses remains constant). GSAT can make a \backwards" move, but only if other moves are not available. Furthermore, it can not make two backwards moves in a row, since the backwards move will guarantee that it is possible to increase the number of true clauses in the next move. Since SAT is a constraint satisfaction problem, it can be modeled as a Hop eld neural network [3]. This paper describes NNSAT, a Hop eld network algorithm for solving SAT problems. NNSAT is not restricted

1 Soundness means that any solution found must be correct. Completeness means that a solution must be found if one exists. 2 CNF refers to boolean expressions that are a conjunction of clauses, each clause (conjunct) being a disjunction of negated or non-negated boolean variables. For example, (a _ b) ^ (a _ b) ^ (a _ b) is in CNF.

_ _ _ _ (a v b) Λ (a v b ) Λ (a v b )

AND

_ avb _ avb

OR

OR

a

OR

_ _ avb

b

Fig. 1: The neural network for (a _ b) ^ (a _ b) ^ (a _ b)

to CNF and can handle arbitrary boolean expressions. The neural network graph is based on the AND/OR parse tree of the boolean expression. It is not necessarily a tree, however, because each instance of the same boolean variable is represented by only one leaf node, which may have multiple parents. The resulting structure is a rooted, directed, acyclic graph. All edges in the graph are directed toward the root node (i.e., there are no symmetric edges). An edge has weight wij = ?1 if there exists a NOT constraint from node i to node j , otherwise it has weight 1. Each node i can be true or false, which is represented using an activation ai of 1 and ?1, respectively. See Figure 1 for an example, where NOT s are represented by dashed edges. It is important to note that the \arrows" in Figure 1 are directed from children to parents. Let Energy be the global energy state of the network, representing the degree to which the boolean constraints Energy is the combination of local energy contributions at each node: Energy = Pi Energyiare= Psatis ed. net a . If all boolean constraints are satis ed, then each local energy contribution will i i i be maximized, and the total energy of the system will be maximized. In this formulation neti represents the net input to a node i from its immediate neighbors. If the net input is positive, then ai must be 1 in order to have a positive energy contribution. If the net input is negative, then ai must be ?1. In NNSAT neti is normalized to lie between ?1 and 1. Because of this Energyi = 1 if and only if all constraints are satis ed at node i.PIf a node i violates all constraints, then Energyi = ?1. The energy of a solution is simply Energy = i Energyi = n, where n is the number of nodes in the network. All solutions have this energy, and all non-solutions have lower energy. The computation of neti for SAT problems is reasonably complex. Informally, the net input neti is computed using upstream and downstream net inputs to node i. Downstream net inputs ow from children of a node, while upstream net inputs ow from the parents of a node. If a node is a leaf node, there is no reason to consider the downstream net input. If a node is an interior node, and there is no upstream net input, only the downstream net input in considered. If, however, there is upstream net input, it is averaged with the downstream net input (this will be expressed formally at the end of this section). Finally, the root node must always be true, since the goal is to satisfy the boolean expression. The downstream (upstream) net inputs are based on downstream (upstream) constraints. Downstream constraints ow from the children of a node. An AND (OR) node is constrained to be true (false) if and only if all of its children are true (false). An AND (OR) node is constrained to be false (true) if and only if at least one child is false (true). Upstream constraints ow from the parent of a node. If the parent of node i is an AND and the parent is true, then node i is constrained to be true. However, if the parent is an AND and the parent is false, then node i is constrained to be false if and only if all siblings (of node i) are true. Other situations are possible, but they do not constrain the node. Finally, if the parent of node i is an OR and the parent is false, then node i is constrained to be false. However, if the parent is an OR and the parent is true, then node i is constrained to be true if only if all siblings (of node i) are false. Again, other situations do not constrain the node. These rules can be illustrated with a simple example (Figure 1). Suppose we have the following CNF

boolean expression of two variables and three conjuncts: (a _ b) ^ (a _ b) ^ (a _ b). Each node is bound by boolean constraints. Since the root must be true, the conjuncts are constrained to be true. This information is used to probabilistically determine the truth values of each boolean variable. For example, suppose a is true. Then an upstream constraint from the rst conjunct constrains b to be true, while an upstream constraint from the third conjunct constrains b to be false. Due to the con icting constraints, it is not clear whether b should be true or false, so b is true with 50% probability. However, suppose a is false. Then the only constraint is an upstream constraint from the second conjunct, which constrains b to be false. Therefore b is false with high probability. More formally, the net input neti is a function of two contributions: Uneti and Dneti . Uneti represents the upstream net input to node i from all its parents. Since some nodes may have multiple parents, Uneti will be computed using contributions from each parent. Dneti represents the downstream net input to node i from its children. De ne ANDi (ORi ) as a predicate that is true if and only if node i is an AND (OR). It is now possible to de ne the downstream constraint Di for node i as:

Di = (ANDi ^ 8j (aj wji = 1)) _ (ORi ^ 9j (aj wji = 1)) Di is true if and only if node i is an AND node and has all children true, or it is an OR node and has at least one child true. To be precise, the truth of a child j is the product of the activation of the child and the weight from the child to node i (aj wji ). If the weight wji is ?1, a NOT constraint is implied, reversing the truth value of the child3 . Given this de nition, the rules for Dneti are: 8i[(Dneti = 1) $ Di ]; 8i[(Dneti = ?1) $ Di ] To see how this a ects local energy, consider the case were Di is true (Dneti = 1). This states that node i should be true. If we assume there are no upstream constraints, then neti = Dneti = 1 and Energyi = neti ai = ai . If i is true (ai = 1) there is a positive energy contribution; if it is false (ai = ?1) there is a negative energy contribution. As an example, consider Figure 2. If a is true and b is false, a node i representing a ^ b (note that Dneti = 1) will contribute positive energy if and only if that node is true. If Di is false (Dneti = ?1) then Energyi = ?ai and a similar situation exists. _ aΛb 1

−1

a

b

Fig. 2: The neural network for a ^ b

In a similar fashion, de ne the upstream constraint Uij of a node i from a parent j :

Uij = (ANDj ^ ((aj = 1) _ [(aj = ?1) ^ 8k6=i (ak wkj = 1)])) _ (ORj ^ ((aj = ?1) _ [(aj = 1) ^ 8k6=i (ak wkj = ?1)])) The rule for Unetij is: 8i8j [(Unetij = aj wij ) $ Uij ]4 . A constraint can occur if the parent j is an AND or an OR. If it is an AND, and aj = 1, then node i should have an activation ai = aj wij = wij . To see how this a ects local energy, assume for the sake of illustration that there is no downstream constraint and are no other upstream constraints. Then neti = Unetij and Energyi = neti ai = wij ai . Thus there is a positive energy contribution if and only if ai and wij have the same sign. As an example, consider Figure 2 again. If a node j representing a ^ b is true, then there will be a positive energy contribution at the node i representing b if and only if b is false (because wij = ?1) and a positive energy contribution at the node i representing a if and only if a is true (because for that edge wij = 1). Similar constraints occur when 3 4

For example, if a is true and b is false, then both children of the node a ^ b are true. See [6] for motivational details.

Input: A boolean expression, Cutoff , Max temp, and Min temp; Output: A satisfying truth assignment for the boolean expression, if found; restarts = 0; assignments = 0; loop f restarts++; j = 0; Randomly initialize the activations ai for each node i; loop f if the boolean expression is satis ed terminate successfully; T = Max temp  e?jdecay rate ; if (T < Min temp) then exit loop; loop through all the nodes i randomly f Compute the net input neti ; probability(ai = 1) = ?1 neti ;

g

g

g

1+e

T

j ++; assignments++; if (Cutoff = assignments) terminate unsuccessfully;

Fig. 3: The NNSAT algorithm

parent j is an AND, aj = ?1 and all siblings are true (ak wkj = 1), as well as the case when the parent j is an OR node (see the de nition of Uij above). It is now possible to combine the separate upstream contributions Unetij from each parent j into the total upstream contribution Uneti to node i. De ne ci as the number of upstream constraints on node i at some particular time (ci = jfj jUij gj). Suppose that a node has no upstream constraints (ci = 0). Then let Uneti = ai . For illustrative purposes, suppose there are also no downstream constraints. Then neti = Uneti and Energyi = ai ai = 1. If there are no constraints associated with node i, there is no reason to change the activation of that node. In essence, the network is rewarded at node i for having no upstream constraints. If there are upstream constraints, Uneti is normalized to fall within ?1 and 1 (to ensure that neti is normalized between ?1 and 1):

8i[(ci = 0) ! (Uneti = ai )];

8i[(ci 6= 0) ! (Uneti =

Pj Unetij

ci )] It is now possible to formally compute neti . De ne the predicates Leafi , Inti , and Rooti to be true if and only if node i is a leaf, an interior node, or the root node, respectively. Then the net input neti is: 8i[Leafi ! (neti = Uneti )]; 8i[Rooti ! (neti = 1)] 8i[(Inti ^ (ci = 0)) ! (neti = Dneti )]; 8i[(Inti ^ (ci 6= 0)) ! (neti = Dneti +2 Uneti )] NNSAT is shown in Figure 3. The above discussion has explained how to compute neti in the innermost

loop. Both GSAT and NNSAT also have outer loops to control restarts | if the search stagnates both algorithms are randomly reinitialized until their respective cuto s are reached or until a satisfying assignment of the boolean variables is found. NNSAT uses simulated annealing to help it escape local optima, by allowing it to make multiple backwards moves in a row (unlike GSAT). A restart occurs when the temperature T reaches some minimum. The decay rate for the annealing schedule is reduced each time NNSAT is restarted and is inversely proportional to the number of nodes n in the network (decay rate = 1/(restarts  n)). This allows NNSAT to spend more time between restarts when the problem is large or dicult. Assignments counts the number of boolean assignments tested during a run of the algorithm. An assignment consists of evaluating neti and ai (via the logistic function) for all nodes, in random order.

3. Experiments and Results Particular classes of problems appear to be dicult for satis ability algorithms. This paper concentrates on one such class, a xed clause-length model referred to as Random L-SAT. The Random L-SAT problem generator creates random problems in CNF subject to three parameters: the number of variables V , the number of clauses C , and the number of literals per clause L5 . Each clause is generated by selecting L of the V variables uniformly randomly, negating each variable with probability 50%. Let R denote the ratio of the number of clauses to the number of variables (C=V ). According to [4], hard problems are those where R is roughly 4.25, when L is 3. Although we could not obtain the speci c problems used in their experiments, we generated random problems using their random problem generator (with L = 3 and R = 4:25).

V

C

100 425 200 850 300 1275 400 1700 500 2125

GSAT NNSAT Assignments Assignments # Solved Cuto 21,000 3,700 18/30 100,000 497,000 36,000 16/30 200,000 1,390,000 24,000 11/30 200,000 3,527,000 71,000 8/30 200,000 9,958,000 178,000 6/30 400,000 Table: 1: GSAT and NNSAT on hard problems

Table 1 presents the comparison. For NNSAT Min temp = 0:01 and Max temp = 0:15. \Assignments" indicates the average number of assignments tested before a satisfying assignment was found (for those problems actually solved). Since the problems are not all solvable, Table 1 also presents the number of problems solved (out of 30). \Cuto " indicates the number of assignments tried before NNSAT decides a problem is unsolvable. The results for GSAT are taken from [4]. The percentage solved by GSAT is not reported, however Selman (personal communication) states that GSAT solves roughly 50% of the 100 and 200 variable problems, and roughly 20% { 33% of the 500 variable problems. Also, the cuto s for GSAT are not fully described. Although the lack of speci c information makes a comparison dicult, NNSAT appears to solve roughly the same percentage of hard problems with far fewer assignments. One problem with the above comparison is that each assignment in NNSAT is computationally more expensive than it is in GSAT. This occurs for two reasons. First, NNSAT can handle arbitrary boolean expressions, while GSAT is speci cally designed for CNF expressions. The increase in generality comes at the price of greater overhead. Second, each assignment in GSAT is equivalent to ipping one boolean variable, while an assignment in NNSAT involves ipping potentially multiple variables (leaf nodes). To address the rst issue a speci c version of NNSAT (called NNSAT-CNF) was written for CNF expressions, enormously increasing the eciency of the algorithm. To address the second issue, focus was centered on

ips rather than assignments. As described by [7], a ip in GSAT and NNSAT-CNF have almost identical computational complexity, thus making this an ideal measure for comparison6. Table 2 presents the results for NNSAT-CNF. Due to the increased eciency of the algorithm it was possible to average the results over more problems than when using NNSAT (100 for each choice of V , instead of 30) and to greatly extend the cuto s (which are still in terms of \assignments" to facilitate comparison with Table 1). Computational time is also reported (the results for GSAT are from [4]). The results for NNSAT-CNF are encouraging. In terms of \ ips" (and time) NNSAT-CNF appears to scale better than GSAT, since it solves a greater percentage of the larger problems with less computational e ort. The reason for this performance di erence is not clear. However, [4] report that GSAT performs worse when sideways steps are not allowed. A reasonable hypothesis is that GSAT would perform even better if it could occasionally take sequences of backward steps, as does NNSAT and NNSAT-CNF7 . A literal is a negated or non-negated boolean variable. In [7] NNSAT-CNF is called SASAT, because with CNF the network is simple and the emphasis is on Simulated Annealing. None of the details of NNSAT as described in Section 2, however, are in that paper. 7 GSAT has recently been modi ed to make heuristically-driven backwards moves, which do indeed improve performance. These special backwards moves also improve the performance of NNSAT-CNF. See [7] for details. 5 6

V

C

100 425 200 850 300 1275 400 1700 500 2125

GSAT NNSAT-CNF Flips Time Flips # Solved Time Cuto 21,000 .1 min 31,000 58/100 .2 min 200,000 497,000 2.8 min 396,000 44/100 3 min 400,000 1,390,000 12 min 1,924,000 48/100 13 min 800,000 3,527,000 34 min 2,269,000 45/100 15 min 1,000,000 9,958,000 96 min 4,438,000 41/100 30 min 1,600,000 Table: 2: GSAT and NNSAT-CNF on hard problems

4. Conclusion In this paper we consider the application of neural networks to boolean satis ability problems and compare the resulting algorithm (NNSAT) with a greedy algorithm (GSAT) on a particular class of hard problems. With the given cuto s, NNSAT appears to satisfy roughly as many hard SAT problems as GSAT. A specialized version of NNSAT for CNF expressions performs even better, solving at least as many hard SAT problems with less work. There are a number of potential items for future work. NNSAT uses a very simple annealing schedule, which is not optimized. Recent work in adaptive annealing schedules holds the potential for improved performance. Second, it should be possible to add domain-dependent operators to NNSAT. For example, it may be possible to add stochastic operators based on the Davis-Putnam satis ability algorithm. Third, NNSAT is intrinsically parallel and is a good match for parallel architectures. Andrew Sohn of the New Jersey Institute of Technology has ported NNSAT-CNF to a AP1000 multiprocessor [5]. Future work will concentrate on porting the general NNSAT algorithm to a parallel architecture.

Acknowledgements I thank Diana Gordon, Ken De Jong, David Johnson, Bart Selman, Ian Gent, Toby Walsh, Antje Beringer, Andrew Sohn, Mitch Potter, and John Grefenstette for provocative and insightful comments.

References [1] K. De Jong and W. Spears, \Using Genetic Algorithms to Solve NP-Complete Problems", International Conference on Genetic Algorithms, pp. 124{132, June 1989. [2] J. Gu, \Ecient Local Search for Very Large-Scale Satis ability Problems", SIGART Bulletin, 3(1), January 1992. [3] J. Hop eld, \Neural Networks and Physical Systems with Emergent Collective Computational Abilities", Proc. Natl. Acad. Sci, 79, pp. 2554{2558, 1982. [4] B. Selman, H. Levesque, and M. Mitchell, \A New Method for Solving Hard Satis ability Problems", Proceedings of the 1992 AAAI Conference, pp. 440{446, 1992. [5] A. Sohn, \Solving Hard Satis ability Problem with Synchronous Simulated Annealing on the AP1000 Multiprocessor", Proceedings of the Seventh IEEE Symposium on Parallel and Distributed Processing San Antonio, Texas, pp. 719{722, October 1995. [6] W. Spears, \Using Neural Networks and Genetic Algorithms as Heuristics for NP-Complete Problems", Masters Thesis, Department of Computer Science, George Mason University, 1990. [7] W. Spears, \Simulated Annealing for Hard Satis ability Problems", to appear in the DIMACS Series on Discrete Mathematics and Theoretical Computer Science, 1996. [8] R. Young and A. Reel, \A Hybrid Genetic Algorithm for a Logic Problem", Proceedings of the 9th European Conference on Arti cial Intelligence, pp. 744{746, 1990.