The Backtracking Survey Propagation Algorithm for ...

3 downloads 0 Views 277KB Size Report
Oct 13, 2015 - [15] A. Montanari, F. Ricci-Tersenghi and G. Semer- jian, Clusters of solutions and replica symmetry breaking in random k-satisfiability, J. Stat.
The Backtracking Survey Propagation Algorithm for Solving Random K-SAT Problems R. Marino1, 2 , G. Parisi3 , and F. Ricci-Tersenghi

3

1

NORDITA, KTH Royal Institute of Technology and Stockholm University, Roslagstullsbacken 23, SE-10691 Stockholm, Sweden 2 AlbaNova University Centre, KTH-Royal Institute of Technology, Dept. of Computational Biology, SE-106 91 Stockholm, Sweden 3 Dipartimento di Fisica, Sapienza Universit`a di Roma, INFN, Sezione di Roma I, and Nanotech–CNR, P.le Aldo Moro 5, I-00185, Roma, Italy October 13, 2015

Abstract Satisfiability of random Boolean expressions built from many clauses with K variables per clause (random K-satisfiability) is a fundamental problem in combinatorial discrete optimization. Here we study random Ksatisfiability for K = 3 and K = 4 by the Backtracking Survey Propagation algorithm. This algorithm is able to find, in a time linear in the problem size, solutions within a region never reached before, very close to SAT-UNSAT threshold, and even beyond the freezing threshold. For K = 3 the algorithmic threshold practically coincides with the SAT-UNSAT threshold. We also study the whitening process on all the solutions found by the Backtracking Survey Propagation algorithm: none contains frozen variables and the whitening procedure is able to remove all variables, following a two-steps process, in a time that diverges approaching the algorithmic threshold.

The K-satisfiability (K-SAT) problem is a combinatorial discrete optimization problem of N Boolean variables, submitted to M constraints. Each constraint, called clause, is in the form of an OR logical operator of K literals (variables or their negations), and the problem is solvable when exists one configuration of the variables, among the 2N possible ones, that satisfies all constraints. The K-SAT problem for K ≥ 3 is a central problem in combinatorial optimization: it was the first problem to be shown N P -complete [1, 2, 3] and is still very much studied. Efforts from the theoretical computer science community have been especially devoted to the study of the random K-SAT ensemble [4, 5], where each formula is generated by randomly choosing M = αN clauses of K literals [6]; indeed formulas from this ensemble become extremely hard to solve when the clause to variable ratio α grows [6] and still the locally tree-like structure of the factor graph [7], representing the interaction network among variables, makes the random K-SAT ensemble a perfect candidate for an analytic solution. The study of random K-SAT problems and of the related solving algorithms is likely to shed light on the origin of the computational complexity and to allow the development of

improved algorithms. Both numerical [8] and analytical [9, 10] evidences suggest that a threshold phenomenon, i.e. a 0 − 1 law, takes place in random K-SAT ensembles: in the limit of very large formulas, N → ∞, for α < αs (K) a typical formula has a solution, while for α > αs (K) a typical formula is unsatisfiable. It has been very recently proved in Ref. [11] that for K large enough the SAT-UNSAT threshold αs (K) exists in the N → ∞ limit and coincides with the prediction from the cavity method [12]. A widely accepted conjecture is that the SAT-UNSAT threshold αs (K) exists for any value of K. The best prediction on the location of the SATUNSAT threshold αs (K) for any K value comes from the cavity method [12, 13, 14, 15]: for example, αs (K = 3) = 4.2667 [14] and αs (K = 4) = 9.931 [15]. The cavity method also allows us to count clusters of solutions as a function of their internal entropy [16] and from this very detailed description of the space of solutions several phase transition have been identified [17, 15]. Before reaching the SAT-UNSAT threshold αs , typical formulas undergo at αd a shattering of the solutions space in a number of distinct clusters grow1

point of view [15] although still very hard to solve close to αs ). Since we are interested in solving very large problems we consider only algorithms whose running time scales almost linearly with the problem size. So, the effectiveness of an algorithm is given in terms of how close to the SAT-UNSAT threshold αs it can find solutions with high probability in the large N limit: we call αa this algorithmic threshold. Algorithms searching for solutions to random KSAT problems can be roughly classified in two main categories: (a) algorithms that search for a solution by performing a biased random walk in the space of configurations and (b) algorithms that try to build the solutions by assigning variables one-by-one, according to some estimated marginals. Belong to the former category WalkSat [24], focused Metropolis search (FMS) [25] and ASAT [26]; while in the latter category we find Belief Propagation guided Decimation (BPD) [27] and Survey Inspired Decimation (SID) [28]. All these algorithms are very effective in finding solutions to random K-SAT problems. For example, for K = 4 we have αaBPD = 9.05, αaFMS ' 9.55 and αaSID ' 9.73 to be compared with a much lower algorithmic threshold αaGUC = 5.54 achieved by the best algorithm, whose converge to a solution can be proven rigorously [29]. Among the efficient algorithms above, only Belief Propagation Guided Decimation can be solved analytically [27] to find the algorithmic threshold αaBPD ; for all the other algorithms we are forced to run extensive numerical simulations in order to measure their performances and algorithmic thresholds. The algorithm achieving the best performances on several constraint satisfaction problems up to now seems to be Survey Inspired Decimation (SID), which has been successfully applied to the random K-SAT problem [12] and to the coloring problem [30]. The statistical properties of the SID algorithm have been studied in details in the case of the random 3-SAT problem in the limit of very large number of variables in Refs. [31, 28]. Numerical experiments on random 3-SAT problems with a large number of variables, e.g. N = 3×105 , show that in a time that is approximately linear in N the SID algorithm finds solutions up to αaSID ' 4.2525 [31], that is definitely smaller, although very close to, αs (K = 3) = 4.2667. In other words in the region αaSID < α < αs the problem is satisfiable for large N , but we are unable to find a solution to the problem using the SID algorithm. To fill this gap we introduce a new algorithm for finding solutions to random K-SAT problems, the Backtracking Survey Propagation (BSP) algorithm. This algorithm is based, as SID, on the survey propagation equations derived within the cavity method [12, 31, 28],

ing exponentially with the system size N , with vey high barriers (both energetic and entropic) separating the clusters and an even larger number of metastable states (local minima of the energy potential), that can trap algorithms looking for solutions. Later at αc a condensation transition takes place and the remaining solutions concentrate in a sub-extensive number of clusters, leading to effective long-range correlations among variables in solutions, which are hard to approximate by any algorithm with a finite horizon. In general αd ≤ αc ≤ αs hold. One more feature of the solution space which has a direct effect on algorithms that actually search for solutions is the presence of frozen variables. A cluster of solutions is said to have frozen variables if there is a finite fraction of variables that take the same value in all the solutions of that cluster. This concept is similar to the idea of the backbone of variables hard to set [18], but restricted to a given cluster of solutions. A widely believed conjecture is that finding solutions in clusters with frozen variables is extremely hard and practically impossible to do in a time growing linearly with the system size. At the clauses-to-variables ratio αr the so-called rigidity transition takes place, such that for α > αr with high probability a randomly chosen solution belongs to a frozen cluster. Eventually, at an even higher ratio αf a freezing transition takes place, such that for α > αf all solutions belongs to frozen clusters. In general it holds αd ≤ αr ≤ αf ≤ αs . For random K-SAT problems with K > 8 it has been rigorously proved [19, 20] that a value of αf , strictly smaller than αs , exists; while for small K is again the cavity method that can provide the best prediction for αr [15], αf and for a statistical description of the freezing transition [21]. Actually most of the analytical computations made up to now [17, 15] have computed the rigidity threshold αr , rather than the freezing threshold αf , but in practice αr and αf are expected to be very close in random K-SAT for the small values of K we are going to use in this work. Most of the above picture of the solutions space has been proven rigorously in the large K limit [22, 23]. Moreover, given that random K-SAT problems for very large K become extremely hard and practically impossibile to solve in the interesting region close to αs , the research on solving algorithms has mostly focused on finding solutions to random K-SAT problems with small values of K. We remind the reader that for K ≥ 3 the K-SAT problem is NP-complete in the worst case, and the statistical physics analysis predicts typical random K-SAT problems with K ≥ 4 and α close to αs to be among the hardest problems (the case K = 3 is somehow different from the statistical physics 2

and tries to build the satisfying assignment by setting variables one by one, according to the current best estimates of their marginal probabilities. However, in addition to SID, the BSP algorithm allows to re-assigned variables previously set, in order to improve the partial solution already achieved. The idea of the backtracking [32] is that a choice made at the beginning of the decimation process, when most of the variables are unassigned, later on may turn to be suboptimal, and re-assigning a variable, which is no longer consistent with the current best estimate of its marginal probability, may lead to a better satisfying configuration. We do not expect the backtracking to be essential when correlations between variables are short ranged, but approaching αs we know that correlations become long ranged and thus the assignment of a single variable may affect a huge number of other variables: this is the situation when we expect the backtracking to be really essential. The BSP algorithm is explained in detail in the Methods section, but it is worth stressing that programming this new algorithm is as difficult as the standard SID algorithm. The only extra parameter is the ratio r between the number of backtracking moves (releasing one variable) and the number of decimation moves (fixing one variable). Obviously we must have r < 1, in order to eventually converge to a solution, for r = 0 we recover the SID algorithm and the running times grow as 1/(1 − r) when r grows, but the algorithm complexity remains almost linear in the system size for any r < 1. We are going to show that this new kind of algorithm is able to find solutions in all the SAT phase for K = 3, i.e. αaBSP ≈ αs , and for the more difficult K = 4 case, the numerical evidences suggest the BSP algorithm can go beyond the rigidity threshold, i.e. αr < αaBSP < αs , and very close to the SAT-UNSAT threshold.

Probability SAT Assignment

1 0.8 0.6 0.4 0.2 0 9.5

r=0.9 r=0.5 r=0.0 r=0.9 r=0.5 r=0.0

9.55

9.6

9.65

9.7 9.75 α

9.8

9.85

9.9

Figure 1: Probability that BSP finds a SAT assignment in random 4-SAT. The figure shows the probability that BSP finds a solution for a random 4-SAT problem as function of α, computed over 100 instances for N = 5 × 103 (left curves) and over 50 instances for N = 5 × 104 (right curves). The probability grows by increasing both r or N , and becomes much sharper for larger N , suggesting a 0 − 1 law in the N → ∞ limit. of the probability becomes sharper, as expected for a stochastic process having a 0 − 1 law in the N → ∞ limit. However the extrapolation of the curves in Fig. 1 to the N → ∞ limit seems difficult, because of biased finite size effects. Order parameter and algorithmic threshold. In order to compute the BSP algorithmic threshold in the N → ∞ we need to find an order parameter whose finite size effects are under control, or even better one which is size-independent. We identify this order parameter with the quantity Σres /Nres , where Σres and Nres are respectively the complexity (i.e. log of number of clusters) and the size of the residual problem which has been simplified enough by decimating variables and is ready to be passed to WalkSat for the relatively easy assignment of the last Nres variables. The average, shown in Fig. 2, is performed only on runs that finally found a solution. More precisely Σres is measured in the last step before WalkSat is called, since on calling WalkSat all the survey messages are null (see Methods for a more detailed description of the BSP algorithm) and so is the residual complexity. In general the residual complexity Σres is not always positive, and WalkSat is called anyway. However all the simulations we run provide a solid evidence that if WalkSat is called with Σres < 0 then it is not able to find a solution, while if Σres > 0 WalkSat always return a solution. The case Σres < 0 may happen either because the original problem was unsatisfiable or be-

Results In this section we show the results of our numerical simulations. In each plot having α on the abscissa, the right end of the plot coincide with the best current estimate of αs , in order to provide an immediate indication of how close to the SAT-UNSAT threshold the different algorithms can go. Probability of finding a SAT assignment. We start showing in Fig. 1 the probability that the BSP algorithm run with different values of the r parameter (r = 0, 0.5 and 0.9) on random 4-SAT problems of two different sizes (N = 5 × 103 and N = 5 × 104 ) finds a SAT assignment. We notice that the probability of finding a solution increases both by increasing r and the system size N . Moreover for larger N the behavior 3

Fig. 2 the mean value of the intensive residual complexity Σres /Nres is practically size-independent and linear fits provide very good estimates for the algorithmic threshold αaBSP for both random 3-SAT and 4-SAT (values reported in the caption of Fig. 2). For random 3SAT the algorithmic threshold of BSP run with r = 0.9 practically coincide with the SAT-UNSAT threshold αs thus providing a strong evidence that BSP can find solutions in the entire SAT phase of random 3-SAT. For random 4-SAT the inequality αaBSP < αs seems to hold strictly, although the distance from the SATUNSAT threshold is really tiny and more importantly the algorithmic threshold for BSP run with r = 0.9 is already larger than the rigidity threshold αr = 9.88 [15]. This means that BSP is able to find solutions in a region with α > αr where the majority of solutions is in clusters with frozen variables and thus hard to find. Indeed we will show below that BSP is finding solutions in atypical clusters with no frozen variables, as it has been observed also for other solution searching algorithms [34, 35]. For random 3-SAT problems the BSP algorithm seems to find solutions even beyond the freezing threshold, given that it finds solutions in the entire SAT phase. However for random 3-SAT the estimate αf = 4.254 for the freezing threshold comes from exhaustive enumeration of small problems [33] and may have strong finite N corrections. Computational complexity. The running time of the BSP algorithm is O(N ), since it is proportional to η N, (1) (1 − r)PSAT

0.035 0.03 0.025 Σ/N

0.02 0.015 4-SAT r=0.9 r=0.5 r=0.0 r=0.9 r=0.5 r=0.0

0.01 0.005 0 9

9.1

9.2

9.3

9.4

9.5 α

9.6

9.7

9.8

9.9

0.004 0.0035 0.003

Σ/N

0.0025 0.002 0.0015 0.001 0.0005

3-SAT r=0.9

0 4.225 4.23 4.235 4.24 4.245 4.25 4.255 4.26 4.265 α

Figure 2: Algorithmic threshold for BSP on random 3-SAT and 4-SAT problems. The upper panel shows Σres /Nres as function of α, for random 4-SAT problems with N = 5 × 103 (open symbols) and N = 5 × 104 (full symbols) and three values for the r parameter. The quantity Σres /Nres is practically size-independent and expected to vanish at the algorithmic threshold. Lines are linear fits to data with both values of N , and go to zero at αa = 9.827(3) for r = 0, αa = 9.884(3) for r = 0.5 and αa = 9.898(4) for r = 0.9. The lower panel shows Σres /Nres in random 3-SAT with N = 106 and r = 0.9. The linear fit (blue upper line) reaches zero at αa = 4.2678(4), while the black line is a linear fit forced to have intercept in αs (K = 3), which is also acceptable with a χ2 /d.o.f. = 0.74.

where η is the mean number of iterations to find a solution to the survey propagation equations, shown in Fig. 3, and PSAT is the probability of finding a solution, shown in Fig. 1. Eventually the computational complexity may become O(N log(N )) if the sorting of the single variable marginal biases is made in a trivial way. Fig. 3 clearly shows that both for random 3-SAT and random 4-SAT the mean running time for reaching convergence in the survey propagation equations increases only slightly with increasing α, and it is likely to have a finite value at the algorithmic threshold, where PSAT should go to zero, in the N → ∞ limit. In the worst case we expect a dependence η = O(log(N )) that does not modify the statement that the BSP running time is at most O(N log(N )). Whitening procedure. Given that the BSP algorithm is able to find solutions even beyond the freezing threshold, it is natural to check whether these solutions have frozen variables or not. We concentrate on solutions found for random 3-SAT problems with N = 106 , since the large size of these problems makes the anal-

cause the decimation procedure made some error and removed the few available solutions. Taking the average only on problems were a solution is eventually found, i.e. for which Σres ≥ 0, a null value for the mean residual complexity signals the BSP algorithm is not able to find any solution, and provide a good estimate for the algorithmic threshold αaBSP [31]. As we see in the data shown in the upper panel of 4

4-SAT r=0.9

Mean fraction of non-* variables

10 9

η

8 7 6 5

3-SAT α=4.262 α=4.254 α=4.240

1 0.8 0.6 0.4 0.2

4

0 9.65

9.7

9.75

9.8

9.85

9.9

0

5

10

15

20

α 13

25

30

35

40

45

τ 0.8

3-SAT r=0.9

3-SAT α=4.262 α=4.254 α=4.240

0.7

12

0.6 0.5

11 η

0.4 10

0.3

9

0.2 0.1

8

0 4.225 4.23 4.235 4.24 4.245 4.25 4.255 4.26 4.265 α

0

5

10

15

20

25

30

35

40

45

τ

Figure 3: BSP convergence time. The figure shows η, the mean number of survey propagation iterations to reach convergence, as a function of α for random 4-SAT problems with N = 5 × 104 and r = 0.9 (upper panel) and random 3-SAT problems with N = 106 and r = 0.9 (lower panel).

Figure 4: Whitening random 3-SAT solutions. The upper panel shows the mean fraction of non-? variables during the whitening procedure applied to all solutions found by the BSP algorithm. The line is the Survey Propagation prediction for the fraction of non-frozen variables for α = 4.262, and shows that solutions found by BSP are atypical. The lower panel shows the histograms of the whitening times, defined as the number of iterations required to reach a fraction 0.4 of non-? variables. Increasing α both the mean value and the variance of the whitening time grow. Very similar (and just shifted) histograms are obtained if the fraction of non-? variables defining the whitening time is set to 0.6, 0.2 or 0.

ysis very clean, and because we have a good number of solutions above the best estimate for the freezing threshold αf = 4.254(9) [33]. We check for frozen variables in a solution running the whitening procedure starting from that solution. The whitening procedure assigns iteratively a ? value to variables which belong only to clauses, which are already satisfied by another variable or already contain a ? variable. The procedure is continued until all variables are ? or a fixed point is reached: non-? variables at the fixed point correspond to frozen variables in the starting solution.

exponential in the problem size. So the BSP algorithm is actually finding solutions even beyond the freezing threshold αf by focusing on the sub-dominant clusters with unfrozen solutions. The whitening procedure leads to a relaxation of the number of non-? variables as a function of the number of iterations τ that follows a kind of two steps process [21] with an evident plateau (see upper panel in Fig. 4), that becomes longer increasing α towards the algorithmic threshold. The time for leaving the plateau

We uncover that all solutions found by BSP are converted to all-? by running the whitening procedure, thus showing that solutions found by BSP have no frozen variables. This is expected since it is widely believe that finding solutions with a finite fraction of frozen variable is very hard and probably takes a time 5

100

4-SAT τ(0) 4-SAT τ(0.4) 4-SAT τ(0.6) 3-SAT τ(0) 3-SAT τ(0.4) 3-SAT τ(0.6)

0.8 0.6 τ

Mean fraction of non-* variables

1

10

0.4 0.2

3-SAT α=4.240 α=4.254 α=4.262

0 0

5

10

15

20 25 τf - τ

30

35

40

1

45

100

1000 Σ

Figure 6: Critical exponent for the whitening time divergence. The whitening time τ (f ), defined as the mean time needed to reach a fraction f of non? variables in the whitening procedure, is plotted as a function of Σres for random 4-SAT problems with N = 5 × 104 and r = 0.9 (empty symbols) and for random 3-SAT problems with N = 106 and r = 0.9 (filled symbols) in a double-log scale. The lines are fits to Eq. (2) and predict ν = 0.269(5) for K = 4 and ν = 0.281(6) for K = 3.

Figure 5: Whitening random 3-SAT solutions. Mean fraction of non-? variables in the whitening procedure applied to all random 3-SAT solutions plotted versus the time remaining to end the whitening procedure. τf is the time at which the whitening procedure arrives at the all-? configuration and fluctuates from problem to problem. Errors are much smaller than in the upper panel of Fig. 4 because the trajectory from the plateau to the end is practically deterministic. The major source of fluctuations is the time for leaving the plateau.

K = 3. The two estimate turn out to be well compatible within errors, thus suggesting a sort of universality for the critical behavior approaching the algorithmic threshold αaBSP . Nonetheless a word of caution is needed since the solutions we are using as starting points for the whitening procedure are atypical solutions (otherwise they would likely contain frozen variables and would not flow to the all-? configuration under the whitening procedure). So, while finding universal critical properties in a dynamical process is definitely a good news, how to relate it to the behavior of the same process on typical solutions it is not obvious (and indeed for the whitening process starting from typical solutions one would expect the naive mean field exponent ν = 1/2, which is much larger than the one we are finding).

has large fluctuations as shown in the lower panel of Fig. 4, but after having left the plateau the dynamics of the whitening procedure is almost deterministic. Indeed plotting the mean fraction of non-? variables as a function of the time remained to reach the all? configuration, τf − τ , we see that fluctuations are strongly suppressed and the relaxation is practically deterministic (see Fig. 5). Critical exponent for the whitening time divergence. Given the clear increase of the whitening time approaching the algorithmic threshold, we study whether such an increase follows any critical behavior, i.e. power law divergence as a function of αa −α or Σres , which are linearly related. In Fig.6 we plot in a double logarithmic scale the mean whitening time τ (f ) as a function of the residual complexity Σres , for different choices of the fraction f of non-? variables defining the Discussion whitening time. Data points are interpolated via the We have presented the Backtracking Survey Propafollowing power law gation (BSP) algorithm for finding solutions in ran−ν −ν τ (f ) = C(f ) + A(f )(αa − α) = C(f ) + B(f )Σres , dom K-SAT problems and provided numerical evi(2) dence that it works much better than previously availwhere the critical exponent ν is forced to be the same able algorithms. For K = 3 BSP can find solutions for all fractions f , while the other two fitting parame- practically up to the SAT-UNSAT threshold αs , while ters are allowed to depend on f . Joint interpolations for K = 4 a tiny gap to the SAT-UNSAT threshold still return the following best estimates for the critical ex- remains, but the algorithmic threshold αaBSP seems to ponent ν: ν = 0.269(5) for K = 4 and ν = 0.281(6) for be located beyond the rigidity threshold αr . 6

frozen variables, but these can not be really reached by the algorithm. The backtracking thus intervenes and corrects the partial assignment until a solution with unfrozen variables is found by pure chance. If the marginals could be computed from a new biased measure which is concentrated on the clusters with no frozen variables, this could make the algorithm go immediately in the right direction and much less backtracking would be hopefully needed.

Beating the rigidity threshold, i.e. finding solutions in a region where the majority of solutions belongs to clusters with frozen variables, is hard, but not impossible, because smart algorithms can look for a solution in the sub-dominant unfrozen clusters [34, 35]. Indeed we have shown that all solutions found by BSP have no frozen variables, even for α > αr . The algorithmic threshold αa is defined as the threshold unbeatable by any polynomial-time algorithm, and several conjectures exist for trying to link αa to one of the phase transitions in the space of solutions. The numerical findings above confute the initial conjecture [17] that αa = αr . The most recent conjecture states that αa = αf , that is the algorithmic threshold coincides with the freezing threshold, where the complexity of clusters with only unfrozen variables Σunfrozen goes to zero. In general it is expected that αr ≤ αa = αf ≤ αs . Our numerical results are consistent with these inequalities for K = 4 (where the computation of αr is known analytically) and thus promote the BSP algorithm to a close-to-optimal algorithm for random KSAT, at least for small values of K and large problem sizes N . It would be very interesting to run BSP on random hypergraph bicoloring problems, where the threshold values are known [36] and a very recent computation provides an estimate for Σunfrozen [37]. Similarly to the SID algorithm, a smart implementation of the BSP algorithm has running times that approximately linear in N , and we expect in worst case running times O(N log(N )2 ), due to the sorting of biases and to the growth of the diameter of the random factor graph. It is worth noticing that the BSP algorithm, as also the SID algorithm, is easy to parallelize, since most of the operations are local and do not require any strong centralized control. Obviously the effectiveness of a parallel version of the algorithm would largely depend on the topology of the factor graph representing the specific problem: if the factor graph is an expander, then splitting the problem on several cores may require too much inter-core bandwidth, but in problems having a natural hierarchical structure the parallelization may lead to further performance improvements. The backtracking introduced in the BSP algorithm helps a lot in correcting errors made during the partial assignment of variables and this allow the BSP algorithm to reach solutions at large α values. Clearly the price we pay is that a too frequent backtracking makes the algorithm slower. A natural direction to improve this class of algorithms would be to used biased marginals focusing on solutions which are easy to be reached by the algorithm. For example in the region α > αr the measure is concentrated on solutions with

Methods The Backtracking Survey Propagation (BSP) algorithm is based on the survey propagation equations and can be viewed as an extension of the Survey Inspired Decimation (SID) algorithm. Survey Inspired Decimation. The SID algorithm has been proposed for finding solutions to random K-satisfiability problems in the region where solutions are shattered in an exponentially large number of different clusters [12, 13]. A complete description of the algorithm is given by Braunstein et al. [28]. The SID algorithm starts by solving iteratively the survey propagation equations: after convergence to the fixed point, each variable receives some messages (called surveys) that represent a simplified (projected) marginal, carrying information about the fraction of clusters where the variable i is forced to be positive (wi+ ), negative (wi− ) or not forced at all. From these surveys a bias for each variable i can be computed [31] P (i) = 1 − min(wi+ , wi− ) ,

(3)

such that assigning a variable of large bias minimizes the probability of doing a wrong choice. The SID algorithm then proceed by assigning variables (decimation), starting from those of largest bias. In order to keep the algorithm efficient, at each step of decimation a small fraction f of variables is assigned to their most probable values (according to the surveys). After each step of decimation, the survey propagation equations are solved again on the subproblem, which is obtained by removing satisfied clauses and by reducing those containing a false literal (unless a zero-length clause is generated, and in that case the algorithm returns a failure). The main idea of the algorithm is that fixing variables which are almost certain to their most probable value, one can reduce the size of the problem without reducing too much the number of solutions. The number of solutions, or better the number of clusters of solutions that we can count via the complexity Σ, should always remain positive during the decimation process in order to achieve the goal of finding a solution. At 7

may be beneficial in the search for a solution. So both variables to be fixed in the decimation step and variables to be released in the backtracking step are chosen according to the biases P (i): variables to be fixed have large biases and variables to be released have small biases. The BSP algorithm then proceeds by doing decimation steps and backtracking steps (always on a fraction f of variables in order to keep the algorithm efficient), and running the iterative method for solving survey propagation equations after each step (either decimation or backtracking one), such that it can update the surveys and the biases. The choice between a decimation or a backtracking step is taken according to a stochastic rule (unless there no variables to release), where the parameter r ∈ [0, 1) represents the ratio between backtracking steps to decimation steps. Obviously for r = 0 we recover the SID algorithm, since no backtracking step is ever done. Increasing r the algorithm becomes slower by a factor 1/(1 − r), because several variables are reassigned many times, but it can correct wrong choices made during the decimation, and achieve better results than the SID algorithm. The BSP algorithm can stop for the same reasons the SID algorithm does: either the survey propagation equations can not be solved iteratively or the generated subproblem has a contradiction. Both cases happen when the complexity Σ becomes too small or negative. On the contrary if the complexity remain always positive the BSP eventually generate a subproblem where all survey are null and this subproblem is passed to WalkSat that always finds a solution if the residual complexity Σres is positive.

every fixed point of the survey propagation equations (both on the original problem and on all the subproblems generated during decimation) the complexity Σ can be computed from the surveys [28] and its evolution during the SID algorithm can be very informative [31]. Indeed it is found that, if the complexity Σ becomes too small or negative, the SID algorithm is likely to fail, either because the iterative method for solving the survey propagation equations no longer converges to a fixed point or because a contradiction is generated by assigning variables. In these cases the SID algorithm returns a failure. If the complexity Σ remains always positive, the SID algorithm is likely to reduce such much the problem, that eventually the surveys becomes all null. This is a strong indication that the residual subproblem is easy and can be solved also by other standard and fast SAT solvers. Then WalkSat is run on the residual problem and always returns a solution if the residual complexity Σres computed on the last non-trivial fixed point of the survey propagation equations is positive. A careful analysis of the SID algorithm for random 3-SAT problems of size N = O(105 ) shows that the algorithmic threshold achievable by SID is αaSID = 4.2525 [31], which is close, but definitely smaller than the SATUNSAT threshold αs = 4.2667. The running time of the SID algorithm experimentally measured is O(N log(N )), however it is not excluded than an extra factor log(N ) could be present in the convergence time of the iterative solution to survey propagation equations, but for sizes up to N = O(107 ) has not been observed [28]. Backtracking Survey Propagation. Willing to improve the SID algorithm to find solutions also in the region αaSID < α < αs , one has to reconsider the way variables are assigned. It is clear that the fact the SID algorithm assigns each variable only once is a strong limitation, especially in a situation where correlations between variables becomes extremely strong and longranged. In difficult problems it can easily happen that one realizes that a variable is taking the wrong value only after having assigned some of its neighbours variables. However the SID algorithm is not able to solve this kind of frustrating situations. The BSP algorithms tries to solve this kind of problematic situations by introducing a new backtracking step, where a variable already assigned can be released and eventually reassigned in a future decimation step. It is not difficult to understand whether a variable is worth releasing. Indeed the bias P (i) can be computed also for a variable i already assigned: if the bias P (i), that was large at the time the variable i was assigned, gets strongly reduces by the effect of assigning other variables, then it is likely that releasing the variable i

References [1] S. A. Cook, The complexity of theorem proving procedures, Proc. 3rd Ann. ACM Symp. on Theory of Computing, Assoc. Comput. Mach., New York, p. 151. (1971). [2] M. Garey and D. S. Johnson, Computers and Intractability; A guide to the theory of NPcompleteness , Freeman, San Francisco, (1979). [3] C. H. Papadimitriou, Computational Complexity, Addison-Wesley (1994). [4] M. M´ezard and A. Montanari, Information, Physics, and Computation, Oxford University Press (2009). [5] C. Moore and S. Mertens, The Nature of Computation, Oxford University Press (2011). 8

[6] S. A. Cook and D. G. Mitchell, Finding Hard In- [19] D. Achlioptas and F. Ricci-Tersenghi, On the Solution-Space Geometry of Random Constraint stances of the Satisfiability Problem: A Survey, In: Satisfaction Problems, STOC ’06: Proceedings of Satisfiability Problem: Theory and Applications. the 38th annual ACM symposium on Theory of J. Du, D. Gu and P. Pardalos (Eds). DIMACS Series Computing (Seattle, WA, USA), 130 (2006). in Discrete Mathematics and Theoretical Computer Science, Volume 35 (1997). [20] D. Achlioptas and F. Ricci-Tersenghi, Random [7] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, Formulas Have Frozen Variables, SIAM J. Comput. Factor Graphs and the Sum-Product Algorithm, 39, 260 (2009). IEEE Trans. Infor. Theory 47, 498 (2002). [21] G. Semerjian, On the freezing of variables in ran[8] S. Kirkpatrick and B. Selman, Critical Behaviour dom constraint satisfaction problem, J. Stat. Phys. in the satisfiability of random Boolean expressions, 130, 251 (2008). Science 264, 1297 (1994). [22] D. Achlioptas and A. Coja-Oghlan, Algorithmic [9] O. Dubois, Y. Boufkhad, J. Mandler, Typical ranBarriers from Phase Transitions, in FOCS ’08: Prodom 3-SAT formulae and the satisfiability threshold, ceedings of the 49th annual IEEE symposium on in Proc. 11th ACM-SIAM Symp. on Discrete AlgoFoundations of Computer Science, 793 (2008). rithms, 124 (San Francisco, CA, 2000). [23] D. Achlioptas, A. Coja-Oghlan, and F. Ricci[10] O. Dubois, R. Monasson, B. Selman and Tersenghi, On the Solution-Space Geometry of RanR. Zecchina (Eds.), Phase Transitions in Combinadom Constraint Satisfaction Problems, Random torial Problems, Theoret. Comp. Sci. 265, (2001). Struct. Alg. 38, 251 (2011). [11] J. Ding, A. Sly, and N. Sun, Proof of the satisfia[24] B. Selman, H. A. Kautz, and B. Cohen, Noise bility conjecture for large k, arXiv:1411.0650 (2014). strategies for improving local search, in Proc. AAAI94, 337 (1994). [12] M. M´ezard, G. Parisi and R. Zecchina, Analytic and algorithmic solution of random satisfiabil[25] S. Sakari, M. Alava, and P. Orponen, Focused loity problems, Science 297, 812 (2002). cal search for random 3-satisfiability, J. Stat. Mech. P06006 (2005). [13] M. M´ezard and R. Zecchina, The random Ksatisfiability problem: from an analytic solution to [26] J. Ardelius and E. Aurell, Behavior of heuristics an efficient algorithm, Phys. Rev. E 66, 056126 on large and hard satisfiability problems, Phys. Rev. (2002). E 74, 037702 (2006). [14] S. Mertens, M. M´ezard and R. Zecchina, Threshold Values of Random K-SAT from the Cavity [27] F. Ricci-Tersenghi and G. Semerjian, On the cavity method for decimated random constraint satisfacMethod, Random Struct. Alg. 28, 340 (2006). tion problems and the analysis of belief propagation guided decimation algorithms, J. Stat. Mech. P09001 [15] A. Montanari, F. Ricci-Tersenghi and G. Semer(2009). jian, Clusters of solutions and replica symmetry breaking in random k-satisfiability, J. Stat. Mech. [28] A. Braunstein, M. M´ezard, and R. Zecchina, P04004 (2008). Survey propagation: An algorithm for satisfiability, [16] F. Krzakala, A. Montanari, F. Ricci-Tersenghi, Random Struct. Alg. 27, 201 (2005). G. Semerjian, and L. Zdeborova, Gibbs States and the Set of Solutions of Random Constraint Satisfac- [29] A. Frieze and S. Suen, Analysis of two simple heuristics on a random instance of k-SAT, J. Altion Problems, PNAS 104, 10318 (2007). gor. 20, 312 (1996). [17] L. Zdeborova and F. Krzakala, Phase transitions in the coloring of random graphs, Phys. Rev. E 76, [30] R. Mulet, A. Pagnani, M. Weigt, and R. Zecchina, Coloring Random Graphs, Phys. Rev. Lett. 89, 031131 (2007). 268701 (2002). [18] R. Monasson, R. Zecchina, S. Kirkpatrick, B. Selman and L. Troyansky, Determining computational [31] G. Parisi, Some remarks on the survey decimation algorithm for K- satisfiability, arXiv:cs/0301015 complexity from characteristic phase transitions, (2003). Nature 400, 133 (1999). 9

[32] G. Parisi, A backtracking survey propagation algorithm for K-satisfiability, arxiv:cond-mat/0308510 (2003). [33] J. Ardelius and L. Zdeborova, Exhaustive enumeration unveils clustering and freezing in the random 3-satisfiability problem, Phys. Rev. E 78, 040101 (2008). [34] L. Dall’Asta, A. Ramezanpour, and R. Zecchina, Entropy landscape and non-Gibbs solutions in constraint satisfaction problems, Phys. Rev. E 77, 031118 (2008). [35] L. Zdeborov´a and M. M´ezard, Constraint satisfaction problems with isolated solutions are hard, J. Stat. Mech. P12004 (2008). [36] T. Castellani, V. Napolano, F. Ricci-Tersenghi, and R. Zecchina, Bicolouring random hypergraphs, J. Phys. A 36, 11037 (2003). [37] G. Semerjian, private communication.

Acknowledgements We thank K. Freese, R. Eichhorn and E. Aurell for useful discussions. This research is supported by the Swedish Science Council through grant 621-2012-2982.

10

Suggest Documents