Sampling strategies and variable selection in weighted degree heuristics ? Diarmuid Grimes and Richard J. Wallace Cork Constraint Computation Centre and Department of Computer Science University College Cork, Cork, Ireland email: d. grimes,
[email protected]
Abstract. An important class of CSP heuristics work by sampling information during search in order to inform subsequent decisions. An example is the use of failures, in the form of constraint weights, to guide variable selection in a weighted degree procedure. The present research analyses the characteristics of the sampling process in this procedure and the manner in which information is used, in order to better understand this type of strategy and to discover further enhancements.
1
Introduction
Recent years have seen the emergence of new and more powerful methods for choosing variables and variable assignments during CSP search. In the past most heuristics based their decisions on either the initial state of search or the current state of search. However newer heuristics use within-problem learning, basing their decisions on past states, in order to adapt to a given problem. Among these are Boussemart et al.’s “weighteddegree” heuristic (wdeg) [1] and a variation on the weighted degree strategy that we call “random probing” [2] [3]. The significance of these methods is that they can locate specific sources of failure, including at one extreme insoluble subproblems. In this work, we study these techniques in order to characterise them more adequately, focusing on constraint weighting. (Although the following can be expanded to problems involving n-ary constraints, for simplicity we limit our discussion to binary CSPs.) We examine two aspects of these methods: • the nature and quality of the information gained during sampling • the manner in which this information is used during subsequent search
2
Characterising Algorithms Based on Constraint Weighting
2.1
Description of heuristics based on constraint weights
In the weighted degree procedure, a constraint’s weight is incremented during consistency propagation whenever this constraint causes a domain wipeout. The weighted degree of a variable is the sum of the weights of the constraints in the current search state ?
This work was supported by Science Foundation Ireland under Grants 00/PI.1/C075 and 05/IN/1886.
associated with the variable. The weighted degree heuristic (wdeg) chooses the variable with largest weighted degree. (So the heuristic acts identically to the forward-degree heuristic up until at least one failure has occurred.) Domain information can also be incorporated by choosing the variable which minimises the ratio of domain-size to weighted degree (dom/wdeg). Given the general effectiveness of the dom/wdeg strategy, this is the method used in most experiments reported in this paper. Random probing uses a more systematic method of sampling. It begins with a series of “random probes”, in which variable selection is done randomly, and weights are incremented in the usual fashion but are not used to guide search. Probing is run to a fixed cutoff C, and for a fixed number of restarts R. On the last restart (final run), the cutoff is removed and the wdeg or dom/wdeg heuristic uses the accumulated weights for each variable. On the final run one can either use these weights as the final weights for the constraints or continue to update them, thereby combining global weights with weights that are local to the part of the search space one is in. 2.2
Characterisation of constraint weighting procedures
The weighted degree procedures can be conceived in terms of an overall strategy that combines two heuristic principles: the Fail-First principle and an additional Contention Principle, which says that variables directly related to conflicts are more likely to cause failure if they are chosen instead of other variables. A somewhat more precise statement of this principle is: Contention Principle. If a constraint is identified as a source of contention, then variables associated with that constraint are more likely to cause failure after instantiation. This leads to the rule-of-thumb: Choose the variable associated with the most contentious constraints, which is the basis for the weighted degree heuristic. The validity of the Contention Principle depends on the undirected character of constraints. Suppose the domain of variable xk is wiped out after assigning values to variables x1 , x2 , . . ., xj . In this case, if xk is assigned a value before these other variables, then at least one of the domains of the latter will be reduced. Moreover, the greater the number of partial assignments affecting xk , the greater the likelihood that assigning xk a value early in search will affect some other variable. (Note that a similar argument can be made even if the domain of xk is only reduced.) There are two aspects or phases in the weighted degree procedure (in any of its forms): a sampling phase and a variable selection phase. In the sampling phase, we are trying to sample likelihood of failure. This is done by tallying domain wipeouts. A “wipeout” is a complex event consisting of a set of domain deletions that comprise an entire domain. In addition, deletions are associated via episodes of constraint propagation with a set of assignments. A wipeout therefore has two aspects: the reduction itself and the ‘context’ of that reduction including both a set of assignments and the constraint Cik associated with the final domain reduction. Constraint weights combine these events into event-classes involving all the assignment sets that can lead to a wipe-
out of Dk via constraint Cik (note each of those assignment sets also leads to a wipeout of domain Di via the same constraint). In the weighted degree algorithm sampling is done in the context of systematic search. This means that the same basic event cannot be sampled more than once. At the same time, sampling is heavily biased in favour of assignment tuples related to the order of instantiation. Moreover, the bias changes during search because partial assignments are increasingly determined by the results of the sampling itself. This creates a negative feedback effect, since variables associated with high weights are selected earlier in search, after which their weights are less likely to increase since their constraints are removed from the current problem. Random probing allows us to sample more systematically. In doing so, we may be able to uncover cases of “global contention”, i.e. contention that holds across the entire search space. We also assume that variables associated with global contention are most likely to reduce overall search effort if they are instantiated at the top of the search tree. In the variable selection phase, estimates of failure in the form of constraint weights are used to guide heuristic selection. As noted above, constraint weights give an estimate of overall contention associated with that variable. However, because of sampling biases as well as possible dependencies among failures, the relative values of these sums have an unknown relation to sizes of probabilities associated with the underlying compound events.
3
Experimental Methods and Reference Results Table 1. Search Efficiency on Random and k-Colour Problems with Selected Heuristics heuristic lexical max static deg Brelaz min dom/fwddeg max fwddeg min dom/wdeg max wdeg
random 253,607 2000 3101 1621 2625 1538 2070
colouring 4,547,058 22,460 910 1285 347,306 1029 16,866
Notes. Random problems are . k-colour problems are , where 6 is number of colours and 0.27 is density. Measure is mean search nodes.
The bulk of the present analysis employs two sets of problems: 100 random binary and 100 random k-colouring problems. To facilitate data collection, problem parameters were chosen so that problems were fairly easy to solve, although in both cases, they are in the critical complexity region. The basic sets of problems had solutions. In most cases, results for more difficult problems of the same types corroborate the present ones. Unless otherwise noted, all results are for chronological backtracking using maintained arc consistency (MAC) with lexical value ordering. In experiments on search efficiency,
search was for one solution. For all results using the random probing strategy, means given are means over 10 experiments, each experiment having a different random seed. Since we are trying to assess quality of search given different kinds and amounts of information, measures of effort are confined to the search itself. So results for random probing are for the final run following the sampling phase. In the final run, weights were no longer incremented after failure (“frozen weights”). This allows us to assess quality of information provided by the probing without contamination by further updating during the final run. Selected results for various heuristics are shown in Table 1. These include the foundation heuristics used by weighted degree and random probing, namely maximum forward-degree and minimum domain/forward-degree. To indicate problem difficulty, results for lexical variable ordering are also included.
4
Search Efficiency with Weighted Degree Strategies
Elsewhere we have collected results for a variety of problems using weighted degree and random probing; these included both soluble and insoluble and random and structured problems ([2], [3]). In these papers we have shown that, even with preprocessing work included, the random probing approach improves over dom/wdeg and in some cases by orders of magnitude. This work demonstrates the generality of the basic effects that we wish to study in more depth in this paper. Results for weighted degree using either max wdeg or min dom/wdeg as the heuristic are shown at the bottom of Table 1. As expected, when the basic heuristics are elaborated by incorporating the constraint weights, there is consistent improvement in average search effort. Table 2. Search Efficiency with Random Probing with Different Restarts and Cutoff (random)
Cutoff 25 50 100 200
10 1498 1346 1314 1237
Restarts 20 40 1369 1316 1287 1237 1228 1173 1245 1196
80 1263 1174 1223 1211
160 1228 1205 1174 1177
Notes. problems. final run after probing, mean search nodes across ten experiments, dom/frowdeg.
Probing results for random problems are shown in Table 2. The first thing that should be noted is that for random problems every combination of restarts and node cutoff gives an improvement over dom/wdeg. In the best cases there is about a 25% improvement. At the same time, some combinations of restarts and cutoffs give better overall performance than others. This means that different probing regimes yield information of varying quality, so this is a significant aspect of probing. (Partial results
have been collected for random binary problems with the same parameters that had no solutions. Similar improvements were found as for those reported above.) Corresponding data for colouring problems are shown in Table 3. Somewhat different parameter values were tested with the expectation that good estimates of failure would require more extensive search than with random problems because with inequality constraints propagation cannot occur until a domain is reduced to one value. In this case, search after random probing is consistently inferior to the interleaving strategy used by weighted degree, although these results are closer to those for the strong heuristics listed in Table 1 than for the weak ones. Table 3. Search Efficiency with Random Probing with Different Restarts and Cutoff (colouring)
Cutoff 50 100 500
Restarts 10 40 100 200 6129 4526 3118 3710 5988 4123 3996 — 6463 4698 3837 —
Notes. k-colouring problems. Otherwise as in Table 2.
Results obtained to date suggest that interleaving after probing does not improve search to a significant degree and may sometimes impair it. Thus, for the random problems, using 40 restarts and a 50-node cutoff gave a mean of 1285 search nodes (versus 1237 in Table 2). Comparable results were found for the colouring problems using 40 or 100 restarts with a 50-node cutoff. For much more difficult problems of this type, freezing weights sometimes yields marked improvements in performance [3]. Another issue is whether probing with a node cutoff can mask a varying number of failures and, therefore, differences in quality of sampling. In fact, there is a high degree of consistency in the relation between failures and nodes. Thus, for the random problems a failure-count cutoff of 50 was associated with a node-count of about 70, with a total range of 60-87 across 100 problems and 4000 runs. Similar results were found for the colouring problems. This means the basic results would not be altered if failures were used instead of nodes. Nonetheless, in some subsequent experiments we use failure-cutoffs in order to control this factor directly.
5
Empirical Analyses of Sampling and Variable Selection
Weighted-degree heuristics estimate the likelihood that failure will be induced by tallying actual failures. Previous work has not addressed the issue of the actual method of sampling or considered alternative sampling strategies. The key issues involve quality of sampling. In the first place, we can consider the specific type of event sampled. Because we are looking for indicators that predict the likelihood of inducing failure, it may be possible to sample events related to failure other than domain wipeout. In the second place, we can sample under different methods of search, that differ in the degree of consistency established or the manner in which conflict is detected.
5.1
Sampling based on different specific events related to failure
Here we look at sampling contention that is either an elaboration of failure counts or is based on domain reductions. We consider the following: • wipeout-tallies in which in each case the relevant constraint weight is increased by the size of the domain reduction leading to the wipeout • tallies of all deletions; i.e. whenever a domain is reduced in size during constraint propagation, the weight of the constraint involved is incremented by 1 • tallies of all deletions where constraint weights are increased by the size of the domain reduction • tallies of all deletions except those leading to a wipeout The last-mentioned ‘strategy’ was included to evaluate the hypothesis that sampling is related to contention rather than to failure in particular. Table 4. Search Efficiency with Different Sampling Strategies (Mean search nodes per problem) dom/wdeg wipeouts wipe by #del alldel alldel by #del dels/nowipe
1538 1592 1523 1496 1530
random probe-dom/wdeg (40R50C) 1265 1261 1426 1461 1499
Notes. problems. “R” and “C” are restarts and node-cutoff on each run before the final one.
The results show that any of these events can indicate contention (Table 4). For the dom/wdeg heuristic, sampling either deletions and failures gives comparable results. With random probing, direct sampling of failure is reliably better than sampling deletions. This is further evidence that effectiveness of search after probing is affected by quality of sampling, since events directly associated with failure (i.e. with greater degrees of contention) are better predictors of fail-firstness than conflicts that do not necessarily imply failure. Nonethess, search is very efficient in all cases. 5.2
Sampling based on different search procedures
We compared three search methods for information gathering that weight constraints that either cause domain wipeouts in systematic search or cause conflict in local search. The first uses the breakout method for preprocessing weight generation (similar to [4]), these weights are then used during complete search with dom/wdeg. The second uses forward checking with random probing for information gathering, and the third is random probing using MAC as in earlier sections. All three methods were followed by complete search with MAC, using frozen weights. For both FC and MAC random probing, there were 100 restarts with a failure cutoff of 30. Breakout was run to a total weight cutoff of 3000. Thus, each method generated the same amount of information.
Table 5. Search Efficiency with Information Gathered under Different Search Procedures probe-dom/wdeg probe-wdeg Breakout 1863 3890 FC-probes 1492 3198 MAC-probes 1198 1595 Notes: problems. Mean search nodes for the final run across ten experiments.
As seen in Table 5, the weights learnt by MAC were superior to those learnt by either FC or breakout for these problems. The magnitude of the improvement when using MAC is even clearer if the domain size factor is removed from the heuristic. (Note that if either breakout or FC produced weights of similar quality to MAC then, because of the speedup they offer over MAC, they would be more efficient for preprocessing). 5.3
Analysis of variable selection
To further analyse the quality of search with sampling, measures of promise and failfirstness were collected. These are measures of the quality of search under two different conditions: (i) when search is on a solution path, i.e. the present partial assignment can be extended to a solution, (ii) when a mistake has been made and search is in an insoluble subtree [5]. Quality is assessed by measuring the degree of adherence to an optimal policy under each condition. It was expected that sampling would have its major effect on fail-firstness. Table 6. Adherence-to-Policy Assessments and Selected Descriptive Measures for Orderings Based on Sampling heuristic dom/fwddeg dom/wdeg random probe (40R 50C)
policy measures promise badtree .00041 437 .00042 366 .00046 282
descriptive measures |dom| fwd-deg faildepth 3.3 8.2 7.7 3.3 8.2 7.4 3.4 8.7 7.1
Notes. problems. Means of 100 problem-means. “badtree” is mean mistake-tree size. “|dom|” and “fwd-deg” are for variables chosen in search.
For fail-first assessments, search was for one solution; this gives somewhat less reliable results than with all-solutions search, but it is much less time-consuming. (Promise calculations necessarily involve an all-solutions search.) To cope with the fact that variable selection under interleaved sampling or random probing is not well-defined or easily replicated, the following strategy was used. After a run, the solution was saved in the order obtained. Quality assessment was based on this ordering, which was therefore fixed throughout search. Although search is less efficient here than it would be if the ordering were dynamic, it allows better assessment of information gained by the end of sampling both for interleaved and sample-first methods. For comparison, the same method was used with orderings produced by the min dom/fwddeg heuristic.
Results of this analysis show that the major difference is in the fail-firstness of these orderings, reflected in the magnitudes of mistake-tree sizes (Table 6). Measures of promise are highly similar. With colouring problems, in contrast, the mean mistaketree size was appreciably greater for random probing-dom/wdeg than for dom/wdeg, 10,100 versus 318. Clearly, some features of these problems interfered with adequate sampling.
6
Conclusions
We developed a rationale for the weighted degree approach in terms of two principles of performance, one of which is the well-known Fail-First Principle. We were also able to show why another, the Contention Principle, is necessary in any complete explanation of the weighted degree approach. To demonstrate the actual operation of the Fail-First Principle, we showed that fail-firstness is indeed enhanced by strategies based on sampling and that differences in overall efficiency of search are related to this property. We have also been able to make some predictions on the basis of the Contention Principle regarding novel sources of contention that were verified experimentally. In these procedures, sampling produces estimates of the probability of failure. This information is used in turn to predict differences in fail-firstness in order to inform variable selection. We have given a detailed account of the nature of this sampling and the use of this information in search. In the course of this analysis, we were able to identify two potential shortcomings of the original weighted degree method: a feedback relation between sampling and variable selection that can hamper the discrimination of degrees of contention, and the inability to use estimates of failure for selections early in search. Random probing was designed to address these difficulties. However, the present work shows that there are marked differences in effectiveness of different sampling strategies with different problems and with different search strategies. The basis for these differences remains to be clarified. Now that we have a more adequate analytical framework, we should be able to pursue these and other issues in a more informed manner.
References 1. Boussemart, F., Hemery, F., Lecoutre, C., Sais, L.: Boosting systematic search by weighting constraints. In: Proc. Sixteenth European Conference on Artificial Intelligence-ECAI’04. (2004) 146–150 2. Grimes, D., Wallace, R.J.: Learning from failure in constraint satisfaction search. In Ruml, W., Hutter, F., eds.: Learning for Search: Papers from the 2006 AAAI Workshop, 24-31. Tech. Rep. WS-06-11. (2006) 3. Grimes, D., Wallace, R.J.: Learning to identify global bottlenecks in constraint satisfaction search. In: 20th International FLAIRS Conference. (2007) 4. Eisenberg, C., Faltings, B.: Using the breakout algorithm to identify hard and unsolvable subproblems. In Rossi, F., ed.: Principles and Practice of Constraint Programming-CP’03. LNCS No. 2833. (2003) 822–826 5. Beck, J.C., Prosser, P., Wallace, R.J.: Trying again to fail-first. In: Recent Advances in Constraints. Papers from the 2004 ERCIM/CologNet Workshop-CSCLP 2004. LNAI No. 3419. (2005) 41–55