Random Problem Spaces for the Evaluation of ... - Semantic Scholar

Research Proposal: Random Problem Spaces for the Evaluation of Heuristic Algorithms Robert Schrag Department of Computer Sciences Taylor Hall 2.124, Mail Code 60500 University of Texas at Austin Austin, TX 78712 Internet: [email protected] World-wide Web: http://www.cs.utexas.edu/users/schrag/ October, 1994 This paper represents proposed research by me that was never actually executed beyond the results reported below, although related work is reported in my dissertation. The approach is unique and still promising as a signi cant advance to the state of the art. See for example the opening survey article about evaluating heuristic algorithms in the agship issue (Volume 1, Number 1) of the Journal of Heuristics (Kluwer Academic, 1995). | Robert C. Schrag, Ph.D., December 4, 1996. (Certain references to work unpublished at the original writing have been updated to their now-published forms.)

Introduction This document presents a suggested methodology to extrapolate trends in random problem spaces for the purposes of rating the quality of solutions from heuristic algorithms for combinatorial problems, even when problem sizes are large so that algorithms to nd actual optimal solutions are infeasible. Before beginning, I give a description of the constraint satisfaction problem and its special case propositional satis ability which I will use in an illustrating example.

Constraint Satisfaction Problems Problems involving constraints are inherent in many of the applications that DARPA and its customers face, including planning and scheduling for air campaign or transportation, qualitative reasoning about physical systems for diagnosis or design, and machine vision for robotic mechanisms. A discrete (or nite-domain) constraint satisfaction problem (CSP) [Mackworth, 1992] is described by a set of variables X = fx1; x2; : : :; x g, a corresponding set of domains D = fd1; d2; : : :; d g, and a n

n

1

set of constraints P. For each variable x , the domain d is a nite set of possible values which may be assigned to it. An assignment is a pairing between a variable and a value in its domain such that the variable is assumed to take on that value; i.e., x = v 2 d . The constraints P are truth-valued predicates de ned over any subset of the variables X; they restrict the possible sets of assignments, in that every predicate must be satis ed in a solution to the CSP. Let f1; 2; : : :; 2n g be the power set of X, and let f1; 2; : : :; 2n g be sets of predicates such that each predicate of each set has arity j j and mentions exactly the variables . Then we may express a CSP as rst-order V Vpredicate logic formula of the form (9x1 2 d1)(9x2 2 d2) : : :(9x 2 d ) [1 2n ] . A solution of the CSP is any model (complete set of assignments) satisfying this formula. For a given CSP we may be interested in nding a single solution, some xed number of solutions, or all solutions. Some procedure must be speci ed in order to determine a constraint's truth value with respect to a particular set of assignments. Following convention, we will assume that a constraint is represented by the set of tuples for which it is true; in this case look-up suces as an evaluation procedure. Representing constraints this way, we can merge the tuples of dierent constraints for the same subset of variables into a single constraint by taking their intersection, which we also may refer to as .1 Propositional satis ability (SAT) may be viewed as a degenerate kind of CSP in which each variable has the domain ftrue, falseg. All of the techniques which have been developed for general CSPs can be applied to SAT, but the converse is not true. (Resolution, for example, can be generalized to work for CSPs but the resulting technique is weak.) Nonetheless, SAT is an important CSP case, because it occurs as a fundamental part of many AI problems. One popular approach to solving CSP problems is to translate them to SAT. CSP, as a generalization of SAT, is NP -complete. i

i

i

i

i

i

i

n

n

i2

;

i

i

i

Random Problem Spaces In the past few years, researchers have been investigating CSPs from the point of view of random problem spaces in which instances of a problem are generated by keeping the number of variables and their domains xed and allowing the number of constraints to vary in a controlled manner (e.g., [Cheeseman et al., 1991, Mitchell et al., 1992, Williams and Hogg, 1992]). Experimentally, it is found that instances in these spaces fall into three main regions: a region of underconstrained instances where almost all instances have solutions which on average are easy to nd; a region of overconstrained instances in which almost no instances have solutions which on average is easy to determine; and a narrow, intervening region of \phase transition" during which the probability that an instance has a solution changes abruptly from near 1 to near 0 and on average the existence of a solution is dicult to determine. This empirical investigation is valuable because CSPs as yet appear to have complexity which prohibits the application of analytical techniques successful in characterizing similar properties in simpler random 1 For constraints over numeric (e.g., integer) domains this representation may be impractical. Many constraint satisfaction techniques designed strictly for variables over nominal domains can be extended to apply to mixed nominal/numeric variables, however. (For more perspective on numeric CSP's, please refer to [Schrag, 1992].)

2

graphs [Bollobas, 1985], except within weak bounds (e.g., [Chao and Franco, 1990, Franco and Paull, 1983]). Random problem problem spaces are important because they provide a convenient means of comparing the performance of dierent algorithms. The instances are arti cial, but they are controllable, easy to generate, and are of a wide range of diculty. I will argue below that random problem spaces also can be exploited in a new way to rate the quality of solutions from heuristic algorithms.

Logical De nitions Here I formalize some important de nitions. Throughout this section, we deal with propositional logic.

A variable, which is represented by a symbol (e.g., p), ranges over the values

\true" and \false." A literal is a variable or its negation (e.g., p, :p). A clause is a disjunction of literals (e.g., (p _ :q _ r)). When convenient, we treat a clause as the set of its literals. A clause which contains both a literal and its negation (e.g., (p _ :p _ q)) is trivial, or tautological. A conjunctive normal formula (CNF) F is a conjunction of clauses (e.g., (p _ :q _ r) ^ (p _ s)). In this paper, all input formulas will be CNF with non-trivial clauses. When convenient, we treat F as the set of its clauses. The resolution operation combines two clauses A and B mentioning complementary literals l and l , respectively (e.g., A = (p _ q); B = (p _ :q)), to produce a single clause which mentions all literals in either clause except the complementary ones ([A ? l] [ [B ? l ] = (p)). In addition, we require that A [ B include no pair of complementary literals besides fl; l g; this excludes trivial clauses as inputs and outputs for resolution. The set R of resolvents of F is the set of clauses resulting from the closure of the resolution operation over F , including the input clauses in F . An implicate of F is any clause C such that F j= C, where j= is the propositional entailment operator. E.g., all the clauses comprising R are implicates of F . Given a set S of clauses, the subsumption operation removes from S all clauses C such that there exists a clause C 2 S such that C C . I.e., subsumption removes from S all non-minimal clauses. A prime implicate of F is an implicate C of F such that C is subsumed by (contains) no other implicate C of F . I.e., the set of clauses P resulting from subsumption on R comprises all of the prime implicates of F . A prime cover of F is any subset of P which mentions all of the literals mentioned in F . The size of a prime cover is the number of literal occurrences (counting duplicates) in it. 0

0

0

0

0

0

3

A minimum prime cover of F is any prime cover of minimum size. An implicant of a propositional formula F is any conjunctive term T such that T j= F . A prime implicant of F is an implicant T of F such that T is subsumed by (contains) no other implicate T of F . 0

Let us consider some simple examples. Suppose we are given F = (a _ b _ c) ^ (a _ b _ :c). The set of resolvents is R = (a _ b _ c) ^ (a _ b _:c) ^ (a _ b), and the set of prime implicates P = (a _ b), because (a _ b) subsumes both (a _ b _ c) and (a _ b _ :c). Intuitively, we may think of resolution as a generator of clauses and subsumption as a lter. Suppose we are given the unsatis able formula F = (r) ^ (:r). Then the set of resolvents is R = (r) ^ (:r) ^ (), and the set of prime implicates P = (). Obviously, this same set P = () will result for any unsatis able formula.2

Random 3SAT The Random 3SAT, xed clause-length space of satis ability problems has been the subject of much theoretical and experimental investigation (e.g., [Chvatal and Szemeredi, 1988, Mitchell et al., 1992]). Until now, the main property of interest has been | as the name implies | satis ability. However, we may inquire of any property of a propositional expression which we desire; in [Schrag and Crawford, 1996] I focus on two of these | the numbers of resolvents and of prime implicates. Other properties which I have experimented with include number of prime implicants and size of the minimum equivalent prime cover. Instances in Random k SAT are parameterized by two values (given xed k): n, the number of variables which is available to the space; and m, the number of clauses which are to be generated. All input clauses are of size k; these are generated by selecting k distinct variables from among the n available variables randomly and negating each with probability 1/2. The resulting instance in general may include duplicate clauses, so that less than m unique clauses can occur; it also may mention less than n dierent variables, particularly if m is small compared to n. I de ne the problem size of a problem space to be the number of variables it uses. In order to compare properties pro led for problem spaces of dierent problem sizes, the ratio of constraints to variables, or constraint ratio, is used to scale one of the axes of comparison to a common base. (We will see examples of this in forthcoming graphs.) In Random 3SAT, the constraint ratio m=n is expressed in units of clauses/variable, or \C/V" for short. Let the constraint ratio be an independent variable, and let some other measurable property of instances be the dependent variable. In experiments with problem spaces, we generate a large number of instances at each constraint ratio data point and plot an average value for a property of interest in a 2-dimensional problem space graph. I de ne a distinctive feature of a property in a 2-d problem space graph as any feature exhibited consistently across problem sizes. Clearly, a function which describes the constraint ratio location of any distinctive feature must be constant in the limit; otherwise such features would vanish from the graph, and not really be characteristic. 2

I should add some examples for minimum prime cover and prime implicants.

4

In each of the experiments to follow, unless otherwise stated I generate 1000 separate 3SAT instances at each constraint ratio data point, with the data points spaced just one clause apart. In cases where I report only on satis able instances, there are fewer than 1000 trials/data point at higher constraint ratios; this is unavoidable when constraint ratios are suciently large that satis able instances are rare because it would be very dicult to generate very many of them. I interpolate in cases where the target feature is not a peak.

Scaling Results Crawford and Auton [1993] show that the number of clauses required for 50% of instances to be satis able in Random 3SAT is a linear function of the number of variables: m = 4:24n + 6:21. Thus, the 50%-satis able point is a distinctive feature of the %-satis able property. Figure 1 depicts the experimentally observed transition in satis ability in Random 3SAT for n =4, 6, 10. Note that the phase transition sharpens as n increases.

Unsat. n=4

1.00

n=6 0.80

n=10

0.60 0.40 0.20 0.00

C/V 0.00

5.00

10.00

Figure 1: Part unsatis able for n = 4, 6, 10. Figure 2 shows values of m for the 50%-satis able points for the curves in Figure 1. 5

m 60.00

50.00

40.00

30.00

20.00

n 5.00

10.00

Figure 2: Number of clauses at 50%-satis able point for dierent problem sizes, with least squares t. Interpolated from satis ability data. The reason for this apparently exact linearity is an open analytical question. In [Schrag and Crawford, 1996], I show that the same kind of linearity is exhibited by distinctive features for numbers of resolvents and of prime implicates, but not by distinctive features for numbers of prime implicants. Subsequently, I have experimented with the minimum prime cover size property. I focus on the size of the minimum prime cover (MPC) of a conjunctive normal formula as an example of why such scaling results as reported in [Crawford and Auton, 1993] and [Schrag and Crawford, 1996] might not be only interesting, but also useful. By way of background, MPC is an exact minimization technique for 2-level logic, useful in VLSI and other logic design. The size of the minimum prime cover can be taken as a measure of a logic function's complexity in 2-level implementation. MPC is discussed in many introductory textbooks on logic design. Figure 3 shows a 2-d problem space graph for this property.

6

Average Complexity n=7 26.00

n=6

24.00

n=5

22.00

n=4 n=3

20.00 18.00 16.00 14.00 12.00 10.00 8.00 6.00 4.00 2.00

C/V 0.00

2.00

4.00

6.00

8.00

Figure 3: Size of the minimum prime cover for n 2 [3; 7]. MPC involves enumeration and minimization in addition to decision, making it much harder than SAT, so small problems really are all we can sample thoroughly enough to generate high-quality curves that will be statistically signi cant. You might imagine that computing MPC is on average most dicult for instances with the largest prime cover; we choose to examine the peak in MPC average size as a distinctive feature. The constraint ratio location of this distinctive feature | if we can predict it | can serve as a convenient source of hard instances in Random 3SAT for the purposes of benchmarking MPC logic minimization algorithms. Figure 4 shows the locations of these complexity peaks, in terms of number of clauses, with a linear least squares function t: m = 2:30n ? 2:90. (The peak points may look a little jumpy, but they actually are as close to the line as they can be, given that values for m, n both are constrained to be integral.) We can use this function to predict the location of the \dicult peak" for larger problem sizes as well.

7

m 13.00 12.00 11.00 10.00 9.00 8.00 7.00 6.00 5.00 4.00 n 4.00

6.00

Figure 4: Number of clauses at the peak in MPC average size for n 2 [3; 7]. We don't have to stop there, though. Looking at the curves in Figure 3 again, I note that there also is useful information in the peak average values for complexity. Figure 5 shows these values on linear/logarithmic axes.

8

Average Complexity

2.5 2

1.5

1e+01 8

6 5 C/V 3.00

4.00

5.00

6.00

7.00

Figure 5: Average size of the minimum prime cover at the peak in MPC average size for n 2 [3; 7]. We can try to do a function tting for this; the present function appears to be quite nearly exponential. Then the tted function can be used to predict values expected on average for larger problems as well, and can be used to rate the quality of solutions produced by non-systematic or heuristic algorithms which are feasible for these larger problems.

9

Benchmarking Methodology What I have done so far in this part is to outline, by illustration, a methodology for generating instances for benchmarking combinatorial problems | in this case, minimum prime cover | and for rating the quality of solutions from non-systematic algorithms for these problems. This methodology consists of the following steps. 1. Start with a problem space, like Random 3SAT. In principle, you can customdesign your problem space for a given problem. 2. Pro le the problem space for your property of interest using an exact algorithm and feasible problem sizes. 3. Identify distinctive features of this property which will serve for benchmarking purposes. A distinctive feature corresponding to the hardest, or most complex problems, is a reasonable choice, if it can be identi ed. 4. Determine how the constraint ratio location of the distinctive feature scales with problem size; do a function t. 5. Generate test instances based on this functional relationship and use them to compare the performance of systematic algorithms. 6. Determine also how the average value for the property of interest at the location of the distinctive feature scales with problem size; do a function t also for this. 7. Predict exact values for the property for (large) test instances based on this functional relationship and use them to rate quality of solutions from nonsystematic algorithms. The chief contribution of this methodology is in predicting average optimal values of a property for a space of instances, even though the expected values cannot be determined analytically.

Defending the Methodology Are problem spaces reliable? Are distinctive features valid? Do results for small problem sizes scale in a well-behaved way? In general, these questions will have to be answered individually for dierent (classes of) problem spaces. In the case of Random 3SAT, there is a wealth of information available about satis ability for problem sizes up to 400 variables, from pro ling using exact, systematic algorithms. This is far larger than will be feasible for any exact MPC algorithm. There also is circumstancial evidence for problem sizes up to 2000 variables, based on non-systematic algorithms. Prime implicate generation is a generalization of satis ability, so we also can have some con dence about the stability of distinctive features regarding prime implicates or the size of the MPC over these problem sizes. Caution has been recommended regarding the computation of statistical means where NP-hard problems are involved, particularly considering that run times for some algorithms can exhibit huge variance (e.g., [Hogg and Williams, 1994, Gent 10

and Walsh, 1994]). However, I believe that algorithm-independent properties of instances will not exhibit such wide variability. Figure 6 shows the PI aggregate peak for n = 10. The center line plotted represents the mean over all trials; the surrounding lines represent upper and lower limits of \approximate" 95% con dence intervals, obtained using a normal approximation to the actual, unknown distribution. This approximation should be quite accurate because of the large satis able sample sizes used [Hogg and Tanis, 1993]. Also, histograms of the space appear to be quite nearly normal. It appears that variance for this CNF property is well-behaved, at least for problems of this size. I suspect that it will be so for other properties which do not depend on the execution details of particular algorithms. n=10 Average Count

80.00 70.00 60.00 50.00 40.00 30.00 20.00 10.00 0.00 C/V 0.00

2.00

4.00

6.00

8.00

Figure 6: Number of prime implicates for satis able instances in n = 10. Toby Walsh, who with Ian Gent [Gent and Walsh, 1996] discovered extreme variance in satis ability diculty (run time) in Random 3SAT in the vicinity of 2.0 C/V for n = 100, told me that in the same region the number of solutions | another algorithm-independent property | had very well-behaved variance.

11

Properties and Algorithms I have proposed a new methodology to rate the solution quality of heuristic algorithms. Another way to approach this goal would be simply to pro le optimal and heuristic algorithms together on random problem spaces with small problem sizes, measure the average dierences in solution quality, and then do a function t to predict the dierences at larger problem sizes. This might not work, though, for a few reasons.

Some heuristics may perform at or near the optimum level for small problem sizes, so that the projected dierence might be deceptively small. Heuristic algorithms may break down dramatically at larger problem sizes, and without some steps to predict optimal quality at these sizes the breakdown might not be detected. Heuristics algorithms might introduce enough arbitrary computation so that the optimal value of the target property becomes obscured and the measured values exhibit the same kind of extreme variance we have seen in the run times for some optimal algorithms.

Some heuristics will be well-behaved. Greedy algorithms for some problems such as the bin-packing problem are provably within a logarithmic factor of optimal solution quality; arguably these need no experimentation at all, but evaluation of the quality of their results in practice still might be valuable. Algorithms which approximate a rst-class property which is expensive to compute by using another rst-class property which is less expensive to compute (e.g., approximating the number of cliques of size 4 in a graph by the number of cliques of size 3) also might be expected to have solution quality scaling which is well-behaved.

Plan The methodology needs to be validated by testing it on real problems. I have experimented with exact algorithms for MPC, but I have done nothing yet to rate the quality of heuristic algorithms for the same problem. To obtain results of the greatest signi cance, I propose to make use of the fastest exact MPC algorithm known | the Espresso-Signature algorithm in Espresso-Exact [Rudell, 1989], for which a public-domain implementation is available. This will have two major bene ts.

It will allow me to pro le Random 3SAT for MPC size at larger problem sizes,

thus leading to a better set of values for function tting. It will show me the largest problem size for which exact solution is feasbile (on average) for dicult instances in Random 3SAT. Then, I must identify the most important heuristic algorithms used in practice for large MPC problems. After that I can apply my methodology and evaluate their quality | for this part of this space. One might want to hedge one's bets by pro ling a few other parts of the space (after identifying suitable distinctive features), or by re-applying the whole methodology to a quite dierent RID for the same problem. 12

Also, I have corresponded with Alan Garvey at the University of Massachusetts at Amherst about applying this methodology to random problem spaces for their Design-to-Time Scheduling problem, which is signi cant in AI, compared to MPC. Alan is interested in quality ratings of dierent heuristic algorithms, and he already has created some random problem spaces for this problem. If this application doesn't work out, then I can examine another signi cant AI problem.

References [Bollobas, 1985] Bollobas, Bela 1985. Random Graphs. Academic Press. [Chao and Franco, 1990] Chao, M. and Franco, J. 1990. Probabilistic analysis of a generalization of the unit-clause literal selection heuristics of the k satis ability problem. Information Sciences 51:289{314. [Cheeseman et al., 1991] Cheeseman, Peter; Kanefsky, Bob; and Taylor, William 1991. Where the really hard problems are. In Proceedings of the Twelfth International Joint Conference on Arti cial Intelligence (IJCAI-91). 331{337. [Chvatal and Szemeredi, 1988] Chvatal, Va^sek and Szemeredi, Endre 1988. Many hard examples for resolution. Journal of the Association for Computing Machinery 35(4):759{768. [Crawford and Auton, 1993] Crawford, James and Auton, Larry 1993. Experimental results on the crossover point in satis ability problems. In Proceedings of the Eleventh National Conference on Arti cial Intelligence (AAAI-93). 21{27. [Franco and Paull, 1983] Franco, J. and Paull, M. 1983. Probabilistic analysis of the Davis Putnam procedure for solving the satis ability problem. Discrete Applied Mathematics 5:77{87. [Gent and Walsh, 1994] Gent, Ian and Walsh, Toby 1994. Easy problems are sometimes hard. Arti cial Intelligence 70:335{345. [Gent and Walsh, 1996] Gent, Ian and Walsh, Toby 1996. The satis ability constraint gap. Arti cial Intelligence 81:59{80. Special volume | Frontiers in problem solving: Phase transitions and complexity, edited by Tad Hogg, Bernardo Huberman, and Colin Williams. [Hogg and Tanis, 1993] Hogg, Robert and Tanis, Elliot 1993. Probability and Statistical Inference. Macmillan, 4th edition. [Hogg and Williams, 1994] Hogg, Tad and Williams, Colin 1994. The hardest constraint problems: A double phase transition. Arti cial Intelligence 69:359{377. [Mackworth, 1992] Mackworth, Alan 1992. Constraint satisfaction. In Shapiro, Stuart, editor 1992, Encyclopedia of Arti cial Intelligence. Wiley. 285{293. [Mitchell et al., 1992] Mitchell, David; Selman, Bart; and Levesque, Hector 1992. Hard and easy distributions of SAT problems. In Proceedings of the Tenth National Conference on Arti cial Intelligence (AAAI-92). 459{465. [Rudell, 1989] Rudell, Richard 1989. Logic Synthesis for VLSI Design. Ph.D. Dissertation, University of California at Berkeley. 13

[Schrag and Crawford, 1996] Schrag, Robert and Crawford, James 1996. Implicates and prime implicates in Random 3SAT. Arti cial Intelligence 81:199{222. Special volume | Frontiers in problem solving: Phase transitions and complexity, edited by Tad Hogg, Bernardo Huberman, and Colin Williams. [Schrag, 1992] Schrag, Robert 1992. The Quantity Lattice for engineering design. In Working Notes of the Design from Physical Principles Symposium. AAAI Press. 68{72. [Williams and Hogg, 1992] Williams, Colin and Hogg, Tad 1992. Using deep structure to locate hard problems. In Proceedings of the Tenth National Conference on Arti cial Intelligence (AAAI-92). 472{477.

14

Random Problem Spaces for the Evaluation of ... - Semantic Scholar

Random Problem Spaces for the Evaluation of ... - Semantic Scholar

Suggest Documents

An evaluation of different bi-spectral spaces for ... - Semantic Scholar

Problem Definitions and Evaluation Criteria for the ... - Semantic Scholar

Evaluation of Procedures for Adjusting Problem ... - Semantic Scholar

An Evaluation of the Use of Problem-Based ... - Semantic Scholar

Formalizing Semantic Spaces For Information ... - Semantic Scholar

Semantic Spaces for Sentiment Analysis - Semantic Scholar

Exploiting Partial Problem Spaces Learned from ... - Semantic Scholar

An Evaluation of Wearable Information Spaces - Semantic Scholar

Evaluation of decay times in coupled spaces ... - Semantic Scholar

Random Forest for the Contextual Bandit Problem

Chat Spaces - Semantic Scholar

CONFIGURATION SPACES - Semantic Scholar

THE ADAPTION PROBLEM FOR ... - Semantic Scholar

THE COHOMOLOGY OF CLASSIFYING SPACES ... - Semantic Scholar

THE DIRICHLET PROBLEM FOR DEGENERATE ... - Semantic Scholar

The Simultaneous Representation Problem for ... - Semantic Scholar

THE INTERIOR TRANSMISSION PROBLEM FOR ... - Semantic Scholar

Interaction Design for Public Spaces - Semantic Scholar

Algorithms for Smart Spaces - Semantic Scholar

Infrastructure for Information Spaces - Semantic Scholar

universal spaces for r-trees - Semantic Scholar

Learning Spaces for Knowledge Generation - Semantic Scholar

Economic Evaluation of a Problem Solving ... - Semantic Scholar

CHROMATIC COLOUR SPACES FOR SKIN ... - Semantic Scholar