Constraint Relaxation Techniques & Knowledge Base Reuse A Ph.D. Thesis by Tomas Eric Nordlander BSc. Electrical Engineering (University of Trollhättan/Uddevalla) MSc. Management Science (De Montfort University)
Department of Computing Science University of Aberdeen, Scotland, U.K.
[email protected]
2004
Constraint Relaxation Techniques & Knowledge Base Reuse
Abstract Effective reuse of KBs often entails the expensive task of identifying plausible KBcombinations. This research assists the MUSKRAT-Advisor [133] to decide whether existing KBs could be reused for solving new problems. This research assists this process, by developing an aid based on constraint satisfaction techniques which identifies incompatible KB-combinations in the scheduling domain. Incompatible KBs can be discarded, thus leaving fewer combinations for the MUSKRAT-Advisor to examine in detail. I have used a constraint solver as the Problem Solver (PS) and have represented the existing scheduling KBs as Constraint Satisfaction Problems (CSPs) which can be combined to create a composite CSP. If the composite CSP is found to be inconsistent, the KB-combination will not fulfil the PS requirements and can be discarded. Proving a CSP inconsistent can be a lengthy process, so I propose a constraint relaxation approach to more quickly identify inconsistent KBcombinations. My approach relaxes the CSP by removing constraints and if the relaxed version is unsolvable then the original CSP will not have a solution either. However, it is not certain that relaxing a CSP will produce an easier problem; previous research has shown that random binary CSP are hardest to solve around the solution transition phase [25, 96]. Moreover, part of my research has shown that when constraints are removed randomly from an inconsistent CSP (binary and nonbinary), the new relaxed CSP can be up to 10 times harder to solve. To investigate these issues, I created a Prolog test suite designed to generate test-beds of CSPs and to help identify useful relaxation strategies and analyse their results. The identified relaxation strategies are based on different constraint graph properties and are in most cases easy to implement. Empirical tests show that removing constraints of low tightness as well as high arity are efficient strategies which are also simple to implement and can create relaxed CSPs that verify that the original CSP is inconsistent, using less than half the search effort of the original CSP.
-2-
Constraint Relaxation Techniques & Knowledge Base Reuse
Note Parts of the work that appear in this thesis have been published:
-
T. Nordlander, K. Brown, and D. Sleeman, (2003) 'Identifying Inconsistent CSPs by Relaxation', University of Aberdeen, Technical Report: TR0304
-
T. Nordlander, K. Brown, and D. Sleeman, (2003) 'Constraint Relaxation Techniques to Aid the Reuse of Knowledge Bases and Problem Solvers' in The Twenty-third SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge: Springer Verlag, pp. 323336. (Republished in *)
-
T. Nordlander, K. Brown, and D. Sleeman, (2003) 'Identifying inconsistent CSPs by Relaxation' in Ninth International Conference on Principles and Practice of Constraint Programming (CP03), Kinsale, Ireland: Springer Verlag, pp. 987
-
T. Nordlander, K. Brown, and D. Sleeman, (2004) 'Constraint Relaxation Techniques to Aid the Reuse of Knowledge Bases and Problem Solvers' *. in Advanced Knowledge Technologies Consortium (AKT) - Selected Papers, pp. 312-324
-3-
Constraint Relaxation Techniques & Knowledge Base Reuse
Declaration I declare that this thesis has been composed by myself and describes my own work. It has not been accepted in any previous application for a degree. All verbatim extracts have been distinguished by quotation marks, and all sources of knowledge have been specifically acknowledged.
Tomas Eric Nordlander October 25, 2004
Department of Computing Science, University of Aberdeen
-4-
Constraint Relaxation Techniques & Knowledge Base Reuse
Acknowledgement This work is supported under the EPSRC’s grant number GR/N15764 and the Advanced Knowledge Technologies Interdisciplinary Research Collaboration, which comprises the Universities of Aberdeen, Edinburgh, Sheffield, Southampton and the Open University. I would like to thank my two supervisors, Prof. Derek Sleeman1 and Dr. Ken Brown2, for giving me the chance to do a PhD. I deeply appreciate the help and support over these three years. I also would like to thank Mats Carlsson at SICStus for helpful discussions about resumption, constraint checks, and search effort. In addition, I would like to express my gratitude to Paula Rice for her help in proof reading my thesis.
Finally, I thank my fiancée Luisa Teresa for her long distance love and support.
Tomas Eric Nordlander October 25, 2004
1
[email protected], Department of Computing Science, University of Aberdeen, Scotland, U.K. 2
[email protected], Cork Constraint Computation Centre, Department of Computer Science, University College Cork, Ireland
-5-
Constraint Relaxation Techniques & Knowledge Base Reuse
Table of Contents TITLE ...........................................................................................................................................1 ABSTRACT.................................................................................................................................2 NOTE ............................................................................................................................................3 DECLARATION ......................................................................................................................4 ACKNOWLEDGEMENT......................................................................................................5
TABLE OF CONTENTS.................................................................................6
1
INTRODUCTION................................................................................16
1.1 1.2 1.3
Background: Reusing Knowledge & Constraint Satisfaction...................18 Thesis Motivation .......................................................................................19 Thesis Layout..............................................................................................20
2
LITERATURE REVIEW..................................................................23
2.1
Knowledge Based System...........................................................................23 2.1.1 Knowledge ...........................................................................................23 2.1.2 Knowledge Base...................................................................................24 2.1.3 Knowledge Based System.....................................................................25 2.1.4 Knowledge Based Scheduling System ..................................................26 2.1.5 Reuse of KBS Components...................................................................27 2.1.6 Reusing KBs.........................................................................................28
2.2
Constraint Programming ...........................................................................30 2.2.1 Constraint Satisfaction..........................................................................30 2.2.1.1 2.2.1.2 2.2.1.3
2.2.2 2.2.3 2.2.4
CSP Definition.................................................................................................... 30 Search Cost......................................................................................................... 31 Constraint Arity .................................................................................................. 31
Constraint Graph & Constraint Hyper-graph.........................................32 Search Methods ....................................................................................33 Consistency Algorithms........................................................................35
2.2.4.1 2.2.4.2
Node-Consistency ............................................................................................... 36 Arc-Consistency .................................................................................................. 37
-6-
Constraint Relaxation Techniques & Knowledge Base Reuse
2.2.4.3 2.2.4.4 2.2.4.5 2.2.4.6
2.2.5 2.2.6 2.2.7 2.2.8 2.2.9
Search Heuristics. .................................................................................43 Random Generated CSPs......................................................................43 Phase Transition Behaviour & Hardness Peak.......................................46 Formulating the CSP.............................................................................50 Examples of CSPs ................................................................................50
2.2.9.1 2.2.9.2 2.2.9.3
3
Path-Consistency ................................................................................................ 38 Obtaining n- and k-Consistency........................................................................... 40 Search Algorithms that use Consistency Techniques ............................................ 41 Approaches that Reduce the Time Complexity...................................................... 41
Cryptarithmetic Puzzles ...................................................................................... 50 Graph Colouring................................................................................................. 51 Scheduling .......................................................................................................... 52
KNOWLEDGE BASE REUSE THROUGH CONSTRAINT RELAXATION: A PROPOSAL......................................................54
3.1 3.2 3.3
Reusing KBs ...............................................................................................54 MUSKRAT framework..............................................................................56 The Meta-PS Approach..............................................................................57 3.3.1 The Shortcomings of the Meta-PS Approach ........................................59 3.4 My Relaxation Approach ...........................................................................60 3.4.1 Scheduling Example in the Mobile Phone Manufacturing Domain........62 3.4.2 My Reuse Investigation Process............................................................64 3.4.3 The Meta-PS & My Constraint Relaxation Approach Compared...........65 3.4.4 Possible Criticisms of the Relaxation Approach....................................67 3.5 Knowledge Base Survey .............................................................................69 3.5.1 Source & Classification Process............................................................69 3.5.2 Limitations ...........................................................................................70 3.5.3 Findings................................................................................................70
4 4.1
RELAXATION STRATEGIES AND THE EXPERIMENTAL TESTBED...............................................................................................72 The Generating Step...................................................................................73 4.1.1 Constraint Tightness .............................................................................74 4.1.2 Constraint Arity....................................................................................75 4.1.3 Constraint Types...................................................................................77 4.1.3.1 4.1.3.2 4.1.3.3 4.1.3.4 4.1.3.5
relation(?X,+MapList,?Y) ................................................................................... 77 all_different(+Variables) .................................................................................... 78 serialized(+Start,+Duration) .............................................................................. 78 cumulative(+Start,+Duration,+Resources,?Limit)............................................... 78 Non-binary Constraints in a Real-World Context................................................. 79
-7-
Constraint Relaxation Techniques & Knowledge Base Reuse
4.1.4 4.1.5
Extended Notation for my Problem Classes ..........................................81 The Process of Generating CSPs...........................................................82
4.1.5.1 4.1.5.2
Skeleton CSP ...................................................................................................... 82 Adding Binary & Non-binary Constraints............................................................ 83
4.1.6 Examples of How My Generated CSPs are Formulated.........................85 4.1.7 Criticisms of Conventional CSP Generating Programs..........................86 4.2 The Relaxing Step.......................................................................................88 4.2.1 Random Removal .................................................................................89 4.2.2 Greedy Search ......................................................................................90 4.2.3 Greedy Ordering ...................................................................................94 4.2.4 Node Degree.........................................................................................95 4.2.5 Tightness Removal ...............................................................................96 4.2.6 Arity Removal ......................................................................................96 4.2.7 Isolate Node .........................................................................................97 4.2.8 Implementing the Strategies on Real-world CSP...................................97 4.3 The Solving Step.........................................................................................98 4.3.1 The Recorded Statistics ........................................................................99
5
EMPIRICAL STUDIES...................................................................102
5.1
Solution Space Transformations..............................................................103 5.1.1 Conclusions when Relaxing a CSP .....................................................104 5.1.2 Conclusions when Tightening a CSP ..................................................105 5.1.3 Valuable MUSKRAT Transformation Conclusions ............................106 5.2 Phase Transition & Hardness Curve Revisited.......................................106 5.3 Experimental Map....................................................................................110 5.4 General Behaviour of the Problem Classes .............................................113 5.5 Relaxing Conventional CSPs ...................................................................116 5.5.1 Conclusion .........................................................................................119 5.6 Relaxing Binary CSPs with Different Tightness Distributions...............121 5.6.1 CSPs with Uniform Distributed Tightness ..........................................121 5.6.1.1
5.6.2
CSPs with Normally Distributed Tightness .........................................126
5.6.2.1
5.6.3
Conclusion........................................................................................................ 126 Conclusion........................................................................................................ 131
CSPs with Exponentially Distributed Tightness ..................................131
5.6.3.1
Conclusion........................................................................................................ 134
5.7
Relaxing Non-binary CSP with Uniform Distributed Tightness ............135 5.7.1 Conclusion .........................................................................................141 5.8 Summary ..................................................................................................141
-8-
Constraint Relaxation Techniques & Knowledge Base Reuse
6 6.1 6.2
CONCLUSIONS & FUTURE WORK........................................145 My Contributions .....................................................................................146 Future Research Directions .....................................................................148 6.2.1 Reusing Search ...................................................................................148 6.2.2 Creating New Relaxation Strategies by Combining Existing Strategies148 6.2.3 Possible New Strategies......................................................................149 6.2.4 Extending the CSP-Suite’s Generating Module...................................149 6.2.5 Randomising the Constraint Type Order for Non-Binary Test-beds ....150 6.2.6 KB-CSP Transformation.....................................................................150 6.2.7 Applying My Relaxation Algorithms on Over-Constrained Problems .151 6.2.8 Applying My Algorithms in Search Heuristics....................................151
BIBLIOGRAPHY..........................................................................................152 APPENDIX: A ................................................................................................164 Appendix A1: Taxonomy of Problem Solving Methods .....................................164
APPENDICES: B...........................................................................................165 Appendix B1: Prototype Components of the Knowledge Based Scheduling System ..........................................................................................165 Appendix B2: The ACM Computing Classification System...............................168 Appendix B3: Knowledge Representation Type Distribution for Individual Proceedings ..................................................................................169 Appendix B4: ACM Category Distribution ........................................................170 Appendix B5: ACM & Knowledge Representation Type Distribution with References ....................................................................................171
APPENDICES: C...........................................................................................172 Appendix C1: Part of the Non-binary Constraint Table....................................172 Appendix C2: Probability Equations of Generating Flawed Variables.............173 Appendix C3: Example of Recorded Properties.................................................174
APPENDICES: D...........................................................................................176 Appendix D1: Problem Class Map for my Experiments ....................................176 Appendix D2: Solution Transition Phase for Relaxation Strategies applied on Problem Classes with Uniform Distributed Tightness ...............177 Appendix D3: Extended Uniform Investigation .................................................178 -9-
Constraint Relaxation Techniques & Knowledge Base Reuse
Appendix D4: Solution Transition Phase for Relaxation Strategies applied on Problem Classes with Normal Distributed Tightness.................179 Appendix D5: Extended Normal Distribution Investigation..............................180 Appendix D6: Solution Transition Phase for Relaxation Strategies applied on Problem Classes with Exponential Distributed Tightness .........181 Appendix D7: Solution Transition Phase for Relaxation Strategies of Nonbinary Problem Classes with Uniform Distributed Tightness ...182 Appendix D8: Extended Non-Binary Investigation of High Arity.....................183 Appendix D9: Investigation of High Node Degree Performance on Non-binary Problem Classes with Uniform Distributed Tightness ...............184 Appendix D10: Investigation of High Arity Performance on Non-binary Problem Classes with Uniform Distributed Tightness ...............185 Appendix D11: A Slice from the Recorded Properties from the Experiment in Appendix D8 ................................................................................186
APPENDIX: E.................................................................................................187 Appendix E1: KB-Survey Bibliography..............................................................187
- 10 -
Constraint Relaxation Techniques & Knowledge Base Reuse
Table of Figures Figure 2-1. Constraint Graph & Constraint Hyper-Graph. ............................................... 33 Figure 2-2. Reduction in Domain Size with Node-Consistency ....................................... 36 Figure 2-3. Reduction in Domain Size with Arc-Consistency.......................................... 37 Figure 2-4. Example of Path-Consistency, partly based on [118]..................................... 39 Figure 2-5. Solution Transition Phase for 30 ........................... 47 Figure 2-6. Hardness Peak for 30............................................ 48 Figure 2-7. Density & Solution Transition Graph for 30 ............... 48 Figure 2-8. The Different States of Australia; a Map Colouring Problem......................... 51 Figure 3-1. The MUSKRAT-Advisor [133]..................................................................... 57 Figure 3-2. Building Process of the Meta-PS.................................................................... 59 Figure 3-3. Combining KBs with Selected PSs along with the Problem Specifications ... 63 Figure 3-4. One of 160 Plausible KB-Combinations........................................................ 63 Figure 3-5. An Overview of My Process of Analysing Standardised KB-Combinations . 65 Figure 3-6. Distribution of 149 Papers that Indicate their type of Knowledge Representation................................................................................................ 70 Figure 3-7. ACM Distribution and Knowledge Representation Classification of the Papers that Indicate How they Represent their Knowledge........................................ 71 Figure 4-1. The CSP-Suite ............................................................................................... 73 Figure 4-2. Tightness Example......................................................................................... 74 Figure 4-3. Example of Uniform, Normal, & Exponential Tightness Distribution........... 75 Figure 4-4. Creation of the Skeleton CSP......................................................................... 83 Figure 4-5. Creating Random CSPs from Scratch............................................................ 83 Figure 4-6. Node Connectivity Contrast between Random CSP and Timetabling Data... 87 Figure 4-7. The Node Connectivity of my CSP-Suite ...................................................... 87 Figure 4-8. Random Removal Strategy can Create Harder Relaxed CSPs ....................... 90 Figure 4-9. Relaxation Results for Greedy Search............................................................ 91 Figure 4-10. Node Connectivity Preference for Greedy Search........................................ 93 Figure 4-11. Tightness & Arity Correlation for Greedy Search’s Selections .................... 94 Figure 4-12. High Node Degree Strategy......................................................................... 95 Figure 4-13. Tightness Removal Strategy ........................................................................ 96 Figure 4-14. Arity Removal Strategy ............................................................................... 96 Figure 4-15. Isolate Node Strategy................................................................................... 97 - 11 -
Constraint Relaxation Techniques & Knowledge Base Reuse
Figure 4-16. Correlation Test for Resumption................................................................ 100 Figure 5-1. Possible Solution Space Transformation when Manipulating a CSP ........... 103 Figure 5-2. Relaxed Solution Space Transformation...................................................... 105 Figure 5-3. Tightened Solution Space Transformation................................................... 105 Figure 5-4. Solution Phase Transition Graph for 20............... 108 Figure 5-5. Hardness Curve for 20......................................... 109 Figure 5-6. Tightness Curve and Average Search Effort ................................................ 110 Figure 5-7. Solution Transition Phase for 30. ........... 111 Figure 5-8. Hardness Peak for 50 & 50........................ 111 Figure 5-9. Experimental Schema .................................................................................. 113 Figure 5-10. Hardness Curves for Different Problem Classes Groups............................ 114 Figure 5-11. Solution Transition Phase for Different Problem Classes Groups.............. 115 Figure 5-12. Resumption Profit for all Relaxation Strategies on Conventional CSPs..... 117 Figure 5-13. Closer Examination of the Best Performing Relaxation Strategies............. 118 Figure 5-14. Search effort for Random Removal vs. Best Strategies.............................. 119 Figure 5-15. Solution Transition Phase when Removing up to 133 Constraints............. 119 Figure 5-16. Relaxation Strategies when Removing up to 60 Constraints...................... 122 Figure 5-17. Resumption Profit for the Most Beneficial Strategies ................................ 124 Figure 5-18. Search Effort for Random Removal vs. Low Tightness............................. 125 Figure 5-19. Solution Transition phase for Random Removal vs. Low Tightness.......... 126 Figure 5-20. Relaxation Strategies when Removing up to 60 Constraints...................... 127 Figure 5-21. Resumption Profit for the most Promising Strategies................................. 129 Figure 5-22. Search Effort for Random Removal vs. Low Tightness............................. 130 Figure 5-23. Solution Transition Phase for Random Removal vs. Low Tightness ......... 130 Figure 5-24. Relaxation Strategies when Removing up to 60 Constraints...................... 131 Figure 5-25. Resumption Profit for the Most Promising Strategies ................................ 133 Figure 5-26. Search Effort for Random Removal vs. Low Tightness............................. 134 Figure 5-27. Solution Transition Phase for Random Removal vs. Low Tightness ......... 134 Figure 5-28. Relaxation Strategies when Removing up to 60 Constraints...................... 136 Figure 5-29. The Most Promising Relaxation Strategies’ Resumption Profit on Non-binary CSPs............................................................................................................. 138 Figure 5-30. Search Effort for Random Removal vs. Low Tightness............................. 140 Figure 5-31. Transition Phase for Random Removal vs. Low Tightness........................ 140 Figure 5-32. Low Tightness Problem Movement on the 3D Hardness Graph................ 143 Figure 5-33. Low Tightness Problem Movement on the 3D Solution Transition Graph 144 - 12 -
Constraint Relaxation Techniques & Knowledge Base Reuse
Figure B2-1. The ACM Computing Classification System [2]........................................ 168 Figure B3-1. Knowledge Representation Type Distribution for Individual Proceedings. 169 Figure B4-1. ACM Category Distribution....................................................................... 170 Figure B5-1. ACM & Knowledge Representation Type Distribution with References ... 171 Figure D1-1. Problem Class Map for my Experiments.................................................... 176 Figure D2-1. Solution Transition Phase for Uniform Problem Classes ........................... 177 Figure D3-1. Extended Experiment on Uniform Problem Classes close to the Hardness Peak ........................................................................................................... 178 Figure D4-1. Solution Transition Phase for Problem Classes with Normal Distributed Tightness.................................................................................................... 179 Figure D5-1. Hardness Curves for the Extended Normal Investigation........................... 180 Figure D6-1. Solution Phase Transition for Problem Classes with Exponential Distributed Tightness.................................................................................................... 181 Figure D7-1. Solution Transition Phase for Non-binary Problem Class with Uniform Distributed Tightness ................................................................................. 182 Figure D8-1. Extended Investigation of Non-binary Problem Class’s Hardness Curves . 183 Figure D9-1. Extended High Node Degree Investigation for Non-binary Problem Class with Uniform Distributed Tightness........................................................... 184 Figure D10-1. High Arity Investigations for Non-binary Problem Class with Uniform Distributed Tightness ............................................................................... 185
Table of Equations Equation 2-1. Density Calculation.................................................................................... 49 Equation 2-2. Constrainedness ......................................................................................... 49 Equation 2-3. Constrainedness using Number of Constraint instead of Density............... 49 Equations 3-1. The Cost Equations for the Meta-PS ........................................................ 60 Equations 3-2. The Cost Equations for My Relaxation Approach.................................... 66 Equation 4-1. Tightness Calculations ............................................................................... 74 Equation 4-2. Calculating Minimum Amount of Skeleton Constraints ............................ 83 Equation 4-3. Calculate Allowed Tuples.......................................................................... 84 Equation 4-4. Tightness Calculations ............................................................................... 85 Equation C2-1. Probability Calculation of Finding a Flawed Variable............................ 173 Equation C2-2. Probability of Finding a Flawed Variable for my CSPs.......................... 173
- 13 -
Constraint Relaxation Techniques & Knowledge Base Reuse
Table of Codes Code 2-1. Slice of a KB that Contains Declarative and Procedural Knowledge ............... 25 Code 2-2. Cryptarithmetic Puzzle 'Send+more=money', written in SICStus Prolog......... 51 Code 2-3. Graph Colouring Problem for the States of Australia, written in SICStus Prolog ....................................................................................................................... 52 Code 2-4. Simple Scheduling Program, written in SICStus Prolog [114]......................... 53 Code 4-1. Slices from Factory, Supplier, and Background KBs along with the PS .......... 80 Code 4-2. Example of how CSPs are Formulated by my CSP Generator......................... 86 Code 4-3. The Solving Module ........................................................................................ 98 Code B1-1. Constraint Satisfier............................................................................................................. 165 Code B1-2. KB Factory Canada ........................................................................................................... 166 Code B1-3. KB Shipping America....................................................................................................... 166 Code B1-4. KB Supplier China............................................................................................................. 167 Code C1-1. Example of a Non-binary Constraint Table; Arity 3 & Tightness Range 45-30...... 172
Table of Tables Table 2-1. Constraints with Different Arity...................................................................... 32 Table 2-2. Time- and Space-Complexity for Arc-consistency Algorithms [11, 12, 15, 16, 26, 67, 118].............................................................................................. 38 Table 2-3. Time- and Space-Complexity for Path-Consistency Algorithm [66, 125]....... 40 Table A1-1. Taxonomy of Problem Solving Methods [116] ........................................................... 164 Table C3-1. Example of Recorded Information when NOT Using the Relaxation Module ....................................................................................... 174 Table C3-2. Example of Recorded Information when Using the Relaxation Module...... 175 Table D11-1. A slice of Recorded Information for Low Node Degree on 30 ................................................................. 186
- 14 -
Constraint Relaxation Techniques & Knowledge Base Reuse
List of Acronyms & Abbreviations AC
Arc-Consistency
ACM
Association for Computing Machinery
AI
Artificial Intelligence
AKT
Advance Knowledge Technologies
BC
Back Checking
BJ
Back Jumping
BM
Back Marking
BT
Standard Back Tracking Algorithm
CBR
Case Base Reasoning
CLP
Constraint Logic Programming
CP
Constraint Programming
CSP
Constraint Satisfaction Problem
FC
Forward Checking
FL
Full Look Ahead
GT
Generate and Test
KB
Knowledge Base
KBS
Knowledge Based System
MAC
Maintaining Arc Consistency
MLT
The Machine Learning Toolbox
MUSKRAT
Multistrategy Knowledge Refinement and Acquisition Toolbox
NC
Node-Consistency
OKBC
Open Knowledge Base Connectivity
OR
Operational Research
PC
Path-Consistency
PLA
Partial Look Ahead
PS
Problem Solver
PSM
Problem Solver Method
- 15 -
Chapter 1 ‘There is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success, than to take the lead in the introduction of a new order of things.’ Emilie Cady
1 Introduction Knowledge Engineering is often a time-consuming and expensive process, particularly if it involves acquiring new knowledge and constructing new problem solving systems from scratch [24, 33, 34, 108]. Knowledge Reuse addresses this issue by building new systems partly from existing components. The difficulty then becomes one of identifying which components can be reused to address the new task; this is still a demanding problem. One approach to tackling this is the MUSKRAT (Multistrategy Knowledge Refinement and Acquisition Toolbox) framework that aims to combine problem solving, knowledge acquisition, and knowledge-base refinement in a single computational framework [136]. Given a set of Knowledge Bases (KBs) and Problem Solvers (PSs), the MUSKRAT-Advisor [133] investigates whether combinations of the available KBs will fulfil the requirements of the selected PS for a given problem. White proposed an aid, a Meta-PS [51], that conducted a weaker plausibility test - trying to identify and remove incompatible combinations. The work described in this thesis develops White's proposal, by considering how to generate the plausibility test, and how to ensure it does assist in knowledge reuse. This thesis will show how constraint satisfaction techniques can assist in the reuse of standardised KBs3. I propose to represent existing standardised KBs as Constraint Satisfaction Problems (CSPs), which can be combined to produce a composite CSP. And so the problem solver is the constraint solver. If the composite
3
The KBs written in same language, use the same ontologies etc.
- 16 -
CHAPTER 1: Introduction
CSP is unsolvable, then that combination of KBs could not be reused to solve the given problem. Identifying plausible combinations thus requires examining a series of CSPs, and rejecting unsolvable ones. Proving a CSP unsolvable may be a lengthy process. The method I propose to speed up this inconsistency detection is to relax the CSP, and if the relaxed version is demonstrated inconsistent then the original CSP will not have a solution either. Note that if the relaxed CSP has a solution, then the original CSP represents a plausible combination and should be retained for further investigation. Identifying plausible combinations brings little benefit. However, relaxing a CSP to produce an easier problem is not that simple; previous transition phase research has shown that inconsistent CSPs with fewer constraints are harder to identify as inconsistent than those with more constraints on random binary CSPs of the same size and tightness [25, 96]. Furthermore, part of my research has shown that when constraints are removed randomly from an inconsistent CSP the new relaxed CSP can be several times harder to solve. To test this approach, I investigate different relaxation strategies on a variety of CSPs based on scheduling problems. The work described in this thesis will show that, contrary to what might be expected from the phase transition results, focused relaxation strategies do produce simpler problems, and thus relaxation is an effective method for plausibility testing. This research contributes to the challenging problem of Knowledge Reuse as it develops an aid based on Constraint Programming to enable more rapid identification of incompatible KBs; leaving fewer combinations on which the MUSKRAT-framework can conduct a thorough investigation. The rest of the chapter is structured as follows: Section 1.1 gives a short overview of reusing components of Knowledge Based System (KBS) as well as a brief introduction to constraint satisfaction. In addition, the chapter also presents a conceptual discussion of how the process of reusing KBs may be enhanced as well as presenting previous attempts at using constraint techniques to aid in the reuse process. In section 1.2, I discuss the motivation for the work and the last section (section 1.3) gives the layout of the thesis.
- 17 -
CHAPTER 1: Introduction
1.1 Background: Reusing Knowledge & Constraint Satisfaction One of the main goals of the KBS community has been to reuse KBS components, to ease the bottlenecks in knowledge engineering and management [3]. Dieng et al. [34] have highlighted the importance in industry of being able to store and reuse corporate knowledge. By reusing components and concepts Sauer and Bruns [109] reduced the development time, from years to months, for their knowledge based scheduling systems. Recent work has focused on the reuse of a variety of different KBS components, including PS/PSM, ontologies, and KBs [99, 108]. Schreiber [111] point out that if part of the domain knowledge often exists elsewhere and is used to solve other tasks, it could be reused. It is therefore sound to investigate reuse potential of an existing KB when developing a new KBS. There are several suggested processes to help in the hard task of reusing KBs. Recently, systems have provided an option to write KBs in a standardised format OKBC, which facilitates the necessary merging/mapping of KBs and thereby eases KB reuse. The large number of tools available and their complexity can make it problematical for the knowledge engineer to select the most suitable ones when investigating reuse potential of existing components. One approach that tries to tackle this dilemma is the MUSKRAT framework [51]. The main component of the framework is its Advisor, which investigates whether existing KBs will fulfil the requirements of a selected PS for a given problem. White [134] proposed a weaker test of plausibility for the Advisor; a constraint-based approximation, the aim of which was to identify combinations which can be demonstrated to be impossible. However, the Meta-PS was created informally and therefore it can not guarantee that all the discarded combinations are impossible. Constraint Satisfaction techniques attempt to find solutions to constrained combinatorial decision problems. A CSP consists of a set of variables, where each variable has a set of values and constraints that restrict the possible values the variables connected to the constraint can take simultaneously. A solution to a CSP is the assignment of values to every variable, in such a way that all constraints are satisfied simultaneously. There has been extensive research on randomly generated binary CSPs and the concept of relaxing CSPs has received considerable attention - 18 -
CHAPTER 1: Introduction
[18, 43, 110], but in contrast to my relaxation aim this other research has focused on changing the CSP to introduce solutions.
1.2 Thesis Motivation This research contributes to the MUSKRAT-Advisor’s challenging reuse investigation as it develops an aid based on Constraint Programming to enable a quick identification of incompatible KBs leaving fewer KB-combinations for a thorough investigation. Earlier the Meta-PS approach suggested a tractable plausibility test to identify inconsistent KB-combinations. The major shortcoming of this Meta-PS was that it could not guarantee that no successful combinations were falsely discarded. My research focuses on using constraint satisfaction techniques to assist in the reuse investigation by quickly identifying incompatible KB-combinations and guaranteeing that no successful KB-combinations are falsely discarded. Problem:
To find a combination of existing standardised KBs, created for previous tasks, that can be reused to solve a new task.
Hypothesis: If the standardised KBs can be represented as CSPs, my relaxation techniques then eliminate impossible combinations faster than executing the combination. To test this hypothesis I have created the CSP-Suite, a new Prolog test suite to assist in creating and evaluating relaxation strategies on a wide range of problems. The CSP-Suite produces a large number of CSPs based on real-world scheduling problems, relaxes and solves them. To help characterise real-world problems in terms of the entities and their relationships, I have investigated the characteristics and ontologies [101, 122] of scheduling problems as well as the properties of their constraint graphs (e.g. [131]). This investigation gave me a better understanding of the building blocks of real-world scheduling problems, which was vital in my effort to close the gap between the properties of real-world scheduling problems and the problems my CSP-Suite generates. For example, it is likely that real-world, standardised KBs will contain non-binary constraints, and in order to minimise the - 19 -
CHAPTER 1: Introduction
required KB-CSP transformations work, my relaxation method should apply directly to those constraints. Therefore, different types of non-binary constraints frequently used in real-world scheduling problems are implemented into the CSPSuite. In addition, the CSP-Suite has the ability to generate not only conventional CSPs with fixed tightness but also binary CSPs with a different statistical tightness distribution. My relaxation strategies, based on different constraint graph properties, are in most cases easy to implement. The aim of each relaxation strategy is to remove constraints from the CSPs to create new relaxed CSPs that should be easier to solve. It is not certain that relaxing a CSP will produce an easier problem. In fact, phase transition research (e.g. [25, 96]) seems to indicate the opposite when the original CSP is inconsistent; as constraints become looser, or the connectivity of the problems becomes sparser, the time required to demonstrate inconsistency for random problems increases. Moreover, part of my research has shown that when random constraints are removed from an inconsistent CSP (binary and non-binary), the new relaxed CSP can be up to 10 times harder to solve (see section 4.2.1). By carefully selecting constraints to remove, some of the strategies produce relaxed CSPs that were ~50% easier to demonstrate inconsistent without introducing any early solution. Note that only an inconsistent relaxed CSP could discard original CSP and the corresponding KB-combination. The previous sections described how the relaxation approach contributes to the MUSKRAT-Advisor investigation of whether existing KB-combinations can be reused to solve a new task. A survey was undertaken that confirms that KBs are still commonly used among AI-researchers. Although, I argue that if the knowledge can be harvested by the constraint solver, my relaxation approach is independent of the knowledge source.
1.3 Thesis Layout The thesis is best read in the order the chapters are presented. I created a prototype of a knowledge based scheduling system in the mobile-phone manufacturing domain; this is written in SICStus Prolog and available for download [88]. I use this prototype throughout the thesis to exemplify my reasoning and to highlight the benefit of reusing KBs from different parts of the production chain - factories, suppliers, - 20 -
CHAPTER 1: Introduction
shipping companies etc. – and to illustrate in a comprehensible way the complexity of production scheduling in a domain with vocabulary which could be understood by a person who is not an expert in the domain. In addition, the prototype also exemplifies an area where my relaxation strategy may be successfully applied and helped my research focus: creating relaxation strategies that can be applied to realworld scheduling problems as well as making sure that my CSP-Suites generate problems with scheduling properties. The thesis is structured as follows: Chapter 2 presents background information on KBSs and the attempts to reuse some of their components. The chapter highlights common problems and the limitations of the reuse issue. In addition, the chapter gives general information about Constraint Programming (CP), with a focus on CSP. Thereafter, a discussion about phase transition and the hardness peak of conventional problem classes is presented and why knowledge about their behaviour is important to my research is discussed. Chapter 3 gives background information on knowledge based scheduling systems and explains why I chose to exemplify the relaxation concept for these types of problems. Moreover, the chapter highlights the benefits of reusing knowledge in scheduling systems and the difficulties in reusing KBs. The chapter describes the MUSKRAT-framework and its Advisor’s approach to reuse, as well as a previously suggested approach to assist in this process. In this chapter, I present how my relaxation approach contributes in the reuse investigation process. In addition, I compare my approach with previous attempts and show how my approach overcomes the major shortcomings of previous attempts. Furthermore, the chapter shows how my relaxation approach is exemplified in the mobile phone manufacturing domain. Finally, the chapter ends with a presentation of the results of the KB-Survey undertaken to determine if KBs are commonly used in the AI community. The Chapter 4 describes the CSP-Suite, written in SICStus Prolog and designed to generate test-beds of CSPs, to help identify and test relaxation strategies whose goal it is to produce relaxed CSPs that are easier to demonstrate as being inconsistent. Parallel to the description of the CSP-Suite the chapter discusses the steps by which CSPs are created, relaxed and solved. There is a detailed discussion of the difficulties and the necessary compromises needed to enable the creation of CSPs with real-world properties. - 21 -
CHAPTER 1: Introduction
In Chapter 5, a theoretical investigation of possible ways of manipulating the solution space of the CSP is presented. The findings can be helpful for the MUSKRAT-framework. In addition, the chapter revisits the phase transition and the hardness curve and describes why understanding the behaviour of the solution transition phase and hardness of the problem classes is important for my relaxation approach. Moreover, an experimental layout is presented that describes my experiment set-up and its limitations regarding chosen problem classes. Thereafter the extensive empirical results of my relaxation strategy on a wide range of problem classes are presented. Chapter 6 concludes the results and contribution of my thesis. In addition, the chapter also presents possible future research directions.
- 22 -
Chapter 2 ‘Many learned persons have read themselves stupid.’ Arthur Schopenhauer
2 Literature Review This chapter presents some background information on Knowledge Based Systems (KBSs) and Constraint Programming (CP), the two main Artificial Intelligence (AI) paradigms combined in my research. The chapter is divided into two sections: the first part discusses KBSs and knowledge based scheduling systems and the attempts to reuse some of these systems’ components while the second part gives general information about CP, with a focus on Constraint Satisfaction Problems (CSPs). In addition this second section also presents a formal definition of CSP and discusses search and consistency algorithms. Moreover the behaviour of CSPs is described and the section ends with a presentation of classic examples of CSPs.
2.1 Knowledge Based System This section is presented in the following order: firstly, I define Knowledge, Knowledge Bases (KB); secondly, a formal definition of KBS as well as Problem Solving Method (PSM) is given; lastly, I discuss knowledge based scheduling systems and present previous attempts at reusing KBS components. In addition, common problems and limitations of reuse are highlighted. 2.1.1 Knowledge There exist numerous definitions of knowledge; some are thousands of years old and date back to philosophers such as Plato. Epistemology is the theory of knowledge and it involves the study of knowledge, its sources, varieties, and limits. There are two extreme positions regarding knowledge, namely empiricism and apriorism. While empiricism states that knowledge can only be derived from experience, the a priori position is that knowledge is innate. My concept of knowledge does not align
- 23 -
CHAPTER 2: Literature Review
completely with either of these views although it is closer to empiricism than apriorism. People working in different fields have different views of knowledge: the definition of knowledge is context sensitive and therefore no generic definition exists. This view is widely supported [27, 29, 111]. Because data and information are closely connected to knowledge, let me first begin by defining them. Knowledge is the whole body of data and information needed to take action or create new information [111]. If the ‘context sensitive’ definition described above is applied to the data that my CSP-suite produces (see Appendix C3) a person with no experience of constraints would only see the data as data. However, those working in the constraint-domain, when analysing the data along with the headers, would see the data as information that gives him/her the knowledge for possible actions; for example, to create new constraint relaxation strategies (section 4.2) based on this information. For knowledge definitions in a business context see [34, 35, 79]. 2.1.2 Knowledge Base The knowledge in a KB can be divided into declarative and procedural knowledge. The declarative knowledge is stored as a set of facts about objects, events etc., which can easily be modified, added to or deleted. While the procedural knowledge is stored as a set of procedures which themselves determine when they should be executed (e.g. how to find relevant facts, make inferences). The procedural knowledge shows how to find the relevant facts, and make inferences/rules such as how to calculate the total weight of the phone depending on the choice of components. Code 2-1, shows a slice from a KB, written in Prolog, from my prototype scheduling system [88] in the mobile-phone manufacture domain. The declarative knowledge gives us facts, such as: weight, time to unpack and configure the phone modules; number of employees needed for unpacking; and the cost of individual components for a mobile phone. This separation of different knowledge areas eases the process of sharing, modifying, or reusing knowledge parts. It is easy to see that if the company needs to update this KB by, for example, adding, removing, or modifying batteries, it would be rather straightforward.
- 24 -
CHAPTER 2: Literature Review
%%% FACTS Material %%% % (TYPE,W_Battery,Unpack_Batt_Dur,Unpack_Batt_Res,Batt_cost) battery(1,30,12,2,35). battery(2,25,12,2,45). battery(3,25,12,2,55). battery(4,18,12,2,65). battery(5,16,12,2,67). battery(6,15,12,2,85). % (TYPE,W_base,Unpack_Base_Dur,Unpack_Base_Res,Base_cost) base(1,30,10,2,35). base(2,25,10,2,45). base(3,25,10,2,55). base(4,18,10,2,65). % (TYPE,W_antenna,Unpack_Antenna_Dur,Unpack_Antenna_Res,Antenna_cost) anntenna(1,30,2,2,35). anntenna(2,25,2,2,45). anntenna(3,25,2,2,55).
Declarative knowledge
% (TYPE,QUALITY,W_Screen,Unpack_Screen_Dur,Unpack_Screen_Res,Screen_cost) screen(1,2,6,5,8,2,45). screen(2,2,5,5,8,2,46). screen(3,2,3,5,8,2,57). screen(4,56,3,5,8,2,65). %%%
RULE Weight
%%%
calculate_weight(Total_weight,Battery_Type,Base_Type,Antenna_Type,Screen_Type):battery(Battery_Type,W_Battery,_,_,_), base(Base_Type,W_base,_,_,_), anntenna(Antenna_Type,W_antenna,_,_,_), screen(Screen_Type,_,W_Screen,_,_,_,_),
Procedural knowledge
Total_weight#=W_Battery+W_base+W_antenna+W_Screen.
Code 2-1. Slice of a KB that Contains Declarative and Procedural Knowledge
2.1.3 Knowledge Based System Prior to KBS, the notation of ’expert system’ was used for these applications; people in industry and academia have stopped using this term due to the fact that experts themselves disapproved of the word choice ‘expert’ since it implied that the systems were as good as them. The notation KBS is slightly broader since it does not infer that the knowledge can only come from human experts. There are more recent terms such as intelligent systems and knowledge systems, which both encapsulate ‘expert system’ and KBS. Throughout the thesis I will simply use the term KBS, and let it include all the terms mentioned above. There are many different ways to divide KBS into components: a common way is to divide it into KB and domain independent Problem Solver (PS). The PS is the solving technique the KBS uses on a problem. The abstract model of KBS problem solving behaviour is known as the Problem Solving Method (PSM) [76], (also known as the inference system [93]) where KBS is an application that, through PSM, makes decisions based on declarative and procedural knowledge in a certain domain. Many different types of PSM have been suggested for different problems, for instance prediction, diagnosis, and design (for more examples see Table A1-1, Appendix A1). - 25 -
CHAPTER 2: Literature Review
KBSs are developed by; KB shells, AI languages, or conventional languages. Shell software (e.g. KEE, GoldWorks-II, Nexpert Object, Leonardo, Xi Plus, Flex) [36] has been specifically designed to enable quick development of KBSs but AI languages are commonly used as well (e.g. LISP and Prolog). It is also possible to use more conventional languages, for example Fortran, C++, Java, etc. KBSs have been developed for a variety of reasons, including: the archiving of rare skills, preserving the knowledge of retiring personnel, support in decision making, and to aggregate all of the available knowledge in a specific domain from several experts and/or machines. A criticism by knowledge representation theorists is that a KB is not a truthful representation of the actual knowledge [30, 138]. Although this may be the case, the knowledge representation is good enough for KBS to be accepted, implemented, and used in industry [80]. MYCIN [112], a medical diagnosis tool, was one of the first systems and was already in use in 1974. Given information concerning symptoms and test results of the patient, MYCIN attempted to identify the cause of the patient's problem and suggest treatments [4]. According to McCarthy [74]), MYCIN did better than medical students or practising doctors, providing its limitations were observed. KBS is one of the AI approaches that have been easiest for companies to embrace and there exist numerous additional examples of successful KBSs in business [33, 40, 77, 80]. The most frequent appear in production, marketing, and customer service [78]. Some of the more successful systems are scheduling KBSs such as PROTOS, PSY, and MEDICUS, which schedule work in chemistry, the metal industry, as well as in heart surgery [109]. 2.1.4 Knowledge Based Scheduling Systems In the last few years, interest in using artificial intelligence technologies to aid in scheduling problems has increased [46, 97, 108, 109]. The first knowledge based scheduling system ISIS [41] appeared over twenty years ago and was followed by systems like; XCON [6], OPIS [120], SONIA [28], YAMS [94], S2 [37], DAS [20], REDS [53] etc. Knowledge based production scheduling is divided into Predictive and Reactive scheduling. Predictive scheduling is the scheduling done before the production starts (offline). Reactive scheduling, is the process of revising the predictive schedule during production (real time) when the original schedule cannot - 26 -
CHAPTER 2: Literature Review
be executed due to changes in manufacturing conditions, such as machine breakdowns, delivery problems, sudden absence of personnel etc. Most research has focused only on the predictive part [109], even though the reactive part is important for acceptance in industry. Throughout the thesis I will simply use the Scheduling KBS notation for Knowledge Based Scheduling System. For more information about the history and overview of Scheduling KBSs, I recommend the following papers [92, 97, 121]. 2.1.5 Reuse of KBS Components This research is supported by the Advanced Knowledge Technologies (AKT) Interdisciplinary Research Collaboration, which focuses on six challenges to ease substantial bottlenecks in the capturing and management of knowledge. The reuse of knowledge is one of those challenges [3]. Most KBSs are developed from scratch and the required knowledge acquisition is time consuming and therefore expensive [24, 33, 34, 108]. If new systems could be built by reusing part of existing KBs and PSs it might be done quicker and money could be saved. Reusing components of KBSs have for this reason been considered one of the main goals of the KBS community [24]. There are several different KBS components that have been suggested for reuse [23, 93, 109, 111]. At an early stage, researchers in the Knowledge Engineering sub-area identified a range of PSMs, which they argued covered the whole range of problem solving, and included methods for Classification and Diagnosis through to Planning (so-called Synthesis tasks) [56]. An early but powerful example of reuse of a PSM was the use of the EMYCIN (Empty/Essential-MYCIN) [10, 112] shell with a variety of domain-specific knowledge bases in the diagnosis of infectious diseases, analysis of building structures etc. Current work in reuse has resulted in systems where a number of components have been reused, including PS/PSM [10, 62, 112], ontologies [23, 62, 128] and KBs [3, 93, 99, 108, 111]. The use of cases in Case Based Reasoning is also a related activity [65]. The industry is currently using Scheduling KBSs [97, 108, 109, 111, 126], and in some cases even reusing standardised knowledge components successfully. The development of Scheduling KBSs PROTOS, PSY, and MEDICUS is an excellent example of how reuse can reduce the development time. PROTOS, the first project, took three years to develop and several of its - 27 -
CHAPTER 2: Literature Review
system concepts were reused when producing PSY, which only took one year to develop. By reusing experience and concepts from both previous projects the development time for MEDICUS was, according to Bruns & Sauer [109], just three months. The large number of knowledge acquisition tools and techniques and their complexity can make it complicated for users to select the most suitable ones for KBS development or to investigate reuse potential. One approach to dealing with the problem is to create toolboxes and advisory systems [64, 135, 136]. One example of such an approach is the MUSKRAT (Multistrategy Knowledge Refinement and Acquisition Toolbox) framework [51, 136] which aims to unify problem solving, knowledge acquisition and machine learning in a single computational framework. The framework contains an advisory system called ‘The MUSKRAT Advisor’ [133] which investigates a combination of existing KBs along with a specific PS that can be reused to solve a task. A more detailed description of the MUSKRAT-framework is given in section 3.2. 2.1.6 Reusing KBs In the development of a new Knowledge Base System (KBS), part of the required domain knowledge often exists elsewhere [111]. Therefore, it makes sense to investigate the reuse potential of existing KBs when developing a KBS. Currently engineers are exchanging information across distributed design teams and corporate boundaries earlier and are reusing information to a greater extent. In Industry, they talk about distributed Corporate Memory to support sharing and reusing knowledge between groups, teams, suppliers etc. Dieng et al. [34] highlight the importance of being able to store and reuse co-operative knowledge in the industry. For example, if a mobile phone manufacturing company plans to create a new production scheduling KBS it would be well advised to first check to see if the whole or part of the existing KBs (taken from their own company, suppliers, delivery companies), could be reused. Then, when the manufacturer tries to answer a new question such as ‘Is it possible to produce and deliver a mobile phone type with certain properties to a specific market?’, they can do so by reusing KBs from suppliers’ factories, delivery companies etc.
- 28 -
CHAPTER 2: Literature Review
There are several processes to assist in reusing non-standardised KBs4, such as; searching, translation, comprehension, comparing, slicing [129], reformulation and merging [24]. These are all very hard tasks and ontologies [93] play an important part in facilitating sharing and merging KBs. Ontology is the ‘content theories about the sorts of objects, properties of objects, and relations between objects that are possible in a specified domain of knowledge’ [23]. Ontologies have two main contributions to the KBS community. Firstly, the creation of an ontology clarifies the knowledge structure before building KBs by articulating the concepts and relationships inherent in the knowledge. Secondly, it facilitates the knowledge representation to be shared so that people in a domain use common structures and vocabulary for creating KBs. When searching though, comparing and merging KBs that are developed using different ontologies, some type of matching is needed to check for essential knowledge and detect redundant knowledge. KBS ontology research does not only conceptualise domain knowledge but also describes mapping possibilities between KBS components [122]. The contribution of the ontologies above, facilitate searching for structural and lexical similarities between the PS knowledge requirement and the existing knowledge in KBs which increases the chances for mapping, merging and ultimate reuse of KBs [23]. Merging, sharing and reusing ontologies is currently a popular research area [54]. Also some interesting research in characterising KBs is being undertaken [117, 129] that could be used to describe/summarise KBs. Still there exists no known fully automatic knowledge identification and tagging systems. Due to this, using standardised KBs in the development of new KBSs, which facilitate merging/mapping knowledge, is vital and should be done whenever possible. Recently, systems such as PROTÉGÉ have provided an option to write KBs in a standardised format like OKBC (Open Knowledge Base Connectivity) protocol [49], which facilitates investigations of KB reuse. My research makes the assumption that the KBs in the mobile phone manufacturing problem are written in the same language and use a common ontology, the context and structure of which is understood by the user. This might not always be true for all real-world problems but since the current focus is concerned with developing standardised KBs I believe this assumption is appropriate.
4
The KBs could be written in different languages, use different ontologies etc.
- 29 -
CHAPTER 2: Literature Review
2.2 Constraint Programming Constraint programming (CP) has successfully been applied to many real-world problems since these problems can easily be modelled in terms of constraints, such as: scheduling, planning, configuration, layout, resource allocation, and decision support [100, 130]. Other areas where CP is used are: Concurrent computing, database systems, graphical interfaces, hardware verification, operations research and combinatorial optimisation [8, 45, 58, 107, 130]. In the eighties, constraint logic programming (CLP) appeared; the first general-purpose computational framework based on combining constraints and logic programming [58]. 2.2.1 Constraint Satisfaction Constraint Satisfaction techniques attempt to find solutions to constraint satisfaction problems (CSPs) [7, 127]. There are a number of efficient toolkits and languages available, for instance ILOG and SICStus [59, 113], especially designed to handle these problems. 2.2.1.1
CSP Definition
The definition of a Constraint Satisfaction Problem (CSP) is: A set of variables X={X1,..., Xn}, For each variable Xi, a finite set Di of possible values (its domain), and A set of constraints C ⊆ Dj1 × Dj2 × …× Djt, restricting the values that subsets of the variables can take simultaneously.
A solution to a CSP is the assignment of a value from its domain to every variable, in such a way that all constraints are satisfied. The main CSP solution technique interleaves consistency enforcement [42], in which unfeasible values are removed from the problem through reasoning about the constraints, and various forms of backtracking search. The same approach also serves to identify unsolvable problems. Formulating the problem as a Constraint Satisfaction Problem tends to be less complicated than traditional Operational Research (OR) techniques (e.g. [119]). Though sometimes when fine-tuning the search for a CSP it requires remodelling in a more complicated fashion than the less expressive [97] OR techniques. In CSP variables and domain correlate directly to the problem entities - 30 -
CHAPTER 2: Literature Review
and the values they can take. In some cases constraint satisfaction techniques may give a solution faster than OR techniques such as integer linear programming [8, 57, 118, 119]. 2.2.1.2
Search Cost
Solving a CSP may be intended to achieve one of the following goals: demonstrate there is no solution; find any solution; find all solutions; find an optimal, or at least a good, solution given some objective evaluation function.
According to Freuder and Wallace [44], a standard measure of effort for a CSP algorithm is the number of constraint checks. Other properties such as time, backtracking, and resumption are also commonly used to measure the cost of the search, which depends on the following CSP properties: The structure of the problem; how the constraints interact to rule out assignments. The individual constraints; some constraints are cheap to test/propagate, while others are expensive. Some constraints even push the problem into areas where there are no efficient solving methods [26]. The number of solutions in a best-solution search or in an all-solution search.
2.2.1.3
Constraint Arity
The constraint arity is the number of variables that the constraint is connected to. A ‘unary constraint’ constrains one single variable while a constraint that constrains two variables is called a ‘binary constraint’. A commonly used notation of arity is a constraint that constrains the values of N variables and is a ‘N-ary constraint’. Below (Table 2-1) are some example of constraints and their arity5.
5
I used SICStus not-equal sign ‘#\=’ from its constraint library over finite domains.
- 31 -
CHAPTER 2: Literature Review
Constraint
Arity
Name
X1#\=0
1
Unary constraint
X1#\=X2
2
Binary constraint
3
Non-binary constraint with a arity of 3
…
…
…
X1#\=X2#\=…Xn
n
n-ary constraint
all_different (X1,X2,X3)
Table 2-1. Constraints with Different Arity
One of the reasons that researchers in the last 10 years, have mainly worked with binary constraints is that all constraints of an arity greater than 2 can be reformulated6 and represented with binary constraints [5, 7]. For instance, the arity 3 constraint ‘all_different(X1,X2,X3)’ can be reformulated to the following three binary constraints ‘X1 #\=X2, X2#\=X3, X3#\=X1. For more information about this binary representation of a non-binary constraint, see [5]. I have reservations about the practice of using solely binary constraints [5, 32, 31, 61, 102] and in section 4.1.2, I present the shortcomings of this practice and argue for introducing a mixture of different constraint arities. 2.2.2 Constraint Graph & Constraint Hyper-graph Graphical representations of the binary CSPs are normally done with Constraint Graphs (left graph in Figure 2-1). The nodes of the graph represent the variables and the constraints between them are represented by the edges joining two of the nodes. Graphical representations of the non-binary CSPs are normally done with a Constraint Hyper-graph, where the nodes represent the variables and the constraint is circled around the variables that are involved in the constraint (right graph in Figure 2-1).
6
Even though in practice this transformation is not likely to be worth doing.
- 32 -
CHAPTER 2: Literature Review
Figure 2-1. Constraint Graph & Constraint Hyper-Graph.
2.2.3 Search Methods The majority of search algorithms systematically assign possible values to the variables. Although these types of algorithms are guaranteed to find existing solutions, they have the drawback of sometimes requiring a lot of time for the search. The effectiveness of an algorithm is normally judged by its time complexity; how long it takes to find the solution. Note that search is also commonly referred to as labelling. One of the earlier systematic search algorithms is Generate and Test (GT) that starts with randomly generating a value for each variable and checking if the set is consistent with the existing constraints. The instantiation and checking procedure iterates until a solution is found or until all possible instantiations have been tried. The advantage of this algorithm is that it is easy to implement; however it also has two major drawbacks: firstly, the run-time complexity of the algorithm is exponential O(max(|Di|)n), where n is the number of variables and D is the domain size used. Secondly, the algorithm is rather inefficient because the algorithm does not memorise previous inconsistent variable instantiations and it will continue to instantiate the same inconstant values to the variables. The Standard Backtracking algorithm (BT) is a more commonly used systematic search algorithm and can be seen as a modified GT algorithm that surmounts the last shortcoming of the GT. After the first domain-value is instantiated to one of the variables, the Standard BT algorithm continues to instantiate another variable and checks for consistency against the first instantiation (partial solution). If - 33 -
CHAPTER 2: Literature Review
consistent, it will extend the partial solution with the domain-value-instantiation and continue by instantiating the next variable, checking for consistency against previous partial solutions. This process will iterate until either a complete consistent assignment is found or no solution is found. A solution is detected when a complete consistent assignment is found, while no solution exists if no complete consistent assignment is found. If during this iterative process an inconsistency is detected in the BT instantiation process, it will ignore all further instantiations containing that partial solution and backtrack to the last successful variable instantiation and re-assign it with a new domain-value. This means that the algorithm avoids some of the inconsistent search space the GT would examine. If BT can find a solution without any backtracking its run-time complexity becomes linear. This is seldom the case as the most non-trivial problems require backtracking and the worst time complexity then becomes exponential O(dnm) and the space complexity linear O(dn). In order to reduce the amount of backtracking it is possible to implement search heuristics (see section 2.2.5) that consider the ordering of variables and values in the instantiation process. Standard BT has three drawbacks that affect its run-time complexity: firstly, Thrashing; which is the failure of BT to detect the actual variable that makes the partial instantiation test inconsistent. For example, if X1 is instantiated with value Da and the search continues instantiating values on variables X2, X3,…, Xn without realising that it is impossible to find any consistent assignment on these variables as long as Da is instantiated to X1. Secondly, the algorithm does redundant work: even when the reason for inconsistency is correctly detected, the reason would be forgotten when an identical inconsistency occurs in the iterative process. Lastly, the late inconsistency detection; the algorithm would only detect inconsistency after all variables in the partial assignments have been instantiated. Intelligent BT algorithms have been developed to overcome the drawbacks of the standard BT, for example, Back-jumping (BJ), Backmarking (BM), or Back-checking (BC). These are all Intelligent BT ‘Look Back algorithms’ that, by using consistency to check among the assigned variables, can overcome the first two limitations of standard BT. When backtracking takes place, this algorithm can identify the source of inconsistency and backtrack to the place where the inconsistent variables were assigned. In spite of the fact that these algorithms normally perform better than standard BT, they still suffer from the drawback of only detecting inconsistency after the assignment has been made. Algorithms that manage to - 34 -
CHAPTER 2: Literature Review
overcome the third weakness of standard BT enforce consistency techniques (see section 2.2.4) during search, to avoid any inconsistent domain sets values before the instantiation is done. Several of these so-called Intelligent BT ‘Look Ahead algorithms’ have been proposed, see section 2.2.4.5. Even though search algorithms such as standard BT are guaranteed to find any existing solution and its run-time complexity becomes linear if no backtracking is needed, this is hardly ever the case, for the most non-trivial problems backtracking is needed and the run-time complexity becomes exponential. Intelligent BT algorithm bridges the three inadequacies of the standard BT which affect its run-time complexity, but they are sometimes so costly to apply that standard BT is preferred. For further information on search algorithms see [46, 67, 105, 118]. 2.2.4 Consistency Algorithms Consistency techniques were first introduced for picture recognition programs [132] and later successfully applied on different hard search problems [46]. Consistency techniques try to detect and remove inconsistent values from the domain sets of a variable but can seldom discard all inconsistent domain values for a problem. Because consistency algorithms do not remove any values that would take part in any solutions, they can be considered to transform the original CSP to an equivalent one. Note that although Consistency Algorithms are often called discrete relaxation algorithms, they are completely different from the relaxation algorithms I have introduced (See section 4.2). The effectiveness of the consistency algorithms is normally judged by how long it takes to find the solution (time complexity) as well as how much memory is needed to perform the search (space complexity). Because these algorithms can not demonstrate consistency (are incomplete), they are more frequently used either interleaved with the search or before the search as a preparation phase to remove redundant domain values that might have been detected several times and thus slow down the search. Researchers have long worked under the assumption that consistency checks before the search are always valuable. It should be noted that empirical results [106] have shown that the consistency checking before the search can interfere with the interleaved checking inside a search algorithm, making the search with pre-processing consistency checks more costly. - 35 -
CHAPTER 2: Literature Review
2.2.4.1
Node-Consistency
Node-consistency algorithms check that each variable (nodes) connected to unary constraints are consistent. Node-consistency algorithms locate variables that are constrained with unary constraints. When such a variable is found, the algorithm checks each of the domain values of the variable against the unary constraint and removes those that violate the constraint. A variable is Node Consistent (NC) if all its domain size values satisfy the unary constraint and a CSP is considered NC if all the variables connected with a unary constraint are NC. Figure 2-2 shows a CSP example where X1 is the only variable with the unary constraint. After the algorithm locates X1 it examines domain values of X1 to see if any of the domain values violate the unary constraint. In the example X1 domain values are reduced from {1,2,3,4,5,6,7,8,9,10} to {1,2}, because {3,4,5,6,7,8,9,10} violates the unary constraint that states the X1 can only take a value less than 3.
X1