velopment of new Knowledge Based Systems (KBSs). A two-stage methodology
for achieving this has been developed: in the first stage, knowledge is mapped ...
Developing Knowledge-Based Systems through Ontology Mapping and Ontology Guided Knowledge Acquisition
David Corsar BSc. Hons. Computing Science (Artificial Intelligence) (University of Aberdeen)
A thesis presented for the degree of Doctor of Philosophy at the University of Aberdeen.
Department of Computing Science
2009
Declaration
2
Declaration No portion of the work contained in this document has been submitted in support of an application for a degree or qualification of this or any other university or other institution of learning. All verbatim extracts have been distinguished by quotation marks, and all sources of information have been specifically acknowledged.
Signed:
Date: 2009
Abstract
3
Abstract This thesis focuses on reusing domain ontologies and generic problem solvers (PSs) in the development of new Knowledge Based Systems (KBSs). A two-stage methodology for achieving this has been developed: in the first stage, knowledge is mapped from a domain ontology to the requirements of a generic PS (expressed in a PS ontology); in the second stage, this mapped knowledge and the domain specific reasoning requirements of the generic PS are used to “drive” the acquisition of additional (domain specific) procedural knowledge required by the PS. This acquired knowledge can then be used to generate an executable KBS. Developing this methodology involved a detailed review of the earlier reuse literature, in order to understand the strengths and weaknesses of earlier approaches. Generic PSs for proposeand-revise design and diagnosis were also developed based on two existing KBSs which performed these tasks in the elevator domain. To gain insights into the KBS development process, the generic PSs were used to manually build two new executable KBSs. A tool, MAKTab, was then developed to support the methodology by semi-automatically performing the actions undertaken during the manual building of the two KBSs. MAKTab has been used to successfully recreate the two elevator systems, and fully develop diagnosis and design KBSs in the computer hardware domain. The findings described in the thesis support the belief that a domain ontology developed for one type of PS will, in general, be unable to fully meet the procedural requirements of another type of PS; this knowledge must therefore be acquired. This work also shows that a single, general knowledge acquisition technique can be applied with different types of generic PSs, to acquire the necessary procedural knowledge. These findings are significant as they show shortcomings of previous approaches have been identified and addressed in the proposed methodology, which along with MAKTab, moves the Knowledge Engineering community closer to fulfilling the dream of KBS creation by configuring reusable components.
Acknowledgements
4
Acknowledgements This work was supported under the Advanced Knowledge Technologies (AKT) Interdisciplinary Research Collaboration (IRC), which is sponsored by the UK Engineering and Physical Sciences Research Council under grant number GR/N15764/01. I would also like to thank my supervisor, Professor Derek Sleeman, for giving me a chance to do a PhD and for the help, support, and guidance he has provided throughout. I would also like to acknowledge Trevor Runcie for various useful discussions on research issues, and members of the Prot´eg´e team, particularly Mark Musen, Samson Tu, and Martin O’Connor for useful discussions regarding my work. I would also like to further acknowledge Mark Musen at the Stanford Center for Biomedical Informatics Research, who made available their version of the Sisyphus-II VT solution. I would also like to acknowledge Nial Chapman, who along with Derek Sleeman developed the original elevator diagnosis KBS. I am also extremely grateful to the Prot´eg´e team and Henrik Eriksson (Link¨oping University) who developed JessTab: without their respective tools, my work would have been significantly more challenging. I would also, once again, like to thank those who participated in the evaluation experiments for their time and efforts to evaluate my tool. I would also like to thank my parents, Stanley and Gwen Corsar for their continual support during my studies at the University of Aberdeen. I would also like to thank Laura Moss for her support throughout my PhD.
Note
5
Note Parts of the work that appear in this thesis have been published: • D. Corsar and D. Sleeman. Reusing JessTab Rules in Prot´eg´e. In M. Bramer and F. Coenen and T. Allen, editors, Research and Development in Intelligent Systems XXII Proceedings of AI-2005 the Twenty-Fifth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence (Cambridge, UK); pages 7-20. Springer, 2005. Received “Best Technical Paper AI-2005” award. • D. Corsar and D. Sleeman. Reusing JessTab Rules in Prot´eg´e, Knowledge-Based Systems. 19(5); pages 291-297, Elsevier B.V.. • D. Corsar and D. Sleeman. KBS Development through Ontology Mapping and Ontology Driven Acquisition. In D. Sleeman and K. Barker, editors, K-CAP ’07: Proceedings of the 4th International Conference on Knowledge Capture (Whistler, BC, Canada); pages 23-30, ACM, New York, NY, USA, 2007. • David Corsar, Derek Sleeman, and Anne McKenzie. Extending Jess to Handle Uncertainty. In M. Bramer, F. Coenen, and M. Petridis, editors, Research and Development in Intelligent Systems XXIV Proceedings of AI-2007, the Twenty-seventh SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence (Cambridge, UK), pages 81-93. Springer, 2007. • David Corsar and Derek Sleeman. KBS Development on the (Semantic) Web. In Symbiotic Relationships between Semantic Web and Knowledge Engineering, Papers from the AAAI Spring Symposium, Technical Report SS-08-07, pages 35-44. AAAI Press, Menlo Park, California, 2008. • D. Corsar and D. Sleeman. Developing Knowledge-Based Systems using the Semantic Web. In E. Gelenbe, S. Abramsky, and V. Sassone, editors, Visions of Computer Science, BCS International Academic Conference (London, UK); pages 29-40, BCS, 2008.
Acronyms, Abbreviations, and Notation
6
Acronyms, Abbreviations, and Notation
Acronyms PS PS-RS PS-ONT PSM ONT KB KBS
Problem Solver (PS-RS + PS-ONT). Rule Set which implements a PS. Ontology used by a PS. Problem Solving Method. Domain Ontology. Knowledge Base. Knowledge-Based System (PS + ONT).
Abbreviations pnr diag elevator computer
Propose-and-revise. Diagnosis. Elevator domain (Lift in British English). Computer hardware domain.
Notation PS(pnr, -) PS(pnr, [elevator])
PS-RS(pnr, -) PS-RS(pnr, [domain])
PS-ONT(pnr, -) PS-ONT(pnr, [elevator’]) PS-ONT(pnr, [elevator])
ONT(elevator, -) ONT(elevator, [pnr]) ONT(elevator’, [pnr, diag]) KBS(pnr, elevator)
Domain instantiated PS
Domain independent pnr PS, which is composed of PS-ONT(pnr, -) and PS-RS(pnr, -). pnr PS developed in the context of the elevator domain, composed of components: PS-ONT(pnr, -), PS-RS(pnr, -), and PS-RS(pnr, [elevator]). Rule Set which implements the generic pnr algorithm. Rule Set which implements the domain specific pnr rules for the domain domain, e.g. PS-RS(pnr, [elevator]) is the set of elevator specific pnr rules. PS-ONT which defines the concepts used by PS-RS(pnr, -) and PSRS(pnr, [domain]). The PS-ONT(pnr, -) partially configured for the elevator domain, produced after the mapping stage. PS-ONT which defines the concepts used by PS-RS(pnr, -) and PS-RS(pnr, [elevator]) instantiated with relevant elevator knowledge (components and/or rules). Elevator domain ontology. Elevator domain ontology used by PS(pnr, -). Elevator domain ontology used by PS(pnr, -) and extended with knowledge for PS(diag, -). A KBS using the pnr PS for the elevator domain. KBS(pnr, elevator) is composed of 2 linked components: ONT(elevator, [pnr]) and PS(pnr, [elevator]). A PS which has been configured/instantiated with domain specific information.
Acronyms, Abbreviations, and Notation
7
Typeface In this thesis, standard text is written in Times, such as this. Fixed width text is used when referring to the names of concepts in an ontology, CLIPS, Jess, or JessTab constructs and example code, and Java class, variable, and method names. Other special typefaces, such as those in figure 2.2 are explained where they are used.
Contents
8
Contents
1
2
Introduction
23
1.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
1.2
Thesis Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
1.3
Thesis Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
Related Work
28
2.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.2
KBSs from Reusable Components . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.2.1
PSM Librarian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
2.2.2
KADS-I, KADS-II, and CommonKADS . . . . . . . . . . . . . . . . . .
31
2.2.3
IBROW3 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
2.2.4
MUSKRAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
2.3
Ontology Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
2.4
Knowledge Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
2.4.1
Prot´eg´e for Knowledge Acquisition . . . . . . . . . . . . . . . . . . . .
35
2.4.2
SALT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
2.4.3
EXPECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
2.5
Elevator Design (VT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
2.6
Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
2.6.1
The OWL Web Ontology Language . . . . . . . . . . . . . . . . . . . .
38
2.6.2
The Semantic Web Rule Language . . . . . . . . . . . . . . . . . . . . .
40
2.6.3
Overview of the Prot´eg´e Environment . . . . . . . . . . . . . . . . . . .
41
2.6.4
Jess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
2.6.5
JessTab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
2.7 3
Reusing JessTab Rules in Prot´eg´e
51
3.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
3.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
3.2.1
Ontology-Tied Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
Example Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
3.3.1
Simulating Stock Management and Water Treatment . . . . . . . . . . .
53
3.3.2
Route Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
The JessTab Rule Reuse Process . . . . . . . . . . . . . . . . . . . . . . . . . .
54
3.3
3.4
Contents
4
3.4.1
Phase 1 - Rule Abstraction . . . . . . . . . . . . . . . . . . . . . . . . .
54
3.4.2
Phase 2 - Rule to Ontology Mapping . . . . . . . . . . . . . . . . . . . .
55
3.5
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
3.6
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
3.7
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
Building Knowledge-Based Systems from Reusable Components
61
4.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
4.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
4.3
Reusable Knowledge Components . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.3.1
Problem Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.3.2
Generic Problem Solvers . . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.4
Proposed Methodology for Building KBSs Using Reusable Components . . . . .
68
4.5
Creating Reusable Components . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
4.5.1
Creating a Generic Propose-and-Revise Problem Solver . . . . . . . . .
69
4.5.2
Creating a Generic Diagnostic Problem Solver . . . . . . . . . . . . . .
79
4.5.3
Extracted Domain Ontologies . . . . . . . . . . . . . . . . . . . . . . .
85
Manual Reuse Experiments Investigating the Proposed Methodology . . . . . . .
86
4.6.1
Manual Experiment 1: Elevator Diagnostic KBS . . . . . . . . . . . . .
87
4.6.2
Manual Experiment 2: Elevator Configuration KBS . . . . . . . . . . . .
93
4.6.3
Summary of Lessons Learnt from the Manual Reuse Experiments . . . .
97
Supporting the Acquisition of Domain Specific (Reasoning) Knowledge . . . . .
99
4.7.1
99
4.6
4.7 4.8 5
9
Driven Rule Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . .
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Supporting the Proposed KBS Development Methodology
106
5.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.2
Supporting Reuse-Based KBS Development (MAKTab) . . . . . . . . . . . . . . 107
5.3
MAKTab’s Ontology Mapping Tool . . . . . . . . . . . . . . . . . . . . . . . . 107
5.4
5.3.1
Mapping Power/Complexity . . . . . . . . . . . . . . . . . . . . . . . . 108
5.3.2
Mapping Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.3.3
Mapping Dynamicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.3.4
Mapping Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.3.5
Automatic Suggestion of Mappings . . . . . . . . . . . . . . . . . . . . 119
5.3.6
Interface Design Features . . . . . . . . . . . . . . . . . . . . . . . . . 121
MAKTab’s Knowledge Acquisition Tool . . . . . . . . . . . . . . . . . . . . . . 128 5.4.1
Focused Rule Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.4.2
Addition of New Domain Concepts . . . . . . . . . . . . . . . . . . . . 137
5.4.3
Interface Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.4.4
Converting SWRL Rules into Executable Form . . . . . . . . . . . . . . 146
5.5
Generating the Executable KBS . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Contents 6
Acquiring Problem Solvers from the Web
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.2
Acquiring Problem Solvers from the Web . . . . . . . . . . . . . . . . . . . . . 152
6.3
6.4
Extracting an Ontology from CLIPS, Jess and JessTab Programs . . . . . 154
Design and Implementation of PS2 R . . . . . . . . . . . . . . . . . . . . . . . . 156 6.3.1
The Architecture of PS2 R . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.3.2
Searching the Web for PSs . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.3.3
A Repository of Problem Solvers . . . . . . . . . . . . . . . . . . . . . 164
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Evaluation
170
7.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
7.2
Evaluation of MAKTab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.3
8
152
6.1
6.2.1
7
10
7.2.1
Aim of Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.2.2
Potential Comparative Studies . . . . . . . . . . . . . . . . . . . . . . . 171
7.2.3
Evaluation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
7.2.4
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.2.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Evaluation of PS2 R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 7.3.1
Aim of Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7.3.2
Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7.3.3
Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
7.3.4
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
7.3.5
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.3.6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Conclusions and Future Work
202
8.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.2
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.3
8.4
8.2.1
Hypothesis 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.2.2
Hypothesis 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.2.3
Hypothesis 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
8.2.4
Hypothesis 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.2.5
Hypothesis 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 8.3.1
Expanding the Range of Generic PSs . . . . . . . . . . . . . . . . . . . 206
8.3.2
Generic SWRL to JessTab Rule Generator . . . . . . . . . . . . . . . . . 207
8.3.3
Tracking New Domain Concepts . . . . . . . . . . . . . . . . . . . . . . 207
8.3.4
Ensuring Rule Completeness . . . . . . . . . . . . . . . . . . . . . . . . 208
8.3.5
Alternative Executable Formats . . . . . . . . . . . . . . . . . . . . . . 208
8.3.6
Developing KBSs on the (Semantic) Web . . . . . . . . . . . . . . . . . 208
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Contents
11
Bibliography
213
A Example Application of Mapping Algorithms
221
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 A.2 The Mapping Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 A.2.1 The Source Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 A.2.2 The Target Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 A.2.3 Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 A.3 Walk-Through . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 B Example Application of the KA Algorithms
248
B.1 Example PS Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 B.2 Walk-Through . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 C MAKTab User Manual
266
C.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 C.1.1
Brief Introduction to KBS . . . . . . . . . . . . . . . . . . . . . . . . . 266
C.1.2
Brief Introduction to Ontologies . . . . . . . . . . . . . . . . . . . . . . 267
C.1.3
Brief Introduction to KBS in Prot´eg´e . . . . . . . . . . . . . . . . . . . 268
C.1.4
Brief Introduction to MAKTab KBS Development Methodology . . . . . 268
C.2 Installing MAKTab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 C.2.1
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
C.2.2
Installing MAKTab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
C.3 Using MAKTab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 C.3.1
Enabling MAKTab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
C.4 Using the Mapping Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 C.4.1
Loading the Source Ontology . . . . . . . . . . . . . . . . . . . . . . . 272
C.4.2
Loading the Target Ontology . . . . . . . . . . . . . . . . . . . . . . . . 272
C.4.3
Defining A Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
C.4.4
Executing the Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 277
C.4.5
Automatic Mapping Suggestions . . . . . . . . . . . . . . . . . . . . . . 279
C.4.6
Viewing/Editing Mappings . . . . . . . . . . . . . . . . . . . . . . . . . 279
C.5 Using the KA Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 C.5.1
Select Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
C.5.2
The KA Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
C.5.3
Selecting a Concept for KA/Starting KA . . . . . . . . . . . . . . . . . . 280
C.5.4
Adding New/Editing Domain Concepts . . . . . . . . . . . . . . . . . . 281
C.5.5
The Rule Definition Interface . . . . . . . . . . . . . . . . . . . . . . . 281
C.5.6
Adding an Antecedent to the Rule . . . . . . . . . . . . . . . . . . . . . 282
C.5.7
Removing an Antecedent . . . . . . . . . . . . . . . . . . . . . . . . . . 283
C.5.8
Adding a Consequent to the Rule . . . . . . . . . . . . . . . . . . . . . 283
C.5.9
Removing a Consequent . . . . . . . . . . . . . . . . . . . . . . . . . . 283
C.5.10 Creating a Similar Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Contents
12
C.5.11 Acquiring the Next Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 284 C.5.12 Viewing Existing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 C.5.13 Editing an Existing Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 284 C.5.14 Creating a New Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 C.5.15 Deleting an Existing Rule . . . . . . . . . . . . . . . . . . . . . . . . . 286 C.5.16 Editing an Atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 C.5.17 Creating Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 C.5.18 Adding Arguments to Expressions and Lists in General . . . . . . . . . . 288 C.5.19 Generating the Executable KBS . . . . . . . . . . . . . . . . . . . . . . 288 C.5.20 Useful Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 D A Short Introduction to Knowledge-Based Systems
290
D.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 D.1.1 Brief Introduction to KBS . . . . . . . . . . . . . . . . . . . . . . . . . 290 D.1.2 Brief Introduction to Ontologies . . . . . . . . . . . . . . . . . . . . . . 291 D.1.3 Brief Introduction to KBS in Prot´eg´e . . . . . . . . . . . . . . . . . . . 292 D.1.4 Brief Introduction to MAKTab KBS Development Methodology . . . . . 292 E MAKTab KBS Development Introduction - Computer Configuration
294
E.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 E.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 E.3 And Finally... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 F MAKTab KBS Development Introduction - Computer Diagnosis
296
F.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
F.2
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
F.3
And Finally... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
G How to Adapt the Generic Propose-and-Revise Problem Solver for a Domain
298
G.1 The General Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 G.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 G.3 How to Configure the Generic Propose and Revise PS for a Domain . . . . . . . 301 G.3.1 Step 1 – Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 G.3.2 Step 2 – Rule Knowledge Acquisition . . . . . . . . . . . . . . . . . . . 302 G.3.3 Propose and Revise Rules . . . . . . . . . . . . . . . . . . . . . . . . . 303 G.3.4 Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 H How to Adapt the Generic Diagnostic Problem Solver for a Domain
307
H.1 How to Adapt the Generic Diagnostic Problem Solver for a Domain . . . . . . . 307 H.1.1 The General Diagnostic Algorithm . . . . . . . . . . . . . . . . . . . . . 307 H.1.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 H.1.3 Configuring the Generic Diagnostic Problem Solver . . . . . . . . . . . 309 H.1.4 Diagnostic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 H.1.5 Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Contents I
MAKTab Tutorial for Building a Propose-and-Revise Based KBS
13 312
I.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
I.2
The Domain: Shelving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
I.3
Building the Shelf Design KBS . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
I.4
I.5
I.3.1
Loading the Propose and Revise Ontology . . . . . . . . . . . . . . . . . 313
I.3.2
Select MAKTab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 I.4.1
Load the Shelf Ontology as the Source Ontology . . . . . . . . . . . . . 313
I.4.2
Import the Propose and Revise Ontology as the Target Ontology . . . . . 313
I.4.3
Define a Copy Class Mapping . . . . . . . . . . . . . . . . . . . . . . . 313
I.4.4
Execute the Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Rule Knowledge Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 I.5.1
Select the KA Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
I.5.2
Select KA Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
I.5.3
Create a new SystemVariable for the Shelf Supported Load . . . . . . . . 315
I.5.4
Start KA for Shelf Supported Load . . . . . . . . . . . . . . . . . . . . 315
I.5.5
Create a new SystemVariable for the Required Load . . . . . . . . . . . 319
I.5.6
Start KA for Maximum Load . . . . . . . . . . . . . . . . . . . . . . . . 319
I.5.7
Define a Constraint Rule for Maximum Load . . . . . . . . . . . . . . . 321
I.5.8
Select the Initial Material to Use . . . . . . . . . . . . . . . . . . . . . . 324
I.5.9
Define an Output SystemComponent Rule for Material . . . . . . . . . . 324
I.5.10 Generate the Problem Solver . . . . . . . . . . . . . . . . . . . . . . . . 324 I.6 J
Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
MAKTab Tutorial for Building a Diagnosis Based KBS
326
J.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
J.2
Introduction to the Domain: Car Diagnosis . . . . . . . . . . . . . . . . . . . . . 326
J.3
Building the Car Diagnosis KBs . . . . . . . . . . . . . . . . . . . . . . . . . . 326
J.4
J.5
J.3.1
Loading the Generic Diagnosis Problem Solver . . . . . . . . . . . . . . 326
J.3.2
Select MAKTab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 J.4.1
Load the Engine Parts Ontology as the Source Ontology . . . . . . . . . 327
J.4.2
Import the Diagnosis Ontology as the Target Ontology . . . . . . . . . . 327
J.4.3
Define Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Defining Engine Part Diagnostic Rules . . . . . . . . . . . . . . . . . . . . . . . 328 J.5.1
Select the KA Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
J.5.2
Select KA Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
J.5.3
Start KA for Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
J.5.4
Define Engine Will Not Start Problem . . . . . . . . . . . . . . . . . . . 328
J.5.5
Define a Fix Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
J.5.6
Generate the Problem Solver . . . . . . . . . . . . . . . . . . . . . . . . 330
Contents K An Introduction to Building a Desktop Computer
14 331
K.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 K.2 The Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 K.3 The Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 K.4 Selecting Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 K.4.1 The Motherboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 K.4.2 The Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 K.4.3 The Processor (a.k.a. CPU) . . . . . . . . . . . . . . . . . . . . . . . . 333 K.4.4 Computer Memory (a.k.a. RAM) . . . . . . . . . . . . . . . . . . . . . 333 K.4.5 Power Supply Unit (PSU) . . . . . . . . . . . . . . . . . . . . . . . . . 333 K.4.6 Hard Disk Drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 K.4.7 Optical Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 K.5 System Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 K.5.1 Desired minimum memory . . . . . . . . . . . . . . . . . . . . . . . . . 334 K.5.2 Number of memory modules . . . . . . . . . . . . . . . . . . . . . . . . 334 K.5.3 The total memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 K.5.4 Desired hard drive capacity . . . . . . . . . . . . . . . . . . . . . . . . . 334 K.5.5 Total required power . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 K.6 Constraints and Fixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 K.6.1 Motherboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 K.6.2 Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 K.6.3 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 K.6.4 Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 K.6.5 Hard Drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 K.6.6 Optical Drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 K.6.7 Power Supply Unit (PSU) . . . . . . . . . . . . . . . . . . . . . . . . . 337 K.7 Initial Selections and Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 K.8 Output Selections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 L An Introduction to Computer Hardware Fault Diagnosis
340
L.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 L.2 Power Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 L.2.1
Problem PSU-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
L.2.2
Problem PSU-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
L.3 Video Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 L.3.1
Problem VP-1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
L.3.2
Problem VP-2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
L.3.3
Problem VP-3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
L.4 Motherboard, CPU, and Memory Problems . . . . . . . . . . . . . . . . . . . . 341 L.4.1
Problem MCM-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
L.4.2
Problem MCM-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
L.4.3
Problem MCM-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Contents
15
L.5 Hard Drive Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 L.5.1
Problem HD-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
L.5.2
Problem HD-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
L.6 Optical Drive Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 L.6.1
Problem OD-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
L.6.2
Problem OD-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
L.7 Modem Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 L.7.1
Problem M-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
L.8 Sound Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 L.8.1
Problem S-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
L.8.2
Problem S-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
L.8.3
Problem S-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
M Questionnaires
344
M.1 Pre-Experiment Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 M.2 Questions Regarding the Experiment you have just performed . . . . . . . . . . 346 M.2.1 General Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
List of Tables
16
List of Tables 2.1
Description of the main concepts in the SWRL ontology. . . . . . . . . . . . . .
42
3.1
Results for mapping the classes of the Document ontology. . . . . . . . . . . . .
58
3.2
Results for mapping the slots of the Document ontology. . . . . . . . . . . . . .
59
5.1
Descriptions of the four direct creation mappings currently provided by MAKTab. 109
5.2
Descriptions of the transformation mappings currently provided by MAKTab. . . 110
5.3
An example of the table T1 used by the mapping algorithm.
5.4
An example of table T2 used by the mapping algorithm. . . . . . . . . . . . . . . 114
6.1
Summary of the mappings from a CLIPS, Jess, or JessTab slot definition to an
. . . . . . . . . . . 114
OWL ontology property. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 7.1
Summary of the types of rules defined in KBS(pnr, elevator). . . . . . . . . . . . 179
7.2
Summary of the types of rules defined in KBS(elevator, diag). . . . . . . . . . . 186
7.3
Summary of each subject’s familiarity with the relevant concepts for the KBS(pnr, computer) experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.4
Summary of the KBSs developed by the subjects in the computer hardware configuration experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.5
Summary of the responses to the questionnaire given by the subjects who took part in the elevator configuration experiment. . . . . . . . . . . . . . . . . . . . . . . 191
7.6
Summary of each subject’s familiarity with the relevant concepts for the KBS(diag, computer) experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.7
Summary of the KBSs built by the subjects in the computer hardware diagnosis experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.8
Summary of the responses to the questionnaire given by the subjects that took part in the elevator diagnosis experiment. . . . . . . . . . . . . . . . . . . . . . . . . 194
7.9
Summary of the interface evaluation checklist results. . . . . . . . . . . . . . . . 195
7.10 Continued summary of the interface evaluation checklist results. . . . . . . . . . 196 7.11 Summary of the retrieval rates of PSs from the PS2 R evaluation. . . . . . . . . . 199 A.1 Walk-through of the mapping algorithm for applying the translation mappings (1). 226 A.2 Walk-through of the mapping algorithm for applying the translation mappings (2). 227 A.3 Walk-through of the mapping algorithm for applying the translation mappings (3). 228 A.4 Walk-through of the mapping algorithm for applying the translation mappings (4). 229 A.5 Walk-through of the mapping algorithm for applying the translation mappings (5). 230
List of Tables
17
A.6 Walk-through of the mapping algorithm for applying the translation mappings (6). 231 A.7 Walk-through of the mapping algorithm for applying the translation mappings (7). 232 A.8 Walk-through of the mapping algorithm for applying the translation mappings (8). 233 A.9 Walk-through of the mapping algorithm for applying the translation mappings (9). 234 A.10 Walk-through of the mapping algorithm for applying the translation mappings (10). 235 A.11 Walk-through of the mapping algorithm for applying the translation mappings (11). 236 A.12 Walk-through of the mapping algorithm for applying the translation mappings (12). 237 A.13 Applying mapping TM1 to SI1-1. . . . . . . . . . . . . . . . . . . . . . . . . . 238 A.14 Applying mapping TM2 to SI1-1. . . . . . . . . . . . . . . . . . . . . . . . . . 239 A.15 Applying mapping TM3 to SI1-1. . . . . . . . . . . . . . . . . . . . . . . . . . 240 A.16 Applying mapping TM1 to SI1-2. . . . . . . . . . . . . . . . . . . . . . . . . . 241 A.17 Applying mapping TM2 to SI1-2. . . . . . . . . . . . . . . . . . . . . . . . . . 241 A.18 Applying mapping TM3 to SI1-2. . . . . . . . . . . . . . . . . . . . . . . . . . 242 A.19 Applying mapping TM1 to SI1-3. . . . . . . . . . . . . . . . . . . . . . . . . . 243 A.20 Applying mapping TM2 to SI1-3. . . . . . . . . . . . . . . . . . . . . . . . . . 243 A.21 Applying mapping TM3 to SI1-3. . . . . . . . . . . . . . . . . . . . . . . . . . 244 A.22 Applying mapping TM4 to SC2-1. . . . . . . . . . . . . . . . . . . . . . . . . . 245 A.23 Applying mapping TM4 to SC2-2. . . . . . . . . . . . . . . . . . . . . . . . . . 245 A.24 Applying mapping TM6 to SI3-1. . . . . . . . . . . . . . . . . . . . . . . . . . 246 A.25 Applying mapping TM6 to SI3-2. . . . . . . . . . . . . . . . . . . . . . . . . . 247 G.1 Sample materials for constructing a shelf. . . . . . . . . . . . . . . . . . . . . . 299 H.1 Example cause rules in the car domain. . . . . . . . . . . . . . . . . . . . . . . . 308 H.2 Example repair rules in the car domain. . . . . . . . . . . . . . . . . . . . . . . 309
List of Figures
18
List of Figures 2.1
VT System Components [123, figure 1]. . . . . . . . . . . . . . . . . . . . . . .
39
2.2
Excerpts from the abstract syntax for OWL. . . . . . . . . . . . . . . . . . . . .
40
2.3
Example class, property, and individual definitions using the OWL abstract syntax.
41
2.4
A visualisation of the SWRL ontology. . . . . . . . . . . . . . . . . . . . . . . .
41
2.5
Asserting facts in Jess. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
2.6
Example deffacts for defining a series of book facts and author facts. . . . . .
45
2.7
An example deftemplate for a book and author. . . . . . . . . . . . . . . . .
46
2.8
An example Jess rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
2.9
Example Jess output from running previous rule. . . . . . . . . . . . . . . . . .
47
2.10 Example mapped deftemplate and facts from an ontology. . . . . . . . . . .
49
2.11 An example JessTab rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
3.1
Illustration of the JessTab rule reuse process. . . . . . . . . . . . . . . . . . . . .
55
3.2
Sample mappings illustrating the need for the enhanced mapping algorithm. . . .
57
4.1
Using multiple tools to support the process of building a new KBS. . . . . . . . .
63
4.2
The generic PS ontology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
4.3
Outline methodology and components for reusing the ONT(elevator, [diag]) from KBS(diag, elevator) with PS(pnr, -) to produce KBS(pnr, elevator). . . . . . . . .
69
4.4
Visualisation of the PS-ONT(pnr, -)’s initial component selection rule. . . . . . .
74
4.5
Visualisation of the PS-ONT(pnr, -)’s initial value rule. . . . . . . . . . . . . . .
75
4.6
Visualisation of the PS-ONT(pnr, -)’s output component selection rules. . . . . .
76
4.7
Visualisation of the PS-ONT(pnr, -)’s output value selection rules. . . . . . . . .
76
4.8
Visualisation of the PS-ONT(pnr, -)’s assignment rule. . . . . . . . . . . . . . .
77
4.9
Visualisation of the PS-ONT(pnr, -)’s constraint rule. . . . . . . . . . . . . . . .
78
4.10 Visualisation of the PS-ONT(pnr, -)’s fix rule. . . . . . . . . . . . . . . . . . . .
78
4.11 Propose-and-Revise rule relationships. . . . . . . . . . . . . . . . . . . . . . . .
79
4.12 Illustration of two possible KBS(diag, elevator)s. . . . . . . . . . . . . . . . . .
80
4.13 Visualisation of the PS-ONT(diag, -)’s cause rule. . . . . . . . . . . . . . . . . .
82
4.14 Visualisation of the PS-ONT(diag, -)’s repair rule. . . . . . . . . . . . . . . . . .
82
4.15 An example graph of malfunctioning elevator behaviours, their causes and repairs.
83
4.16 Diagnostic rule relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
4.17 Example elevator propose-and-revise rules. . . . . . . . . . . . . . . . . . . . . 102 4.18 Rule KA graphs for PS(pnr, -). . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.19 Example protocol for acquiring propose-and-revise rules. . . . . . . . . . . . . . 103
List of Figures
19
4.20 Example protocol for acquiring diagnostic rules. . . . . . . . . . . . . . . . . . . 104 5.1
Mapping types currently supported by MAKTab. . . . . . . . . . . . . . . . . . 108
5.2
Class diagram of MappingFactory, including implementing factories for OWL and Frames mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.3
Sequence diagram showing how mapping types are loaded. . . . . . . . . . . . . 111
5.4
Example mapping configuration file for mapping types. . . . . . . . . . . . . . . 111
5.5
Applying mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.6
Class diagram showing classes relevant to the automatic mapping suggestion feature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.7
Example mapping configuration file for MappingSuggesters. . . . . . . . . 121
5.8
Sequence diagram for the automatic suggestion feature. . . . . . . . . . . . . . . 121
5.9
The mapping tab interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.10 Classes involved in loading an OWL ontology. . . . . . . . . . . . . . . . . . . . 123 5.11 Sequence diagram for loading an OWL source ontology. . . . . . . . . . . . . . 123 5.12 Sequence diagram showing how the system responds to the user selecting a class in the source ontology display. . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 5.13 Screenshots of the mapping display area when defining a class level mapping. . . 125 5.14 Class diagram of the mapping interfaces. . . . . . . . . . . . . . . . . . . . . . . 126 5.15 Sequence diagram for building the PropertyRenamingMappingGui GUI. . 126 5.16 Sequence diagram showing how the defined mappings are applied. . . . . . . . . 127 5.17 Example rule graph for the assignment, constraint, and fix rules defined in figure 4.17. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 5.18 Class diagram of the classes involved when rule KA is started for a particular concept. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 5.19 Sequence diagram showing the interactions among components for starting the KA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 5.20 A screenshot of the KA tool’s interface. . . . . . . . . . . . . . . . . . . . . . . 139 5.21 Screenshots of the KA tool’s domain concept display. . . . . . . . . . . . . . . . 141 5.22 Screenshots of the KA tool’s display of existing rules. . . . . . . . . . . . . . . . 142 5.23 Layout of rule definition GUI component. . . . . . . . . . . . . . . . . . . . . . 143 5.24 Example screenshots of the rule definition GUI component. . . . . . . . . . . . . 144 5.25 Screenshots showing how property values of an individual are highlighted in the rule editing display. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 5.26 Sample screenshot of the rule type selection dialog. . . . . . . . . . . . . . . . . 146 5.27 Sample screenshot of the atom type selection dialog. . . . . . . . . . . . . . . . 147 5.28 Sequence diagram for converting the rules defined by the user with MAKTab into an executable format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 6.1
Searching the Web for problem solvers. . . . . . . . . . . . . . . . . . . . . . . 154
6.2
Illustration of PS2 R’s three-tier architecture.
6.3
UML Class diagram of classes relevant to PS Web search.
6.4
UML sequence diagram for
PS2 R’s
. . . . . . . . . . . . . . . . . . . 157 . . . . . . . . . . . . 158
Web PS search using Google. . . . . . . . . 158
List of Figures
20
6.5
Example deftemplates for elevator motor and components. . . . . . . . . . . . . 160
6.6
Abstract OWL syntax representation of extracted classes. . . . . . . . . . . . . . 161
6.7
Inferred relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.8
An example input and output for the preprocessing algorithm, algorithm 8. . . . . 165
6.9
An example execution of taxonomy inference algorithm, algorithm 9. . . . . . . 166
6.10 EER diagram showing the final, normalised design of the PS2 R repository’s database.167 6.11 Sequence diagram of searching the repository. . . . . . . . . . . . . . . . . . . . 168 6.12 Generating a tag cloud of PS tags. . . . . . . . . . . . . . . . . . . . . . . . . . 168 6.13 A sample tag cloud from the PS2 R. . . . . . . . . . . . . . . . . . . . . . . . . . 168 7.1
Time taken by subjects to define the constraint and fix rules for KBS(pnr, computer).190
7.2
Search terms used during the PS2 R evaluation. . . . . . . . . . . . . . . . . . . . 198
8.1
Building KBS on the Semantic Web. . . . . . . . . . . . . . . . . . . . . . . . . 209
A.1 Relevant fragments of the source ontology. . . . . . . . . . . . . . . . . . . . . . 222 A.2 Relevant fragments of the target ontology before the mappings process. . . . . . 223 A.3 The mappings that will be applied between the source and target ontologies. . . . 223 A.4 Additions to the target ontology after the direct creation mappings have been applied.224 A.5 The target ontology after all the mappings have been applied. . . . . . . . . . . . 238 B.1 Fragments of a PS ontology relevant to the KA process. . . . . . . . . . . . . . . 249 B.2 Example RuleKaNodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 B.3 The RuleGraphNodes created during the example KA session. . . . . . . . . . 265 C.1 High level view of a KBS for elevator design. . . . . . . . . . . . . . . . . . . . 267 C.2 Illustration of how JessTab links the Prot´eg´e ontology and its individuals with Jess. 268 C.3 High level overview of the MAKTab KBS development process. . . . . . . . . . 271 C.4 A screenshot of the mapping tool in MAKTab. . . . . . . . . . . . . . . . . . . . 272 C.5 A screenshot of a defined copy a class mapping. . . . . . . . . . . . . . . . . . . 274 C.6 A screenshot of a defined Class To Individual Mapping. . . . . . . . . . . . . . . 275 C.7 A screenshot of a defined Property Renaming Mapping. . . . . . . . . . . . . . . 276 C.8 A screenshot of a defined Property Concatenation Mapping. . . . . . . . . . . . 277 C.9 A screenshot of a defined Property To Individual Mapping. . . . . . . . . . . . . 278 C.10 A screenshot of a defined Copy A Property Mapping. . . . . . . . . . . . . . . . 278 C.11 Selecting the target ontology as the ontology to use during KA. . . . . . . . . . . 279 C.12 The interface of the KA tool in MAKTab. . . . . . . . . . . . . . . . . . . . . . 280 C.13 The rule definition area of the KA tool. . . . . . . . . . . . . . . . . . . . . . . . 282 C.14 Dialog for selecting the type of atom to add as an anticedent or consequent. . . . 283 C.15 Dialog for selecting the type of the next rule to be acquired. . . . . . . . . . . . . 285 C.16 The atom display interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 C.17 The menu of available options for editing the value of a property of an atom. . . . 287 C.18 Adding an argument to a list. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
List of Figures
21
D.1 High level view of a KBS for elevator design. . . . . . . . . . . . . . . . . . . . 291 D.2 Illustration of how JessTab links the Prot´eg´e ontology and its individuals with Jess. 292 G.1 Flow diagram illustration of the propose-and-revise algorithm. . . . . . . . . . . 299 G.2 Illustration of how the propose-and-revise algorithm solves the shelf design problem.300 G.3 Illustration of the path the KA tool follows when acquiring rules for propose-andrevise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 H.1 Example diagnosis graph for the car domain. . . . . . . . . . . . . . . . . . . . 308 H.2 The diagnostic rule KA flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 K.1 Illustration of the restrictions involved in building a desktop computer. . . . . . . 338
LIST OF ALGORITHMS
22
List of Algorithms 1
Basic KA control algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
2
Mapping application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
3
Applying a transformation mapping. . . . . . . . . . . . . . . . . . . . . . . . . 118
4
Building the Rule KA Graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5
The KA Control Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6
The determineNextOptions() method used by the KA control algorithm.
7
Determining a taxonomical relation between two deftemplates. . . . . . . . 155
8
Preprocessing steps for the taxonomy inference algorithm. . . . . . . . . . . . . 162
9
Taxonomy inference algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
133
23
Chapter 1
Introduction This thesis explores the development of Knowledge Based Systems (KBSs) through the configuration of reusable components, namely generic Problem Solvers (PSs) and domain ontologies. A KBS is an artificial intelligence (AI) system which imitates human problem solving by applying some form of reasoning to domain knowledge stored in a knowledge base. Typically, KBSs are developed to “solve problems that are difficult enough to require significant human expertise for their solutions” [41] and are usually applied to nondeterministic tasks. There are numerous examples of successful KBSs in business [63; 70], with KBSs appearing frequently in marketing, banking, finance, and forecasting [70]. Knowledge Engineering is however often a time-consuming and expensive process, particularly when it involves acquiring new knowledge and constructing new systems from scratch [14; 21; 22]. It has been a long standing goal of the knowledge engineering community to rapidly build new systems from existing components: the ability to select suitable domain knowledge and domain independent inference components from repositories and to rapidly configure them to work together should make the KBS development process more efficient. However, despite considerable research aimed at achieving this dream, it is, for many reasons, still to be fully realised. While various projects strived to meet this goal, each produced their own approach to the problem, developed different, often incompatible, technologies to support their approach; moreover, none of them have fully executable implementations. This thesis presents a two stage methodology for achieving KBS development through reuse, and is supported by MAKTab, a tool which is capable of producing executable KBSs.
1.1
Background
Modern day KBSs are typically developed using an expert system shell (the terms “Expert System” and KBS are often used synonymously). The earliest such shell was EMYCIN [115], which was based on the classic MYCIN expert system [11]. At an abstract level, a KBS developed in such a shell is split into at least three main components: a knowledge base (KB), an inference engine, and a suitable user interface (the user can be a human user, an agent, another system, and so on) [41]. By removing the need to invest time developing a reasoning component for their systems, shells freed knowledge engineers to focus on acquiring and encoding the knowledge their systems would use to solve their tasks. A KBS uses two types of knowledge: relevant domain knowledge; and procedural/reasoning knowledge, which effectively tells the system how to use the domain knowledge for some specific purpose. For example, one would expect an elevator configuration
1.1. Background
24
KBS to contain (domain) knowledge of elevator components and how they can be combined into a working system (inference knowledge). Different shells allow the knowledge engineer to make use of different methods for implementing the KB. Production rules were one of the most popular approaches that were supported. Briefly, this approach involves applying a set of IF THEN rules to a series of facts in order to perform some type of reasoning/task. At an early stage, researchers in knowledge engineering identified a range of types that could be used to classify the majority of expert systems. A range of problem solving methods (PSMs) were subsequently identified, which were believed to be capable of meeting the reasoning requirements of the majority of KBSs [10]. These included configuration, diagnosis through to planning. This perspective initiated a new, reuse-based, approach to both KBS implementation and development. In theory, it was possible to encode the domain and reasoning knowledge separately within the system, enabling reuse of the various components. Further, the KBS development process was revised to one in which KBSs were built by selecting and configuring various components from repositories. Typically such repositories would be libraries of domain knowledge, libraries of problem solvers (PSs), and so on. When a new KBS is required the developer would select the appropriate components from these libraries and, preferably with very little effort, configure them to work together to solve his problem. Ideally an automated agent or broker would perform much of the process for the developer, reducing development times and costs. Since the 1990s, ontologies often played an important role in reuse-based KBS development approaches. An ontology provides an “explicit specification of a conceptualisation” [47] and is well suited to modeling reusable KBS components, such as domain knowledge and PS requirements. There have been various proposed formalisms for ontologies: the current W3C recommendation is the Web Ontology Language (OWL) formalism [116], which is based on description logics and is discussed in section 2.6.1. With ontologies providing both domain knowledge and the specification of a PS’s requirements (in terms of its inputs and outputs), the KBS development process was viewed as essentially a mapping task. If mappings could be defined such that the domain ontology could meet the PS’s requirements, then the PS should be able to access and use the domain knowledge it contains. However, despite considerable research aimed at achieving the dream of (automatic) KBS creation from reusable components, it is, for many reasons, still to be fully realised. While various projects strived to meet this goal, each produced their own approach to the problem and developed different, often incompatible, technologies to support their approach; moreover, none of them have fully executable implementations. The primary reasons why previous approaches were not completely successful were 1) the lack of a standard formalism for domain knowledge, 2) the lack of a standard to represent rules, and 3) the lack of tools which allow standardised rule sets to work with data expressed against standardised representational schemas. Technologies and standards have progressed however, and OWL now provides the standard ontology language; additionally SWRL (section 2.6.2) and RIF [1] propose formalisms which can be used for defining rule-based PSs expressed against OWL ontologies; and tools such as Prot´eg´e (section 2.6.3) provide a mature, extendable framework for both creating and using ontologies.
1.2. Thesis Motivation
25
In fact, Prot´eg´e provides a good environment for building a KBS by combining reusable components, as it includes extensive import facilities for different ontology languages along with several reasoning plug-ins (called tabs) which allow various types of reasoning to be performed against an (instantiated) ontology. One of the most mature reasoning tabs is JessTab (section 2.6.5) which allows Jess production rules (section 2.6.4) to be executed against a knowledge base (instantiated ontology). By using these standards and technologies, it has been possible to focus on developing tools for achieving the rapid re-configuration of a domain ontology, typically developed for one type of PS, for use with another type of PS; additionally it has not been necessary to implement the corresponding reasoning engines. The methodology for developing KBSs through reuse consists of two phases. After selecting a generic PS and a domain ontology (possibly initially part of an existing KBS where the ontology has been decoupled), relevant domain knowledge is mapped from the domain ontology into the form required by the generic PS. A focused knowledge acquisition (KA) process then uses the requirements of the generic PS and the mapped domain knowledge to guide a domain expert with the acquisition of the domain specific problem solving knowledge that will enable it to work on a subset of tasks in that domain. After completing the KA process, if the new rules contain new domain concepts then these can be used to enhance the corresponding domain ontology; a KBS can then be formed from the selected PS, the acquired domain specific rules, and the appropriate domain ontology. The mapping stage is a common technique for enabling reuse of existing domain knowledge from a domain ontology. However, the focused KA is unique to this approach and is, as shown in this work, critical to successful reuse, as a domain ontology developed for one type of PS will not normally contain all the information required by another PS. For example, one would not expect a rule set nor domain ontology designed for a configuration KBS to contain all the information required by a diagnostic KBS; therefore this knowledge must be acquired when building a new (configuration) KBS. In particular, it is likely that the procedural knowledge describing how to perform the diagnosis will have to be acquired, as it is extremely unlikely that it will be present in the original configuration system.
1.2
Thesis Motivation
As stated above, previous approaches suffered from a lack of standards and based their ideas on incorrect assumptions. These observations formed the motivation, for this thesis, i.e., to revisit the problem of reuse-based KBS development to determine if a new approach could bring the goal of reuse-based KBS development closer. This research contributes to the reuse of domain ontologies and generic PSs in the development of KBSs. If one has a Knowledge Based System (KBS) which is capable of solving propose-and-revise (pnr) tasks in the elevator domain, then if one wishes to develop a diagnostic KBS in the same domain, I hypothesise that: • The necessary diagnostic rule set will not be contained in the existing pnr KB, and hence it will be necessary to acquire this additional rule set (hypothesis 1).
1.3. Thesis Layout
26
• A generic diagnostic problem solver (PS), together with an appropriate domain ontology can be used to drive the Knowledge Acquisition (KA) process (hypothesis 2). • An ontology developed for use with a pnr PS for a particular domain will need to be extended when it is used as the basis for acquiring a diagnostic rule set for the same domain (and vice versa) (hypothesis 3). These can of course be generalised to apply to further pairs of Problem Solvers. Further, I hypothesise that it should be possible to find rule sets which provide implementations of well known Problem Solvers on-line, and that it will be possible to semi-automatically reconfigure these rule sets for use in other domains, using a technique such as that described in the second hypothesis above (hypothesis 4). Additionally, I hypothesise that it is possible to provide semi-automated support to help knowledge engineers reuse knowledge sources (hypothesis 5).
1.3
Thesis Layout
Chapter 2 - Literature Review presents related work on developing KBSs through reusable components: in particular the PSM Librarian, CommonKADS, IBROW3 and MUSKRAT approaches to reuse-based KBS development are discussed. My KBS development methodology uses ontology mapping and knowledge acquisition, and work in both of these fields is also discussed. Finally, short introductions to the various technologies discussed in this thesis: OWL, SWRL, Prot´eg´e, Jess, and JessTab are provided. Chapter 3 - Reusing JessTab Rules in Prot´eg´e presents a technique and tool developed to support the reuse of a set of JessTab rules developed for one ontology, with alternative ontologies. Chapter 4 - Building Knowledge-Based Systems from Reusable Components discusses my KBS development methodology could be used to support the complete KBS development process for a user who does not initially have any reusable components. The notion of generic PSs is also introduced and examples for propose-and-revise based configuration design and diagnosis are discussed in detail, along with how they can be used by my KBS development methodology to build a new KBSs. Chapter 5 - Supporting the Proposed KBS Development Methodology discusses MAKTab, the tool developed to support the reuse-based KBS development methodology, providing details of the tool’s mapping and knowledge acquisition sub-tools, along with the generation of executable KBSs. Chapter 6 - Acquiring Problem Solvers from the Web introduces PS2 R, a Web based PS search engine and repository, providing details of the tool’s Web search technique, repository, and the extraction of ontologies from the discovered PSs. Chapter 7 - Evaluation discusses experiments performed to evaluate MAKTab and the PS search engine. Chapter 8 - Conclusions and Future Work summarises the conclusions of this thesis and provides various possible future work directions. Appendix A - Example Application of Mapping Algorithms provides an example application of the mapping algorithms descried in section 5.3.
1.3. Thesis Layout
27
Appendix B - Example Application of the KA Algorithms provides an example walkthrough of the KA algorithms described in section 5.4. Appendix C - MAKTab User Manual is the user manual for MAKTab, which was given to participants of MAKTab evaluation experiments discussed in section 7.2. Appendix D - A Short Introduction To Knowledge-Based Systems provides an introduction to KBSs which was given to participants of the MAKTab evaluation experiments discussed in section 7.2. Appendix E - MAKTab KBS Development Introduction - Computer Configuration provides an introduction to the MAKTab computer configuration experiment (section 7.2). Appendix F - MAKTab KBS Development Introduction - Computer Diagnosis provides an introduction to the MAKTab computer diagnosis experiment (section 7.2). Appendix G - How to Adapt the Generic Propose-and-Revise Problem Solver for a Domain describes how to use MAKTab to configure the generic propose-and-revise problem solver to work in a domain. Appendix H - How to Adapt the Generic Diagnostic Problem Solver for a Domain describes how to use MAKTab to configure the generic diagnostic problem solver to work in a domain. Appendix I - MAKTab Tutorial for Building a Propose-and-Revise Based KBS provides a tutorial for building a propose-and-revise based KBS with MAKTab. Appendix J - MAKTab Tutorial for Building a Diagnosis Based KBS provides a tutorial for building a diagnosis based KBS with MAKTab. Appendix K - An Introduction to Building a Desktop Computer describes how to select appropriate computer hardware components for building a new system. Appendix L - An Introduction to Computer Hardware Fault Diagnosis describes how to diagnose a selection of common computer hardware problems. Appendix M - Questionnaires provides sample questionnaires that the participants of the MAKTab evaluation experiment (section 7.2) were asked to complete.
28
Chapter 2
Related Work 2.1
Overview
This chapter summarises research related to this work, along with brief presentations of the technologies used to develop several tools in my research work. Four projects which focused on building KBSs from reusable components, namely PSM Librarian (section 2.2.1), CommonKADS (section 2.2.2), IBROW3 (section 2.2.3), and MUSKRAT (section 2.2.4) are described. Each project is described in terms of its approach and its strengths and weaknesses. The approach to reuse-based KBS development described in this thesis makes use of ontologies for describing the different components, and so it has been necessary to define mappings between different ontologies. Ontology mapping is an active research area; section 2.3 describes various approaches and techniques for automatically determining mappings between two ontologies. As ontology mapping is unlikely to meet all the (domain) knowledge requirements of a KBS, a knowledge acquisition technique will be used to acquire the required additional knowledge. Three approaches to knowledge acquisition are discussed in section 2.4: Prot´eg´e (section 2.4.1), which provides a generic approach to the creation of ontology individuals; SALT (section 2.4.2), which provides KA for propose-and-revise; and EXPECT (section 2.4.3), which attempts to combine the flexibility of general approaches to KA with the focus of method-specific approaches, such as that of SALT. As the task of elevator1 configuration is used in several examples throughout this thesis, section 2.5 provides a brief introduction to the task to familiarise the reader with many of the elevator concepts that are discussed. Finally, section 2.6 provides a brief introduction to the technologies which have been used to implement the approach developed in this thesis, namely: the OWL Web Ontology Language (section 2.6.1), used for defining ontologies; the Semantic Web Rule Language (section 2.6.2), used for defining rule-based problem solvers; the Prot´eg´e environment (section 2.6.3) for working with ontologies; the Jess rule engine (section 2.6.4); and JessTab (section 2.6.5) which allows Jess rules to work with instantiated ontologies in Prot´eg´e.
2.2
KBSs from Reusable Components
Various projects have focused on the creation of KBSs from libraries of reusable components, such as libraries of domain ontologies and problem solving methods. This section discusses four 1
Referred to as “lift” in British English.
2.2. KBSs from Reusable Components
29
of the major projects that have focused on this, namely PSM Librarian, CommonKADS, IBROW, and MUSKRAT.
2.2.1
PSM Librarian
PSM Librarian (Problem Solving Method Librarian) has been developed by various researchers at the Stanford Center for Biomedical Informatics Research since 1995 [39]. The latest revision of the methodology, described in [19] is highly ontology-dependent, using ontologies to describe domains, methods, PSM repositories, and mappings [19]; this approach involves selecting domain and method ontologies from the library and then creating mappings between the two. The domain ontology is a PSM-independent description of a particular domain, either created by a person or retrieved from a relevant library. The method ontology provides a signature for the PSM, describing the roles and requirements the domain knowledge must fulfil. This description must consist of at least a specification of the PSM’s inputs that the domain knowledge must provide. Again, ideally the method ontology is taken from a PSM library, which is described, accessed and queried through the PSM ontology. The Unified Problem-Solving Method Development Language (UPML) (meta-)ontology is used to describe the PSMs (see [34] and section 2.2.3); UPML was developed by the IBROW project for describing PSMs, and has only been added to this project in its latest revision. The mapping ontology [87; 51; 39] provides a mediating layer in the architecture: providing a bridge between the domain and method ontologies. The authors have defined what they claim to be a “formal, abstract definition for all mapping relations” [87]. The user is required to instantiate this ontology, defining mappings between the concepts in the selected domain ontology and the inputs/outputs of the method ontology. This is one of the major problems with the PSM Librarian: defining mappings is typically a huge task and very little support is currently provided. The user is expected to determine the corresponding concepts and how concepts in the domain ontology need to be transformed using the various types of mappings the mapping ontology provides. While defining mappings alone is a time consuming task (the elevator to propose-and-revise example contains over 120 mappings [18]), no support is provided for dealing with any mismatch between domain and method ontologies. The authors’ suggestions are if the domain knowledge is lacking, “mapping relations can specify constant assignments to method concepts” or that the domain ontology should be extended [19]. The first suggestion is very unhelpful providing no guidance as to where these constants should come from or what they are. The second is obvious. However this is all the support that is provided. Further, extending the domain knowledge is extending the entire task: if extended in-keeping with the PSM independent philosophy of the PSM Librarian the developer will then have to map the new knowledge to the problem solver, adding extra work. On the other hand, with further guidance from the PSM Librarian, the additional knowledge could have been acquired in the correct form for the problem solver. The PSM Librarian approach has been implemented as the Prot´eg´e plug-in PSMTab, and some examples are provided relating to the propose-and-revise experiments described in [39]. When examining these examples another problem becomes apparent: the mappings make continual reference to (global) variables, functions, and language constructs from the actual PSM implementation. This weakens the claim that a developer can reuse a PSM from some library
2.2. KBSs from Reusable Components
30
with only knowing the details of the PSM provided in the method ontology (typically inputs and outputs), these additional requirements demand that the developer is very familiar with the PSM implementation. Mappings require knowledge ranging from the language it was developed in (such as the language’s syntax and any methods/functions it provides, etc.) down to fine details such as global variables and methods in the code. Requiring this level of familiarity with a PSM written by another developer adds yet another burden to a (potentially) already considerable workload as it requires considerably more effort than just mapping concepts, which is the claim for this approach. This leads one back to the classic question: is it quicker to reuse someone else’s components or to build the KBS completely yourself? Another important shortcoming in PSMTab is that, despite many claims of successful application of this methodology with the propose-and-revise problem solver in [39], [51], [85], and [103] the tool is not fully functioning, and is unable to produce an (executable) KBS from two components which, of course, is the central objective of the reuse community. In [113], Tu and Musen reflect on PSMs and their experience of using the Episodic Skeletal Plan Refinement (ESPR) PSM in particular, with respect to the belief that PSMs represent their functionalities and knowledge requirements so well that they could be reused for similar tasks in different domains. Tu and Musen highlight that there are very few documented examples of “plug-and-play” stories of PSM reuse and provide an empirical perspective on PSM reuse and the different possible implementations of the same PSM. Briefly, they discuss the Sisyphus-II VT experiment [97], in which multiple research groups were given the same task (elevator configuration) and method (propose-and-revise), yet each group developed their own variants on the method to solve the task. Issues arising from Tu and Musen’s work on applying the ESPR method to the task of (medical) therapy planning in different application settings using the T-HELPER system are also discussed. They conclude that each application of the PSM “changes the decomposition and control structure of the method substantially, even when the overall functional requirements and structure of domain knowledge remain essentially the same” [113] and that every time they applied the PSM in different settings, they had to re-implement it specifically for those settings. Also discussed is their reformulation of ESPR for the EON framework, a server based system which facilitates the invocation of problem solving components as and when necessary. In EON the sub-processes of a PSM are represented as CORBA objects, which, the authors claim, make developing applications easier as they can select which sub-proccess to use and when to use them, instead of having to redefine the entire PSM. The work reported in [113] illustrates the critical role that the task and domain play in a PSM’s configuration. It is highly unlikely that one specific configuration will suffice for all applications, and therefore to facilitate PSM reuse not only is it necessary to support mapping domain knowledge to a PSM, it is also necessary to support the configuration/selection of the method’s sub-processes for that domain knowledge or application; such selection is generally not supported by PSM reuse approaches.
2.2. KBSs from Reusable Components
2.2.2
31
KADS-I, KADS-II, and CommonKADS
One of the major European projects of the 1980s and 1990s was the KADS project. The Knowledge Acquisition and Documentation System (KADS) is now in its third revision, CommonKADS. While KADS-I and KADS-II were primarily focused on modelling expertise, developing a library of models of ‘generic’ problem solving tasks, and associated KA activities, CommonKADS [98] has a slightly wider focus. CommonKADS expands on KADS by producing a complete, structured KBS development methodology, similar to that of software engineering. That is, a methodology that considers project management, organisational analysis, knowledge engineering, and software engineering [98]. The development methodology of CommonKADS defines a process for developing a product model. The product model is basically a description of how the organisation, for which the new KBS is being developed, should be once the KBS has been put into practice. The product model is achieved first by analysing the initial state of the organisation to provide an analysis model. The analysis model then undergoes a design process to create the product model, which, when implemented, should result in the organisation reaching its target state. The product model is itself composed of six model templates from the CommonKADS model set. Each model provides an alternative view of a problem solving context. Briefly these models are: the organisation model, which describes the environment, typically the relevant parts of the organisation, the KBS will be deployed in; the task model, which describes the relevant tasks and activities important to the organization’s successful functioning; the agent model, which describes the agents (human or computer) which carry out the tasks; the communication model, which describes the communications, or transactions between different agents; the expertise model, which describes an agent’s knowledge with reference to a particular task, in terms of domain, task and inference knowledge and its linkages; and the design model, which describes the system that will implement the expertise and communication models in terms of computational mechanisms, representational constructs and software modules [99]. In the above descriptions the phrase “relevant” means relevant to the KBS under development. To aid the developer the CommonKADS expertise modelling library contains various problem solving methods such as assessment, diagnosis, and design [10]. It is this library that forms the basis for reuse in the CommonKADS project. In the latest book on CommonKADS [98] the library contains a series of abstract patterns for many of the PSMs. Each pattern provides an inference structure (an abstract specification of the algorithm) and corresponding domain schema for a PSM. These abstract patterns form the expertise modelling library of the CommonKADS model set. In theory, libraries of all the various models used in CommonKADS would be available for the developer, although only the expertise modelling library is described in detail and is presumably the only implemented library. However, even it is incomplete: no assistance is provided to aid the instantiation of the domain schemas, nor, more significantly, the configuration of an inference structure and its corresponding domain structure to form an operational KBS. The CommonKADS methodology does provide a structured development cycle for KBS, however its multiple model approach has suffered much criticism. One of the main disadvantages of the approach is the overheads it brings to a project: it can take considerable time to construct
2.2. KBSs from Reusable Components
32
each model in sufficient detail and there is considerable documentation required to track progress and decisions [56]. The overheads were so considerable that Pragmatic KADS was developed as a streamlined version of CommonKADS, although development did not go far. Menzies [69] offers further criticisms about the high-level modelling approach of CommonKADS and argues that it is less insightful into the KBS construction process, and more error-prone than alternative lower-level approaches, such as the symbol-level. Further, Corbridge et al [15] found that subjects acquired more knowledge from dialogs with experts than when using the KADS models. These points suggest using CommonKADS adds unnecessary project overheads, is harder for the developer to use than other approaches, and is not necessarily an effective knowledge acquisition method. Further problems with the CommonKADS expertise modelling library are discussed in [33] and [114]: despite stating the library has hundreds of PSMs [33], the same paper states “none of these have been implemented”. [114] goes further stating “access and practical use of the library are difficult” that the library is incomplete, with some parts remaining empty, and contains “models that we no longer believe in”. The same paper also suggests that models in the library, which are described at three levels, should have included a fourth “operationalization via code” level, since there is no executable code provided by the library.
2.2.3
IBROW3 Project
The main objective of the IBROW3 project [50] was the development of “an intelligent brokering service that enables third-party knowledge-component reuse through the World Wide Web” [6]. These third-party components were domain ontologies and PSMs stored in libraries which, in the case of the PSMs were described using the Unified Problem-Solving Method Development Language (UPML). The user would specify a problem (task) against a task ontology and send it to the brokering service, which would then select an appropriate domain ontology and PSM using a matching process: it would then configure and adapt the selected PSM by reasoning about the task and relating the method (PSM) and domain ontology. The result of this process would be a distributed reasoning service which could solve the user’s task. Having developed an outline architecture and identified the numerous requirements for its successful production, there were only three discernible outputs from the project, of which only one is still used today: UPML. The others being two PSM libraries [73; 74] and a prototype brokering service, which are no longer available. UPML [34; 31; 32; 83] was developed as a standard for describing the architectural framework the IBROW3 project believed would help realize their goal. The UPML architecture consists of six components: tasks which define the problem to be solved; PSMs which provide some type of reasoning; domain models which provide relevant domain knowledge; ontologies which are mainly used to provide a common vocabulary for use by the tasks, PSMs and domain models; bridges which provide a bi-directional link between two components (there are three types of bridges: task-domain; PSM-domain and PSM-task); and refiners which define how a component can be refined, for example, from a generic PSM to a specific PSM. All components (except the bridges) should be defined independently of each other: so PSMs and tasks were defined independently and domain knowledge did not reflect a specific task or PSM requirements. It was hoped that this independence would provide greater potential for reuse of the components.
2.2. KBSs from Reusable Components
33
The intention was that every component would be described in UPML; so library providers would describe their components (typically PSMs and domain knowledge) using UPML, and the client would use UPML to define tasks. The intelligent brokering service would then relate the task definition and PSM descriptions to select the correct PSM for the task and, via a series of bridges and refiners, configure and adapt the domain knowledge and PSM to produce a KBS which could solve the task. The success of the IBROW3 project was largely dependent on the successful creation of the various libraries (which required widespread adoption of UPML), an intelligent broker which could use these libraries and map between the components, and a user-friendly method for creating tasks. However, very few components were completed before the project finished. Motta et al. describe two PSM libraries: an existing parametric design library was reformulated to use UPML [73], and the other concentrated on classification problem solvers [74]. Both libraries are no longer available today. A prototype brokering service was implemented, but was only developed to the stage of showing a single working example. A UPML (meta-)ontology [34; 83] has been defined, with an associated editor as a Prot´eg´e project [34; 83]. Users are expected to instantiate the UPML (meta-)ontology with details about their particular PSM, task, domain knowledge, etc. However there is no special editor for supporting these steps, and very little documentation to guide the user; so unless the user is very familiar with UPML and its syntax, using the ontology to describe a component is very challenging.
2.2.4
MUSKRAT
The MUSKRAT framework (Multistrategy Knowledge Refinement and Acquisition Toolbox) provides a framework in which problem solvers, knowledge acquisition systems and knowledge refinement/transformation tools exist. Essentially, the user provides MUSKRAT with a task to be solved, a suitable PS, and a set of knowledge sources (KSs) (knowledge bases, databases, ontologies, etc.), each of which is associated with a formalised specification. The MUSKRAT-Advisor will then give one of the following pieces of advice [119]: • The PS can be applied with the KSs already available. • The PS needs KSs not currently available, and these KSs must be acquired. • The PS needs KSs not currently available, but which could be provided by transforming the existing KSs. Note that each of the above decisions requires a KS to meet the corresponding formal specification; this problem is essentially the same as proving the equivalence of two programs, which is known to be undecidable. However, as in much of AI, progress can often be made by using domain- or task-specific heuristics. The approach of developing approximate reasoners was explored in some detail by White [118]. Nordlander [78] developed a system which applied the following two refinements to this task: • Considering only KSs which are shown to be relevant to the task, i.e. contain appropriate domain knowledge.
2.3. Ontology Mapping
34
• Introducing the idea that each of the formal KSs fulfils a particular role, and then classifying the (remaining) KSs (either by human or the system) as to which roles they are able to fulfil. The system developed by Nordlander aims to quickly identify implausible KB-PS combinations, which can then be discarded, leaving a smaller number for the MUSKRAT-Advisor to evaluate. This approach represents PSs as Constraint Satisfaction Problems (CSPs), and applies a series of CSP-relaxation techniques to identify plausible PS-KB combinations. Two strong, limiting, assumptions of this approach were: 1. That the PS is a CSP. Runcie [96], investigated the extraction of knowledge components from an existing KBS, which is not necessarily a CSP, and the subsequent reuse of these components as a CSP. 2. That the KSs are all expressed against a single ontology. The present thesis/dissertation investigates the use of multiple ontologies with a particular PS.
2.3
Ontology Mapping
The three projects above all model domains, PSMs, and tasks independently of each other and depend on the developer linking them by a series of mappings. As noted in the description of the PSM Librarian, this can leave the developer responsible for defining hundreds of mappings. This dissertation investigates whether ontology mapping, alignment, and merging techniques can be used to automate parts of this process, i.e. to suggest corresponding concepts in descriptions of domain knowledge and generic PSs. Ontology mapping, alignment and merging are all active research fields2 . In a relatively recent survey of ontology mapping research [55] (see [91] and [24] for other relevant surveys), Kalfoglou and Schorlemmer discussed 35 projects focused on ontology mapping, which they broadly categorised as either providing frameworks, methods and tools, translators, mediators, techniques, reports of large scale ontology mapping projects, theoretical frameworks, surveys, and examples. The methods, tools and translators are of most interest here as they present existing work on (semi-)automatic ontology mapping and translation between ontological forms. Many of the tools such as FCA-Merge [105], SMART [79], PROMPT [81] and PROMPTDIFF [80], Chimaera [66], and ONION [72] make use of various linguistic matching techniques and ontological structures when suggesting mappings between two different ontologies. Typical linguistic techniques are string equivalence, substring matching, and synonym matching using online thesauri such as WordNet [30]. While these lexical based techniques provide good results in same-domain applications, the accuracy level of their suggestions understandably drops when dealing with ontologies modelling different domains. Although this project requires the definition of mappings between different kinds of ontologies (domain and PS ontologies), these techniques will serve as a starting point for aiding the user with mapping domain knowledge to PS’s requirements. However, as this work involves mapping from a domain ontology to a problem solver, more complex transformations than property matching may be required. OntoMorph [12] is one tool 2
See http://www.ontologymatching.org for details on ontology mapping research.
2.4. Knowledge Acquisition
35
which facilitates such transformations by using a special rule language to describe sentence-level transformations based on pattern matching. Sentence-level refers to the “sentences” (lines) of code which describe the knowledge base, for example Jess deftemplates and deffacts which are rewritten in the “sentences” (lines of code) of another language, such as KIF [38] statements. However, in order to write the transformation rules, the user is required to have an understanding of the syntax of the language the KB is written in (for example, Jess), the target language (for example, KIF), along with the syntax of the transformation rule language. Although this is a very powerful technique, the requirements it places on the user make it unsuitable for general users. However the ability to transform a domain ontology’s classes and/or properties (especially the property values of the individuals), in a similar manner may be required if the schema used by the domain ontology is significantly different from the schema used by the PS for describing domain concepts.
2.4
Knowledge Acquisition
While ontology mapping may provide some domain knowledge for the problem solver, it is unlikely that it will provide all the knowledge the problem solver requires; this implies that some knowledge will need to be acquired. Previous work on knowledge acquisition has had various levels of abstraction: CommonKADS and Prot´eg´e [104] provide generic approaches, accommodating many domains; while other tools such as MORE [54], MOLE [28], and SALT [64] each deal with a specific type of problem solver. The discussion in this section focuses on Prot´eg´e, as the tools developed in this work are developed as Prot´eg´e extensions, and SALT which deals with knowledge acquisition (KA) for propose-and-revise based systems. I also discuss the EXPECT system which attempts to combine the best of both worlds: the flexibility of generic approaches with the strong KA guidance of problem solver-specific approaches for extending intelligent systems.
2.4.1
Prot´eg´e for Knowledge Acquisition
The Prot´eg´e environment is an open source ontology editor and knowledge base framework [104]. Two of its components, or tabs as Prot´eg´e refers to them, are the Classes and Instances tabs. The Classes tab allows the user to create and edit an ontology: defining classes, properties and so on. The Instances tab allows the user to create a knowledge base by creating instances of the defined classes. The user selects the class he wishes to create an instance for, and the Instances tab (also referred to as the Knowledge Acquisition Tab) displays a GUI tailored for creating instances of that particular class. Different widgets are displayed for entering a property value based on the data type and constraints on the particular property. This approach is very successful for generic KA. It provides a simple and effective means for creating instances of different classes, while restricting user input for each property value ensures only acceptable values are entered. However this approach can suffer difficulties when creating knowledge bases for problem solvers. No support is provided for the user to visualise the relations between the various concepts and how they relate to a PS. Without this support the user has difficulty deciding with which class to start creating the KB (most developers simply pick a class at random and go from there). Without knowing how the classes are used by the problem
2.4. Knowledge Acquisition
36
solver, the user can find himself instantiating many classes which may never be used by his chosen problem solver. This results in wasted effort for the user, and a longer development time.
2.4.2
SALT
SALT [64] is one of a family of tools, which use what is often referred to as a “role limiting” approach to knowledge acquisition. Role limiting tools focus on building KBSs which use one particular type of PSM; as a result they contain detailed knowledge of that particular PSM, which they use to provide the user with specialised support when building a KB for that particular type of PSM. SALT is a knowledge acquisition tool for generating KBs that conform to a proposeand-revise PSM; other role limiting tools include MOLE [28], which focused on building KBs for a variant of heuristic classification, and MORE [54], which focused on a similar task. The KA technique of SALT is simple: the user is asked to enter either a procedure, a constraint or a fix; each of which represents a construct used by propose-and-revise. Each has their own associated schema which the user is then required to complete. The procedure schema is used to define how a value for a design extension is determined; the constraint schema identifies constraints (and supplies a procedure for determining its value) and the fix schema provides fixes for specific constraints. SALT was developed with an understanding of how these three different types of knowledge interact, and internally reflects this by building a graph of entities with associated contributes to, constraints and suggests revision of relations. It uses this graph/understanding of the various knowledge roles to acquire the relevant domain knowledge, finds where the acquired knowledge is lacking and generates an expert system that uses the propose-and-revise method with the user-created KB. The strong understanding of the three types of knowledge in propose-and-revise systems makes SALT a very effective and powerful tool for creating propose-and-revise systems. It is this type of understanding of problem solver related knowledge that will be necessary in my reuse methodology and tools to enable a directed KA process. SALT is limited to propose-and-revise; however, I intend to produce a methodology and tools which are not limited to one particular PSM, but still produces KA akin to SALT for various PSs.
2.4.3
EXPECT
The EXPECT [9; 42] system, developed by the Interactive Knowledge Capture group3 at ISI, USC, is a tool designed to help users extend and customise a KBS for their needs. The KA part of the tool does this by combining the flexibility of generic tools, such as Prot´eg´e and the direct KA process of role limiting tools such as SALT. EXPECT uses two main types of knowledge: factual (domain) knowledge and problem solving knowledge (methods) both represented explicitly in the description logic representation language, LOOM. EXPECT was designed with a reflexive architecture which allows its basic KA tool to provide some support for the user in fixing errors in the knowledge base or problem solving knowledge. This support includes alerting the user when the problem solver refers to a class or attribute which is not currently in the knowledge base, and adding the missing class/attribute; when the problem solver uses instances of a class which at that time has no instances; when an 3
http://www.isi.edu/ikcap/
2.4. Knowledge Acquisition
37
attribute of an instance requires a value but the attribute has no value assigned; and when new methods added to the knowledge base contain sub-goals which are, at that point, not achieved by the existing methods in the problem solver [42]. Since its initial version described in [42], the KA tool of EXPECT has had several extensions. The initial, and main extension was the addition of the EXPECT Transaction Manager (ETM), which uses a series of Knowledge Acquisition Scripts (KA Scripts) to guide the user through performing changes to an EXPECT KBS [108]. A KA script provides “a prototypical sequence of changes together with the conditions that make it relevant given previous changes to a knowledge base” [109]. The developed scripts focus on procedures which help the user ensure that when one element of the knowledge base or problem solver is updated; all elements that relate to it are also updated if necessary. For example, if one changed a method for calculating the round trip time of a ship, which uses methods for finding the ship speed and journey distance, to calculate the round trip time of an aircraft, one would need to provide methods for finding the aircraft speed and journey distance. The KA Script would guide the user in creating a new method for calculating the round trip time of the aircraft, based on that of the ship, and then in creating two new methods for finding the aircraft speed and journey distance, also based on those of the ship. The KA Script knows that the new methods need to be created by using an interdependency model [43] which captures how individual pieces of knowledge work together in the problem solver. Interdependency models are another extension of EXPECT, which help ensure the KBS is left in a consistent state when changes are performed. Another important extension is an Englishbased method editor which helps the user add new methods/edit existing methods within the problem solver using a series of constrained English sentences instead of writing them in EXPECT’s native LOOM format [92]. Each sentence is an English paraphrase of a method of the problem solving knowledge. The sentences are based on a range of method templates which are combined with context domain information such as method, class, and attribute names. The user can customise a sentence (and associated method) by selecting and altering the relevant parts. This is an important extension as it helps users, who are not necessarily programmers alter the EXPECT KBS without having to learn LOOM or too much detail about the problem solver. With these extensions EXPECT now provides a good system for interactively extending a KBS through the addition of factual and problem solving knowledge. The KA Scripts ensure all relevant information is updated when updates take place, and the English-based method editor makes it easier for any user to extend and customise a KBS. However, the approach is centred on extending an existing KBS. The user extends PSs by adding new methods which make it fit the user’s requirements and tie it to their domain. This means that the user needs to understand what goal the existing methods of the PS achieve and how they achieve it, along with each method’s role in the overall problem solving process in order to decide which existing methods they should modify and which should form the basis of new methods. The user is also required to use the tool to provide all the domain knowledge that the PS will use. This could mean that the user invests significant effort if there is a large domain knowledge base which must be imported for the problem solver. Additionally, while the system provides a good KA tool, it could end up acquiring large amounts of both factual and problem solving knowledge which is only ever used by that one KBS. There is no mention of how this acquired knowledge could be reused in other applications;
2.5. Elevator Design (VT)
38
however this is not the focus of that project’s work. There are some similarities with EXPECT and what I require: analysing existing methods/rules, acquiring new relevant domain and procedural knowledge and the idea of KA scripts to ensure relevant knowledge is kept up to date may be useful for extending generic problems solvers and domain ontologies with new knowledge.
2.5
Elevator Design (VT)
The elevator design task is a configuration task which involves determining a selection of components that meets performance requirements while also satisfying safety and building constraints. The first KBS which performed this task was VT (Vertical Transportation) by Marcus et al., which was developed for the Westinghouse Elevator Company in Randolph, New Jersey. Later, the Sisyphus-II KA challenge chose to use this task as a challenge for knowledge acquisition tools: a report on the experiment is given in [97]. Designing an elevator is a very complex process: the documentation from the Sisyphus-II challenge [123] describes at least 14 different types of required components, six different categories of building dimensions and five different categories of loads and moments that must be taken into account, and lists at least a further 50 constraints (and associated fixes) that must be satisfied by the final design. Figure 2.1 gives an outline of the various components that make up an elevator. Briefly, the counterweight, which weights about the same as the car when it is 40% full, counterbalances the car as they are both connected by the hoistcable. This means that when both are at rest (i.e., when the elevator has stopped at a floor), less power is required to move the complete assembly, as once an initial motion is created by the motor turning the sheave (which the hoistcable goes over), the car and counterweight will continue to travel up or down until they are stopped. Various safety mechanisms, such as the car and counterweight buffers and compensation cable are also required.
2.6
Technologies
2.6.1
The OWL Web Ontology Language
The OWL Web Ontology Language [116], is the W3C recommendation formalism for representing ontologies. OWL builds on XML, XML Schema, RDF, and RDF Schema (RDFS), adding “more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. “exactly one”), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes” [67]. OWL provides a XML syntax for saving and distributing ontologies; the W3C have also provided an abstract syntax which “has a frame-like style, where a collection of information about a class or property is given in one large syntactic construct” [88], which is more human readable than the XML equivalent. [88, section 2] provides a detailed discussion of the OWL abstract syntax, describing it through an Extended BNF; figure 2.2 provides excepts from the Extended BNF, which are used in this thesis; “terminals are quoted; non-terminals are bold and not quoted. Alternatives are either separated by vertical bars (|) or are given in different productions. Components that can occur at most once are enclosed in square brackets ([. . . ]); components that can occur any number of times (including zero) are enclosed in braces (. . . ). Whitespace is ignored in the
2.6. Technologies
Figure 2.1: VT System Components [123, figure 1].
39
2.6. Technologies
40
productions” [88, section 2]. Figure 2.3 provides example definitions for three classes, associated properties, and individuals. Individuals individual ::= ’Individual(’ [ individualID ] { annotation } { ’type(’ type ’)’ } { value } ’)’ value ::= ’value(’ individualvaluedPropertyID individualID ’)’ | ’value(’ individualvaluedPropertyID individual ’)’ | ’value(’ datavaluedPropertyID dataLiteral ’)’ individualID ::= URIreference type ::= description individualvaluedPropertyID ::= URIreference datavaluedPropertyID ::= URIreference dataLiteral ::= typedLiteral | plainLiteral typedLiteral ::= lexicalForm ∧∧ URIreference plainLiteral ::= lexicalForm | lexicalForm@languageTag lexicalForm ::= as in RDF, a unicode string in normal form C languageTag ::= as in RDF, an XML language tag OWL DL Class Axioms axiom ::= ’Class(’ classID [’Deprecated’] modality { annotation } { description } ’)’ modality ::= ’complete’ | ’partial’ OWL DL Descriptions description ::= classID | restriction | ’unionOf(’ { description } ’)’ | ’intersectionOf(’ { description } ’)’ | ’complementOf(’ description ’)’ | ’oneOf(’ { individualID } ’)’ classID ::= URIreference OWL DL Property Axioms axiom ::= ’DatatypeProperty(’ datavaluedPropertyID [’Deprecated’] { annotation } { ’super(’ datavaluedPropertyID ’)’} [’Functional’] { ’domain(’ description ’)’ } { ’range(’ dataRange ’)’ } ’)’ | ’ObjectProperty(’ individualvaluedPropertyID [’Deprecated’] { annotation } { ’super(’ individualvaluedPropertyID ’)’ } [ ’inverseOf(’ individualvaluedPropertyID ’)’ ] [ ’Symmetric’ ] [ ’Functional’ | ’InverseFunctional’ | ’Functional’ ’InverseFunctional’ | ’Transitive’ ] { ’domain(’ description ’)’ } { ’range(’ description ’)’ } ’)’ Figure 2.2: Excerpts from the abstract syntax for OWL described in [88, section 2].
2.6.2
The Semantic Web Rule Language
The Semantic Web Rule Language (SWRL) is a W3C member submission, which extends OWL with axioms to include Horn-like clauses to allow Horn-like rules to be expressed against the classes, properties, and individuals in an OWL knowledge base [49]. SWRL rules have the form: IF antecedents are satisfied THEN consequents are asserted; a rule with no antecedents is treated as true and a rule with no consequents is treated as false. The SWRL proposal also provides
2.6. Technologies
41
Class(o:Author complete) Class(o:Item complete) Class(o:Book partial o:Item) DatatypeProperty(o:name Functional domain(o:Author) range(xsd:string)) DatatypeProperty(o:title Functional domain(o:Item) range(xsd:string)) DatatypeProperty(o:isbn Functional domain(o:Book) range(xsd:string)) DatatypeProperty(o:rrp Functional domain(o:Book) range(xsd:float)) ObjectProperty(o:authored domain(o:Author) range(o:Book) inverseOf(o:authors)) ObjectProperty(o:authors domain(o:Book) range(o:Author) inverseof(o:authored)) Individual(o:fh type(o:Author) value(o:name “Friedman-Hill”)) Individual(o:jin type(o:Book) value(o:title “Jess In Action”) value(o:isbn 1930110898) value(o:rrp 49.94) value(o:authors o:fh)) Figure 2.3: Example class, property, and individual definitions using the OWL abstract syntax. an OWL ontology which can be used for defining rules, the main concepts of the ontology are discussed in table 2.1, a visualisation is provided in figure 2.4. In figure 2.4, squares represent a class, a dashed arrow represents a property in the domain of the arrow’s source and with range of the arrow’s destination (for example, both the swrl:body and swrl:head properties in the domain of swrl:Imp are restricted to individuals of the swrl:AtomList class). The SWRL formalism provides a way to capture rules, by creating individuals of the relevant classes, particularly, the swrl:Imp, swrl:Atom and swrl:AtomList classes; however, an inference engine is still required to run them. One suitable rule engine is Jess [36]; in fact, various projects have developed tools which transform SWRL rules to executable Jess rules, for example SWRLJessTab [44] and SWRLJessTab [82] (these are two separate projects with the same name).
Figure 2.4: A visualisation of the SWRL ontology.
2.6.3
Overview of the Prot´eg´e Environment
Prot´eg´e is an ontology and knowledge base creation tool which has been developed by the Stanford Center for Biomedical Informatics Research for over 20 years; the evolution of the first five
2.6. Technologies Concept Name swrl:Imp swrl:body swrl:head swrl:AtomList swrl:Atom swrl:Variable swrl:Builtin
42
Description An individual of this class represent a rule in SWRL, with links to the antecedents and consequents. A rule body, (list of antecedents), which is an individual of type swrl:AtomList. A rule head, (list of consequents), which is an individual of type swrl:AtomList. Individuals of this class build the antecedent and consequent lists of a rule. An atom is a rule antecedent or consequent; SWRL also provides seven different types of atom. Represents a variable in a rule. Individuals of this class provide a series of functions for use in rules.
Table 2.1: Description of the main concepts in the SWRL ontology. generations of Prot´eg´e is discussed in [40]. Briefly, the initial version, Prot´eg´e-I, which was developed by Mark Musen in the late 1980’s, was a generalisation of the Opal KA tool for the Oncocin system. Prot´eg´e-I supported knowledge engineers with developing structured data sets, and then generated KA tools for acquiring associated data. These KA tools were designed to be used by domain experts to specify a KB, which was designed to be used by the episodic skeletal-plan refinement method provided by the Oncocin inference engine. When Prot´eg´e-II was built in the early 1990’s, the developers generalized the tool to be capable of developing KBs for any PSM, which could be run with the CLIPS expert system shell [76]; as part of this, a graphic ontology editor, Maˆıtre was introduced, which allowed knowledge engineers to build domain ontologies. Domain ontologies then formed the basis of Mediator, a KA tool which domain experts could use to build KBs; the layout of the KA interface could be edited by the knowledge engineer using the DASH tool. The next version, Prot´eg´e/Win, which was developed in the mid-1990s, integrated the three tools of Prot´eg´e-II into a single tool (composed of three sub-tools), and extended the functionality of the previous version by providing an ontology import mechanism. Prot´eg´e-2000, which was developed from the late 1990’s, introduced two major changes. Firstly, the underlying knowledge model moved from being based on CLIPS to being based on OKBC; secondly, a plug-in based architecture was introduced (made possible by the move to Java), which produced a much more integrated system that could be extended by developers in various ways discussed below. The most recent version developed at Stanford, namely Prot´eg´e 3, was released around 2004 and provided enhanced support for OWL ontologies; development on the Prot´eg´e 3 generation is ongoing, with Prot´eg´e 3.3.1 being the most recent. Development has also started on Prot´eg´e 4 [48], which is developed mainly at the University of Manchester, and is focused specifically on dealing with OWL ontologies. This work uses the Prot´eg´e 3 generation, and henceforth any references to Prot´eg´e are to this generation. From the user’s perspective, there are two principal versions of Prot´eg´e: Prot´eg´e Frames [40], which deals with ontologies expressed in an OKBC-compliant Frames formalism, and Prot´eg´e OWL [57], which deals with ontologies expressed in OWL. Both versions provide a series of tabs which facilitate the standard ontology editing operations such as creating new ontologies and adding, editing and deleting classes, properties, and individuals, and so on. There are also
2.6. Technologies
43
numerous other tabs available for many different tasks, such as search and navigation, ontology visualisation, inference and reasoning, validation, import, and export. Prot´eg´e is also capable of directly exporting ontologies into many different formats, such as, RDF(S) [117], N-TRIPLE [46], and N3 [7]. From the developer’s perspective, Prot´eg´e is an extendible ontology management environment, which provides many powerful, reusable components that can be configured and extended to perform some task. Prot´eg´e is designed around an extendible architecture, which allows developers to extend its functionality in various ways: tabs allow extra functionalities to be added through the inclusion of another tab in the Prot´eg´e user interface (which by default is composed of tabs for creating ontology classes, properties, and individuals); slot widgets allow new interface components for supporting property value entry; back-ends allow different formalisms/storage mechanisms to be used; project plug-ins, which allow manipulation of a project (ontology) and its user interface; and createproject and export plug-ins, which allow importing and exporting of ontologies expressed in formalisms not already supported. An example of the benefit of this architecture is Prot´eg´e itself: despite appearances, there is really only one version of Prot´eg´e, that is the two apparent versions (Prot´eg´e Frames and Prot´eg´e OWL) are provided by use of various plug-ins (particularly back-end, tab, and slot widgets) which provide two different environments (in terms of graphical interface and the underlying data-structures) for editing Frame- and OWLbased ontologies; this is necessary due to the differences between the two formalisms.
2.6.4
Jess
Jess is a “Java-based rule engine and scripting language developed at Sandia National Labatories in Livermore, California in the late 1990s” [36], and is continually maintained today. Jess was inspired by the CLIPS expert system shell [76]; there are however, many differences between Jess and CLIPS, not least the implementation language: whereas CLIPS is written in C, Jess is written entirely in Java, and as such can easily be incorporated into Java programs [36], and supports Java scripting, meaning it can access and use all available Java APIs. Jess is a declarative programming language, and supports the JSR-00094 Java Rule Engine API standard4 . Jess can be split into three components: working memory, a rule base, and an inference engine. The working memory stores a series of facts depicting the data that the rules reason with; these rules are stored in the rule base. The inference engine is then responsible for correctly applying the rules based on the facts in working memory. The inference engine can be further broken down into a pattern matcher, agenda, and execution engine. The pattern matcher is responsible for determining which rules should be activated by comparing the antecedents of all the rules with the facts in working memory, to generate an unordered conflict set of rule activations. A conflict resolution strategy is then applied to the conflict set, which orders the activated rules in the conflict set. Jess provides two conflict resolution strategies: depth-first (the default) and breadth-first; CLIPS in contrast provides seven. The first item on the sorted agenda is then executed by the execution engine, and the entire process is repeated until no more rules can be fired [36]. Jess uses several CLIPS constructs for describing data and rules, namely: the fact construct for representing data/facts; the deftemplate construct for representing the structure/types of 4
http://jcp.org/aboutjava/communityprocess/review/jsr094
2.6. Technologies
44
facts; the defrule construct for representing rules; the defquery construct for querying the facts in working memory; and the deffunction construct for describing functions. The main CLIPS construct not supported by Jess is COOL: the Clips Object-Oriented Language [2]. Essentially, COOL allows the user to represent the data using object classes in a similar manner to object oriented programming. Each class specifies the structure of a particular data class by specifying a series of slots, which can be thought of as the class’s fields/properties; instances of the classes can be created to represent the actual data; and it is possible to define multiple subclass/superclass inheritance relationships between classes. COOL classes use message-handlers to pass messages between instances, in a similar way to methods in Java classes.
Representing Facts in Jess Facts effectively constitute the data/knowledge base which is used by Jess to determine which rules should be activated. In Jess there are two types of facts: unordered facts and ordered facts. As the name suggests, unordered facts do not have to be defined with respect to a specific order, and their structure is defined by an associated deftemplate. An unordered fact is required to specify the deftemplate that it is being expressed against and values for that deftemplate’s slots, which can be expressed in any order. For example, the definitions of UF1 and UF2 in figure 2.5 both assert a fact representing the same book to Jess’s working memory (in fact, Jess will recognise UF2 is identical to UF1 and only one fact representing this book will be asserted). With unordered facts it is generally possible for other developers to read and understand any fact definitions and references (for example, in rules), as both the slot name and value are specified. Ordered facts on the other had, do not use a deftemplate and rely on the consistent ordering of values across all fact definitions for particular types of facts. The meaning of the values is not explicitly stated in ordered facts, and so it is generally harder for other developers to read and understand fact definitions and references. For example, the code for asserting two example ordered facts into Jess’s working memory is shown in figure 2.5, as OF1 and OF2. Although these facts appear to represent the same book, there is no way to ensure this as there is no meaning attached to the various pieces of data: 1930110898 for example could be the ISBN number, however, it could equally be the author’s phone number, or a book store’s ID number for that book, etc.; as such Jess will assert a fact into working memory for each of these two definitions. For the remainder of this thesis, unless otherwise stated, when discussing facts, it is unordered facts that are being referred to. It is possible to assert a series of unordered facts into working memory at once using the deffacts construct. Facts defined in a deffacts construct are asserted into Jess’s working memory before the inference engine is executed. Two example deffacts are provided in figure 2.6: these define a series of facts representing books and authors; note that any deffacts construct can only define facts for one particular deftemplate.
Jess deftemplates The deftemplate construct is used to define the structure of facts, in order to improve the organisation of the data the system uses, as well as to improve the readability of the code. The deftemplate for a fact must be provided before an unordered fact of that type can be asserted into working memory. Example deftemplates for books and authors are shown in figure 2.7.
2.6. Technologies
45
UF1: (assert (Book (title "Jess In Action") (isbn 1930110898) (rrp 49.94) (authors "Friedman-Hill"))) UF2: (assert (Book (isbn 1930110898) (title "Jess In Action") (authors "Friedman-Hill") (rrp 49.94))) OF1: (assert (Book "Jess In Action" 1930110898 49.94 "Friedman-Hill") ) OF2: (assert (Book 1930110898 "Jess In Action" "Friedman-Hill" 49.94) ) Figure 2.5: Example code for asserting unordered (UF1 and UF2) and ordered (OF1 and OF2) facts representing the Jess In Action book into Jess’s working memory.
(deffacts books "example books" (Book (title "Jess In Action") (isbn 1930110898) (rrp 31.27) (authors "Friedman-Hill")) (Book (title "Artificial Intelligence") (isbn 0130803022) (rrp 46.99) (authors "Russell" "Norvig")) (Book (title "Design Patterns") (isbn 0201633612) (rrp 41.99) (authors "Gamma" "Helm" "Johnson" "Vlissides")) ) (deffacts (Author (Author (Author )
authors "example authors" (name "Friedman-Hill") (authored "Jess In Action")) (name "Russell") (authored "Artificial Intelligence")) (name "Norvig") (authored "Artificial Intelligence"))
Figure 2.6: Example deffacts for defining a series of book facts and author facts.
After defining the name of the deftemplate (for example, “Book”), a series of slots are defined which describe the properties of the object modeled by the deftemplate. Each slot definition consists of the slot name, which immediately follows the slot keyword, for example “title” is the name of the first defined slot; slots defined with the “slot” keyword can only have one value: so for example any book can only have one title (c.f. the books deffacts in figure 2.6); slots defined with the multislot keyword, such as the authors slot in the book deftemplate can have multiple values (for example, the Design Patterns book in figure 2.6 has four authors: Gamma, Helm, Johnson, and Vlissides). Slot definitions can also specify a default value for the slot, for example the rrp slot in the book deftemplate has a default value of 0.00: if a book is asserted into working memory and the rrp value is not set, it is set to 0.00 by default. It is also possible to specify the type of a slot; however, unlike CLIPS, Jess does not enforce the type restrictions on any facts that are asserted.
2.6. Technologies
46
(deftemplate Book (slot title) (slot isbn) (slot rrp (default 0.00) ) (multislot authors) ) (deftemplate Author (slot name) (slot authored) )
Figure 2.7: An example deftemplate for a book and author.
Jess defrules The defrule construct is used to define rules in Jess, in a similar style to CLIPS. As mentioned previously, Jess rules are of the form IF antecedents THEN consequents; where the antecedents are a set of conditions that must be met by the facts in working memory, and consequents are a set of actions that should be taken, if the antecedents are satisfied by the facts in working memory. Jess’s rule engine is forward-chaining, and forward-chaining rules are the default type that it supports; however it can simulate a backward-chaining rule engine relatively effectively when programmed correctly [36]: this discussion focuses on forward-chaining rules, which is Jess’s “natural” mode. At run time, Jess’s pattern matcher matches the antecedents of every rule with facts from working memory that satisfy the conditions expressed in the rule’s antecedents. The pattern matcher therefore requires the rules to refer to the fact type and slot names. An example rule is shown in figure 2.8, which prints out the details of all the books with an rrp over 40.00; figure 2.9 shows the output produced when this rule is run with the deffacts defined in figure 2.6. (defrule displayBooksOver40 "Displays books over 40.00" (Book (title ?title) (authors $?authors) (isbn ?isbn) (rrp ?rrp)) (test (> ?rrp 40.00) ) => (printout t ?title " by " ?authors " (isbn " ?isbn ") has an rrp over 40.00 (rrp " ?rrp ")" crlf) )
Figure 2.8: An example Jess defrule to display all the books with an rrp greater than 40.00. There are several points to note about this rule: • Every rule starts with the “defrule ” and an optional comment. • Facts are referenced in a similar manner to how their deftemplate is defined. The ordering of slots is not important.
2.6. Technologies
47
Artificial Intelligence by Russell Norvig (isbn 0130803022) has an rrp over 40.00 (rrp 46.99) Design Patterns by Gamma Helm Johnson Vlissides (isbn 0201633612) has an rrp over 40.00 (rrp 41.99) Figure 2.9: Example output from running the displayBooksOver40 rule with the books deffacts. • Variables are declared by prefixing the name with a “?”, these treat the value of the variable as a single item (for example, the book title variable ?title). Variables for storing multislot values (such as the authors slot), are defined with a name prefixed with “$?” which allows further pattern matching on the separate values of the variable, for example to display only books with a particular author, it is necessary to test if that author is included in the list of authors. • The test function is used to perform some sort of evaluation, such as the greater than test in the displayBooksOver40 rule. • The “=>” keyword acts as the separator between the list of antecedents and the list of consequents. • The printout function prints text to a given stream; by default the t stream is the terminal. The crlf keyword ends the line so further text is displayed on the next line.
Other Jess Features Jess also provides the deffunction and defquery constructs. The deffunction construct allows the developer to define functions (similar to methods in Java), which can be used by rules to perform some activity as required by the rules: for example, to query the user to provide data or to perform some complex evaluation on fact values. A deffunction can receive a list of parameters when it is called, and returns some value as the result of its execution. The defquery construct allows the developer to query the facts in working memory. A defquery is similar to a defrule; the main difference is that a defquery does not contain a list of consequents: when executed it returns the facts in working memory that the pattern matcher matches with the list of “antecedents” defined in the defquery. The developer can also define user functions, which are Java programs which Jess automatically incorporates into its list of functions, and can be used by a rule-based system in the same manner as other functions (such as the test function in figure 2.8). As a user function is essentially a Java program, it provides a very powerful mechanism for incorporating new capabilities into Jess programs.
2.6.5
JessTab
JessTab [25; 26] is a tab plug-in to the Prot´eg´e environment developed by Henrik Erikson at the University of Link¨oping, Sweden, which enables Jess rules to be applied to an ontology within the Prot´eg´e environment. Effectively, JessTab sits between the two technologies (Jess and Prot´eg´e), allowing Jess to treat the individuals of an ontology as (Jess) facts which it can reason with, and
2.6. Technologies
48
updating the ontology when Jess’s rules update the facts in its working memory. To achieve this, JessTab extends Jess with a series of user functions which allow it to access and perform various operations with the classes, properties and individuals of an ontology.
JessTab Functions JessTab provides several functions which allow Jess to access and manipulate an ontology in Prot´eg´e, which are listed in [27]. For example, there are functions for mapping a class (to Jess’s working memory), create, delete, and edit classes and individuals, and get and set property values of an individual; query (Frames based) restrictions on properties, such as type, cardinality, and the allowed-values; find information about a class, such as determining if one class is a superclass or subclass of another, and getting the superclasses and subclasses of a class; find information about an individual, such as its name and address (in memory); find a particular individual; loop through all the individuals of a class; load and save an ontology; and get the Java instance which stores the ontology.
Mapping an Ontology to Jess JessTab works by mapping classes and individuals in an ontology to deftemplates and facts in Jess’s working memory. JessTab adds a “master” deftemplate, called “object”, to Jess, which it uses to represent the classes, properties, and individuals of an ontology. The object deftemplate contains a slot called “is-a” that it uses to store the type (class) of a fact (that represents an individual) and a multislot for each of the properties in the ontology. Consider the example object deftemplate in figure 2.10, which was created by mapping the classes described in figure 2.3: the single deftemplate contains multislots for all the properties defined in the ontology; note that because “authors” is an object property, its values are other individuals in the ontology, which JessTab describes with the name of the class in the Prot´eg´e API that represents individuals. Figure 2.10 also shows the six example facts derived from the individuals associated with the book and software classes. Consider the first example, for the “Jess In Action” book: the is-a property has the value of the type (class name) of the mapped individual, in this case Book; any datatype properties, such as title, isbn, and rrp are mapped to multislots with the datatype value; object properties also mapped to multislots, with the value set as (the program instance of a class from the Prot´eg´e API which represent) the resource (class, individual, property) in the ontology.
JessTab Rules JessTab rules are similar to Jess rules written to use the “object” deftemplate when referring to facts and the extra JessTab functions. An example JessTab rule is provided in figure 2.11, which adds a book that an author has authored to the author’s “authored” property; Jess ensures that this rule runs for all authors and all books. There are several points to note about this rule: • The first line of code finds a book fact in Jess’s working memory, stores a reference to it in the variable ?b and a reference to one of its authors in the ?author variable.
2.7. Summary
49
Example Mapped Object deftemplate (deftemplate object (slot is-a) (multislot title) (multislot author) (multislot isbn) (multislot rrp) (multislot author) (multislot authored)) Example Mapped Facts (object (is-a Book) (title "Jess In Action") (isbn 1930110898) (rrp 31.27) (authors )) (object (is-a Book) (title "Artificial Intelligence") (isbn 0130803022) (rrp 46.99) (authors > )) (object (is-a Book) (title "Design Patterns") (isbn 0201633612) (rrp 41.99) (authors )) (object (is-a Author) (name "Friedman-Hill")) (object (is-a Author) (name "Russell")) (object (is-a Author) (name "Norvig"))
Figure 2.10: Example mapped deftemplate and facts from an ontology. • The second line of code finds an author in Jess’s working memory, stores a reference to it in the variable ?a and a reference to the value of its authored property in the $?authored variable. • The third line of code uses the instance-address function to test that the address of the Prot´eg´e individual represented by ?a is the same as ?author; this is to ensure ?author and ?a refer to the same individual. • The fourth line of code again uses the instance-address function to determine that the book individual represented by ?b is not already in the list of books stored in $?authored. • The final line of code sets the value of the authored property of the author individual (referenced by ?a) to a new list consisting of the existing value and the individual referenced by ?b.
2.7
Summary
This chapter summarises research related to this work, along with brief presentations of the technologies used to develop several tools in my research work. Four projects which focused on building KBSs from reusable components, namely PSM Librarian, CommonKADS, IBROW3, and MUSKRAT are described in section 2.2. Each project is described in terms of its approach and its strengths and weaknesses. In summary, the main weakness of the four approaches is that they all failed to produce satisfactory support tools for the development and subsequent execution of a new KBS.
2.7. Summary
50
(defrule assignBooks "Adds any books to an author’s authored property" ;; find a book fact with an author, and keep a reference to it ?b Constituent Mappings > WordNet Mappings, where > 3 The constituents of a string are the words that make up that string. The symbols ‘-’, ‘ ’ and (in mixed case strings) upper case letter are used to denote the start of new words. For example the string date-of-birth has constituents date, of and birth.
3.4. The JessTab Rule Reuse Process
57
means “select before”. Currently PJMappingTab supports mappings between each extracted class to a single class in the ontology. In a few cases this may result in some extracted classes not having a mapping. To minimise this, after the mapping algorithm has produced the list of mappings, a global optimisation algorithm is applied. This algorithm analyses the suggested mappings to determine if choosing a sub-optimal mapping for one class would result in more classes being mapped. If this is the case, then this alternative mapping configuration is suggested to the user. Suppose we have a situation where we have three classes in the abstracted rules, RC1 , RC2 , and RC3 , and further suppose the new ontology has three classes, OC1 , OC2 , and OC3 . Then suppose the mappings between the abstract rules and the actual classes are suggested as shown in figure 3.2. In figure 3.2 each suggested mapping is represented by an arrow between two classes (which are represented as circles), the number beside each arrow is the score associated with the mapping. For example, the arrow between classes RC1 and OC1 represents a mapping with a score of 1.0 between RC1 and OC1 ; similarly, the arrow between classes RC1 and OC2 represents a mapping with a score of 0.75 between RC1 and OC2 .
Figure 3.2: Sample mappings illustrating the need for the enhanced mapping algorithm. If we select the highest scoring local mappings we will have: RC1 7→ OC1 (score 1.0); RC3 7→ OC3 (score 1.0); and no mapping for RC2 and OC2 . To ensure the abstracted rules will work correctly with the new ontology, it is necessary to ensure a mapping is defined for every class in the abstracted rules, and so this mapping set is unacceptable as some of the classes (in the abstracted rules) are unmapped (RC2 ). However, our enhanced algorithm analyse the mappings and suggest the following mappings to the user: RC1 7→ OC2 (score 0.75); RC2 7→ OC1 (score 0.8); RC3 7→ OC3 (score 1.0). In this mapping set all the (abstract rules) classes are mapped, and if we use a simple evaluation, we would have a higher mapping score: 2 in the first case and 2.55 in the second. (It is appreciated that a more sophisticated mapping function is likely to be required for more complex examples.)
3.5. Results
3.5
58
Results
To evaluate the performance of our tool, a series of experiments were conducted that were designed to test first the rule abstraction process and second the mapping capabilities. Each test involved the development of an initial JessTab rule set, which was based around an initial ontology. This ontology was either created by myself, or downloaded from the Stanford KSL OKBC server (accessed via the Prot´eg´e OBKB Tab plugin [101] at www-ksl-svc.stanford.edu:5915). Once each set of rules had been developed to a satisfactory level, further ontologies with which to test the mapping functionality were built. Tables 3.14 and 3.2 provide a selection of the results from tests with the Document ontology from the Stanford KSL OKBC server’s Ontolingua section. This Document ontology contained 20 classes and five slots; Ontology A contained of 37 classes and 14 slots; Ontology B contained 25 classes and nine slots. Table 3.1 gives results for the mappings of selected class names and table 3.2 gives the results for the slot mappings. The columns of table 3.1 detail the concept name (in the Document ontology), the abstract name it was given after Phase 1, and details of the mappings produced in tests with two separate ontologies, detailing the concept it was mapped to and the mapping operator selected. The columns of table 3.2 are identical with the additional column stating the slot type. Name
Abstract Name
Mapped in to Ontology A Name
Mapping
Ontology B Name
Operator
Mapping Operator
Document
XX Document
Document
Direct
Docment
Direct
Book
XX Book
Book
Direct
Volume
WordNet
Thesis
XX Thesis
Thesis
Direct
Dissertation
WordNet
Masters-Thesis
XX Masters-Thesis
ThesisMasters
Constituent
Masters-
WordNet
Dissertation Doctoral-Thesis
XX Doctoral-Thesis
DoctoralThesis
Constituent
Dissertation-
WordNet
Doctoral Miscellaneous-
XX Miscellaneous-
AssortedPublish-
Publication
Publication
ing
Artwork
XX Artwork
Art
WordNet
Artistry
TechnicalMnual
Direct
Manual-Technical Constituent
ComuterProgram
Direct
Computer-
Cartographical-
WordNet
Technical-Manual XX Technical-Manual XX Computer-Program
WordNet
WordNet
Publishings
Computer-Program Cartographic-Map XX Cartographic-Map
Miscellaneous-
Manual
WordNet
Programme
Correspondence
Cartographical-
WordNet
Mapping
Table 3.1: Results for mapping the classes of the Document ontology. These results offer a good illustration of the type of support the user can expect from PJMappingTab. The “Document” concept in table 3.1 demonstrates nicely the direct mapping: in Application A the name is identical, in Application B the minor spelling error in “Docment” is picked up by our tool and has no adverse effect on the suggested mapping. Both tables exhibit examples of constituent mappings, while the WordNet mappings also feature prominently both with one-word 4
The spelling mistakes in table 3.1 are intentionally left as they illustrate the minor variant checking feature of direct mappings.
3.6. Discussion Name
Data Type
Abstract Name
59 Mapped in to
Ontology A Name
Mapping
Ontology B Name
Operator Has-Author
String
Has-Editor
String
Title-Of
String
Publication-
String
XX Has-
Mapping Operator
Author
Direct
HasWriter
WordNet
HasEditor
Direct
Editor-Of
Manual
XX Title-Of String Title
Constituent
TitleOf
Direct
XX Publication-
WordNet
Date-Of-
Constituent
Author String XX HasEditor String
Date-Of
IssueDate
Date-Of String
Publisher-Of
String
Publication
XX Publisher-
Publication-
Of String
House
Manual
PublisherOf
Direct
Table 3.2: Results for mapping the slots of the Document ontology. name examples (“Book” to “Volume”) and multi-word concept names (“Cartographic-Map” and “CartographicalCorrespondance”).
3.6
Discussion
Overall the outcome of these experiments were very satisfactory. However, they do suggest that some modifications to PJMappingTab, particularly the mapping phase, could be beneficial. One significant change would be to allow the user to specify how deep the tool searches WordNet for synonyms. This is best described by means of an example. Take the “Artwork” concept in table 3.1 row 7. In Application B the user was required to manually map this concept to “Artistry”. When suggesting a mapping, the tool looked up synonyms of artwork in WordNet, which returned “artwork”, “art”, “graphics”, and “nontextual matter”. Had the tool searched another level (i.e. looked up synonyms of art, graphics and so on), it would have found “artistry” is a synonym of “art” and suggested a mapping. However, performing deeper searches could significantly increase the time taken by the automatic mapper, and so the user should be able to decide how deep to search. A second significant change involves relaxing the property/slot data type matching constraint. Currently, suggested slot mappings are only made between properties/slots of the same type (to avoid run time errors). However many property/slot types are compatible with each other; for example, an Integer can also be considered a Float (but not, of course, the reverse). Relaxing this constraint would increase the number of slots considered during the automatic mapping but would still ensure no run-time data incompatibility errors. One relatively minor modification which should further improve the accuracy of the automatic mapping, particularly when searching for constituent mappings, would be the removal of surplus words from concept names. Words such as “has”, “is”, and “of” often feature in concept names to help the user understand their purpose. However they have relatively little meaning for the mapping algorithm, but can often dramatically affect the similarity rating of two concept names. Consider the two concepts “Has-Author” and “Writer-Of” both used to indicate the author of a document. Currently PJMappingTab would assign them a similarity rating of .50 (which,
3.7. Summary
60
depending on the threshold level may mean a mapping is not suggested), but by removing “Has” and “Of” we increase this to 1.0 guaranteeing a mapping is suggested.
3.7
Summary
The experiments performed have provided favourable results, although there are several enhancements which could be made. However, by automating part of the reuse process, PJMappingTab can be of considerable use to a developer wishing to reuse a set of JessTab rules with further ontologies. In fact, we claim to have gone some way to implementing the vision, developed by John Park and Mark Musen [86] of creating ad-hoc knowledge-base systems from existing problem solving methods (here JessTab rule sets) and knowledge bases (here Prot´eg´e OKBC Frames/OWL ontologies).
61
Chapter 4
Building Knowledge-Based Systems from Reusable Components 4.1
Overview
This chapter starts with a description of an overall KBS development approach, in which reusable knowledge components (domain ontologies and PSs) are first acquired from the Web, then configured to work together, and the resulting system is subsequently executed. This approach proposes using existing ontology search engines to support the acquisition of domain ontologies and an initial system which supports Web based searching for generic PSs (this system is discussed further in chapter 6). Further details of the components used by the KBS development methodology developed in this thesis, particularly generic PSs, are then provided in section 4.3. A generic PS provides a KBS with the ability to reason with domain knowledge provided by a domain ontology, and consists of an ontology and generic rule set. The PS ontology describes the structure of the PS’s required domain knowledge, and the type of rules that the PS uses to reason over that domain knowledge, with the generic rule set providing an implementation of the generic parts of the PS’s reasoning process. The proposed methodology for configuring generic PSs with domain ontologies is discussed in section 4.4. This methodology involves mapping suitable domain knowledge from the domain ontology to the PS, and then using that domain knowledge along with the reasoning requirements of the generic PS to acquire the rules that will enable the generic PS to reason with the domain knowledge. Section 4.5 discusses two generic PSs that have been built: a propose-and-revise based PS, and a diagnostic PS. Both generic PSs are based on the reasoning components of two existing KBSs; two domain ontologies were also extracted from these existing KBSs and are discussed in section 4.5.3. Two manual reuse experiments were performed in which KBSs were built by configuring the extracted components to gain insights into the proposed methodology and are discussed in section 4.6. These experiments provided many insights into the configuration process and suggested many requirements for MAKTab, a tool which supports the user with this configuration task. Briefly, these included the identification of a variety of mappings types that were required, of possible issues with applying those mappings, and of techniques for guiding the user through the process of creating the rules required by a KBS, as well as insights into how executable KBSs can be produced. Finally, section 4.7 provides an outline of the derived KA technique that MAKTab uses to support the rule creation process; examples of the process’ use with the two generic PSs are also provided.
4.2. Introduction
4.2
62
Introduction
The success of reuse-based approaches to knowledge based system (KBS) development relies on repositories populated with relevant components (both domain knowledge and problem solvers), and usable tools which support the configuration of those components into a working system. As discussed in section 2.2, various projects developed their own, somewhat similar approaches to the task of reuse-based KBS development. Although each approach developed different, often incompatible, technologies to support their approach, they generally agree that the reusable components should be descriptions of domain knowledge and different types of reasoning (referred to as Problem Solving Methods (PSMs)). Various repositories of reusable components were developed: for example, the Ontolingua [29] ontology repository and the PSM repositories described in [10; 73; 74]. However, few are still available today, and, in the case of PSM repositories, it is questionable how helpful they would be when building executable KBSs, as they often relied on specific interpreters to handle execution, which are not available now. As the Semantic Web [8] movement continues, the importance of ontologies is growing, and correspondingly, technologies such as ontology search engines (for example Swoogle [23]), and revised libraries/repositories (for example, OntoSearch2 [84]) are becoming available. Given the vast number of ontologies these search engines are processing (Swoogle is currently searching over 10,000 ontologies1 ), it is reasonable to believe that these search engines could make available on the Web (and ultimately, on the Semantic Web) a vast repository of domain ontologies, which could be exploited by KBS developers. If we assume that ontology search engines can provide developers with appropriate domain ontologies, the remaining issues for reuse-based KBS development are acquiring problem solvers (PSs)/PSMs and configuring the components into executable systems. This work has focused on both of these issues, with the main focus being the configuration of executable KBSs from reusable components. Two methodologies for building executable KBSs have been developed: the PJMappingTab methodology (discussed in chapter 3) supports the reuse of relatively simple PSs with multiple ontologies; and the MAKTab (MApping and Knowledge acquisition Tab) methodology (the methodology is mainly discussed in sections 4.4 and 4.7; the supporting tool, MAKTab, is discussed in chapter 5), which supports the development of a new KBS through the configuration of a generic PS (section 4.3.2) with domain ontologies. A technique and supporting tool, PS2 R (the Problem Solver Search engine and Repository), (chapter 6) has also been developed for Web-based searching (and storing) of executable PSs. When developing a new KBS from scratch, once the knowledge engineer has became familiar with the domain (for example through interviews with domain experts or reading relevant texts), reusing existing components to build a new KBS can involve up to four laborious tasks: search (for components), evaluation and selection (of components), configuration (of selected components), and execution (of the final system). Previously, the IBROW3 project [50] attempted to perform all of these processes automatically, a task which is still not possible today. However, existing technologies can be used to greatly support the developer with each of these tasks. Figure 4.1 provides an overview of the process a knowledge engineer could use the techniques and tools developed in this work in conjunction with external tools to support their activities when 1
Statistics obtained in January 2008.
4.2. Introduction
63
building a new KBS. The process, along with the areas investigated in this work and the assumed capabilities of the user are discussed below.
Figure 4.1: Using multiple tools to support the process of building a new KBS (a * indicates the step is optional).
1a If necessary, ontology search engines, repositories and/or directories are accessed to find potentially suitable domain ontology/ontologies. Ontology search is outside the scope of this work, but has received significant attention elsewhere (for example [23; 84]). Current ontology search engines provide a range of query options (for example, keyword search and SPARQL queries [90]), and present results in various forms (for example, OntoSearch2 [84] provides the URI of each result, whereas Swoogle [23] provides links to the relevant ontology). To perform this step, the user must be capable of using the search engine’s query option, and may require enough understanding of OWL to be able to locate and download the relevant ontology. 1b If necessary, PS2 R (the Problem Solver Search engine and Repository) can be used to search for relevant Problem Solvers (PSs), either from the Web or the repository. PS2 R is discussed further in section 6.3, and requires the user to be familiar with appropriate PS terms. 2 The evaluation and selection step involves the user analysing the suitability of the domain ontologies and PSs from steps 1a and 1b, with respect to their abilities to solve the user’s task, with the most appropriate combination being selected. To support this process, the user
4.2. Introduction
64
may make use of ontology viewing/editing tools to analyse the domain and PS ontologies; further, techniques such as those provided by the MUSKRAT-Advisor system [119], can be used to automatically select compatible domain ontology/PS combinations. The evaluation step requires that the user is familiar with ontologies and ontology viewing/editing tools; knowledge of reasoning techniques and Jess may also be helpful for the evaluation of PSs. Supporting/automating the evaluation and selection process is beyond the scope of this work. Once the user has selected their domain ontology and PS, it is then necessary to determine which tool should be used to aid the configuration of a new KBS. This is a relatively trivial step, as for a PS to be configurable with MAKTab, the ontology associated with the PS must import the generic PS ontology discussed in section 4.3.2. Determining if this is the case can easily be done by examining the declared imports of the PS’s ontology. If the PS imports the generic PS ontology, then it can potentially be used with MAKTab (section 5.2; if not, then configuration may be possible using PJMappingTab (chapter 3). Currently this evaluation step must be performed manually, but could easily be performed automatically. This step assumes the user can determine the ontologies imported by the PS’s ontology. 3a If the user selects PJMappingTab, then it is used to configure the PS and domain ontology. PJMappingTab is discussed in chapter 3, and assumes the user is able to confirm and/or define mappings between the PS and the domain ontology. Once the user completes the configuration process, PJMappingTab generates the corresponding code for the new KBS which is returned to the user. 3b Alternatively, if the user selects MAKTab, then it is used to configure the PS and domain ontology. Using MAKTab assumes the user is able to confirm and/or define mappings between the domain ontology and the PS, and provide any extra domain knowledge (descriptions of domain concepts and/or domain rules) that the PS requires. Once the user completes the configuration process, MAKTab generates the corresponding code for the new KBS which is returned to the user. 4 After configuring the PS and the domain ontology, a KBS is generated from the domain ontology, the PS, and the code generated in step 3a or 3b, which the user can then execute by using JessTab. This step assumes the user is able to use JessTab; as JessTab provides a suitable environment for executing KBSs, the task of building a KBS execution environment was not investigated in this work. The main requirements placed on the user in this process relate to the use of ontologies (typically expressed in OWL) and PSs written in JessTab. To further support the user with understanding the content of an ontology, natural language ontology description techniques, for example [68], could be used to provide a description of the knowledge contained in an ontology; this should make the ontology’s contents easier to understand. The PS selection process is supported by the ontology extraction feature and, for those PSs which are expressed using the generic PS ontology, the various descriptions provided by the documentation/description properties of the generic PS ontology; the tag cloud of PS terms should also aid the initial search. Subsidiary assumptions
4.3. Reusable Knowledge Components
65
of the user’s capability relating to use and installation of the relevant tools are addressed by the relevant documentation.
4.3
Reusable Knowledge Components
A KBS applies intelligent reasoning in a domain to solve problems that would otherwise require considerable human time, effort and expertise. To achieve this, a KBS typically requires significant domain knowledge coupled with an intelligent reasoning module. In this work, domain knowledge is provided by (instantiated) domain ontologies, with the reasoning being provided by a configured PS. As well as providing domain knowledge, ontologies have been used to describe various components by the KBS community, specifically, PSMs [19; 83], mappings between domain ontologies and PSMs [87], tasks a KBS must achieve [83], and component repositories [83]. This section describes how ontologies are used in this work as a source of a KBS’s domain knowledge, and to provide the descriptions of generic PSs.
4.3.1
Problem Solvers
This work uses PSs to provide a KBS’s intelligent reasoning process. Typically, a PS provides a particular type of reasoning, for example configuration, diagnosis, assignment, or scheduling, which is combined with a domain ontology to solve a particular problem. Two types of PSs are referenced in this work: generic PSs and domain-instantiated PSs. The main difference is that a generic PS describes its required domain knowledge structure and the structure of the rules it uses to reason with that domain knowledge, while a domain instantiated PS is a generic PS that has been provided with the required domain and reasoning knowledge. Both play important roles in KBS development: the generic PS is used to define the types of knowledge a KBS will use, and the domain instantiated PS is used to generate the (executable) reasoning component of the new KBS.
4.3.2
Generic Problem Solvers
A generic PS provides a domain-independent description of a particular type of reasoning. At a high level, there are various similarities between generic PSs and PSMs. After studying various projects, Tu and Musen [113] describe a PSM as “having (1) a knowledge-level characterization of the method’s functional competence and assumptions, (2) well-defined knowledge roles, (3) modular decomposition into subcomponents, and (4) a task control structure or explicit control knowledge”. A comparison of PSMs and generic PSs using these descriptors finds: 1. For PSMs, the knowledge-level characterizations are designed to aid the selection of the correct type of PSM; similarly, as discussed below, generic PSs use meta-information to provide the user with a description of the PS to aid selection. 2. PSM knowledge roles indicate the type of knowledge that should be defined; generic PSs have two distinct knowledge roles: domain and procedural, which define the types of domain and reasoning knowledge required for a specific PS. 3. The modular decomposition and task control structure of PSMs is designed to encourage reuse of a PSM’s subcomponents; the generic PS’s separation of domain and reasoning knowledge allows the domain knowledge to be independently reused with another PS,
4.3. Reusable Knowledge Components
66
Figure 4.2: The generic PS ontology. reusing the reasoning knowledge with another PS is also possible, although this requires that the referenced domain concepts are also reused. 4. A further distinction between PSMs and generic PSs, is that generic PSs include mechanisms to produce an executable system, once they have been configured for a given set of domain knowledge, an aspect not generally included with PSMs (which, when considered, usually left interpretation to external software). This is an important distinction as, as argued by Runcie [96], there is little value in investing resources configuring a PSM which can not subsequently be executed.
The Generic PS Ontology Every generic PS is composed of two components: an implementation of the domain independent PS code (for example, PS-RS(pnr, -)2 ) and an ontology which describes the PS and its requirements (for example PS-ONT(pnr, -)). To be useful, a generic PS must be supplied with domain knowledge which the final KBS can use in the reasoning process. A simple PS ontology has been developed which can be used by developers to describe generic PSs; a visualisation of the main classes of this ontology (and their relationships) is shown in figure 4.2 (to improve the visualisation, properties with a string value are not included in the diagram). The purpose of this ontology is to describe the types of domain knowledge that PSs require and the domain rules that relate a generic PS to a particular domain. The basic PS ontology includes classes for describing the domain knowledge components, the required rules, and meta-information about the PS and its required rules. The basic domain knowledge descriptions provided by this ontology consist of a PSConcept class, which has two subclasses: SystemComponent for representing the different components from the domain that will be used by the KBS; and SystemVariable which describes variables or parameters that will also be used by the KBS. These classes can be extended and configured by developers to describe the domain knowledge requirements of their PSs as required. 2
Please see page 6 for a description of this notation.
4.3. Reusable Knowledge Components
67
The PS ontology uses the SWRL formalism [49] to describe the types of domain rules a particular PS requires by extending relevant SWRL classes, particularly the swrl:Imp (rule), swrl:Atom (an antecedent or consequent), and swrl:AtomList (the list of antecedents or consequents) classes. Briefly, for every type of domain rule the PS requires, a subclass of swrl:Imp is created whose valid types of antecedents and consequents are restricted to be relevant to the intended purpose of the rule. This is achieved by placing appropriate constraints on the swrl:body (the rule antecedents) and swrl:head (the rule consequents) properties to restrict the types of swrl:AtomList (which are used for storing the lists of antecedents and consequents) that can be used to be particular subclasses of swrl:AtomList. These subclasses of swrl:AtomList, in turn, can restrict the types of atoms that can be added to particular subclasses of swrl:Atom. The developer can define their own types (subclasses) of swrl:Atom which constrain the knowledge that can be expressed using that atom type by placing constraints on the values the properties associated with that atom type can take. By doing this, the developer restricts the type of values that can be selected as property values for the atoms (i.e. the type of knowledge each atom can express) and the types of antecedents and consequents that can be added to a particular rule type (by creating swrl:AtomLists that can only contain certain types of atoms, and placing constraints on the rule’s valid types of antecedents and consequents), which allows the developer to ensure that all rules of that type only contain knowledge appropriate to the rule’s purpose. Examples of rule type definitions for propose-and-revise, and diagnosis are provided in sections 4.5.1 and 4.5.2 respectively. The final purpose of the PS ontology is to provide meta-information on the PS and its rule descriptions. The meta-information is provided by individuals of the ProblemSolver and MetaClasses classes and subclasses. An individual of the ProblemSolver class provides various pieces of information about the PS to both the user and tools that use it. This includes a textual description of the PS and how users can configure it for their domain (important if the PS is to be used by non-knowledge engineers); a list of the type of rules that should be acquired from the user first; and references to the generic PS code and transformation (rule generator) component that is responsible for generating the domain specific parts of the executable KBS. Individuals of the MetaClasses class, and its subclasses RuleMetaClass, AtomMetaClass, and PropertyDisplayMetaClass provide meta-information about the PS’s domain rules. Each individual of the RuleMetaClass provides information about one particular type of rule; specifically, this includes a human-friendly name, human-friendly alternatives to the “IF” and “THEN” parts of the rule descriptions, which types of domain concepts the rule is applicable to (for example, SystemComponents and/or SystemVariables), and which other types of rules are related to it. Each individual of the AtomMetaClass provides a human-friendly name and description for a particular type of atom used by the PS’s rules. Finally, each individual of the PropertyDisplayMetaClass provides a human-friendly name/label for a property when it is referred to in the domain of a particular class. All this meta-information can be used by tools to improve the support provided to users when they are defining rules for their domain.
PS Implementation Along with the above ontology, a generic PS also consists of an implementation of any generic parts of the final system. For example, the generic diagnostic rule set, PS-RS(diag, -), contains
4.4. Proposed Methodology for Building KBSs Using Reusable Components
68
code for dealing with user interaction and determining possible explanations for reported symptoms. The final KBS also contains an implementation of the rules defined by the user, i.e. the domain specific rules.
4.4
Proposed Methodology for Building KBSs Using Reusable Components
The completion of the PS and the ontology extraction steps described in section 4.5 resulted in two sets of independent components that could be used by the KBS development methodology developed in this work, and supported by MAKTab. This methodology involves the creation of a new KBS by configuring a generic PS with the necessary knowledge about a domain that will enable the PS to work in that domain. Configuration is achieved in two stages: mapping and knowledge acquisition. Having selected a generic PS and a suitable domain ontology, relevant domain knowledge is mapped from the domain ontology to the generic PS, typically providing the PS with knowledge about concepts in the domain. A Knowledge Acquisition (KA) process then uses the requirements of the generic PS and the mapped domain concepts to guide the acquisition of the additional domain specific PS knowledge that a user thinks is necessary for the KBS to work in the domain. This focused KA stage is unique to this approach and is, as shown in this work, critical to successful KBS development through reuse, as a domain ontology developed for one type of PS will not normally contain all the information required by another. For example, one would not expect a rule set (nor domain ontology) designed for a configuration KBS to contain all the information required by a diagnostic KBS. This methodology is such that the user should be able to extract the domain ontology from an existing KBS and rapidly configure a further generic PS to work with it and subsequently produce a new KBS. Figure 4.3 illustrates the situation in which a diagnostic elevator ontology, ONT(elevator, [diag]) (extracted from the composite KBS) and a generic propose-and-revise PS, PS(pnr, -) are configured to work together, producing a new configuration KBS in the elevator domain, KBS(pnr, elevator). The sequence of steps/methodology for achieving this with MAKTab is: 1. Split KBS(diag, elevator) into ONT(elevator, [diag]) and PS(diag, [elevator]) (this step is relatively easy in Prot´eg´e/JessTab implementations). 2. Map relevant domain knowledge (that is, domain knowledge that will be used by the final KBS) from ONT(elevator, [diag]) to PS-ONT(pnr, -)3 , to produce an initial PS-ONT(pnr, [elevator’])). 3. Use PS-ONT(pnr, [elevator’]) with the KA process to acquire propose-and-revise rules for the elevator domain, to produce an extended PS-ONT(pnr, [elevator]). 4. Generate PS-RS(pnr, [elevator]) from PS-ONT(pnr, [elevator])). 5. If any new domain concepts are introduced in step 3, add these to ONT(elevator, [diag]) to create ONT(elevator’, [diag, pnr]). 3
extracted from PS(pnr, -)
4.5. Creating Reusable Components
69
6. Combine PS(pnr, [elevator]) (which is composed of PS-RS(pnr, [elevator]) and PS-RS(pnr, -)) with ONT(elevator’, [diag, pnr]) to create KBS(pnr, elevator)).
Figure 4.3: Outline methodology and components for reusing the ONT(elevator, [diag]) from KBS(diag, elevator) with PS(pnr, -) to produce KBS(pnr, elevator).
4.5
Creating Reusable Components
This section provides a details on how two existing KBSs dealing with elevator configuration and elevator diagnosis formed the basis of generic PSs for propose-and-revise configuration (section 4.5.1) and diagnosis (4.5.2) respectively. The two elevator domain ontologies used by the KBSs were also extracted and are discussed in section 4.5.3.
4.5.1
Creating a Generic Propose-and-Revise Problem Solver
Propose-and-revise is a well-known problem solving technique for design/configuration. Three of the most well known uses of propose-and-revise are SALT [64] and its application in the VT domain [65], and the subsequent Sisyphus-II VT challenge [97]; the CommonKADS methodology [98, chap. 6] also describes a propose-and-revise PSM which it then uses as the basis for two further PSMs (scheduling and planning); it was therefore desirable to develop a generic proposeand-revise PS for use in this work. The propose-and-revise method uses knowledge of components, their properties, values these properties can have, constraints on these values, and fixes for violated constraints. These are used
4.5. Creating Reusable Components
70
to produce, if one exists, an acceptable combination of components. Briefly, its algorithm as described in [39; 65] is as follows: 1. Attempt to produce a proposed design, if no design proposal is produced then exit with failure, 2. Verify proposed design, if OK then exit with success, 3. If constraints are violated, systematically attempt to repair all the constraint violations with the sets of fixes provided. To perform this successfully, the algorithm requires three types of domain specific knowledge/rules, which are used in its execution: 1. Configuration rules which specify how a list of subcomponents can be combined to form a complete system. 2. Constraints which specify restrictions between the various components of the configuration. 3. Sets of fixes which should be applied to remedy particular violated constraints. The Department of Computing Science at Aberdeen was provided with the Stanford Medical Informatics’ (SMI) solution to the Sisyphus-II VT elevator configuration challenge (described in [95]), to support an earlier project. Briefly, the system, which is split over 11 files of CLIPS Object Oriented Language (COOL) [2] code, consists of several defclasses describing relevant concepts and associated definstances which describe the relevant instances, and several pieces of program code. A series of propose-and-revise related defclasses are defined (such as constraints, fix, stateVariable, upgrade-info, assign-constraint, and fix-constraint), along with a series of 79 defclasses describing various elevator components, parameters, and constraints. The propose-and-revise related classes have 942 associated instances describing the elevator propose-and-revise system (essentially, these are the result of executing mappings between the elevator domain ontology and the propose-and-revise method ontology, as described in [95]), with a further 674 instances describing the actual elevator components, parameters and constraints. When executed, the program firstly converts the propose-and-revise instances into corresponding (assign (which are effectively configuration rules), constraint, and fix) rules. These rules are then combined with code which specifies some initial parameter values and component selections, and an implementation of the generic parts of the propose-and-revise algorithm. This produces an executable system which attempts to solve the elevator design problem. The generic propose-and-revise code is responsible for testing design proposals, determining, selecting and applying relevant fixes, and concluding if an acceptable design has been reached or cannot be reached.
Extracting the Generic Propose-and-Revise PS As it was desirable to have a generic propose-and-revise PS for this research project, it was decided to rework the above system to provide the desired generic propose-and-revise PS, PS(pnr, -). This
4.5. Creating Reusable Components
71
would also determine if it was possible to reuse an existing KBS as the basis of a (new) generic PS. To facilitate working with knowledge represented in ontologies, it was decided to first convert the COOL based system into a Prot´eg´e/JessTab-based system. This intermediate system was then tested with a range of standard tests to ensure no errors were introduced during the conversion process. The PS(pnr, -) would then be created by extracting any generic parts of the reasoning knowledge (specifically code), and modelling the domain-dependent parts of the system using the generic PS ontology. It was proposed that the intermediary Prot´eg´e/JessTab elevator design system would follow the standard architecture for a Prot´eg´e/JessTab system and be composed of two components: 1. An (OWL) ontology which contains representations of the defclasses and definstances used by the CLIPS system. 2. A JessTab version of the CLIPS system’s program code (consisting of the defrules and deffunctions). This would provide a domain ontology of propose-and-revise related classes and elevator components which could be reused later, and an implementation of the propose-and-revise algorithm which would form the basis of the PS-RS(pnr, -). Analysis of the CLIPS system identified two potential obstacles for the development of the intermediary system: 1. An earlier version of Prot´eg´e had been used to generate the schema and instance files of the CLIPS system (probably Prot´eg´e/Win); due to the change in data model in Prot´eg´e2000, modern versions of Prot´eg´e were unable to load these files. To ensure the new Prot´eg´e/JessTab system is an accurate reworking of the original, it was desirable to use the same data, to test whether the two systems produced identical/similar results. 2. The CLIPS system is written in COOL, and makes extensive use of the defmessage-handler COOL construct. At the time this task was performed, JessTab did not support defmessage-handlers, meaning the code from the CLIPS system required to be altered extensively to work in the Prot´eg´e/JessTab environment. The following actions were taken to resolve these issues: 1. As mentioned previously, the CLIPS system implements the elevator configuration system described in [95]. As part of the PSM Librarian downloads [18], the researchers at Stanford have updated the ontologies from the propose-and-revise PSM experiments described in [95] to work with recent versions of Prot´eg´e, and have made them available. After comparing the relevant ontologies from the download (the elevator ontology and the (propose-and-revise) method ontology instantiated for the elevator domain), it was determined that, although there was a mismatch between file names, the content of the files (associated with the ontologies) was the same; this allowed the downloaded ontologies to be used in the new Prot´eg´e/JessTab system. Although these ontologies provided the correct data, JessTab required that it was available in one ontology, so the PROMPT plug-in [81] was used to merge these ontologies.
4.5. Creating Reusable Components
72
2. To create a JessTab version of the code from the CLIPS system, all the defmessage-handlers were converted to deffunctions, and all calls to the defmessage-handlers were updated to call the corresponding function. Changes were also made where appropriate to reflect differences in the built-in functions between JessTab and CLIPS, and the difference in loading code from files into JessTab. 3. The various ontologies included in the PSM Librarian download were expressed in the Prot´eg´e Frames formalism; one of the goals of this task was to build PS-ONT(pnr, -) expressed in OWL (to enable use with ontologies from search engines), and so the ontology used by the JessTab program was converted to OWL using Prot´eg´e’s export facility. This resulted in a slight loss of code, relating to default values for some of the elevator component class descriptions, but this has not affected the Prot´eg´e/JessTab KBS that was developed. Completing the above steps, resulted in the CLIPS elevator configuration system being successfully converted into a Prot´eg´e/JessTab system. To assess the revised system, both it and the original (CLIPS) systems were executed with identical initial component selections and parameter values. The systems produce a series of parameter/value pairings, detailing the various components and parameters of the final design proposal; these pairings were exported to separate lines in two text files, which were then sorted alphabetically. The contents of the two files were then compared using TortoiseMerge [61], which automatically examines two text files line by line and highlights any differences. This comparison found that both systems produced identical elevator configurations, which provided an initial confirmation that the Prot´eg´e/JessTab system was a satisfactory reworking of the original. Further testing of the Prot´eg´e/JessTab system was performed using alternative initial component selections, for which, as expected, the system failed to produce a valid final design for the same reasons (violated constraints that could not be successfully fixed) as the original CLIPS system. In theory, many valid elevator configurations could be produced by the Prot´eg´e/JessTab and CLIPS systems, which would make such a comparison very difficult; however, as noted by Runcie [96], due to an error in the encoding of the Sisyphus-II VT constraints in the original CLIPS system (which was not corrected in the downloaded PSMTab elevator ontology), only one elevator configuration is produced by these systems for the given domain knowledge. As a result of these tests, it was concluded that the Prot´eg´e/JessTab system was a satisfactory reworking of the original.
The Generic Propose-and-Revise PS Ontology: PS-ONT(pnr, -) The purpose of the generic propose-and-revise PS ontology, PS-ONT(pnr, -), is to provide a description of the types of knowledge that allow the PS to work within a subset of a domain. As such, PS-ONT(pnr, -) was required to describe the domain specific rules used by the intermediate elevator design system, along with (at a general level) the type of domain concepts that were used. The code from the intermediate system consists of two main components: the domainindependent code (which provides the propose-and-revise algorithm, and was extracted to produce PS-RS(pnr, -)) and the domain-dependent code (which is generated from the 942 elevator proposeand-revise related instances) that was to be modeled in PS-ONT(pnr, -). The main challenge in creating PS-ONT(pnr, -) was creating descriptions of the domain-dependent rules in sufficient
4.5. Creating Reusable Components
73
detail so that they would meet the requirements of a KBS, while being constrained enough to provide useful support when acquiring the rules from the user during development of a new KBSs. An initial version of PS-ONT(pnr, -) was developed and used in the manual experiment discussed in section 4.6.1, after which PS-ONT(pnr, -) was updated to reflect lessons learnt from the experiment. The remainder of this section discusses the domain (concept) knowledge required by the PS, and types of domain dependent propose-and-revise rules used by the PS, including how they have been modeled in PS-ONT(pnr, -). Domain Knowledge Required by PS(pnr, -) The PS-ONT(pnr, -) places few requirements on the form of the domain knowledge that it requires: it does not extend the PSConcept classes provided by the generic PS ontology. This is because the SystemComponent class can be used to provide details of the different domain components that are used in the KBS, either by creating associated individuals or, preferably, by extending it with appropriate domain classes from a domain ontology; the SystemVariable class can also be used to provide details (in terms of name and textual description) of any variables/parameters that are used throughout the design process. Initial and Output Values The first component of the domain-specific code used by the intermediate system is a function which asserts the initial selection of domain components and initial parameter values, along with a list of the components and parameters that should be displayed upon the KBS’s successful completion. Typical inputs for the elevator domain specify which motor to start the design with and how many car buffers should initially be used; example outputs are the motor and number of car buffers used in the final design. To acquire this information using the PS-ONT(pnr, -), it was necessary to represent this function as a series of four rule types: two for defining initial values and two for selecting desired outputs. The initial value rules are: InitialComponentSelectionRule, visualised in figure 4.4, which allows the user to select a particular individual of a SystemComponent (for example, a particular motor) to be used at the start of the design process, and the InitialSystemVariablesValue rule, visualised in figure 4.5, which allows the user to specify the value (either as a datatype value or another individual from the ontology) of either a property of a component, or a parameter/variable (for example, setting the car-buffer-quantity to be 0). There are two similar rules for specifying output values: OutputCalculatedComponentRule and OutputCalculatedSystemVariablesRule. The OutputCalculatedComponentRule, visualised in figure 4.6, allows the user to define which components should be displayed when the PS successfully completes the design process (for example, the selected motor and door model). The OutputCalculatedSystemVariablesRule, visualised in figure 4.7, allows the user to define which variable/parameter values should be displayed (for example, the total weight of the cab). Assignment Rules The first set of rules generated by the intermediate elevator configuration system are referred to as “assign-constraints” although the term constraint is misleading: they are essentially assignment rules. The purpose of these rules is to assign a value, either directly or by performing some calculation, to a variable or a property of a component. The majority of these
4.5. Creating Reusable Components
74
Figure 4.4: Visualisation of the PS-ONT(pnr, -) rule classes for specifying initial selection of components.
rules are generated from the domain model and simply update the property values of the components that the PS selects as it runs. The propose-and-revise PS maintains a list of the design states (configurations) that it proposes, as it executes; for each proposed design state the PS maintains a series of feature/value pairings in Jess’s working memory, where a feature is a component, a component’s property, or a variable. For example, in the elevator domain, a design state will be associated with the following pairings if the motor 10HP has been selected (described using [ ] format, so the pairing [motor-model-name motor 10HP] means that the motor-model-name variable has the value motor 10HP): [motor-model-name motor 10HP], [motor-horsepower 10], [motor-max-current 150], [motor-weight 374.0]. Many of the assignment rules are concerned with determining the link between a proposed design state, and the feature/value pairing. For example, if the next proposed design state uses the motor 20HP motor, the pairings will be updated to: [motor-model-name motor 20HP], [motor-horsepower 20], [motor-max-current 260], [motor-weight 539.0]. Similarly, if the component selection does not change between proposed design states, the assignment rules ensure that the newly proposed design has the same feature/value pairings as the previous design state. A rule of this type is created for every property value of each individual of every elevator component in the elevator ontology. The other assign-constraint rules specify the value of a variable or a property of a component as either a predetermined literal (string or numeric) value, an individual from the domain model, or the result of performing a calculation. For example, the total cab weight is calculated by “the sum of the car cab weight, the platform weight, the sling weight, the safety beam weight, the car fixture weight, the car supplement weight and the total weight of miscellaneous car components” [123, pg. 25]. Figure 4.8 provides a visualisation of the assignment rule in PS-ONT(pnr, -). As it is possible to automatically generate the first set of assignment rules from the domain model (those maintaining component property values for a design state), it is not necessary for the user to define them manually. The PS-ONT(pnr, -) SystemVariableValueAssignmentRule allows
4.5. Creating Reusable Components
75
Figure 4.5: Visualisation of the PS-ONT(pnr, -) rule classes for specifying initial values of component properties and variables. rules of the form “IF conditions ARE PRESENT THEN assignments, where conditions can be a list of component selections, component property value tests and/or variable value tests; similarly, the assignments can be a list of component selections, component property value assignments, or variable value assignments. An example assignment rule is provided in figure 4.17. Constraint Rules The next set of rules generated by the intermediate elevator configuration system are referred to as “fix constraints” that is, constraints which have a set of fixes associated with them. These rules specify a constraint on the value of a variable or a property of a component. A constraint essentially consists of an expression which tests (during the propose-and-revise cycle) if the value of a variable and/or a property of a component in the current design state violates a constraint related to it. For example, in the elevator domain, if the total cab weight is greater than the current motor’s maximum supported weight, then there is a violated constraint. As the execution of the propose-and-revise algorithm progresses, constraint rules are activated if the constraint expression(s) (in the antecedent) is satisfied by the feature/value pairings associated with the current design state. If a constraint rule is activated, it asserts a suitable violation statement into Jess’s working memory, for the generic propose-and-revise code to try to fix in the next proposed design state. This type of rule is modeled in the SystemVariableConstraintRule in the PSONT(pnr, -); a visualisation is provided in figure 4.9. Essentially this rule type contains a list of constraint expressions as its antecedents, and a list of constraint violations (each consisting of a violated constraint name) as its consequents. An example of a constraint rule is provided in figure 4.17. Fix Rules Along with constraint rules, the intermediate elevator configuration system also generates a set of fix rules, which specify an action that should be taken to fix a violated constraint. Fix rules are modeled in PS-ONT(pnr, -)’s FixRule class; a visualisation is provided in figure 4.10. Generally, a fix action is either the selection of a different component (based on some criteria) or a value assignment (for example, change the value of a variable). There are six different types of
4.5. Creating Reusable Components
76
Figure 4.6: Visualisation of the PS-ONT(pnr, -) rule classes for specifying which components should be displayed when the design process completes successfully.
Figure 4.7: Visualisation of the PS-ONT(pnr, -) rule classes for specifying which variable values should be displayed when the design process completes successfully.
fixes that can be applied (these were derived from the fixes used by the intermediate system, but can be extended if necessary): increase and decrease, which specify that the (numeric) value of the specified variable or property of a component should be increased or decreased by a set delta; assign, which assigns a particular value, either a predetermined literal or individual, or the result of a calculation, to a variable or component property, in a similar manner to the assign rules; upgrade, which specifies an upgrade for a component (for example, to upgrade from the motor 40HP to the motor 60HP) (this is essentially a specialisation of the assign fix type); and assign variable or component property value, which assigns the value of a system variable or a property of a selected component. The final fix type (property value based fix) allows the user to specify that the fix (the new value that will be assigned or alternative component that should be selected) should be determined at run-time based on values of other components in the design. For example, if the total cab weight is greater than the motor supported weight, then select a new motor with a maximum supported weight which is greater than the motor selected at that point; without this more general type of fix it could be necessary to specify separate upgrade rules for every type of motor: i.e. one
4.5. Creating Reusable Components
77
Figure 4.8: Visualisation of the PS-ONT(pnr, -) classes for the assignment rule, SystemVariableValueAssignmentRule.
rule for upgrading from motor 40HP to motor 60HP, one for upgrading from motor 60HP to motor 80HP, one for upgrading from motor 80HP to motor 100HP and so on. This is how both the original CLIPS system and the intermediate Prot´eg´e/JessTab system work. Every fix also has a desirability, that is, a numeric value (between 1 and 10) which indicates how desirable a particular fix is, where lower numbers indicate higher desirability. The meanings associated with desirability values are application-dependent; for the elevator application, they reflect how complex the fix is to apply; [123, pg. 31] describes desirability scores as: “1 indicates it is no problem, 2 that it increases maintenance requirements, 3 makes installation inconvenient” and so on through to “9 which changes the building dimensions and 10 which changes major contract specifications” [123, pg. 31]. Dependency Rules The final set of rules generated by the intermediate system are dependency rules. The purpose of these rules is to ensure that when a fix is applied and a value is changed (typically through the selection of an alternative component), all values that depend on the altered value are recalculated. These rules are automatically generated based on the assignment expressions used in assignment and fix rules (for example, when the platform weight changes this effects the total cab weight, and so it must be recalculated). The dependencies between variables/parameters in the rules generated by the intermediate system are explicitly stated in the instances of the downloaded elevator ontology; however as it is possible to determine them automatically based on the assignment and fix rules, it is not necessary to model dependency rules in PS-ONT(pnr, -).
4.5. Creating Reusable Components
Figure 4.9: Visualisation of the PS-ONT(pnr, SystemVariableConstraintRule.
78
-) classes for the constraint rule,
Figure 4.10: Visualisation of the PS-ONT(pnr, -) classes for the fix rule, FixRule.
Rule Relationships in PS(pnr, -) The relationships of the rules defined in PS-ONT(pnr, -) are shown in figure 4.11. There are five rules which acquisition can start with: either of the input and output rules or the assignment rule. As the input and output rules do not directly interact with any other rules, they can be acquired independently, in any given order. The assignment rule on the other hand is related to the constraint rule, as the assignment rule defines how to assign a value to a variable or a component’s property, and a constraint rule can restrict the valid values of that variable or component’s property. The constraint rule, in turn, is related to the fix rule, since fix rules define what to do if the constraint is violated. So, the acquisition of an assignment rule should be followed by the acquisition of a (set of) constraint rule(s), which in turn should be followed by the acquisition of a (set of) fix rule(s).
The Generic Propose-and-Revise Code: PS-RS(pnr, -) The PS-RS(pnr, -) provides the domain independent propose-and-revise functionalities used by the Prot´eg´e/JessTab elevator configuration system. This mainly consists of the propose-and-revise algorithm (which deals with checking design states, proposing new designs, and selecting and applying fixes), and supplementary functions for dealing with list processing.
4.5. Creating Reusable Components
79
Figure 4.11: Illustration of the rule relationships in the generic propose-and-revise problem solver, PS(pnr, -).
4.5.2
Creating a Generic Diagnostic Problem Solver
Diagnosis is another well-known PSM, which suggests possible malfunctions of a system based on a set of reported observed symptoms combined with knowledge of the artefact/domain. Diagnosis can be considered to be similar to other PSMs such as classification and assessment, which typically also use knowledge of the state of an artefact to produce very similar outputs. In traditional classification it is typical to assign the artefact to a class; in assessment it is typical to assign the artefact to a given category [98]. It was therefore desirable to have a generic diagnostic PS for this work, as if it could be used successfully, it would provide an indication that similar PSMs could also be used. As part of previous work, performed by Derek Sleeman and Nial Chapman at the Department of Computing Science at the University of Aberdeen, a diagnostic KBS for elevator malfunctions was created (written in CLIPS), based on an interview with an elevator engineer. Ideally this system would have been reworked to produce a generic diagnostic PS, in a similar way to PS(pnr, -); however, as discussed below, analysis of the system determined that, as described below, due to the implementation of the system this was not possible. Briefly, the original system groups faults into five categories (lift stopping, noise, doors, speed/quality of ride, buttons, and smell), each of which is associated with a set of related symptoms (the lift stopping category is associated with three faults, noise has two, doors has four, ride has four, buttons has two, and smell has three); each symptom is then associated with a question for determining the symptom’s presence or absence. At run-time the user is asked to select the category of the problem they are reporting, and the system then asks all of the questions related to that category; if only one of the symptoms is present, then a diagnosis is made, otherwise no diagnosis is made. The system does not use any generic diagnostic code (such as a generic functions for matching observed symptoms to known faults), and uses a very simple lift deftemplate as its domain model. The lift deftemplate consists of 15 slots, one for every symptom encoded in the system, which are used to indicate that symptom was observed by the user or not; the system’s domain knowledge consisted of one fact of type lift. Figure 4.12a illustrates how this system works and where it stores the elevator diagnostic information (related to faults in category a, similar rules exist for category b and c faults). Knowledge of the different types (categories) of faults that occur is stored in the first rule that is fired (furthest left in figure 4.12a). Once the user selects a group, the user is then asked all the questions related to that group (so they are asked about fault a1 , a2 , a3 ). Each of these question rules
4.5. Creating Reusable Components
80
Figure 4.12: Illustration of the original KBS(diag, elevator) and the desired, more generic, KBS(diag, elevator). (shown as the middle set of rules in figure 4.12a) contain information regarding the symptoms of a particular fault. Once all the relevant questions have been asked, if possible, a diagnosis is made by one of the rules shown on the right of figure 4.12a, which contain the diagnostic knowledge (no knowledge of repairs is stored). It should be clear from the above description that this system was very specific to the set of faults that it had originally been developed for, and so did not provide any implemented components that could be reused directly. However, it was possible to abstract the reasoning process that it used to produce the following general diagnostic process: 1. Ask the user to provide a value/description of the state that a component of the artefact is in. For example, the current state of the doors. 2. If this allows a diagnosis to be determined, then it is suggested to the user. 3. Otherwise, ask the user to provide a value/state description for another component of the artefact. 4. Repeat steps 1 to 3 until either an acceptable diagnosis has been made or no diagnosis can be made. Figure 4.12b illustrates how a KBS using this more general approach to diagnosis may work. The knowledge regarding faults and repairs is stored as a set of facts (the various circles on the left hand side of figure 4.12b). This allows the rules (the various squares on the right had side of figure 4.12b) to be written in a generic way (i.e. using the structure of the fault and repair facts), without reference to any domain specific faults or repairs (which are stored in the facts). This process is similar to the processes described in the literature on diagnosis, where the task is generally split into two stages: hypothesis/candidate generation and hypothesis/candidate
4.5. Creating Reusable Components
81
selection [5; 10]. During candidate generation a set of potential causes for the observed faults are determined, based on the reported observations; the candidate selection process then attempts to determine the most likely explanation for the faults, which is then displayed to the user. The generic diagnostic PS, PS(diag, -), that was developed adopts a similar two stage process, and is loosely based on the diagnosis PSM described in [98, chap. 6], which uses “a simple causal model in which symptoms and potential faults are placed in a causal network, and in which internal system states act as intermediate nodes”, [98, pg. 138] to represent and store system malfunctions, their causes and potential repairs. Briefly, the PS(diag, -) uses two types of rules to define the domain specific diagnostic knowledge used by a KBS. These rules link faults associated with a component to their causes and/or possible repairs, and are used by the generic PS code to determine and select causes and repairs for a set of observed symptoms reported by the user. The remainder of this section discusses PS-ONT(diag, -), the required domain knowledge and the rules it defines, their relationships, and finally, the generic diagnosis control code, PS-RS(diag, -).
The Generic Diagnostic Problem Solver Ontology: PS-ONT(diag, -) The generic diagnostic PS ontology, PS-ONT(diag, -), specifies the form of the required domain knowledge, the structure of two diagnostic rules (CauseRule and RepairRule) which can be used to provide the domain specific diagnostic reasoning knowledge required by PS(diag, -), along with meta-information about the PS, and a description of the required domain knowledge. Domain Knowledge Required by PS(diag, -) The PS-ONT(diag, -) places little requirements on the form of the domain knowledge that it requires; it does not extend the PSConcept classes provided by the generic PS ontology (section 4.3.2). The PS-ONT(diag, -) uses the SystemComponent class for describing components relevant to the domain that diagnosis will be performed in; the only restriction is that every individual must have a name. There are several options for representing domain knowledge: individuals of SystemComponent can be used to represent either a class of components (for example, all motors) or different types of each component (for example, the different motors that are available); subclasses of SystemComponent can be added to provide a better representation of a class of components, which may have relevant individuals associated with it; alternatively, the definition of the SystemComponent class can be extended with new properties to better describe the components. The SystemVariable class can also be used for defining variables/parameters relevant to the domain. Cause Rules The purpose of a cause rule is to specify that a malfunctioning behaviour being exhibited by one or several part(s) of the artefact can be caused by other part(s) of the artefact exhibiting certain (typically malfunctioning, but not necessarily so) behaviour(s). For example, a rule from the original elevator diagnosis KBS specifies that the doors failing to open can be caused by the door motor being broken; another rules states that it can also be caused by the door motor receiving less voltage than it requires to function. This type of rule is modeled as the CauseRule in PS-ONT(diag, -); a visualisation of it is provided in figure 4.13. The cause rule uses a list of observations or states related to components as its antecedents and a corresponding set of component states as its consequents. This enables the expression of rules such as those mentioned above associated with the doors failing to open. Each atom of this rule refers to the component that it is describing and provides a description of the associated observation/state.
4.5. Creating Reusable Components
82
The component can either be a SystemComponent (for example, the doors), a property of a SystemComponent (for example, the required voltage of the motor), or a variable/parameter (a SystemVariable individual). Depending on the atom type, the value can be a literal value (used for providing natural language descriptions of the state such as “off”, “on”, or a numeric value, such as 10 (representing the motor’s required voltage)); another system component; or an equation specifying a more complex conditional value, for example the motor’s supplied voltage being less than 10 (volts).
Figure 4.13: Visualisation of the PS-ONT(diag, -) classes related to the cause rule, CauseRule. Repair Rules The purpose of the repair rule is to provide (a set of) suggested actions that should be taken to repair a malfunctioning system. As with cause rules, the repair rule uses a list of component states as its antecedents, and a list of natural language repair instructions as its consequents. This type of rule is modeled in PS-ONT(pnr, -) as the RepairRule class; a visualisation of it is provided in figure 4.14
Figure 4.14: Visualisation of the PS-ONT(diag, -) classes related to the repair rule, RepairRule.
4.5. Creating Reusable Components
83
These two rule types effectively build a graph of (malfunctioning) states/behaviours relating to an artefact, possible causes of the malfunctioning behaviours, and, where appropriate, a list of possible repairs. An example of this, taken from the original elevator diagnosis system, is provided in figure 4.15. In this figure the initial malfunctioning behaviours are shown on the left; they are linked (through solid arrows) to states of other components that could be causing the malfunction, which in turn are linked (through dotted arrows) to suggested repairs. For example, the “Elevator rough ride at top speed” can be caused by “Rail alignment being poor” which can be repaired by “Re-aligning the rails” or it can be caused by “Motor control requires adjustment” which can be repaired by “Adjusting the motor control”.
Figure 4.15: An example graph of malfunctioning elevator behaviours, their causes and repairs.
Rule Relationships in PS(diag, -) The relationships between the two rule types in the PS(diag, -), are shown in figure 4.16. Initially either a cause rule or repair rule can be acquired depending on the situation: if the malfunction being described is not caused by another component, then a repair rule should be defined first, otherwise a cause rule should be defined. The cause rule type is related to itself and the repair rule type (to allow sequences of cause rules such as “malfunction X can be caused by malfunction Y, which can be caused by malfunction Z, which can be repaired by action A”). A repair rule typically indicates that a suggested action should be performed, which should solve the malfunction and so it is not related to anything.
Figure 4.16: Illustration of the rule relationships in the generic diagnosis problem solver, PS(diag, -).
4.5. Creating Reusable Components
84
The Generic Diagnosis Code: PS-RS(diag, -) The current PS(diag, -) provides a JessTab implementation of the generic diagnostic code, PSRS(diag, -), which performs the generic functionalities of the PS. The rules defined by the user are converted into a series of state and repair nodes. Every state node represents a cause rule, and so knows the component states associated with it and the other malfunctions that can cause it, as well as any related repairs; initially all the component states are unsatisfied (i.e. have not been observed by the user/entered into the system). As observations are entered by the user, the candidate generation rules of the generic code determine if any of the unsatisfied component states (in the state nodes) are satisfied; if so it updates the relevant state node to reflect this. If a state node becomes fully satisfied, then the associated repairs are displayed to the user; alternatively any associated state nodes (that could be causing the satisfied state) are investigated. This algorithm is a significant enhancement on the diagnostic algorithm described in [98, chap. 6] (which formed the initial conceptual design for PS-RS(diag, -)) as it directly references the domain knowledge requirements (described in PS-ONT(diag, -)), describes how the algorithm interacts with and uses the domain knowledge, considers the concept of providing repair advice, and the extensive detail provided should mean that the algorithm is directly implementable by the developer. 1. Ask the user to enter at least one initial observation, and the component/variable that the observation(s) relate(s) to. 2. If no state node has been fully satisfied by the initial set of observations (i.e. all the node’s component states have not yet been satisfied by the reported observations). 2.1 Sort the state nodes using some metric (such as increasing number of unsatisfied component states). 2.2 Ask the user to provide a description for the state of a component/variable associated with the first state node in the sorted list. 2.3 Repeat until a state node is fully satisfied or all nodes are not satisfied by the current observations (i.e. the node’s component states are not satisfied by the reported observations). 3. If all nodes are known not to be satisfied by the current observations. 3.1 Then exit with failure (as a diagnosis cannot be determined). 4. Let s be the first satisfied state node. 5. If there are repairs associated with s, then display them. If the repairs do not solve the problem, then diagnosis continues. 6. If s can be caused by other state nodes. 6.1 If any of those state nodes are fully satisfied, then set one of them to s and return to step 5. 6.2 Else, sort the list of nodes that s can be caused by, and ask the user to provide a description for the state of a component/variable associated with the first state node in the list.
4.5. Creating Reusable Components
85
6.3 Repeat step 6.2 until a node is satisfied or all of the nodes in the list are not satisfied by the current observations. 7. If any state node(s) that can cause s is now fully satisfied. 7.1 Then set s to be (one of) the satisfied node(s), and return to step 5. 8. Else, in this instance none of the things that are known to be causes of s are causing it, so attempt to determine if there are any other malfunctions and attempt to diagnose them. 8.1 As observations have been being entered, the other state nodes in the graph have been automatically updated; if any other state nodes have now became satisfied, set s to be one of them, and return to step 5. 8.2 Else, if any of them are not yet known to be satisfied or unsatisfied, sort the list of remaining nodes, and ask the user to provide a description for the state of a component/variable associated with the first state node in the list. 8.3 If a node has been satisfied, then set it to be s and return to step 5. 8.4 Else if all nodes are known not to be satisfied by the current observations, then exit with failure as a diagnosis cannot be determined. 8.5 Else return to step 8.2.
4.5.3
Extracted Domain Ontologies
As well as using the propose-and-revise and diagnostic KBSs to create generic PSs, it was also possible to extract the domain models (ontologies) used by the KBSs: ONT(elevator, [pnr]) and ONT(elevator, [diag]). These extracted ontologies were used as the domain ontologies for various experiments investigating the process of developing a new KBS through the configuration of the generic PSs, described above, together with the domain ontologies. This section describes the two extracted domain ontologies: ONT(elevator, [pnr]) and ONT(elevator, [diag]).
The Extracted Elevator Configuration Ontology - ONT(elevator, [pnr]) The development of PS(pnr, -) (section 4.5.1), involved reformulating the CLIPS system to work in the Prot´eg´e/JessTab environment; as part of this process relevant ontologies provided by Stanford were used, and, as discussed below, an OWL version of the elevator domain ontology was produced for the Prot´eg´e/JessTab elevator configuration KBS (the original elevator domain ontology provided by Stanford was encoded in Prot´eg´e Frames and was converted to OWL using Prot´eg´e’s export OWL mechanism). This domain ontology provides a very detailed representation of elevators, which was required for the Sisyphus-II VT challenge, and includes concepts such as elevator components and constraints on configuration assignments, range constraints on values, fixes for violated constraints, and parameters (variables) which were all used by the propose-and-revise algorithm. The ontology also contains details about the components (sub-systems) and models (parts) of an elevator. There are 14 different component classes, all of which have one individual associated with them: door-system, cable-system, car-system, counterweight-system, deflectorsheave-system, hoistway-system, opening-system, machinebea-
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
86
m-system, machine-system, platform-system, safety-system, sling-system, motor-system, and buffer-system. Every component class has properties for the component name, associated assign and range constraints, fixes, relevant parameters, and the relevant models. There are 17 model (component part) classes (the number of individuals of each class is given in brackets after the class name): carbuffers (2), counterweightbuffers (0), carguiderails (5), carguideshoes (1), compensationcables (7), counterweightguiderails (4), deflectorsheaves (2), doors (8), hoistcables (8), machines (4), machinegrooves (2), motors (6), platforms (3), safetybeams (3), slings (5), machinebeams (10), and counterweightbetweenguiderails (3). Finally, an elvis-parameter class provides details of parameters related to elevator design. Each model class has properties for defining the model name, model specific details (such as height, weight, diameter, etc), and upgrade model name (i.e. the name of the model that this model should be replaced with should an upgrade be required). In all there are 79 classes, with 143 defined properties and 756 individuals describing the necessary knowledge for the elevator propose-and-revise KBS (not all of which are described above). This ontology provides an extensive description of the domain knowledge required by the elevator configuration system built for the Sisyphus-II VT challenge; it was clear however, that general elevator knowledge was provided by the parameter, component and model classes (after removing propose-and-revise specific properties such as links to relevant constraints in the component class and upgrades in the model class).
The Extracted Elevator Diagnosis Ontology - ONT(elevator, [diag]) As discussed previously (section 4.5.2), the CLIPS based elevator diagnosis KBS used a very simple elevator model. The domain model described by the lift deftemplate contained 15 slots relating to elevator faults: stop-fall, stop-floor-level, stop-button-pressed, noise-top-floor, noise-open-close, doors-enter-exit, door-hit, doors-too-long, ride-bumpy-top, ride-rough-speed, ride-too-slot, buttons-lights, smell-oil, smell-bad, and smell-stuffy. The naming of these slots clearly reflects the six categories the KBS grouped the problems into (elevator stopping, noise, doors, speed/quality of ride, buttons, and smell) and a symptom of that category (for example, the stop-fall slot represents an elevator stopping problem where the elevator stops and falls abruptly). The value of each slot was constrained to “yes” or “no”, representing the presence or absence of the problem in the elevator under examination. The associated lift deffacts assert one fact of type lift. This was easily modeled in an ontology: the lift deftemplate mapped directly to a lift class in the ontology, with the slots becoming datatype properties, with the lift class in their domain, and the one (lift) fact mapping to an individual of the lift class in the ontology.
4.6
Manual Reuse Experiments Investigating the Proposed Methodology
After creating the two generic PSs and extracting the two elevator domain ontologies, two manual reuse configuration experiments were performed, in which the extracted elevator ontology from one KBS was configured to work with the generic PS generated from the other KBS. These experiments were performed when only the outline for the methodology had been developed, which
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
87
was: first use mapping to gain some initial knowledge of the domain, and then extend this with the domain specific reasoning knowledge (rules) required by the PS to work in the domain. Further, the rules should be acquired for one domain concept at a time, i.e. all the rules relating to one domain concept should be acquired, followed by the rules relating to another domain concept, and so on for all domain concepts. The order in which rules are acquired (for any given domain concept) should be structured to provide an apparent “natural” flow to the process. These experiments served two purposes: first, to determine if an executable KBSs could be developed from these components using this methodology; and secondly to gain insights into how the individual processes of the methodology were performed and potentially how they interacted. Specifically, it was hoped insights would be gained into what types of mappings would be required and how they should be applied, how the KA process would work, how executable rules would be generated from those defined by the user (and stored in the PS ontology) and used by the final KBS, as well as evaluating the suitability of the rule definitions in the PS ontologies and their defined relationships. During both experiments, extensive notes were taken detailing the actions performed, and any insights gained. These insights were then used to further enhance the methodology, to determine which processes MAKTab could perform for the user, and, if necessary, alter the generic PSs.
4.6.1
Manual Experiment 1: Elevator Diagnostic KBS
The purpose of this experiment was to manually perform the steps required to build a diagnostic KBS by configuring the PS(diag, -) to work with the ONT(elevator, [pnr]). To reiterate, PS(diag, -) was based on an elevator diagnostic KBS, however all links to the elevator domain were removed so that it was domain independent. The initial domain knowledge came from ONT(elevator, [pnr]) (which was completely independent of the original diagnostic KBS), with the diagnostic knowledge coming from the transcript of the interview with the elevator engineer (on which the original KBS had been based) and an extracted version of additional diagnostic knowledge found in the original elevator diagnostic KBS. This experiment first involved mapping relevant domain knowledge from ONT(elevator, [pnr]) to the PS-ONT(diag, -). The resulting PS ontology, PS-ONT(diag, [elevator’]) was then extended with elevator diagnostic rules which were taken from the transcript of the interview with the elevator engineer (on which the original KBS had been based) and an extracted version of additional diagnostic knowledge found in the original elevator diagnostic KBS’s rules, to produce PS-ONT(diag, [elevator]) containing both elevator concepts and diagnostic elevator rules. The knowledge in this ontology, PS-ONT(diag, [elevator]), was then converted into the executable JessTab rule set, PS-RS(diag, [elevator]), combined with the generic diagnostic rule set PS-RS(diag, -) (from PS(diag, -)), and PS-ONT(diag, [elevator]) to produce KBS(diag, [elevator]). The original elevator ontology, ONT(elevator, [pnr]), was then extended with new elevator concepts that were added to PS-ONT(diag, [elevator]) during the rule creation stage. Although this experiment and experiment 2 (section 4.6.2) used specific components (two elevator domain ontologies and the generic propose-and-revise and diagnostic PSs), it is possible to generalise many of the lessons learnt during these experiments. These include: • A range of different mapping types that were necessary to map data from a source (domain)
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
88
ontology to a target (PS) ontology for the purpose of providing domain knowledge for a KBS. • Techniques for applying the defined mappings to ensure that the individuals added to the target ontology accurately reflect the corresponding individuals in the source ontology. • A general technique for ordering the rule creation process based on using the metainformation provided by the PS ontology regarding relationships between different rule types and a depth-first graph traversal algorithm. Further discussion of the general lessons learnt from these experiments is provided in section 4.6.3.
Stage 1 - Mapping The purpose of the knowledge mapping stage is to reuse as much domain knowledge as possible, reducing the time and effort required for the user to configure the generic PS to work in the desired domain. By reusing domain knowledge, which comes from some pre-existing domain ontology, the user does not need to invest time developing an initial domain model which the final KBS will use. The processes that were performed in the experiment relating to the mapping stage are described below. Analysis Before any mapping can be performed, one must determine the corresponding concepts in the two ontologies. This need arises from the separation of domain and problem solving knowledge, which results in two different knowledge descriptions which must be reconciled, before mapping can be performed. As would be expected, both descriptions (ontologies) are composed of their own vocabulary which was determined by their separate designers. Therein lies the main challenge faced in the mapping stage: corresponding classes cannot be determined simply by matching classes with similar names. Where two classes share the same/similar name this only indicates an overlap of the two vocabularies used to define them; it is by no means guaranteed that they describe the same concept in the real world. At this stage, looking at the two class descriptions can help gain an understanding of the purpose of the classes. While the analysis here was conducted manually, the steps performed were recorded as any insights found may influence the algorithm for the ontology mapping sub-tool of MAKTab, which was subsequently developed. Automatically determining corresponding classes in two ontologies is an active research area, which is discussed in section 2.3. The analysis process first involved studying the description of each class in ONT(elevator, [pnr]) to gain an understanding of its purpose. This understanding was gained by noting the class name, the properties associated with that class, and, in particular, what information those properties provided. This gave an understanding of the purpose of each class, and the type of information that its individuals would be expected to provide. The classes were grouped into three groups: component information, model information (describing each component’s sub-components), and parameters. Following this, PS(diag, -)’s ontology, PS-ONT(diag, -) (section 4.5.2), was inspected for classes which required the type of information provided in ONT(elevator, [pnr]). Immediately one set of mappings became apparent: PS(diag, -) requires information about components, which
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
89
various classes in the ONT(elevator, [pnr]) provide, although apparently at a much more detailed level than would be required by the KBS being developed. Another mapping appeared to be available between the parameters from ONT(elevator, [pnr]) and the SystemVariable class in PS-ONT(diag, -), as the terms “parameter” and “variable” can be considered to be synonymous. Mapping In total 31 classes from ONT(elevator, [pnr]) were considered suitable for mapping to PS-ONT(diag, -). A class in ONT(elevator, [pnr]) was considered suitable if it provided (domain) knowledge which could be used by the final KBS(pnr, elevator). These classes all provided some form of component information for an elevator: either in the form of describing a sub-system of the elevator, or the parts which make up that sub-system. The target class of these mappings was the SystemComponent class of the PS-ONT(diag, -). These 31 classes were all subclasses of two classes in the ONT(elevator, [pnr]): the elvis-components and elvis-models classes. The elvis-components class describes the overall subsystems of an elevator, for example, the door, cable, car, and sling systems; the elvis-models class describes the components (parts) of the elevator, for example, the hoist cables, compensation cables, car guide rails, car guide shoes, and motors, that make up an elvis-component. The following 14 subclasses of elvis-components were mapped to the diagnostic component class: door-system, cable-system, car-system, counterweight-system, deflectorsheave-system, hoistway-system, opening-system, machinebeam-system, machine-system, platform-system, safety-system, sling-system, motor-system, and buffer-system. Each elvis-components had at least a COMPONENT.NAME property associated with it, and all had exactly one associated individual. The following 17 subclasses of elvis-models were also mapped to the diagnostic component class: carbuffers, counterweightbuffers, carguiderails, carguideshoes, compensationcables, couterweightguiderails, deflectorsheaves, doors, hoistcables, machines, machinegrooves, motors, platforms, safetybeams, slings, machinebeams, and counterweightbetweenguiderails. Each elvis-model was associated with at least a model-name property, and all except counterweightbuffers had at least one associated individual. When performing the actual mapping itself an interesting issue arose at the conceptual level. The PS(diag, -) is concerned with the name of a component and possibly its properties, along with its associated malfunctions, while the component descriptions in ONT(elevator, [pnr]) provide far more detail about the components. The elvis-components subclasses and associated individuals list the various elvis-models which make up each sub-system, while the elvis-models subclasses describe each model (or part), for example the motors class provides the model name, horsepower, maximum current, upgrade name, and weight, and associated individuals describe real world versions of these models. Determining how this knowledge should be mapped is dependent on the type of malfunctions that will be included in the final KBS. Four options were considered: 1. Map each individual of the relevant classes from ONT(elevator, [pnr]) into an individual of SystemComponent, mainly based on the assumption that the COMPONENT.NAME
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
90
or model-name property from ONT(elevator, [pnr]) and the name property from PSONT(diag, -) serve the same purpose. The result of this type of mapping would be multiple SystemComponent individuals, each with the name of a component or model from the ONT(elevator, [pnr]), this would however, lose the counterweightbuffers component as it has no individuals. This would make it possible to define cause and repair rules for every individual of the sub-systems and parts contained in ONT(elevator, [pnr]); it would not however, be possible to describe faults relating to a type of sub-system or part. 2. Create one individual of SystemComponent for each relevant class in ONT(elevator, [pnr]) based on the class name. Compared to the first option, this has the benefit of ensuring all relevant components feature in the PS and reduces the number of rules required to describe a cause/repair that is common to all instances of a particular part; however, this makes the (potentially large) assumption that, for example, all motors, have the same malfunctions and repairs; this may also lose any information about the properties of the components. 3. Copy the relevant classes from the ONT(elevator, [pnr]) to new subclasses of the SystemComponent class in the PS-ONT(pnr, -), with the option of copying the individuals associated with each class. This would have the advantage of maintaining all the elevator information provided by ONT(elevator, [pnr]), but would require a more complex algorithm for applying the mappings to ensure individuals from ONT(elevator, [pnr]) are copied correctly (for example, ensuring the values of all object properties are correctly copied). 4. Finally some combination of the first two options, where individuals of the SystemComponent class are created for all relevant ONT(elevator, [pnr]) classes (which in principle could result in no mappings) and further individuals are created for the different types of components, enabling the definition of rules for component classes and the different available components; information relating to properties of components would likely be lost however. As the type of diagnostic knowledge that would be added during the KA stage related to classes of components generically, it was decided to use the second option described above. Knowledge providing detailed diagnostic information for the different makes of components was unavailable: only information relating to, for example, the faults associated with a door, not with specific types or brands of door, were available. However, in reality, an elevator engineer would probably wish to define rules at both of these levels, as they would be aware of the different faults associated with different types of doors, motors and so on. Once the appropriate mappings had been determined, 31 individuals of the SystemComponent class from the PS(diag, -) were created: one for each class (from ONT(elevator, [pnr])) that was determined to be relevant.
Stage 2 - Focused Rule Knowledge Acquisition The mapping stage provides the PS with some information: it is unlikely that it will provide all the knowledge the PS requires to fully function. The rule KA stage is therefore designed to take the mapped knowledge, the reasoning requirements of the PS, along with the vocabulary provided by the domain ontology to act as an intelligent guide for the acquisition of additional knowledge
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
91
to produce the final KBS. At this stage in the experiment, the elevator diagnostic knowledge contained in the transcript from the original interview with an elevator engineer along with extracted versions of the rules used in the original KBS were used to define diagnostic rules for the elevator concepts acquired from the mapping stage. Although the transcript had formed the basis of the original elevator diagnosis system, the KBS being developed in this experiment was substantially different from the original in terms of both the overall KBS design, the structure of rules, the domain model that was being used, and in content (the original system did not include all the diagnoses specified in the transcript). It is important to make this clear to avoid this task being considered just another ontology mapping task, due to the use of the same source of elevator diagnostic knowledge. Analysis of the domain knowledge sources indicated that they grouped diagnosis into the six categories mentioned previously (buttons, doors, noise, ride, smell, and stop problems). Buttons and doors were the main reference to components, while ride, smell, and stop problems all related to problems with the elevator as a whole, although sometimes they did include references to other components in their explanation. Adding this information to the PS(diag, -) required the creation of two new SystemComponent individuals: one for buttons and one for elevator. As this experiment was being performed manually, it was decided to define cause rules for one malfunction for each of the six categories listed above, due to the number of rules that would be required to define all the malfunctions and their causes described in the domain knowledge sources (which would require approximately 90 cause rules and 90 repair rules). Unfortunately the knowledge sources did not contain any repair information, so the repairs that were defined were what seemed to be common-sense suggestions based on the diagnosis that had been provided.
Generating the Executable KBS Having completed the mapping and KA stages, it was necessary to generate an executable KBS. It was envisioned that the new KBS would consist of the extended PS ontology, PS-ONT(diag, [elevator]), the generic diagnostic rule set, PS-RS(diag, -), and the domain specific rule set, PSRS(diag, [elevator]), generated from the rules defined in the KA stage. The primary task was determining the systematic method in which the relevant individuals of PS-ONT(diag, [elevator]) would be transformed into the various constructs required by PS-RS(diag, -). Briefly, the method developed involved generating symptom facts for all the malfunctioning behaviours/states described by the rules, then generating repair facts for the defined repair advice, and finally state nodes for describing a malfunction (in terms of its expected symptoms) and linking to potential causes and/or repairs. After manually generating the PS-RS(diag, [elevator]), it was combined with PS-RS(diag, -) to create a fully executable KBS. This system was then executed several times with different observations being entered to test the diagnosis and repair advice generated by the system. Updating the original elevator ontology, ONT(elevator, [pnr]), with the relevant new information introduced during KA was also considered. Primarily, two new individuals representing buttons and the elevator system had been added to the domain knowledge, although various new pieces of information regarding component malfunctions, their causes and repairs had also been added. While it would have been possible to add the diagnostic information to ONT(elevator, [pnr]) it was decided to only add the new general domain knowledge (i.e. the representation of the
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
92
button component and elevator system), as the diagnostic information was application specific and not necessarily of use in future KBSs. Ultimately, however, the task of updating ONT(elevator, [pnr]) becomes one of mapping from PS-ONT(diag, [elevator]) to ONT(elevator, [pnr]), and so adding all the new information should be possible with a wide enough range of mappings. To map the buttons and elevator concepts it was decided to create a new subclass of elvis-models for buttons, as this is where ONT(elevator, [pnr]) stores component information, and a new subclass of the root class of ONT(elevator, [pnr]) for the lift, as there was no appropriate superclass in ONT(elevator, [pnr]).
Lessons Learnt Mapping There is considerable work done in the ontology mapping field, and many different approaches to the problem of automatically deriving mappings between two ontologies have been developed. While a significant insight into this problem was not expected during this experiment, appropriate mapping types for this application were discovered. When defining the mappings in this experiment, four options were considered. While the ONT(elevator, [pnr]) provided a very detailed description of various elevator parts, the degree to which this was required by the PS(diag, -) would really be determined by the user, who may be happy to assume all, for example, motors have the same faults, but equally they may not. It was therefore determined that the mapping tool must allow the user to do both. An alternative approach would be to allow the user to extend the PS(diag, -) by copying classes to subclasses of SystemComponent which can be used, along with their individuals, during the KA stage. In this example that would have involved adding subclasses of SystemComponent for the subclasses of elvis-models and elvis-components. Rule Knowledge Acquisition The KA stage was very much an exploratory work into determining an effective, intuitive method for acquiring new information based on the rule relationships determined previously. Although the technique developed from this experiment was specific to the particular rule relationships for PS(diag, -), it was understood that it would be necessary to develop an algorithm that would be sufficiently general to allow it to be applied to other PSs in the future. The technique developed in this experiment used the meta-information provided by the PSONT(diag, -) regarding the rules that KA should be started with and the rule relationships to determine the order in which rules are defined for every component. As PS-ONT(diag, -) specifies that a cause rule can be related to a cause or repair rule, this provided various insights into how a generic process could use the meta-information for acquiring rules: • Before the experiment, it was thought that the rule relationships essentially defined a graph, and that the order in which rules were defined could be determined using a technique similar to graph traversal. This experiment supported that viewpoint, and showed that a depth-first traversal of the graph provided a logical ordering in this case, as it started with the definition of a malfunction and a possible cause, then a possible cause of the cause, and so on until a repair was defined. A breadth-first travel in contrast would have started with defining a malfunction, then all its possible causes, then going through each possible cause and defining all the possible causes/repairs for them, and so on.
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
93
• As expected, the correct specification of rule relationships (discussed in section 4.5.2) was key to the successful application of a focused KA technique that is based on them. Along with determining the requirements of the algorithm, this experiment also examined the correctness of the rule relationships specified in PS(diag, -), which were felt to have provided a logical ordering to the process of rule definition during this experiment when used with a depth-first ordering. • It was also determined that it is important to return to previous rules during KA to define alternative causes for a malfunction having completed definition of the rules related to one particular cause. For example the “doors closing too quickly” can be caused by the “door open time requiring adjustment” which can be repaired by performing relevant adjustment; the door’s problem can also be caused by a “stuck call button” which is repaired by replacing it, and so it is necessary to return to the “door closing too quickly” problem after defining the first cause and repair. Generating the Executable KBS This process provided insights into the type of transformations that would have to be applied to the PS-ONT(diag, [elevator]) to produce an executable KBSs. It also provided a successful evaluation of the generic rule set PS-RS(diag, -) and its interactions with the domain specific rule set PS-RS(diag, [elevator]), as tests showed the system operated correctly and reliably, producing the correct diagnoses on all the examples used.
Conclusion This experiment provided a good insight into the type of activities that are required to build a KBS from two separate components. There were various lessons learnt from both the mapping and the KA stages which influenced both the methodology and the design of MAKTab (which attempts to perform these process automatically for the user). More importantly however, this experiment showed, that in this instance at least, it was possible to create a KBS by a process of ontology mapping and knowledge acquisition.
4.6.2
Manual Experiment 2: Elevator Configuration KBS
After completing the first experiment, the complementary experiment was performed: namely, creating a new KBS(pnr, elevator) from ONT(elevator, [diag]), the elevator ontology used in the original diagnostic KBS (section 4.5.3) and the PS(pnr, -) (section 4.5.1). The purpose of this experiment was to gain insights into the types of mappings required by the generic PS(pnr, -), to test the suitability of the configuration PS, particular the PS-ONT(pnr, -) to ensure that it was possible to define the correct rules using it, as well as its suitability for supporting the KA process. Additionally this experiment gained further insights into the requirements of the KA part of MAKTab, and into the generation of an executable KBS from the PS(pnr, [elevator]). To reiterate, the generic configuration problem solver, PS(pnr, -) was based on an elevator configuration KBS, however all links to the elevator domain were removed so that it was not configured for any domain. The decision to configure PS(pnr, -) for the elevator domain was made due to the availability of the Stanford version of the Sisyphus-II VT specification document [123], which was used to provide the domain specific PS knowledge for this experiment. The Sisyphus-II VT specification document [123] provides a highly detailed specification of a propose-and-revise based elevator configuration system, in terms of elevator components, their
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
94
configurations, constraints on those configurations and fixes for violated constraints. Encoding this information in a Prot´eg´e/JessTab KBS would be a very time consuming process resulting in a very large system, similar to the original CLIPS system discussed in section 4.5.1. As this was a manual experiment it was decided for practical reasons that it was necessary to restrict the size of the KBS that was developed. It was decided to focus on the part of the KBS relevant to the motor selection, as it was felt that the range of rules relevant to motor selection provided a suitable balance between a practical experiment and complexity of the required KBS. Specifically, the motor selection is dependent on five other components (the motor generator, the machine, and three different types of cables) as well as various variables (such as the required horsepower, the maximum torque it will experience, the peak required current, and the car speed and capacity); further [123, chap. 7] lists three constraints and five fixes which apply directly to the motor, which is the third highest in the list of constraints (only the car component with 19 constraints and 14 fixes; and the counterweight with 15 constraints and 16 fixes have more).
Stage 1 - Mapping In contrast to the first experiment (section 4.6.1), the simplicity of ONT(elevator, [diag]) (discussed in section 4.5.3) meant that there was very little mapping that could be performed. The shortcomings of this ontology, with regard to the new KBS being developed were: • The ontology does not provide descriptions of elevator components; it contains only one class, with properties briefly describing various types of faults. • As a consequence, the ontology contains no descriptions of the actual components which will be used during the design of a new elevator; these descriptions would typically have been provided through individuals of relevant classes in the ontology. Analysis Although this ontology clearly does not provide enough knowledge to build the configuration KBS, it was extracted from an existing KBS, and therefore it was important to determine mappings which could enable the available information to be reused. Analysis of the (basic) knowledge provided by the ontology with respect to the type of domain knowledge that was required by PS(pnr, -), determined that the following types of mappings might be useful to maximise the reusability of ONT(elevator, [diag]): • Copy a class mapping, which would allow the copying of the lift class from ONT(elevator, [diag]) to the PS ontology, which could then be used for describing the elevator being configured. • Property to class mapping, which would allow the creation of a new class in the PS ontology with the same name as a property from ONT(elevator, [diag]). This would, for example, allow the creation of a buttons component class in PS-ONT(pnr, -) based on the buttons-lights property from ONT(elevator, [diag]), and a doors class based on one of the door fault properties (e.g. doors-hit). Obviously it would be necessary to alter the name of the new class, either manually or through some (regular expression based) parsing of the property name performed at mapping time. While it was unlikely a lift, doors, or buttons class would be required by the rules related to the motor, it was decided that both types of mappings identified were appropriate to this
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
95
task, and so they were performed manually. This resulted in the creation of three new subclasses of the PS-ONT(pnr, -)’s SystemComponent class.
Stage 2 - Focused Rule Knowledge Acquisition The mapping process failed to provide sufficient knowledge for the PS(pnr, -) both in terms of knowledge of domain concepts and the propose-and-revise reasoning knowledge required for the selection of the motor component. This meant that performing KA for the required knowledge provided further insights into the process as, unlike in the first experiment, knowledge of domain concepts had to be acquired along with the reasoning knowledge (a task similar to building a KBS without a relevant domain ontology). The rules and associated domain knowledge defined during this process came from [123, section 5.14], which describes the motor selection, and the constraints C-33, C-34 and C-42 and associated fixes defined in [123, chap. 7]. The first requirement at this stage was the creation of a new Motor SystemComponent subclass in PS-ONT(pnr, -); six individuals of this class were also created as specified in [123, table 10]. This allowed rule KA to be performed for the motor component. According to the meta-information provided in PS-ONT(pnr, -), the first type of rule to be defined for the motor was an InitalComponentSelectionRule, which was used to specify that the design process should start with the motor model 10hp motor; the next rule type was the OutputCalculatedComponentRule, which specified that the motor should be displayed; the next rule type was the SystemVariableValueAssignmentRule, which was not required for the motor; KA then progressed to defining the constraint and fix rules. The first defined constraint rule (for C-33) states that the motor must be able to provide the maximum (or peak) amount of current that will be required. If this constraint is violated, the associated fix states that an alternative motor, which can provide more current should be selected. The peak current required by the motor is dependent on both the motor and the maximum amount of torque that it will experience; [123, section 5.12] describes how the maximum current is calculated for the various motors. Defining this constraint rule required the creation of a new datatype property called maxCurrent, with the Motor class as its domain and range xsd:String, updating the Motor individuals with maxCurrent values, and the creation of a new SystemVariable for storing the motor-peak-current-required and the motor-torque-maximum. A related fix rule was defined to select an alternative motor with a higher maxCurrent value. The second constraint, C-34, states that the selected motor must be compatible with the selected machine, and defines four compatibility rules (constraints) and four fixes. The definition of these rules required the creation of a Machine class, and four associated individuals (as described in [123, table 7]). The final constraint, C-42, specifies that the required motor horsepower must be at most 40 (which is the horse power of the most powerful motor); there is no fix if this constraint is violated.
Defining this rule required the creation of a new SystemVariable for the
motor-horsepower-required, as this value is calculated based on various other values. Having defined the three constraints and their related fixes, it was then necessary to define various initial value, output selection and assignment rules for the new Machine component class and variables (motor-peak-current-required, motor-torque-maximum, and motor-horsepower-required) that had been created. Often this in turn led to altering the
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
96
existing class descriptions or the creation of new classes and/or variables, which relevant rules were then created for; this was repeated recursively until all the knowledge relevant to the motor selection had been acquired.
Generating the Executable KBS Having completed the mapping and KA stages and defined the relevant propose-and-revise knowledge for the selection of the motor component, it was necessary to complete the KBS development by generating the executable KBS. The executable KBS that was created consisted of the extended PS ontology, PS-ONT(pnr, [elevator]), along with the generic propose-and-revise code, PS-RS(pnr, -), and PS-RS(pnr, [elevator]), a JessTab version of the domain specific rules that had been defined during the KA stage. Again, the primary task at this stage was determining a systematic method in which the relevant individuals of PS-ONT(pnr, [elevator]) would be transformed into the various deffunctions and defrules required by PS-RS(pnr, -). Briefly, the method developed for creating the PS-RS(pnr, [elevator]) involved creating one deffunction from all the initial selection and output selection rules; sets of assignment and dependency defrules based on the SystemComponent individuals from PS-ONT(pnr, [elevator]) and the defined assignment rules; a series of constraint defrules based on the defined ConstraintRules; and finally, a series of fix defrules based on the defined FixRules. Again, updating the original domain ontology, ONT(elevator, [diag]), with the new knowledge acquired during the KA stage was also looked at. In addition to the various rules, five new classes of components (Motor, Machine, Hoist-Cables, Compensation-Cables, and Control-Cables) with associated individuals, along with several variables had been added to the domain knowledge classes of PS-ONT(pnr, -). To maintain the modelling style of ONT(elevator, [diag]) it was considered necessary to incorporate the new knowledge by adding properties, such as hasMotor, hasMachine, and motorPeakCurrentRequired) to represent the new components and variables; this would however have resulted in the loss of both the class schemas as well as the associated individuals. To maximise the amount of new knowledge added to ONT(elevator, [diag]) it was decided to copy the new classes (and associated properties and individuals) from PS-ONT(pnr, [elevator]) to ONT(elevator, [diag]), adding them as siblings of the lift class; again, it was decided not to add the reasoning specific knowledge defined in the rules to the domain ontology.
Lessons Learnt Mapping The mapping stage in this experiment highlighted the need for a wide variety of mapping types in order to maximise the domain knowledge that can be gained from mapping. This experiment identified two types of mappings, the “copy a class” mapping (which was also identified in the first experiment) and a “property to class” mapping, which enables reuse of the knowledge (loosely) described in the properties of a domain ontology. Applying the mappings in this experiment was relatively simple, although it was noted that tests should be performed to ensure new concepts created in the PS ontology do not have the same name as a concept already in the ontology. Rule Knowledge Acquisition The rule KA process performed in this experiment was again exploratory work into determining a general KA technique; in contrast to the first experiment, both domain and reasoning knowledge had to be acquired. It was found that the user must be able to
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
97
extend the initial domain knowledge (acquired from mapping) using the standard ontology editing operations (adding a new class, properties, and individuals and editing existing concepts); further it would be desirable to perform these operations (especially creating individuals) when defining a rule (i.e. without having to switch to a different part of the tool). The final insight relating to acquiring domain knowledge was that when a new concept is created and then used by a rule, it is desirable to automatically remember these new concepts and ensure that the user defines relevant rules for them before the final KBS is generated. If this type of “check” was not performed, then the generated KBS might fail to execute successfully as values had not been instantiated, which could be difficult for a novice user to detect. This experiment also confirmed the three main insights from the first experiment, relating to the acquisition of reasoning knowledge. Briefly, these were: that a graph-based traversal could be used based on the meta-information provided in PS-ONT(pnr, -) to order the rule acquisition process, that it was important to be able to return to an existing rule and define extra related rules, and that the rule descriptions provided in PS-ONT(pnr, -) could be used to express the required rules. Generating the Executable KBS This process provided insights into the type of transformations that need to be applied to the PS-ONT(pnr, [elevator]) to produce a JessTab implementation of the PS-RS(pnr, elevator). Further, it provided a successful evaluation of the generic code PS-RS(pnr, -) and its interactions with the domain specific rules and PS ontology, as tests showed the system selected the correct motor for a given set of component selections and variable values.
Conclusion This experiment provided further insights into the creation of KBSs from two separate components using the defined methodology. New types of mappings were determined, which, although not obvious, were required to maximise the reuse of existing domain knowledge. The KA stage highlighted the importance of being able to create new concepts (in the PS ontology) easily while defining rules. Further, the KA stage found that it is desirable to keep track of these new concepts that the user defines, and ensure that KA is performed for these (as appropriate) before the KA process is completed.
4.6.3
Summary of Lessons Learnt from the Manual Reuse Experiments
Both experiments described above provided several important insights into this style of KBS development. These insights related to both how a KBS could be built using these components and the processes that were involved; these insights were used to further develop the methodology and determine the initial requirements for the supporting too, MAKTab.
Mapping In addition to the “standard” types of mappings (those typically supported by ontology mapping tools), such as defining corresponding properties and classes between two ontologies, the mapping activities performed during these experiments found new types of mappings, which were required to maximise the degree of domain knowledge reuse. These ranged from simply copying a class (and its associated details) to creating new individuals and classes in the target ontology to represent classes and properties in the source ontology. Moreover, these experiments highlighted the fact that it is impossible to predetermine all the types of mappings that will be required by future
4.6. Manual Reuse Experiments Investigating the Proposed Methodology
98
applications, identifying another requirement for a mapping tool: that it can easily be extended with new mapping types in the future. The other main requirement for the mapping tool determined during these experiments relates to maintaining the consistency of the knowledge being mapped during the application of the mappings. To illustrate this, consider the process of copying a class, sc1, from the source ontology to the target ontology, where sc1 is in the domain of object property sp, which has range sc2. Now consider an individual, si1, of type sc1, which has another individual, si2 (of type sc2) as the value of sp. When si1 is copied to the target ontology (resulting in a new individual ti1), it may not be correct to set si2 as the value of tp for ti1 (assuming sp maps to tp), as si2 may itself have already been mapped/is going to be mapped to the target ontology. In this example, ti2 (the mapped version of si2) should be set as the value of tp for ti1, to ensure it is an accurate copy of si1. It is, of course, possible to show similar examples for other types of mappings. An important requirement of the mapping support tool is to maintain the knowledge consistency wherever possible when applying mappings, to ensure that knowledge gained from mapping, and, ultimately, the knowledge used by the final KBS is an accurate representation of the original domain knowledge. This issue is further discussed in section 5.3.3.
Rule Knowledge Acquisition The KA activities performed during these experiments provided various insights into the KA process regarding the types of support a tool could provide, and the suitability of the generic PSs. The experiments found that the meta-information provided by a PS ontology regarding rule relationships could be used effectively to “drive” the acquisition of rules from the user; further it found that a depth-first graph traversal technique could be used to give an apparent natural order in which rules are defined for the two generic PSs. These experiments also showed that extra meta-information providing human-friendly labels for the various rules and related concepts could ease the process of defining rules for the user. Requirements for a supporting tool derived from these experiments included: the ability to easily extend the knowledge of domain concepts gained from mapping, ideally from the same interface; storing a reference to any domain concepts created during KA or used in a rule and ensuring KA is performed for them (if necessary) before a KBS is generated; and that it may be necessary to define rules with respect to either a concept/component class, or an individual of a concept class. Finally, both experiments evaluated the suitability of the generic PSs for use in the KBS development process. This mainly consisted of evaluating the specified meta-information, rule definitions, and generic code. The meta-information was felt to be beneficial to the KA process, particularly when “driving” the rule KA; although it was noted that the freedom to return to a previously defined rule and define extra related rules was necessary. Slight changes to the original rule definitions were required to enable the expression of the rules used by the two KBSs; the final rule definitions (which were described in sections 4.5.1 and 4.5.2) enabled the expression of the required reasoning knowledge. It was also felt that, to offer maximum support to the user, any supporting tool should be able to use the rule type descriptions along with the (domain) concept that the user is performing KA for, to automatically create many, if not all, of the individuals that are required when creating a new rule.
4.7. Supporting the Acquisition of Domain Specific (Reasoning) Knowledge
99
Generating the Executable KBS Manually converting the rules defined during the KA stages (which were stored in the PS ontologies) into executable JessTab KBSs helped us gain an understanding of how this process could be performed automatically, as well as how the final KBS could be compiled from the various components. By performing various tests with the generated KBSs, it was also possible to determine that executable KBSs, which provided correct results, could be developed using this methodology, and that the various code components were functioning correctly. It was also interesting to consider how any new knowledge defined during the KA stage could be incorporated into the original domain ontology. Essentially this process involves mapping relevant new knowledge from the PS ontology to the domain ontology. Ideally this would be achieved automatically, possibly using the mappings defined during the mapping stage; however, it was determined that although it may be possible to do this in some cases, in general it would not be. For example, if the user were to copy a class from the domain ontology to the PS ontology, and, during KA, create new individuals of the class in the PS ontology, then it would be possible to determine the reverse mapping that would allow the new individuals of the PS ontology to be added to the domain ontology; if however, the user creates a new class in the PS ontology during the KA process, it would not be possible to determine a corresponding reverse mapping. In cases where it is not possible to achieve this automatically, the mapping tool should be able to support the process.
4.7
Supporting the Acquisition of Domain Specific (Reasoning) Knowledge
The focused KA technique used in this methodology for acquiring domain specific problem solving knowledge is, as shown in this work, critical to the successful development of new KBSs. The KA technique uses the domain knowledge and reasoning requirements of a generic PS, along with the vocabulary provided by a domain ontology to guide the acquisition of the additional rules that the PS requires to function in the chosen domain. This KA process has been designed with the assumption that it interacts with a human user who is assumed to be capable of providing the required information. This section provides an outline on how the KA process is performed at a general level and discusses it with respect to the two generic PSs, that is, PS(pnr, -) and PS(diag, -).
4.7.1
Driven Rule Acquisition
The manual reuse experiments found that the rule-related meta-information provided in the PS ontology (section 4.3.2) could be used to effectively “drive” the acquisition of the domain specific rules required by a PS. If the PS ontology did not provide the meta-information, it may still be possible to determine which rules to start the KA process with and the rule relationships. This could be achieved by examining the restrictions on every rule’s swrl:body and swrl:head properties, assuming consequents derive new facts, any pair of rules in which one rule uses the same type of fact (atom type) in its consequent as the other does in its antecedent could be assumed to be related. Any rules for which relationships are not determined can be viewed as initial rules. From the user’s perspective, the KA process is based on the concepts that have been gained from the mapping stage (which can be added to by the user at any stage during KA, if required).
4.7. Supporting the Acquisition of Domain Specific (Reasoning) Knowledge
100
After completing the mapping stage, the user is first asked to select a domain concept (either a class or individual), and is then “walked through” the process of defining the rules that the PS will use to reason about that concept. The KA process is shown in algorithm 1; a more detailed description of how this is implemented in MAKTab is provided in section 5.4.1. The algorithm takes as input the user’s selected domain concept, and the list of rule types that KA should start with. The user is then asked to define a rule for each of the initial rule types. Each initial rule type is treated as the starting node of a depth-first graph traversal, with the rule types representing the graph nodes, and arcs between the nodes describing the rule relationships. After an initial rule has been defined by the user, the graph of related rule types is used to guide the user through the process of defining the rules related to the selected concept, with the graph traversal path being used to determine the order in which the rules are acquired from the user. The user has “completed” knowledge acquisition of rules related to a concept, once they have defined at least one rule of every type related to that concept. The user is free to interrupt the traversal at any stage, to, for example, define alternative related rules to those previously defined. For example, using PS(pnr, -) this could involve defining a second fix rule for a constraint violation, and the KA process should continue as expected after the extra rules have been defined. Algorithm 1 Basic KA control algorithm. Get the concept, c, KA is being performed for Get the list, ir, of initial rule types to acquire for every rule type r in ir do performKaFor(c, r) performKaFor(Concept c, Rule type r) Ask the user to define a rule of type r for c Get the list of related(/next) rule types for r, rr as defined by the PS ontology for every rule type rt in rr do performKaFor(c, rt)
Acquiring Rules for Propose and Revise Figure 4.11 illustrates the rule relationships for PS(pnr, -); this section provides an example of the KA process that the user goes through when defining a sub-set of the rules used in motor selection in the elevator domain, specifically, rules relating to the required-motor-horsepower variable.
Briefly, the motor must be capable of providing a sufficient amount of horse-
power, which is determined by a function of the car capacity and speed, the motor system efficiency, and various constants [123, pg. 29]. The rules for this task, relating to the required-motor-horsepower variable are provided in figure 4.174 . The three (relatively simple) rule KA graphs generated from the meta-information provided by PS-ONT(pnr, -) for the required-motor-horsepower variable are illustrated in figure 4.18. 4
Figure 4.195 provides a sample protocol of how these rule KA graphs
Fixed width text refers to domain concepts in PS-ONT(pnr, [elevator’]) and italic fixed width text provides the name of the rule. 5 [Text in square brackets] is displayed to the user, fixed width text refers to domain concepts in PS-ONT(pnr, [elevator’]), italic normal text refers to a variable without an assigned value, italic fixed width text provides the name of the rule being defined, and normal text (not in square brackets) between USER: and SYS: refers to the user’s action.
4.8. Summary could be used to acquire the rules in figure 4.17 from the user.
101 Initially the user is re-
quested to define an initial value rule, (InitialSystemVariablesValueInd 1) which sets required-motor-horsepower to 0; the user then defines that the required-motor-horsepower value should be displayed when the design process completes (OutputCalculatedSystemVariablesRuleInd 1). A value assignment rule is then defined (SystemVariableValueAssignmentRuleInd 1), which tells the system how to calculate the required-motor-horsepower value, once the other relevant variables have been given a value.
A constraint rule related to required-motor-horsepower is then defined
(SystemVariableConstraintRuleInd 1), namely that if it is greater than the horsepower of the motor (chosen in the proposed design state), then assert a constraint violation called “need more horsepower”. Finally, the two associated fix rules are defined: FixRuleInd 1, which states that if there is a constraint violation called “need more horsepower” then select an alternative motor with a larger horsepower; or alternatively to reduce car-speed by 50 (FixRuleInd 2).
Acquiring Rules for Diagnosis Figure 4.16 illustrates the relationship of the two rule types in PS(diag, -), and the rule KA graph that is generated from them; this section provides an example of how they can be used to guide the acquisition of the diagnostic rules relating to the elevator ride being rough when travelling at top speed (as shown in figure 4.15). Figure 4.206 provides a sample protocol illustrating how these rules could be acquired in a similar style to figure 4.19. Initially the user starts by defining a cause rule stating that a rough elevator ride at top speed can be caused by poor rail alignment (CauseRuleInd 1). A repair rule for poor rail alignment is then defined (RepairRuleInd 1), stating that it can be repaired by realigning the rail. The user then wishes to define another cause rule for the fault (CauseRuleInd 2), which states that it can be caused by the motor control being incorrect. After defining this rule, a related repair rule (adjust the motor control) is defined (RepairRuleInd 2).
4.8
Summary
This chapter provides a description of an overall KBS development approach, in which reusable knowledge components (domain ontologies and PSs) are first acquired from the Web, then configured to work together, and the resulting system is subsequently executed. This approach proposes using existing ontology search engines to support the acquisition of domain ontologies and an initial system which supports Web based searching for generic PSs. The concepts used by the KBS development methodology developed in this thesis are also introduced, and initial investigations into how these reusable components can be configured into a new KBS are also described. Reusable components in this approach take the form of domain ontologies and generic PSs. Domain ontologies are used to provide the KBS with knowledge about a domain, with generic PSs being used to provide the KBS with the ability to reason with that domain knowledge. A generic PS consists of an ontology and generic rule set. The PS ontology 6
[Text in square brackets] is displayed to the user, fixed width text refers to domain concepts in PSONT(diag, [elevator’]), italic normal text refers to a variable without an assigned value, italic fixed width text provides the name of the rule being defined, and normal text (not in square brackets) between USER: and SYS: refers to the user’s action.
4.8. Summary
102
Initial SystemVariables Value Rule InitialSystemVariablesValueInd 1: Set the required-motor-horsepower to be 0 initially Output Calculated SystemVariables Rule OutputCalculatedSystemVariablesRuleInd 1: required-motor-horsepower value
Display
the
final
SystemVariable Value Assignment Rules SystemVariableValueAssignmentRuleInd 1: IF required-motor-horsepower does not have a value AND car-capacity has a value AND car-speed has a value and motor-system-efficiency has a value THEN required-motor-horsepower = (car-capacity * car-speed * 0.6)/33000 * motor-system-efficiency SystemVariable Constraint Rules SystemVariableConstraintRuleInd 1: horsepower of motor THEN assert violation “need more horsepower”
IF required-motor-horsepower >
Fix Rules FixRuleInd 1: IF violation “need more horsepower” THEN replace motor with another motor with required-motor-horsepower
horserpower
>
FixRuleInd 2: IF violation “need more horsepower” THEN reduce car-speed by 50 Figure 4.17: Example assign, constraint, and fix rules for required motor horsepower, which are required by KBS(pnr, elevator).
Figure 4.18: Rule KA graphs for PS(pnr, -).
4.8. Summary
103
InitialSystemVariablesValueInd 1 SYS: [Please provide a list of initial values for the variables: System variable required-motor-horsepower value Define value There is no need to add any consequents to this rule] USER: Changes [Define value] to [0] and presses [“Acquire Next Rule” button] OutputCalculatedSystemVariablesRuleInd 1 SYS: [Please provide a list of system variables that should be displayed upon completing the design process: System variable required-motor-horsepower There is no need to add any consequents to this rule] USER: Makes no changes and presses [“Acquire Next Rule” button] SystemVariableValueAssignmentRuleInd 1 SYS: [If the following concepts have values: System variable required-motor-horsepower Then assign the following concepts the values: System variable required-motor-horsepower Value Define value] USER: Deletes the antecedent, and defines the following antecedents: [Concept car property capacity has value greater than 0 Concept car property speed has value greater than 0 System variable motor-system-efficiency has value greater than 0] Then changes [Define value] for the consequent to [car-capacity * car-speed * 0.6)/33000 * motor-system-efficiency] Then presses [“Acquire Next Rule” button] SystemVariableConstraintRuleInd 1 SYS: [If the following evaluate to true equation Then assert violation violation name] USER:Changes [equation] to [required-motor-horsepower greater than value of concept motor property horsepower] and changes [violation name] to [“need more horsepower”] and presses [“Acquire Next Rule” button] FixRuleInd 1 SYS: [If the following violations are present Violation name “need more horsepower” Then apply fix Replace concept with one that satisfies equation equation (fix desirability set value)] USER: Changes [concept] to [motor] Changes [equation] to [Value of Concept motor property horsepower greater than required-motor-horsepower] and [set value] to [1] presses [“Acquire Similar Rule” button] FixRuleInd 2 SYS: [If the following violations are present Violation name “need more horsepower” Then apply fix fix] USER: Selects the type of [fix] to be a decrease fix SYS: Changes [fix] to [Reduce concept value by delta (fix desirability set value)] USER: Changes [Reduce concept value] to [Concept car property speed], changes [delta] to [50] and changes [set value] to [2]
Figure 4.19: An example protocol illustrating how the rules defined in figure 4.17 might be acquired from the user.
4.8. Summary
104
CauseRuleInd 1 SYS: [Starting KA for elevator. Creating a new cause rule: The following component states(/malfunctions) Component elevator has state state1 Can be caused by Component elevator has state state2] USER: Makes the following changes: Changes [state1] to [“rough ride at top speed”] Changes the consequent [elevator] to [rail] Changes the [state2] to [“poor alignment”] Presses the [“Acquire Next Rule” button] SYS: [Do you wish to define a cause or repair rule for “elevator rough ride at top speed can be caused by rail poor alignment”] USER: Selects [repair rule] RepairRuleInd 1 SYS: [The following component states Component rails has state “poor alignment” Can be repaired by Performing action] USER: Changes [action] to [“Realign the rail”] and presses [“Acquire Next Rule” button] SYS: [Do you wish to: Define another rule related to “elevator rough ride at top speed can be caused by rail poor alignment” Define a rule similar to “elevator rough ride at top speed can be caused by rail poor alignment” Finish KA for elevator] USER: Selects [Define a similar rule] CauseRuleInd 2 SYS: [The following component states(/malfunctions) Component elevator has state “rough ride at top speed” Can be caused by Component elevator has state state2] USER: Makes the following changes: Changes the consequent component to [motor] Changes [state2] to [“control incorrect”] Presses the [“Acquire Next Rule” button] SYS: [Do you wish to define a cause or repair rule for “elevator rough ride at top speed can be caused by motor control incorrect”] USER: Selects [repair rule] RepairRuleInd 2 SYS: [The following component states Component motor has state “control incorrect” Can be repaired by Performing action] USER: Changes [action] to [“Adjusting the motor control”] and clicks on [“Acquire Next Rule”] SYS: [Do you wish to: Define another rule related to “elevator rough ride at top speed can be caused by rail poor alignment” Define a rule similar to “elevator rough ride at top speed can be caused by rail poor alignment” Define another rule related to “elevator rough ride at top speed can be caused by motor control incorrect” Define a rule similar to “elevator rough ride at top speed can be caused by motor control incorrect” Finish KA for elevator] USER: Selects [Finish KA for elevator]
Figure 4.20: An example protocol illustrating how the rules related to a rough elevator ride could be acquired.
4.8. Summary
105
describes the structure of the PS’s required domain knowledge, and the type of rules that the PS uses to reason over that domain knowledge, with the generic rule set providing an implementation of the generic parts of the PS’s reasoning process. Currently, two generic PSs have been built: a propose-and-revise based PS, PS(pnr, -), and a diagnostic PS, PS(diag, -). Both generic PSs are based on the reasoning components of two existing KBSs; two domain ontologies, ONT(elevator, [diag]) and ONT(elevator, [pnr]), were also extracted from these KBSs. Two manual reuse experiments were performed in which KBSs were built by configuring PS(diag, -) with ONT(elevator, [pnr]) and configuring PS(pnr, -) with ONT(elevator, [diag]). These experiments provided many insights into the configuration process and suggested many requirements for MAKTab, a tool which supports the user with this configuration task. Briefly, these included the identification of a variety of mappings types that were required, of possible issues with applying those mappings, and of techniques for guiding the user through the process of creating the rules required by a KBS, as well as insights into how executable KBSs can be produced. An outline of the derived KA technique that MAKTab uses to support the rule creation process is provided in section 4.7, along with examples of use with the two generic PSs.
106
Chapter 5
Supporting the Proposed KBS Development Methodology
5.1
Overview
This chapter discusses the design of MAKTab (the MApping and Knowledge acquisition Tab), a Prot´eg´e tab plug-in designed to support users building and executing a KBS by configuring a domain ontology and generic PS to work together following the methodology described in section 4.4. To support this process, MAKTab consists of two tools: an ontology mapping tool and a knowledge acquisition tool (the designs of each tool are discussed in detail in sections 5.3 and 5.4 respectively). The ontology mapping tool supports the user with mapping knowledge from a domain ontology to a PS ontology, and includes features such as an extendible range of mapping types, and automatic mapping suggestion. The scope and cardinality of mappings supported by the mapping tool are also discussed, along with issues regarding the mapping process and how the mapping tool attempts to maintain the consistency of mapped knowledge. Finally, the design features of the mapping tool’s interface, both in terms of what it looks like and how it is built at run-time are discussed. The Knowledge Acquisition (KA) tool is responsible for supporting the user with defining the rules that will enable the KBS to use the mapped domain knowledge. The guided KA process used by the KA tool is based on a series of KA graphs which capture the relationships between the different types of PS rules, combined with a depth first traversal of these graphs to give the process a “natural” flow. Rules are defined for one domain concept at a time, which allows the KA tool to offer more support by suggesting values for the antecedents and consequents of the rules. Along with the ability to create new rules related to a particular concept using the guided KA feature, the KA tool supports the user with adding new domain concepts, editing existing domain concepts, creating new rules individually (i.e. without using the guided KA feature), editing, and deleting existing rules. Section 5.5 discusses how the tool can also generate an executable version of those rules defined by the user. The rule generation feature is also extendible: currently, MAKTab produces JessTab rules, however it could be configured to generate rules in any suitable formalism. These generated rules can then be combined with the generic rules provided by the generic PS, and the (now enhanced) PS ontology to produce a new KBS which can (currently) be executed using JessTab.
5.2. Supporting Reuse-Based KBS Development (MAKTab)
5.2
107
Supporting Reuse-Based KBS Development (MAKTab)
The reuse-based KBS development methodology described in this work is supported by MAKTab (the MApping and Knowledge acquisition Tab), which performs automatically as many of the operations required during the manual reuse experiments (section 4.6) as possible, while supporting the user when he/she needs to make decisions. MAKTab consists of two sub-tools: an ontology mapping tool and a rule KA tool, which support the respective stages of KBS development. MAKTab was developed as a tab plug-in for the Prot´eg´e environment [104]; this allows MAKTab to take advantage of Prot´eg´e’s various facilities, such as its ability to read ontologies in a variety of formalisms (including Frames and OWL), its API for handing ontologies, interface widgets for handling user interaction with an ontology, and the wide variety of other available plug-ins, particularly JessTab. MAKTab’s mapping tool supports the reuse of existing domain knowledge from ontologies by enabling the user to define mappings between two ontologies, typically a suitable domain ontology and a generic PS’s ontology, and, when instructed, applying those mappings. The KA tool then uses the enhanced PS ontology to guide the user through the process of defining the domain specific reasoning knowledge (rules) that will allow the PS to work with the mapped domain knowledge. When the user is satisfied that the necessary rules have been defined, the KA tool converts these to an executable format, which, combined with the PS’s ontology and generic rule set, provides the user with an executable KBS. The user is free to return at any time to either of the tools to extend the knowledge available to the KBS. MAKTab was designed to work with the Prot´eg´e 3 system, which supports ontologies defined in both the original Frames format, and the newer OWL format, through the use of two separate APIs. Although both APIs comply to the same interfaces (defined by the Prot´eg´e Frames API), due to the differing natures of Frames and OWL, it is recommended that, to ensure the plug-ins work correctly, the API matching the formalism of the ontology the plug-in is using is employed. MAKTab was primarily designed for use with OWL ontologies; however to accommodate Frames ontologies, it was designed around a Model-View-Controller architecture [93] with a series of Abstract Factory classes [37, chap. 3] which enable the selection of the appropriate Prot´eg´e API at run-time based on the ontologies being used.
5.3
MAKTab’s Ontology Mapping Tool
The ontology mapping tool of MAKTab provides the user with the facility to reuse existing domain ontologies (which can be loaded into Prot´eg´e) during the development of a new KBS. This is achieved by mapping the knowledge contained in a domain ontology to the generic PS’s ontology (for example, PS-ONT(pnr, -) in the case of PS(pnr, -), see sections 4.6.1 and 4.6.2). It is expected that the main knowledge acquired from the mapping stage relates to domain entities, which are represented by the PSConcept class (and its subclasses) in the PS ontology. The mapped knowledge is then used in the development of domain specific rules in the KA stage. The main challenges for the user in the mapping stage are determining which concepts in their ontology map to concepts in the PS ontology, and how these mappings are defined. Consequently, to provide maximum support to the user in performing this task, the mapping tool has been designed to be extendible so that it can incorporate new mapping features in the future as needed. This section
5.3. MAKTab’s Ontology Mapping Tool
108
discusses the mapping tool with respect to the four criteria defined by Park et al. [87] as part of their work on PSM Librarian for describing ontology mapping tools: mapping power/complexity (section 5.3.1), mapping scope (section 5.3.2), mapping dynamicity (section 5.3.3), and mapping cardinality (section 5.3.4). The automatic mapping suggestion feature (section 5.3.5), and interface related design issues (section 5.3.6) are also discussed.
5.3.1
Mapping Power/Complexity
This refers to the expressive power and complexity of the mappings supported by the tool. As the number and type of transformations (mappings) supported is the limiting factor in this type of knowledge reuse, the mapping tool supports an extendible range of mapping types. The mapping types currently provided by MAKTab, which are based on those required during the manual reuse experiments, are broadly categorized into either transformation mappings or direct creation mappings, as shown in figure 5.1 (getter and setter methods for class variables have been excluded to improve layout). This classification is based on how they are applied: the transformation mappings typically perform some kind of transformation of the property values of an individual in the source ontology to create a new individual in the target ontology (and so creating a new individual for every relevant individual in the source ontology). As such there are often a series of transformation mappings associated with a particular class in the source ontology. The direct creation mappings on the other hand, typically create a single new entity (either a new class, property or individual) in the target ontology, directly based on some concept in the source ontology. Table 5.1 provides a description of the direct creation mappings currently provided by MAKTab, and table 5.2 provides a description of the transformation mappings. Both tables describe the mappings in terms of the name of the mapping, which concepts (from the source ontology) they are applicable to, a description of the purpose of the mapping, and an example of its use. An alternative classification method could be based on which concepts the mapping can be applied to, that is, either a class or a property.
Figure 5.1: Mapping types currently supported by MAKTab.
5.3. MAKTab’s Ontology Mapping Tool
109
Name
Applicable Description Example To Class To Indi- Classes Creates an individual of a class in Create a SystemComponent invidual Mapping the target ontology, based on a class dividual called Doors in the PSin the source ontology. ONT(diag, -), based on the Doors class in ONT(elevator, [pnr]) to represent faults associated with all types of doors. Copy Single Classes Copies a class (including properties Copying the Motors class and Class Mapping with this class in their domain, and, associated individuals from the optionally, any individuals associ- ONT(elevator, [pnr]) to the PSated with it) from the source ontol- ONT(diag, -) to enable description ogy to the target ontology. of faults for the different types of motors represented by the class’s individuals. Copy Sin- Properties Copies a property from the source Copying the gle Property ontology to the target ontology; doors-enter-exit propMapping where possible the domain and the erty from the ONT(elevator, [diag]) range of the property is preserved. to the PS-ONT(pnr, -) for use when The new property in the target on- defining rules related to the doors. tology typically also has a class from the target ontology included in its domain. Property To Properties Creates a new individual in the Creating an individual of the Individual target ontology, typically of the SystemComponent class Mapping PSConcept class or a subclass in PS-ONT(diag, -) called thereof, based on (the name of) a doors-specs, to be used property in the source ontology. for defining faults associated with doors, based on the door-specs property in ONT(elevator, [pnr]).
Table 5.1: Descriptions of the four direct creation mappings currently provided by MAKTab. These mappings were based on the mappings performed manually during the initial manual reuse experiments described in section 4.6, and provide an initial set of mapping types which meet the requirements of the current applications. It is, of course, possible that other types of mappings will be required by other applications, and so the mapping tool has been designed to incorporate new types of mappings as and when they are required. The design of this part of the tool is based on the Abstract Factory pattern, with subclasses (which provide the implementations) loading the available mapping types dynamically at run-time from an external configuration file. Figure 5.2 provides the class diagram of the MappingFactory which specifies two methods for listing the available types of mappings, based on whether they can be applied to classes or properties. This classification differs from that used in tables 5.1 and 5.2 to enable the display of relevant mapping types based on the concept in the source ontology that the user is defining a mapping for. Figure 5.3 provides the sequence diagram for loading the available mapping types, which is performed when MAKTab is initialised for an ontology. The MappingController’s initialise method is called by MAKTab; the MappingController first determines the appropriate MappingFactory based on the ontology, and then calls its loadMappingsFrom method, passing it the name of the configuration file. The MappingFactory then reads each of the mapping class names specified in the configuration file, it loads each class, using the Java Reflection API, and creates a new instance of it. The new instance is then added to the relevant list
5.3. MAKTab’s Ontology Mapping Tool
110
Name
Applicable Description Example To Property Re- Properties Defines a mapping between a prop- A mapping between the naming Maperty in the source ontology and model-name property from ping a property in the target ontology, the ONT(elevator, [pnr]), which is which may or may not have the in the domain of all the composame name. nents, and the name property of PS-ONT(diag, -) in the domain of PSConcept. Simple Property Datatype Defines a mapping between multi- Concatenating the values of the Concatenation Properties ple properties with a common class model-name, horsepower and Mapping (in the source ontology) in their do- max.current properties from main; for every individual of that the domain of the motors class class, the values of those properties in ONT(elevator, [pnr]) to the are concatenated and mapped to a name property in the domain single property in the target ontol- of SystemComponent in PSogy. ONT(diag, -), to give several (possibly) uniquely named individuals representing different types of motor.
Table 5.2: Descriptions of the transformation mappings currently provided by MAKTab.
of mappings (either classMappings or propertyMappings); once all the mapping types have been loaded they can then be retrieved from the MappingFactory by calling the relevant method. The configuration file is a XML document which includes a list of the Java class names of the mapping types, an example listing is provided in figure 5.4.
Figure 5.2: Class diagram of MappingFactory, including implementing factories for OWL and Frames mappings. New mapping types can be added to MAKTab by adding the class names of the mappings to the configuration file, and ensuring the relevant class is available to the Java Virtual Machine (JVM) when Prot´eg´e is run.
5.3. MAKTab’s Ontology Mapping Tool
111
Figure 5.3: Sequence diagram showing how the relevant types of mappings are dynamically loaded from a configuration file when required by the GUI.
Figure 5.4: Example segment from the mapping configuration file which defines the available types of mappings.
5.3. MAKTab’s Ontology Mapping Tool
5.3.2
112
Mapping Scope
The scope of a mapping defines the range of classes it can be applied to. In order to reduce the number of mappings the user is required to define, MAKTab allows the user to specify the scope of the mapping to be the class the mapping is defined for, or a specified (recursive) depth of that class’s subclasses. This means that the user does not have to create multiple similar mappings for every subclass. Potentially this saves a significant amount of time and effort.
5.3.3
Mapping Dynamicity
Dynamicity refers to when and how the mappings are invoked. Park et al. [87] suggest two options: either applying them before the final system (KBS) is executed, or applying them dynamically at run-time as they are required by the system. As MAKTab uses the mapped knowledge during the KA stage, to simplify the design, the mappings are invoked when the user is satisfied with the mappings they have defined, and subsequently instructs the tool to apply them. Typically, the first time the user configures a domain ontology to work with a generic PS, the mappings must be applied before the KA process can be initialised, as otherwise the PS ontology will not contain domain concepts that the KA tool might acquire rules for. The only time this is not the case, is when the user chooses to add domain knowledge to the PS ontology themselves (see section 5.4.2). Moreover, the user is free to return to the mapping tool at any time to define and apply new mappings while building their new KBS. The manual reuse experiments identified two important issues relating to the application of mappings, namely: the order in which the two types of mappings are applied, and maintaining consistency of the ontology’s individuals (the KB) once it has been mapped. The order of application is important as it is a factor in maintaining the consistency of the KB during mapping. Maintaining consistency is very important, particularly when applying property level transformation mappings, where all the mappings associated with a class in the source ontology are applied to that class’s individuals, creating new individuals of a class in the target ontology. When dealing with datatype properties, it is easy to maintain consistency as the new property values of the individuals being mapped can be created as new resources in the target ontology. When dealing with object properties however, this is not necessarily the case, as the value(s) of the properties can be other resources from the source (or indeed, any) ontology. Therefore, it is desirable that, if possible, the corresponding values in the mapped individuals are set to refer to the equivalent individuals in the target ontology. Figure 5.5 illustrates this. In this example (figure 5.5), the source ontology (identified with the namespace so) is mapped to the target ontology (namespace to) through the two mappings (Mapping 1 (so:name in the domain of so:motor maps to to:name in the domain of to:PSConcept) and Mapping 2 (copy the so:motor-system class as a subclass of to:PSConcept)). The source ontology contains two individuals: so:m1 (a so:motor) and so:ms1 (a so:motor-system), where so:ms1 has so:m1 as the value of its so:hasMotor property. When applying mapping 1, a new individual of to:PSConcept, called to:m1, is created based on so:m1, with the same value for the to:name property. When mapping 2 is executed, a new class to:motor-system is created as a subclass of to:PSConcept based on so:motor-system and the corresponding property (to:hasMotor, assuming the algorithm creates the new property in the target ontology’s namespace (to)), with the same name as the original property (hasMotor) and individual
5.3. MAKTab’s Ontology Mapping Tool
113
(to:ms1) are created. In the incomplete case, when creating to:ms1 the system is unable to assign a value to the to:hasMotor property, as (in so:ms1) it referred to a resource in the source ontology, which is unknown in the target ontology. However, in the complete additions (bottom of figure 5.5), the system factors in that so:m1 was mapped to to:m1 when applying mapping 1, and so is able to determine that so:m1 corresponds to to:m1, and so sets the value of to:hasMotor for to:ms1 to to:m1. Source Ontology Class (so:motor complete) Class (so:motor-system complete) ObjectProperty (so:hasMotor domain(so:motor-system) range(so:motor)) DatatypeProperty (so:name domain(so:motor-system so:motor) range(xsd:string)) Individual (so:m1 type(so:motor) value(so:name “motor 10hp”)) Individual (so:ms1 type(so:motor-system) value(so:name “motor-system-1”) value(so:hasMotor so:m1)) Target Ontology Class (to:PSConcept complete) DatatypeProperty (to:name domain(to:PSConcept) range(xsd:string)) Mappings Mapping 1: so:name in the domain of so:motor maps to to:name in the domain of to:PSConcept Mapping 2: copy the so:motor-system class as a subclass of to:PSConcept Acceptable Additions to the Target Ontology From Mapping 1 Individual (to:m1 type(to:PSConcept) value(to:name “motor 10hp”)) From Mapping 2 Class (to:motor-system partial to:PSConcept) ObjectProperty (to:hasMotor domain(to:motor-system) range(to:PSConcept)) Incomplete Additions to the Target Ontology From Mapping 2 Individual (to:ms1 type(to:motor-system) value(to:name “motor-system-1”) ) Complete Additions to the Target Ontology From Mapping 2 Individual (to:ms1 type(to:motor-system) value(to:name “motor-system-1”) value(to:hasMotor to:m1)) Figure 5.5: Applying mappings: if mappings on object properties are incompletely applied, the consistency of the mapped knowledge base is compromised, while the complete additions to the mapped individual produce a consistent mapped knowledge base. To maintain the consistency of the individuals created in the target ontology during mapping, the mapping application algorithm uses two lookup tables to store mapping-related information as it proceeds. The first table, T1, is used to store the names of mapped concepts: both the name of the concept in the source ontology and the name of the corresponding target concept; table 5.3 provides an example T1. The second table, T2, is used to store the incomplete details of a mapping
5.3. MAKTab’s Ontology Mapping Tool Source Concept so:motor10hp so:motor20hp so:doors-individual251 so:doors-individual252
114
Target Concept to:motor10hp to:motor20hp to:individual065 to:individual066
Table 5.3: An example of the table T1 used by the mapping algorithm. Source Individual so:m1 so:m2
Target Individual to:ms1 to:ms2
Property to:hasMotor to:hasMotor
Table 5.4: An example of table T2 used by the mapping algorithm. related to an object property: it stores the name of an individual in the target ontology that was created based on an individual of the source ontology, it also stores the name of the property (in the target ontology) that it was unable to set the value of, and the name of the individual in the source ontology that was the value of the mapped property of the individual in the source ontology that was the basis for the new (target) individual. For example, consider the first row of table 5.4: this states that if so:m1 is mapped to the target ontology, the new individual created for it should be set as the value of to:hasMotor for to:ms1. Tables T1 and T2 are global variables in the MappingController class, and are initialised in the constructor and so can be used whenever a MappingController instance exists, which is from when MAKTab is added as a tab to Prot´eg´e until either MAKTab is removed, another ontology is opened, or Prot´eg´e is closed. In addition to these two lookup tables, the mapping application algorithm makes use of the Builder Pattern [37, chap. 3] when applying the transformation mappings. Essentially, the builder pattern allows the creation of a new individual in the target ontology to be performed gradually, with its property values being set when they are known, i.e. after each transformation mapping has been applied; the new individual is only added to the target ontology after all the relevant mappings have been successfully applied. This has the advantage that it allows the transformation mappings to be applied in any order (to an individual in the source ontology), and if an error occurs during the execution of a mapping, it is possible to pinpoint the exact mapping(s) that caused the error, and the new individual will not be added to the target ontology until the error is resolved. The overall flow of the mapping application process is shown in algorithm 2. The algorithm first applies the direct creation mappings, which is a relatively straightforward process, as they are simply creating new concepts in the target ontology. The transformation mappings are then applied; when applying each set of transformation mappings to an individual (from the source ontology), the algorithm stores the result of each mapping, and if no errors occur, adds the new individual to the target ontology and updates T1. The suggested algorithm for applying each transformation mapping is provided in algorithm 3, which, when dealing with an object property, uses T1 to determine if the property value in the source individual has already been mapped to the target ontology, if so, it sets the mapped version of that value to be the value of the property for the (new) target individual, otherwise, the mapping is currently incomplete and so it adds a suitable row to T2. A walk-through example of algorithms 2 and 3 is provided in appendix A. There are a range of potential problems that can occur when applying mappings if the GUI
5.3. MAKTab’s Ontology Mapping Tool
115
has not appropriately restricted the user when the mappings were defined. For example, it may be that a mapping has been defined between two properties which, based on their range restrictions, are potentially incompatible; for example, a datatype property with the range string should not be mapped to a datatype property with the range integer. Another possible problem is when a property from the source ontology is mapped to a property in the target ontology with an incompatible cardinality restriction; for example, mapping from a property with a maximum cardinality of four to a property with a maximum cardinality of two. Application of the latter (to an individual in the source ontology with three or four values for that property) would result in an inconsistent target knowledge base, as either the new (target) individual is inconsistent with its associated class definition or all the values from the source individual would not be mapped to the new (target) individual. To handle this, each mapping can check for such potential problems before it is executed, if it detects a potential problem it can report an error or a warning. An error indicates the mapping could/has not been applied, whereas a warning indicates that it has been applied, but there may be an error as a result. All errors and warnings are picked up by the mapping algorithm, and displayed to the user once the remaining mappings have been applied. Algorithm 2 consists of three main for loops (lines 1, 3, and 29), and so as long as these loops always terminate, algorithm 2 will always terminate. The first for loop simply iterates through all the classes in the source ontology that have an associated direct creation mapping and applies that mapping, so as long as no mapping enters an infinite recursive loop or other condition that causes the application of the mapping to not terminate, this for loop will always terminate. At present the direct creation mappings are simple mappings involving creating a new class, property, or individual in the target ontology which involve no recursion and so will always terminate. The second for loop (line 3) involves iterating through all of the classes in the source ontology with associated translation mappings and applying those mappings to the individuals associated with the class. The present translation mappings are responsible for mapping the property values of an individual. If a datatype property is being mapped, then the mapping simply creates a new resource in the target ontology as appropriate to the value being mapped. If an object property is being mapped, then the mappings use T1 to determine the mapped version (in the target ontology) of the source individual’s property value and, if a mapped version exists, set the mapped version to the property value of the new individual being created. If a mapped version does not exist, then an appropriate row is added to T2 to indicate this. At no point do the mappings perform any action which might result in a non-terminating loop, such as, when mapping the value of an object property, mapping a value that has previously been mapped. This means that the process of applying the translation mappings will always terminate. The final for loop (line 29) simply iterates through the rows in T2 attempting to complete mappings by setting property values of individuals in the target ontology based on the values in T1, and so this for loop will always terminate after the last row in T2 has been processed. The correct output for algorithm 2 is to have the target ontology extended with new concepts which correspond to concepts in the original ontology as defined by the mappings. As mentioned above, the main part of algorithm 2 is split into three sections: applying direct creation mappings, applying translation mappings, and attempting to complete any incomplete mappings. The first
5.3. MAKTab’s Ontology Mapping Tool
116
Algorithm 2 Mapping application. Input ed, a new list for storing errors w, a new list for storing warnings so, the source ontology mappings, the list of mappings T1, table of previously mapped concepts T2, table of incomplete mappings Output ed, with any error messages added w, with any warnings added T1, extended with new mapped concepts T2, possibly altered due to new mappings The target ontology will be extended with results of the mappings Begin 1: for all concepts in so that a direct creation mapping, m, has been defined for do 2: Apply m 3: for Every class, sc, in so that a set of translation mappings, tm, has been defined for do 4: Group the mappings in tm by target class to give a list of target classes ltc 5: for all target classes, tc in ltc do 6: for all individuals si in so with type sc do 7: Create new Boolean variable, error ⇐ false 8: Create a new builder instance bi 9: bi’s newInstanceName ⇐ get the name of a new individual, tiname, from the target ontology 10: for all transformation mappings, m, from sc to tc do 11: results ⇐ apply m to si, 12: if there is an error, e, while executing m then 13: Add e to ed 14: error ⇐ true 15: if there is a warning, wr, while executing m then 16: results ⇐ wr’s result 17: Add wr’s msg to w 18: if error is false then 19: Add m’s target property/results pairing to bi 20: else 21: break 22: if error is true then 23: Move on to next si 24: ti ⇐ create a new individual of type tc with name tiname in to based on the property/values in bi 25: Add si/ti pairing to T1 26: for all occurrences, o of si in T2’s Source Individual column do 27: Add ti to o’s Target Individual’s Property value 28: Remove o from T2 {Try to complete any remaining mappings in T2} 29: for all rows r in T2 do 30: t2si ⇐ r’s Source Individual value 31: if T1’s Source Concept column contains t2si then 32: t1tc ⇐ the corresponding Target Concept value from T1 33: Add t1tc to r’s Target Individual’s value for property r’s Property 34: if Length of ed > 0 then 35: Display the errors in ed to the user 36: if Length of w > 0 then 37: Display the warning in w to the user End
5.3. MAKTab’s Ontology Mapping Tool
117
section is relatively trivial task to complete and should cause no errors. The second and third sections work together to ensure the algorithm produces the correct output for the given inputs, as described above. When applying the translation mappings, error handling is built into the algorithm: if it is the case that applying any mapping to an individual is not possible or will result in an error, an error is returned by the mapping, which is caught at line 12, and that individual is not mapped to the target ontology. This prevents incorrect output in the form of individuals in the target ontology which are not a correct mapping of the source individual. The mapped version of a source individual is only added to the target ontology when all the relevant translation mappings have been applied without an error occurring. This ensures all individuals added to the target ontology are the best version of the source individual that could be created at that point. The term “best version of a source individual” means that the new (target) individual may not have values set for any object properties as the value in the source ontology has not yet been mapped. Once all the translation mappings have been applied to an individual, lines 26 to 28 find any individuals in the target ontology that were incomplete as they required the mapped version of the individual (that was just mapped) to be added to a property value, and update the incomplete individuals appropriately. The final section of algorithm 2 (lines 29 to 33) are executed after all the mappings have been applied and attempt to complete the definition of individuals in the target ontology in a similar way. These steps ensure that no individuals are created in the target ontology that are incorrect, and that those individuals that are added to the target ontology are as accurate mapped version of the corresponding source individuals as possible for the given inputs, and so ensure the algorithm terminates with the correct output. The time taken by algorithm 2 to execute is the time taken to apply the direct creation mappings plus the time taken to apply the translation mappings plus the time taken to complete any incomplete mappings. The time taken to apply the direct creation mappings is the number of direct creation mappings that have been defined. The time taken to apply the translation mappings is the sum of the number of individuals of a source class multiplied by the number of mappings associated with that class plus the time taken to iterate through T2 for every source class that mappings have been defined. The time taken to complete the incomplete mappings is the number of rows there are in T2 at that point, which could range from 0 to a theoretical maximum of the total number of object property values for all the individuals in the source ontology. The main purpose of algorithm 3 is relatively simple: to create a list of concepts that should be set as the value of a property of an individual. Algorithm 3 is broken into three main stages: error checking, mapping a datatype property, and mapping an object property. The first if statement (line 1) checks for any potential errors, causing the algorithm to terminate if executing the mapping would cause an error. The second if statement (line 5) checks if both properties are datatype properties, if they are then the new concepts can simply be created as new resources in the target ontology and the algorithm terminates by returning the new values. Finally, if both properties are object properties, the algorithm checks if every value being mapped, v, has already been mapped and if so the appropriate value is retrieved from T1, otherwise the new value can not be assigned as so T2 is updated as appropriate. Algorithm 3 then checks if any warnings have been generated during the algorithm’s execution, if so a warning is thrown (causing the algorithm to terminate) which includes a suitable message and the list of mapped concepts. If no warnings were generated,
5.3. MAKTab’s Ontology Mapping Tool
118
Algorithm 3 Applying a transformation mapping. Input si, the individual in the source ontology that the mapping is being applied to, sp, the property in the domain of si that the mapping is being applied to, to, the target ontology of the mapping, tiname, the name of the individual that will be created in to after all relevant mappings have been applied to si, tp, the target property of the mapping being applied T1, table of previously mapped concepts T2, table of incomplete mappings Output An error message if an error occurs Otherwise T2 is possibly altered due to new mappings A list of concepts to be set as the value of tp in the domain of tiname individual is returned Begin 1: if the mapping can not be applied for some reason (such as incompatible range values of sp and tp then 2: Throw an error with an appropriate message 3: Exit 4: values ⇐ a new list 5: if sp and tp are both datatype properties then 6: for all values, v, of si’s sp do 7: nv ⇐ create a new resource in to for v 8: Add nv to values 9: return values {Assuming sp and tp are both object values, otherwise an error would have been thrown previously} 10: for all values, v, of si’s sp do 11: if T1 contains v in the Source Individual column then 12: Add the corresponding Target Individual value to values 13: else 14: Add v, tiname, tp to T2 15: if a potential error has occurred when applying mapping then 16: Throw a warning with msg = a suitable message, and result = values 17: Exit 18: return values End
5.3. MAKTab’s Ontology Mapping Tool
119
then the algorithm terminates by returning the list of concepts. Algorithm 3 includes various pieces of error handling code, to ensure that the correct output is produced. The first if statement (line 1) stops the algorithm being executed if it would cause an error; any warnings that are produced while executing the mapping (with object properties) are also recorded and returned to the user. The body of the second if statement (line 5) simply creates new resource(s) in the target ontology as appropriate for the value(s) being mapped; the new resources are then returned. Finally, the body of the for statement (line 10) ensures that for any non-datatype values which are mapped from the source ontology the correct corresponding value in the target ontology is returned, or, if the corresponding value is not in the target ontology, T2 is updated appropriate in the hope that algorithm 2 will complete the mapping later. When applying the mapping would produce an error, algorithm 3 terminates instantly; otherwise, the time taken by algorithm 3 is the number of values of sp for si.
5.3.4
Mapping Cardinality
The cardinality of an ontology mapping tool specifies the nature of the mappings it supports. MAKTab supports N:1 mappings, allowing multiple domain classes to be mapped to a single PS class. This is necessary to enable, for example, multiple component domain classes (such as Door, Motor, etc.) in ONT(elevator, [pnr]) to be mapped to the single PS(diag, -) SystemComponent class. N:N mappings however are supported indirectly: MAKTab stores the mappings while the user is defining them; after they have been executed, the user has the option of deleting the defined mappings, to prevent the earlier mappings being applied again if the user were to create further mappings later in the session. If the user opts to clear the mapping store, they are free to define new mappings for classes they have previously mapped; if the user does not clear the mapping store, they can still alter the existing mappings, and in doing so effectively define new mappings. These features are provided as it is my opinion that it is necessary to allow the user to define new mappings at any stage during the development of a new KBS. This is because it is unrealistic to assume any user, especially a novice, will know exactly which concepts from the domain ontology will be used in the KA or by the KBS, and so they must be able to repeatedly perform mappings, if they so choose. Note the second, and subsequent times, a concept from the source ontology is mapped, its corresponding row in table T1 is updated with the name of the new target concept.
5.3.5
Automatic Suggestion of Mappings
MAKTab aims to reduce the number of mappings the user is required to provide. Allowing inheritance of mappings (to subclasses of the class a mapping is defined for) can help; as can automatically suggesting property renaming mappings to the user. MAKTab also suggests mappings by attempting to match class and property names in the source ontology with those in the target (PS) ontology. This algorithm is based on that of PJMappingTab (section 3.4.2); essentially these suggestions are produced by three types of equivalence tests: • Firstly, finding identical names and those pairings with a similarity value, as determined by the string similarity metrics library Simmetrics [13], over a user set value. • Secondly, matching those with a (user set) percentage of common constituents. • Finally, WordNet [30] is used to suggest appropriate synonyms.
5.3. MAKTab’s Ontology Mapping Tool
120
As ontology mapping/matching is an active research field the automatic suggestion component of the mapping tool has been designed to be extendible, in a similar way to the available types of mappings. The current set of available mapping suggestion techniques is dynamically loaded at run-time, allowing the user to select which technique they would like to use; this should allow new, presently unknown techniques to be easily incorporated in the future. Mapping suggestions are created by a mapping suggestion module, which is invoked by the user. The class diagram (figure 5.6) shows the classes relevant to the suggestion feature. As part of the MappingController’s initialise method, the MappingSuggesterFactory is used to load the appropriate MappingSuggesters from the configuration file (this is the same configuration file discussed in 5.3.1), in a similar manner to the MappingFactory. The MappingController sets its mappingSuggester variable to be the first MappingSuggester in the list returned by the MappingSuggesterFactory’s listMappingSuggesters method. This list can also be used, for example to build a GUI component which allows the selection of an alternative suggestion module. Figure 5.7 provides an example listing of the MappingSuggesters in the configuration file; new mapping suggestion techniques can be added by updating the configuration file appropriately.
Figure 5.6: Class diagram showing classes relevant to the automatic mapping suggestion feature. Figure 5.8 shows the sequence diagram for suggesting mappings. When the automatic suggestion function is invoked, the MappingController uses the suggestion module to determine mappings between the two ontologies: the MappingSuggester returns a list of mappings, which the MappingController adds to the MappingStore. The MappingController also returns a message to MAKTab’s GUI, providing details regarding the mappings that were suggested. When the user selects a concept in the source ontology that the suggestion module has already created a mapping for, the mapping is displayed. The MappingStore provides a store
5.3. MAKTab’s Ontology Mapping Tool
121
Figure 5.7: Example segment from the mapping configuration file which defines the available types of MappingSuggesters. for defined mappings, and is discussed further in section 5.3.6.
Figure 5.8: Sequence diagram for the automatic suggestion feature.
5.3.6
Interface Design Features
The design of the mapping tool is based on a Model-View-Controller architecture. The various mappings provide the tool’s data model; the main functions of the controller have been discussed above. The view component is provided by the mapping tool’s GUI, which provides the user with the following functions: load/select the source and target ontologies (both domain and PS); define a mapping between a concept in the source ontology and a concept in the target ontology; view and edit a defined mapping; invoke the mapping suggestion function; and execute the defined mappings. This section discusses: how the tool’s GUI is created to accommodate the two ontology formalisms, the loading of ontologies, how mappings are defined by the user, and how mappings are stored, viewed and edited.
Building the MappingTab The main interface class (MappingTab) builds the mapping tool’s GUI by selecting the correct components for the ontology that is being displayed. This is achieved through the use of Abstract Factories (or kits as they are often referred to in interface design). When MAKTab is initialised for an ontology, the MappingController selects the appropriate kit based on
5.3. MAKTab’s Ontology Mapping Tool
122
Figure 5.9: The mapping tab interface.
the type of that ontology and creates a new instance of MappingTab, passing it the selected MappingGuiKit. The MappingTab then calls the various methods of the MappingGuiKit to get the relevant components and add them to the GUI. These components are two ontology displays (for the source and target ontologies), a mapping definition area, and a button panel. This passes responsibility for displaying the ontologies and mappings, handling user interactions with the display components, and (if necessary) interacting with the ontologies, to those classes returned by the MappingGuiKit. This allows a specific MappingGuiKit to reuse any relevant classes from the appropriate Prot´eg´e API. A screenshot of the mapping tab is provided in figure 5.9. The button panel along the bottom of the interface provides the user with access to various functionalities of the tool, such as invoking the mapping suggestion feature, storing a mapping, and executing the mappings. The mapping definition area is discussed below. The ontology displays, provided by subclasses of the OntologyDisplay class, provide a tree view of the ontology’s taxonomy, which is a standard Prot´eg´e widget for this task (the number after each class name indicates the number of individuals associated with the class). When the user selects a class, the two tabs below the tree view are updated to display a list of properties and individuals associated with that class. Double clicking on an individual from the list brings up a dialog box displaying the details of the selected individual. The “Import Current” and “Import External” buttons above the tree view allow the user to select the source/target ontology (either by importing the one currently loaded in Prot´eg´e or loading one from a file); the “Save Ontology” button allows the user to save the ontology they have loaded.
5.3. MAKTab’s Ontology Mapping Tool
123
Figure 5.10: Classes involved in loading an OWL ontology.
Figure 5.11: Sequence diagram for loading an OWL source ontology.
Loading Ontologies The classes involved in loading ontologies are shown in figure 5.10; figure 5.11 shows the sequence diagram for loading the source (OWL) ontology from a file external to Prot´eg´e; loading Frame based ontologies and the target ontology is similar. First, the OwlOntologyDisplay class uses the FileSelector (a utility class which displays a file selection dialog box to the user, returning the user’s selected file or null if no file is selected) to prompt the user to select a file, this file is then passed to the OntologyLoader through the loadOwlSourceOntology method. The OntologyLoader class is another utility class which is capable of loading OWL or Frame based ontologies. The loaded ontology is then passed to the OntologyHolder and finally, back to the OwlOntologyDisplay for display. The OntologyHolder class is a singleton [37, chap. 3] which provides all classes in MAKTab with an easy to access store for the source and target ontologies.
5.3. MAKTab’s Ontology Mapping Tool
124
Displaying and Editing a Mapping When a user selects a class or property in the source ontology display, the mapping display is updated to show the mapping that has been defined for that concept, or, if no mapping has been defined, the interface for defining a new mapping is displayed. The sequence diagram for this operation, with respect to a class being selected, is provided in figure 5.12, the sequence for the selection of a property is similar. When the user selects a class in the ontology, an event listener in the OwlOntologyDisplay is notified, which notifies the MappingController, passing it the selected class; the MappingController then queries the MappingStore to find if any mappings have been defined for the selected class. The MappingController also gets the available mapping types for the class. If a mapping was returned from the store, the MappingController passes it to the MappingTab along with the available mapping types for display: the user is then free to edit the mapping. If no mapping was returned from the store, the MappingController only passes the available mapping types to the MappingTab, which displays them to the user, who can then select one and define a mapping.
Figure 5.12: Sequence diagram showing how the system responds to the user selecting a class in the source ontology display. The MappingDisplayArea class is the GUI component responsible for allowing the user to view the available mapping types and to define/edit a mapping. Figure 5.13a shows the display when the user has selected a property that has no mapping associated with it: the display consists of a combo box which displays the available types of mappings. Once the user selects a type and clicks on the “Set Type” button, a new mapping of that type is created and its associated GUI component is added to the area marked “Mapping display” in figure 5.13a, as shown in figure 5.13b. Along with providing a model of a type of mapping, each Mapping instance also knows which view is associated with it. Figure 5.14 shows the classes that are used to display the different types of mappings: they are structured in a hierarchy similar to that of the Mapping classes, which allows them to reuse GUI elements common to all (such as the display of the source and target class names, and the mapping depth). Every MappingGui class provides a display for the parts of the mapping that it knows about and an area for subclasses to add details of the parts of the mapping that only they know about, referred to as the subclass display area. When the setMapping method is called on a MappingGui class, the GUI is rebuilt to display the new mapping. For example, consider the case when the setMapping method is called on the
5.3. MAKTab’s Ontology Mapping Tool
125
Figure 5.13: Screenshots of the mapping display area when defining a class level mapping. PropertyRenamingMappingGui; figure 5.15 shows the sequence diagram of the subsequent events. PropertyRenamingMappingGui calls its superclass’s refresh method, which in turn calls its superclass’s refresh method and so on until the MappingGui’s refresh method is called. The MappingGui then updates its display of the source class and mapping depth, and removes any components that have been added to its subclass display area. The abstract method getSubclassDisplay is then called, with the result of that method call being added to the subclass display area. The TransformationMappingGui responds to that method call by building a display component for the mapping’s source property and target class and property. Before TransformationMappingGui returns the component, it calls the abstract method getTransformationSubclassDisplay, and adds the result to its subclass display area. Finally, the PropertyRenamingMappingGui responds to the method call by returning an empty panel (as it has nothing to add to the display), the chain of method calls return their components, and the GUI is updated.
Saving a Mapping Once the user has edited or defined a mapping they are required to save it, which adds the mapping to the MappingStore. This is not performed automatically to allow the user to review their changes before they are saved; if the user believes they have made error(s) in defining the mapping, the user can either correct the error(s) or reload the mapping from the store (by selecting another concept and then selecting the concept they made the error defining a mapping for: as the mapping was not saved before another mapping was displayed, any changes the user made to the (first) mapping will be lost). When the user initiates the saving of a mapping (by clicking on the relevant button on the GUI), the MappingTab passes the mapping (that it is displaying) to the MappingController. The MappingController in turn passes it to the
5.3. MAKTab’s Ontology Mapping Tool
126
Figure 5.14: Class diagram of the mapping interfaces.
Figure 5.15: Sequence diagram for building the PropertyRenamingMappingGui GUI.
5.3. MAKTab’s Ontology Mapping Tool
127
Figure 5.16: Sequence diagram showing how the defined mappings are applied.
MappingStore, which then stores the mapping for future retrieval, overwriting any existing mapping for that concept. The MappingStore class uses two tables to provide a store for mappings defined (and saved) using the mapping tool. The first table stores the class level mappings, with the source class names being used for the table’s keys and the corresponding mappings being used as the values. The second table stores the property level mappings: to provide efficient access to a property level mapping, this latter table uses the name of the source class that a property mapping was defined for as the key, and a further table as the corresponding value: the latter table uses the name of the mapping’s source property as the key and the mapping as the value. This means property level mappings can be accessed and stored relatively quickly, requiring only two lookup or one lookup and two entry operations respectively. When queried for a mapping, the MappingStore finds the relevant mapping from its tables, creates a clone of it, and returns the clone. The benefit of doing this, is that it allows the user to alter the mapping displayed by GUI without affecting the stored mapping, and so ensuring mappings are only saved when the user is satisfied with them and instructs the tool to save them.
Applying the Mappings Once the user is satisfied they have defined all the necessary mappings they can instruct the tool to apply the defined mappings by clicking on the “Execute Mappings” button. Figure 5.16 shows the sequence diagram from applying the mappings. The MappingTab calls the MappingController’s applyMappings method. The MappingController then retrieves the source and target ontologies from the OntologyHolder, and applies the mappings as defined by algorithms 2 and 3. As each mapping is applied and new concepts added to the target ontology, the target ontology display automatically updates to display the new additions. Once all the mappings have been applied, the MappingController passes any error and warning messages that were generated to the MappingTab for display, and finally passes an appropriate summary message for display to the user.
5.4. MAKTab’s Knowledge Acquisition Tool
5.4
128
MAKTab’s Knowledge Acquisition Tool
Having completed the mapping stage, a focused knowledge acquisition process is then used to extend the knowledge available to the PS. This process uses the requirements of the PS, specified by its ontology, along with the knowledge gained about the domain from the mapping stage to guide the acquisition from a user of the additional rules the PS requires to work with the existing (and, if necessary, new) domain concepts. The Knowledge Acquisition (KA) (sub-)tool of MAKTab is responsible for interacting with the user to guide them through the definition of these domain specific problem solving rules. Currently the KA process interacts with a human user who is assumed to be capable of providing the required information. As with the mapping tool, the KA tool is loosely based on the Model-View-Controller architecture, with various patterns, particularly the Abstract Factory, used to facilitate use of Frame- or OWL-based ontologies. The model component is provided by the PS ontology, the programmatic use of which is provided by the relevant Prot´eg´e API, and additional classes specific to the KA tool, for example representing the rule KA sequences. The view component is provided by the KA tool’s GUI, and supports the user with browsing and selecting the mapped domain concepts, creating new rules, performing KA for a domain concept, browsing existing rules, editing an existing rule, and generating an executable version of the PS. The main responsibilities of the controller are managing the rule acquisition process and generating the executable PS. This section discusses the design of MAKTab’s rule KA tool, describing how the main operations are implemented. As discussed in section 4.7, the KA tool uses meta-information to guide the user when defining a sequence of rules relevant to a particular domain concept, and also to suggest and prompt the user to provide values for the various rule atoms (antecedents and consequents); this is discussed in more detail in section 5.4.1. The KA tool also supports the creation of extra domain knowledge, (section 5.4.2); along with the generation of an executable version of the user’s PS (section 5.4.4); section 5.4.3 discusses issues relating to the design of the KA tool’s GUI.
5.4.1
Focused Rule Acquisition
Each generic PS ontology uses the SWRL formalism to describe the type of rules that it uses to work in a domain, as discussed in section 4.3.2. From the user’s perspective, the KA tool supports the creation of rules by asking for the rules related/applicable to a particular concept in an intuitive order, suggesting possible antecedents, consequents and values for their (the antecedents’ and consequents’) properties (which reduces the user’s workload and hence the time it takes to create a new rule). Further, MAKTab can generate an executable version of the rules at the end of the process (which means the user does not necessarily require knowledge of appropriate programming languages). From the system’s perspective, along with providing these facilities, it provides further support by storing and maintaining the rules (that is, the individuals that define a rule) correctly in the ontology. The rule KA process is performed in the context of a PSConcept, which was acquired through mapping or created directly by the user. When instructed to start the rule definition process for a particular PSConcept, the KA tool uses two pieces of meta-information provided by a PS’s ontology to drive the acquisition of the rules for that concept. Firstly, an individual of the ProblemSolver class provides a list of the rule types that KA should start with; essentially
5.4. MAKTab’s Knowledge Acquisition Tool
129
each of these rules are the first in a sequence of related rules that will be acquired. Secondly, meta-information about each rule type is provided by individuals of the RuleMetaClass; this includes the types of PSConcepts that the rule type is suitable for and a list of its related rule types. This meta-information is used to build the list of rule KA graphs (section 4.7.1) which are used to drive the KA process. As discussed in section 4.7.1, each node in a rule KA graph represents a particular rule type; directed arcs between the nodes represent the relations between the different rule types. Briefly, consider two rule types, rt1 and rt2, where the RuleMetaClass individual associated with rt1 specifies that rt2 is a related rule type of rt1; this is represented in a rule KA graph by two nodes, node1 and node2 which are joined by an arc from node1 to node2. The nodes representing the initial rule types for each graph (as specified by the ProblemSolver individual) are stored as the first rule type to be acquired for that particular graph. When the KA process is invoked for a particular domain concept, selectedConcept, algorithm 4 is used by the KA tool to build the rule KA graphs. A walk-through example of algorithm 4 and the other KA algorithms is provided in appendix B. The algorithm used by the buildGraphFor method is based on a recursive depth-first traversal algorithm which performs two tests before visiting the next node/rule type: 1) is the rule type applicable to the concept that KA is being performed for (as defined by the PS ontology), and 2) does a node already exist for that rule type. The first test ensures the KA process remains focused on the selected concept and that the user is not asked to define irrelevant rules; the second test prevents infinite recursion. Algorithm 4 creates a series of KaGraphNodes, one for each rule type that is applicable to the concept KA is being performed for (selectedConcept), and returns a list of the KaGraphNodes for those rule types specified as the initial rule types by the PS ontology. A list of all the KaGraphNodes that have been created is maintained (in the nodes variable), and whenever a KaGraphNode is required for a particular rule type, algorithm 4 first checks if a KaGraphNode has already been created and stored in nodes: if one has, then it is retrieved from nodes otherwise a new KaGraphNode is created, added to nodes, and then the graph of related rule types is built for that new node. This means that only one KaGraphNode instance is created for every rule type, and so the recursive process of building the graph of related rule types is only performed when it has not previously been performed. This prevents algorithm 4 from entering any infinite loops that would prevent the algorithm from terminating, and so the algorithm always terminates after the final initial rule type has been processed. The purpose of algorithm 4 is to return a list of KaGraphNode instances, where each KaGraphNode is associated with a rule type applicable to selectedConcept, and has a list of next (or related) KaGraphNodes (as the value of its nextRuleTypes property) representing the appropriate rule types as defined by the PS ontology. These conditions are applicable to every KaGraphNode instance returned by algorithm 4. This is achieved by algorithm 4 as a KaGraphNode is only added to the returned list (kaGraphs) (line 10) inside an if statement (line 2) which checks the rule type associated with the KaGraphNode is applicable to selectedConcept. The buildGraphFor method performs a similar check (line 2) on every next rule type of the rule type that the graph is being built for, this ensures KaGraphNodes are only added to n’s nextRuleTypes property (line 5 or 9) if the rule type is applicable to selectedConcept.
5.4. MAKTab’s Knowledge Acquisition Tool
130
Algorithm 4 Building the Rule KA Graphs. Input ir, the list of initial rule types to acquire, selectedConcept, the concept that KA is being acquired for nodes, The (initially empty) list of all nodes that have been visited/created, where every node has properties ruleT ype which stores the rule type associated with the node and nextRuleT ypes which stores the list of rule types to be acquired after ruleT ype kaGraphs, the (empty) list of rule KA graphs which the user will be guided through Output kaGraphs, the (populated) list of rule KA graphs which the user will be guided through Begin 1: for every type of rule, r in ir do 2: if r is applicable to selectedConcept as defined by the PS ontology then 3: if nodes contains a node for r then 4: Get that node, node 5: else 6: Create a new node, node 7: node.ruleT ype ⇐ r 8: Add node to nodes 9: buildGraphFor(node, selectedConcept) 10: Add node to kaGraphs End buildGraphFor (KaGraphNode n, Object selectedConcept) Input n The KAGraphNode that the graph should be built for selectedConcept, the concept that KA is being acquired for nodes, the list of all nodes that have been visited/created Output n with the value of n.nextRuleT ypes property set as appropriate nodes, possibly extended with new nodes Begin 1: for every next rule type, nr, for n.ruleT ype do 2: if nr is applicable to selectedConcept as defined by the PS ontology then 3: if nodes contains a node for nr then 4: Get that node, nn 5: Add nn to n.nextRuleT ypes 6: else 7: Create a new node, nn 8: nn.ruleT ype ⇐ nr 9: Add nn to n.nextRuleT ypes 10: Add nn to nodes 11: buildGraphFor(nn, selectedConcept) End
5.4. MAKTab’s Knowledge Acquisition Tool
131
The time taken by algorithm 4 is the time required to create a KaGraphNode for every rule type applicable to selectedConcept plus the time taken to build the graph for each of these rule types. As the graphs are only built once for each rule type, building a graph for a particular rule type involves creating or retrieving the KaGraphNode for every next rule type and adding them to the nextRuleTypes property, and if a new KaGraphNode was created, building the graph for that new node. Therefore the time taken to build a particular KaGraphNode is the number of next rule types for that KaGraphNode, and the time taken to build all the KaGraphNodes is the sum of this for all rule types applicable to selectedConcept and referenced (directly or indirectly) as a related (next) rule type of one of the initial rule types. Once the list of rule KA graphs have been created, the user is sequentially prompted to define rules relevant to each graph. The control for this is shown in algorithm 5 (a walk-through example of algorithm 5 and the other KA algorithms is provided in appendix B). Initially the user is prompted to define a rule of the type that is stored as the first initial rule type in the list of rule KA graphs. Once the user has defined a rule of that type, they are asked to define a rule for one of the related types, and so on until rules have been defined for all the types in that graph. (If, at any point, there is more than one related rule type, the user is asked to select which type of rule they wish to create.) When a rule has been defined for every node in a path through the graph, the user is then presented with a range of options as determined by determineNextOptions(). The purpose of algorithm 5 is to select the type of rule the user will define next, create a new individual of that rule type, and display the new individual to the user, or to finish the KA process if appropriate. When the user is starting the KA process for a new concept, the algorithm simply creates the appropriate KA graphs, displays a new instance of the first rule type, and terminates. When the user is in the process of defining rules, the acquiring next rule part of the algorithm is executed. This part of algorithm 5 always terminates as it simply determines if there is a next rule type(s) for the rule just defined, and if there is, it selects the type of rule that will be defined (either automatically or by asking the user), creates a new rule of that type, and displays the rule to the user. If there are no next rule types for the current rule, then the determineNextOptions method (defined by algorithm 6) is used to present the user with a range of options including to finish the KA process. When the user is starting KA for a new concept, algorithm 5 will terminate after displaying a new rule individual of the type of the first rule specified as an initial rule type applicable to selectedConcept, which is the correct output for this stage. When the user is in the process of defining a series of rules, algorithm 5 determines the type of the next rule to be defined (either automatically or by asking the user), creates a new rule individual of this type, and displays it to the user, which is the correct output for this stage. Before determining the type of the next rule, algorithm 5 first checks if the user has selected another concept, and if they have, it asks if KA should continue for the previous concept (selectedConcept) or start for the newly selected concept; if the user wishes to perform KA for the newly selected concept, then the control flow reverts to the processes for starting KA for a new concept, which is the correct output for this condition. As these correctly handle all the different input states that can be passed to algorithm 5 (i.e. selectedConcept is either a new concept or the same concept that KA is being performed for), algorithm 5 always terminates with the correct output.
5.4. MAKTab’s Knowledge Acquisition Tool
132
Algorithm 5 The KA Control Algorithm. When starting KA for a new concept Input selectedConcept, the selected domain concept that KA should be performed for Output currentRule, a new rule which is displayed to the user on the KA tool’s interface. Begin kaGraphs ⇐ build the list of the KA graphs using algorithm 4 f irst ⇐ the first rule graph node in the kaGraphs list rule ⇐ a new rule of type f irst.ruleT ype currentRule ⇐ A new rule graph node with currentRule.rule set to rule and currentRule.typeN ode set to f irst Display currentRule.rule to the user End Acquiring the next rule When user requests to define the next rule, determine the appropriate type, create a new rule, and display it to the user: Input selectedConcept, the selected domain concept that KA should be performed for currentRule, the rule that the user has just defined Output A new rule displayed to the user on the KA tool’s interface. Begin if user has selected another concept then Ask if KA should be restarted for that new concept, or continue for selectedConcept if KA should be restarted then selectedConcept ⇐ the newly selected concept Start KA for selectedConcept as defined above return Create variable tempT ypeN ode of type KA graph node, set it to null if length of currentRule.typeN ode.nextRuleT ypes is 0 then tempT ypeN ode ⇐ determineNextOptions(currentRule) else if length of currentRule.typeN ode.nextRuleT ypes is 1 then tempT ypeN ode ⇐ the first KA graph node in crn.typeN ode.nextRuleT ypes else {length of currentRule.typeN ode.nextRuleT ypes is > 1} tempT ypeN ode ⇐ Ask user to select type of the next rule from currentRule.typeN ode.nextRuleT ypes if tempT ypeN ode is not null then nextRule ⇐ new rule of type tempT ypeN ode.ruleT ype nrn ⇐ new rule graph node nrn.rule ⇐ nextRule nrn.previousRule ⇐ currentRule Add nrn to currentRule.nextRules currentRule ⇐ nrn Display currentRule.rule to the user End
5.4. MAKTab’s Knowledge Acquisition Tool
133
When KA is being performed for a new concept, the main processing performed is by algorithm 4 when it builds the KA graphs, and so the time taken to execute is that of algorithm 4. When KA is continuing for the selectedConcept, then the time taken for the acquiring next rule code to execute depends on the number of the next rule types for the rule that was just defined. If there is at least one next rule type, then the time taken is constant for all inputs and current rules, when there are no next rule types, then the main time taken is that of algorithm 6 when it determines the options for which action to perform next. The determineNextOptions() method (outlined in algorithm 6) is used to provide the user with a list of next actions to choose from. These include going back and creating additional related rules for rules previously defined, going back and creating similar rules to those already defined, defining another set of rules for the current rule KA graph (for the selected concept), moving on to the next rule KA graph (if one exists), or finishing KA for the selected concept (if there are no remaining rule KA graphs). The “create similar rule” option allows the user to define another rule with the same set of antecedents as an existing rule, but with an alternative set of consequents. This continues until all the KA graphs have been processed. Algorithm 6 The determineNextOptions() method used by the KA control algorithm. This method is called when there are no nodes in currentRule.nextRuleT ypes, and so must backtrack (if possible) to determine further options. Input currentRule, the rule the user has just defined. Output The KaGraphNode for the type of rule that should be defined next, or null if KA should finish. Begin options ⇐ a new list Add option to create a similar rule to currentRule.rule to options temprn ⇐ currentRule.previousRule while temprn not equal null do Add option to create a similar rule to temprn.rule to options for every KA graph node, kaN ode, in temprn.typeN ode.nextRuleT ypes do if temprn.nextRules contains a rule of type kaN ode.ruleT ype {i.e., a rule of type kaN ode.ruleT ype has already been defined} then Add option to define another rule of type kaN ode.ruleT ype related to currentRule.rule to options else Add option to define a rule of type kaN ode.ruleT ype related to currentRule.rule (mark as required) to options temprn ⇐ temprn.previousRule Add option to define another set of rules related to selectedConcept for this graph to options if currently performing KA for the last rule KA graph in the list then Add option to finish performing KA for selectedConcept to options else Add option to perform KA for the initial node for next rule KA graph in kaGraphs to options Ask the user to select an option from options and return the KaGraphNode relevant to the selected option, or null if the user selects to finish performing KA. End Algorithm 6 is invoked when the user has defined a rule of a type that does not have any next
5.4. MAKTab’s Knowledge Acquisition Tool
134
rule types associated with it, and is responsible for generating a list of next action options, based on the rules that have previously been defined, for the user to select from. As the user is defining the rules, the “next” and “previous” relationships between the rules are stored, and so the user is effectively creating a rule tree, with the first rule they defined for a certain KA graph forming the root node of the tree. Separate branches are then added to this tree for each “next” rule that is created (for the first rule); this process then repeats recursively with each “next” rule that is defined. The only potential part of algorithm 6 which would cause it not to terminate is the while loop (line 4), which is used to move back up the rule tree from the rule they have just defined (which would be a leaf node of the rule tree) back up to the tree’s root node. The assignment at line 14 of temprn to the “previous” rule, moves the temprn pointer back up the tree one step. This continues until the root node is reached, at which point temprn is set to null at line 14 (as the root node (rule) does not have a “previous” rule), this causes the test in the while loop to fail, causing the flow of control to exit the while loop and continue to termination. The purpose of this algorithm is to produce a relevant list of next action options for the user to select from, to present this list to the user, and return the user’s selection. Relevant actions are judged to be actions which are somehow related to the rules the user has defined previously for the KA graph they are defining rules for. For example, a related action would be to define a rule similar to one previously defined or to define a new rule related to one previously defined, to move on to the next KA graph or finish performing KA for the selected concept as appropriate. Recursive use of the value of the previousRule property of the RuleGraphNode starting with the instance passed to algorithm 6 ensures that only options relevant to rules previously defined are added to the list of options presented to the user. Line 5 provides the option of adding a similar rule to one previous defined, and the for loop (line 6) provides options for defining a new rule of a type related to the rule that temprn points to at that given time; the if statement at line 7 ensures the appropriate option of creating a new rule or creating another rule of a particular type is added to the list of options. Once all the relevant options have been added, the while loop exits and another two options are added (one for performing KA using the same KA graph and one for either moving to the next KA graph or finishing KA) and presented to the user. The user is then prompted to select an option, and either the appropriate KaGraphNode is returned if the user selects to create a new rule or null is returned if the user selects to finish KA. The time taken for algorithm 6 to execute is determined by the number of RuleGraphNodes between the rule passed to the algorithm and the root node of the rule tree that has been created. Every rule is stored in the PS’s ontology. The rule KA graphs are used to guide the rule creation; the KA tool’s controller also maintains graphs of the rules that have been defined. Each node in these rule graphs represents a rule, and stores a reference to the rule in the ontology, a reference to the KA graph node that represents the rule’s type, a reference to the related rule defined previously, and references to those defined immediately after (an example is provided below). This rule graph is used when performing KA, typically to determine the direct predecessors of a rule: the direct predecessors are all the previous rules of the current rule, eventually linking back to the first rule defined in the KA graph. For example, consider the rule graph created by the KA tool when defining the assignment, constraint, and fix rules for the required-motor-horsepower from figure 4.17, which were
5.4. MAKTab’s Knowledge Acquisition Tool
135
Figure 5.17: Example rule graph for the assignment, constraint, and fix rules defined in figure 4.17.
used to demonstrate the KA process in section 4.7.1; figure 5.17 provides a visualisation of the rule graph produced when defining these rules. The first node to be created is “Rule Node 1”, which stores references to the assignment rule (SystemVariableValueAssignmentRuleInd 1) and the KA graph node that defines the rule’s type (SystemVariableValueAssignmentRule). When the user creates the next rule, “Rule Node 2” is created to store the details associated with SystemVariableConstraintRuleInd 1; similarly when the user creates the next rule, “Rule Node 3” is created to store the details associated with the first fix rule (FixRuleInd 1). The user then opts to acquire a similar rule, i.e. to create another rule associated with “Rule Node 2” and so “Rule Node 4” is created and appropriate edges between these two nodes are created. If, after defining FixRuleInd 2, the user wishes to acquire the next rule, they are presented with the options of: defining a rule similar to FixRuleInd 2, defining another fix rule for SystemVariableConstraintRuleInd 1, defining another constraint rule to be associated with SystemVariableValueAssignmentRuleInd 1, using KA Graph 3 to define another set of rules for required-motor-horsepower, or finishing the KA process for required-motor-horsepower. Figure 5.18 outlines the classes involved in the rule KA process; figure 5.19 provides a sequence diagram showing how these classes interact when the user has selected a concept and instructs the tool to start the rule KA process. First the KaController builds the list of rule KA graphs for the selected concept. This involves using the KaOntologyProcessor to get the list of initial rule types and related rule types when appropriate. The KaController then selects the first node from the first rule KA graph, and instructs the KaOntologyProcessor to create a new rule individual of that type. The KaOntologyProcessor creates the relevant individuals in the PS ontology, returning the swrl:Imp individual representing the rule, which the KaController passes to the GUI to be displayed. The KaController also stores the concept that KA is being performed for, the list of rule KA graphs, the rule KA graph currently being used, and the rule node that stores the rule currently being defined. The
5.4. MAKTab’s Knowledge Acquisition Tool
136
Figure 5.18: Class diagram of the classes involved when rule KA is started for a particular concept.
KaOntologyProcessor interface defines a convenience class which provides ontology related operations required by the KA tool; the KaController selects (at initialisation time) an appropriate KaOntologyProcessor implementing class for the type of PS ontology being used, allowing Frames and OWL ontologies to be accessed seamlessly by the other KA related classes. The interactions among the classes are similar when the user wishes to define the next rule. The KaController receives the request from the GUI and determines the type of the next rule, either automatically or by asking the user. A new rule of that type is then created in the ontology (using the KaOntologyProcessor) and is displayed by the GUI. When the user creates a new rule, the KA tool automatically creates the relevant individuals for the rule in the PS’s ontology (individuals are required for the rule, its lists of antecedents and consequents, and each antecedent and consequent). The KA tool also attempts to add an antecedent and consequent automatically based on the domain concept KA is being performed for and the new rule’s previous rule (when one exists). When the rule is the first for a rule KA graph, the KA tool looks at the permitted antecedent types to determine if the selected domain concept can be set as the value of a property of one of the permitted types. If it can, then a new individual of that atom type is created with the appropriate property value, and is added as the first antecedent of that rule. A consequent is also added to the new rule by using a similar process. When a new atom is added to the rule, the tool attempts to set the value of that atom’s properties in a similar manner. If the atom type being added is in the domain of an object property whose value can be set to the selected domain concept then it is set accordingly; if any of the remaining object
5.4. MAKTab’s Knowledge Acquisition Tool
137
Figure 5.19: Sequence diagram showing the interactions among components for starting the KA. properties have their range restricted to one particular class, a new individual of that class is created and assigned as that property’s value; the same technique is then applied to any object property associated with that new individual. This process of automatically setting property values, where possible, is performed as it further reduces the user’s workload, leaving the user to complete the rule by defining the remaining values. When the new rule has a related previous rule, if the restrictions on the antecedents of the new rule allow the consequents of the previous rule to be set as the antecedent value, then the tool creates a copy of the consequents which it sets as the antecedents of the new rule. For example, in the PS(pnr, -), the constraint rule has a series of violations as its consequents; the next rule in the KA process can be a fix rule, which uses a series of violations as its antecedents. As shown in figure 4.19, during the KA process, the violations (consequents) of a constraint rule defined by the user are also set as the antecedents of a (related) fix rule. A similar relationship is found in PS(diag, -), in which the cause rule, which is related to itself, uses a series of system malfunctions for both antecedents and consequents, so when the user specifies a rule such as “state A can be caused by state B” and wishes to define a cause rule for B, the tool automatically inserts state B as the antecedent of the new rule.
5.4.2
Addition of New Domain Concepts
During KA, it may be necessary to supplement the mapped domain knowledge with details of additional domain concepts. If these concepts are present in the domain ontology used during the mapping stage, it is possible for the user to go back to the mapping tool and perform the relevant mapping(s). Alternatively, if the concepts are present in another domain ontology, the user can load that (new) ontology into the mapping tool and perform the relevant mapping(s). These newly mapped domain concepts can then be used with the KA tool. Further, if no available ontology defines the required concepts, the user is free to add them directly to the relevant part of the PS
5.4. MAKTab’s Knowledge Acquisition Tool
138
ontology. This can be accomplished using either the relevant tabs of Prot´eg´e (for example, the Clases tab, Properties tab, and Individuals tab) or buttons from the KA tool which provide access to operations such as creating and editing classes, properties (and defining their domain and range values) or individuals using the standard Prot´eg´e widgets. To further encourage reuse, any new domain concepts added to the PS ontology by the user should be added to the domain ontology (using the mapping tool), extending its representation of the domain to produce a more complete model. Ideally this would be achieved automatically; however, as discussed in section 4.6.3, with the possible exception of new individuals this is currently not possible.
5.4.3
Interface Design Issues
The KA tool’s interface supports the user’s interactions with the tool throughout the rule creation process. It is important therefore, that the interface is relatively simple and free from the technical jargon used by the PS ontology (that the rules are being defined against), yet customisable by PS developers to allow the tool to provide maximum support. The following functional requirements were determined for the interface: display, selection, and editing of the domain concepts in the PS ontology, creation of new domain concepts in the PS ontology, invoking the guided KA process for a domain concept, defining a new rule, display, selection, editing, and deletion of existing rules in the PS ontology, and generation of an executable KBS. To meet these requirements, the KA tool’s interface is split into four areas for domain concepts, existing rules, editing a rule, and general functionalities. A screenshot of the KA interface is provided in figure 5.20. Briefly, the top left segment (labelled “1” in figure 5.20) deals with the domain concepts: allowing the user to browse, select, and edit existing concepts and create new ones; the top right segment (labelled “2” in figure 5.20) deals with the existing rules: allowing the user to browse, select, edit, and delete an existing rule; the centre segment (labelled “3” in figure 5.20) deals with displaying the rule that is currently being defined/edited; and the buttons in the bottom segment (labelled “4” in figure 5.20) provide access to generic functions such as generating the executable KBS. As shown in figure 5.19, when the user selects a domain concept and clicks on the “Start KA for” button (in the segment labelled “1” in figure 5.20), the KaTab invokes the KaController’s startKaFor method. This method uses algorithm 5 which, in turn, builds the set of applicable KA graphs for the selected domain concept using algorithm 4, creates a new rule individual of the first rule type returned by algorithm 4, which is then displayed to the user. Once the user is satisfied with the rule they have defined and clicks on the “Acquire Next Rule” button (in the segment labelled “3” in figure 5.20), the KaController uses the acquiring the next rule part of algorithm 5, which, if necessary, uses algorithm 6, to determine the type of the next rule that will be defined. The interface of the KA tab is built at run-time using the same technique as the mapping tab: an Abstract Factory/kit class has been defined, which (the appropriate subclass of) is responsible for returning the GUI components that display and access the concepts from the PS ontology, namely the domain concepts, existing rules, and the current rule (the generic function panel is built by the tab as its GUI components do not directly use the PS ontology).
5.4. MAKTab’s Knowledge Acquisition Tool
139
Figure 5.20: A screenshot of the KA tool’s interface.
Displaying Domain Concepts The KA process is based around acquiring rules for each of the relevant domain concepts, so the user must be able to view and select the domain classes and associated individuals, and if necessary, edit them or create new ones. Figure 5.21 shows a screenshot of the KA tool’s domain concept display. The area at the top left displays the domain concepts: the combobox lists all the domain classes (the PSConcept class and its subclasses) (figure 5.21a); when the user selects one of these classes the individuals of that class are displayed (figure 5.21b). The list of individuals use the browser text value associated with the class to provide a description of the individuals associated with the class. The browser text pattern is a feature of the Prot´eg´e Individuals tab which allows the user (or developer) to specify which property values should be used to describe an individual of a particular class along with any text that should appear between each property value. This facilitates a more user-friendly description of the individual than provided by, for example, the individual’s name; the browser text value is the application of that pattern to a given individual. The text field at the top right provides a filter: when the user types in the text box, those individuals whose browser text value does not contain the text entered by the user are removed from the list of individuals (shown in figure 5.21c). This allows users to rapidly find specific individuals in the list. The buttons below the individual list allow the user to create a new individual of the selected class, edit the definition of the selected class, edit a selected individual, create a new domain
5.4. MAKTab’s Knowledge Acquisition Tool
140
concept class, and invoke the KA process for the selected concept. The selected concept is either the selected individual or, if the user ticks the tick box above the buttons, the selected class.
Displaying Existing Rules It is also important for the user to be able to return to a rule defined previously, either to edit it, delete it, or to define further related rules. Figure 5.22 shows the current existing rule display component. The display is similar to the domain concept display: the combobox at the top lists the different types of rules provided by a PS (the subclasses of the swrl:Imp class) (figure 5.22a). As the name of a rule type can often appear cryptic to someone not familiar with the PS, one kind of rule meta-information that the PS developer can provide is an alternative label, or title, for a type of rule, which is used when displaying the different rule types. For example, the purpose of the OutputCalculatedSystemVariablesRule rule type from PS(pnr, -) (section 4.5.1) is not necessarily clear from the name alone; using the meta-information to provide an alternative label means that the list of rule types displays “Select SystemVariable(s) for display” which is considerably more understandable to the novice user. When a rule type is selected, the individuals (defined rules) of that type are displayed (figure 5.22b). This list of individuals also uses the browser slot feature to allow the PS developer to provide a better description of the rule. For example, consider a fix rule from PS(pnr, -): without using the browser slot feature, this rule would likely be displayed as “ontology-name Fix Rule 004” by using the browser slot feature, this rule can be displayed as “If violation X is present then apply fix Y” where X is the name of a violated constraint, and Y provides the details of a fix (again X and Y are individuals which could be described with browser slot values). To support the user to quickly and easily find a rule they have previously defined, the text field at the top right provides a filtering functionality for the list of displayed rules (figure 5.22c): for example, typing a component name into the filter text field will result in only rules which reference that component being displayed. Once the user has selected a rule, they have the option of editing it or deleting it using the appropriate button.
Defining a Rule Relevant widgets from the Prot´eg´e API are used to display the property values for the rule’s atoms. These widgets allow the user to change the property values as appropriate. When the user changes any of these values, the widgets automatically update the relevant individuals in the ontology ensuring that the model of the rule (the individuals in the ontology) is consistent with the displayed rule. To display a rule, the KA tool’s GUI is passed the individual from the PS ontology (created by the KaOntologyProcessor) that represents the rule; displaying the rule then becomes a task of displaying the associated concepts from the PS ontology (when the user is editing the rule they are simply creating a series of new individuals in the PS ontology). This allows the GUI component for defining/editing a rule to be generated based on the class definitions in the PS ontology. The current GUI components for defining a rule split the screen into five rows, as illustrated in figure 5.23. Rows one, two, and four display text provided by the meta-information associated with the rule type: row one displays a description of the rule type; row two displays a description of the antecedents, a more friendly alternative to the “IF” part of the “IF X THEN
5.4. MAKTab’s Knowledge Acquisition Tool
Figure 5.21: Screenshots of the KA tool’s domain concept display.
141
5.4. MAKTab’s Knowledge Acquisition Tool
Figure 5.22: Screenshots of the KA tool’s display of existing rules.
142
5.4. MAKTab’s Knowledge Acquisition Tool
143
Figure 5.23: Layout of rule definition GUI component.
Y” rule; row four displays a similar description of the consequents, this time an alternative to the “THEN”. Row three displays the rule’s antecedents, and row five displays the rule’s consequents. The list of antecedents and consequents are displayed using the same technique: the first atom is added at the top of the relevant row in the GUI, subsequent atoms are added below the previous, with the text “and” between every sequential pair of atoms to indicate that all the antecedents must be satisfied (which will cause all the consequents to be satisfied). The display for every atom is derived by the browser text pattern (of the atom type) and the associated meta-information; buttons to remove the atom and change the type of the atom are also provided. The browser text pattern associated with the atom is used to determine which properties (and any associated text) of the atom are displayed. This allows the display to be customised for different PSs by the developers by setting the browser slot pattern for their relevant classes. Individuals of the PropertyDisplayMetaClass are then used to determine a label for every property of the atom that is displayed; this allows a more user friendly description of the property to be displayed to the user. The labels for the relevant properties are added to the display above the property value. The rule shown in figure 5.24a illustrates how this meta-information is used to provide a clearer interface. Figure 5.24b shows the same rule displayed using the class and property names of the concepts in the ontology, I maintain that this format is less intuitive than the technique used, as it displays ontology related jargon to the user. The arrows to the right of the property labels (in both figures 5.24 a and b provide the user with access to the relevant options for setting the property value (a popup showing these options is also displayed when the user right clicks on the property label). Further, when the value of a datatype property is being displayed, if the user double clicks on the value, the relevant component for changing the value is displayed. Note, as this is essentially an individual visualisation technique, the same technique is used to display the value of object properties. To support the user identifying how properties are related, a border appears when the user moves their mouse over a property which highlights the value of that particular property. All the properties of a particular individual have the same border colour, to indicate that they belong to the same individual. The border colour of the outermost property is dark, and the colour gets progressively lighter as the property value being bordered “gets deeper” into the definition of the atom, until almost white is reached, at which point the colour gets progressively darker until the original colour is reached and the cycle continues. This process is illustrated in figure 5.25 a through to g.
5.4. MAKTab’s Knowledge Acquisition Tool
144
Figure 5.24: Example screenshots of the rule definition GUI component.
Selecting the Rule Type According to algorithm 5, when the user wishes to define the next rule in a sequence, if there is more than one type of possible rule, then the user is asked which type of rule they would like to define. To support the user with this selection, the KA tool uses the rule meta-information provided by the PS’s ontology to provide a description of each rule’s purpose, antecedents, and consequents. An example of the rule type selection dialog box is shown in figure 5.26, where the user is configuring PS(diag, -). A description of the rule which the user has just defined is provided at the top of the dialog, with the list of rule types displayed on the bottom left and details about the selected rule type displayed on the bottom right. The list of rule types displays the human friendly rule name taken from the RuleMetaClass individual associated with the rule. When a rule type is selected, the fields on the right are updated to display the rule, antecedent, and consequent descriptions (which are provided by the associated RuleMetaClass individual).
5.4. MAKTab’s Knowledge Acquisition Tool
145
Figure 5.25: Screenshots showing how property values of an individual are highlighted in the rule editing display.
5.4. MAKTab’s Knowledge Acquisition Tool
146
Figure 5.26: Sample screenshot of the rule type selection dialog.
Selecting an Atom Type When the user is defining a rule, it may be necessary to change the type of an atom (antecedent or consequent) that has been added to the rule (if, for example, when defining a CauseRule for PS(diag, -), MAKTab automatically adds a consequent of a type SystemVariableOrPropertyIndividualAtom when the user wishes to add an antecedent of type SystemVariableOrPropertyDatatypeAtom). As it can be hard to determine the purpose of an atom type from the class name, MAKTab uses the meta-information provided by the AtomMetaClass individuals to support the user with selecting the (correct) atom type. An example of the atom type selection dialog box is shown in figure 5.27, where the user is adding an antecedent to a SystemVariableValueAssignmentRule. The list of atom types that can be added is displayed on the left of the dialog in the “Suitable Types” list; each item in this list is the display name for an atom type taken from the associated AtomMetaClass individual. The actual atom type (class) name of the selected atom type is displayed in the “Name” field, and a description of the atom type’s purpose is displayed in the “Description” field.
5.4.4
Converting SWRL Rules into Executable Form
Once the user has defined the domain specific problem solving rules (which are stored in the PS ontology), MAKTab can convert them into an executable form. To achieve this, MAKTab uses a rule generator, which is a Java class, provided by the PS developer (and named in the ProblemSolver individual of the PS ontology) that is responsible for producing an implementation (currently in JessTab code) of the rules defined by the user. This generated code (for example, PS-RS(pnr, [elevator])) is combined with any generic PS code (for example, PS-RS(pnr,
5.4. MAKTab’s Knowledge Acquisition Tool
147
Figure 5.27: Sample screenshot of the atom type selection dialog. -)) (also provided by the PS developer) to provide the reasoning component of the new KBS. Figure 5.28 shows the sequence of class interactions for rule generation. When the user presses the relevant button on the GUI, the KaController is instructed to generate the rules. The KaController first uses the KaOntologyProcessor to get the (Java) class name of the rule generator from the PS ontology and passes the name to the RuleGeneratorFactory. The RuleGeneratorFactory then uses the Java Reflection API to load the rule generator class and create a new instance of it, which is returned to the KaController. If the rule generator is not successfully loaded, then a help message is displayed to the user, otherwise the rule generator instance is passed the PS ontology and the associated KaOntologyProcessor which it uses to generate the domain specific PS code. This code is returned to the KaController which passes it to the GUI for display to the user. The user is then able to copy and paste the code into the relevant tool (for example, JessTab) to run the KBS.
Figure 5.28: Sequence diagram for converting the rules defined by the user with MAKTab into an executable format. As the rule generator is named in the PS ontology, the developer of a generic PS is free to provide their own rule generator. There are various reasons why this approach was chosen, rather than having MAKTab convert the (user defined) rules in a standard way for all generic PSs: • It is reasonable to expect there to be a high degree of coupling between the two pieces of PS code (the generic and the domain specific), with both influencing the other’s execution in the final system. Allowing a PS developer to provide their own transformation mechanism may ease the process of developing a new generic PS, particularly for PSs which require more than a simple transformation of the SWRL rules, as it allows for more flexibility in the
5.4. MAKTab’s Knowledge Acquisition Tool
148
type of constructs that can be generated from the user defined rules. For example, the rule generator for PS(diag, -) produces a series of facts that would not be generated by a generic transformation mechanism. Similarly, the assignment, constraint, and fix rules produced by the rule generator for PS(pnr, -) are enhanced with a field for associating the assignment, constraint violation, or fix with the design proposal being producing by the system at that time. As the rules that are generated for these two PSs are unique to each particular PS and substantially different from the SWRL representation, it is extremely unlikely that they would be produced by a generic rule generation mechanism. In certain cases it may be possible to provide the functionality of the unique items in a more complex generic rule set (for example, PS-RS(diag, -)); however, it may not always be possible and so it is important that the developer has the ability to provide their own rule generator if necessary. • Allowing custom rule generators makes it easier to reformulate existing PSs/KBSs into generic PSs which can be used by MAKTab. This was demonstrated with PS(pnr, -) (section 4.5.1) where the PS-RS(pnr, -) already existed, as did an example PS-RS(pnr, [elevator]). Building the generic PS(pnr, -) therefore involved modelling the rules in PS-RS(pnr, [elevator]) and creating a suitable rule generator. • Using a custom rule generator allows the PS developer to select which programming language/engine should be used for executing KBSs built using their generic PS. This is important as it allows developers to exploit benefits provided by other engines (which do not have to be linked to the Prot´eg´e environment) with respect to particular types of PS. For example, Runcie [96] demonstrated several advantages of using a CSP engine for propose-and-revise based KBSs compared to using CLIPS, including reduced execution time and reduced code size. When using a non-Prot´eg´e environment however, it may be the case that the environment does not support accessing and using domain knowledge stored in an ontology, and so it is therefore necessary to include all such knowledge in the generated code. When a developer is creating a new generic PS for use with MAKTab, they have three main options available regarding the development of the rule generator component: 1. Build a new rule generator from scratch. This option was chosen for the two existing rule generators. Although this is likely to result in the largest amount of work for the PS developer, it may also be the easiest option. Before a developer can create a new generic PS, they require a thorough understanding of the reasoning processes that will be used by that type of PS. As such, they will likely have an idea of how the generic rule set will work, and how they would like it to interact with the domain specific rule set; therefore, they are likely to have an idea of how they would like the domain specific rule set to work. This understanding of the PS’s reasoning processes should make it easier to write the generic rule set and corresponding rule generator. 2. Extend and/or adapt an existing rule generator. As described below, rule generators have currently been created for PS(pnr, -) and PS(diag, -). [98, chap. 6] highlights the high degree of similarity between the propose-and-revise algorithm and the algorithms of the assignment, planning, and scheduling PSMs; similarly, [98, chap. 6] also describes
5.4. MAKTab’s Knowledge Acquisition Tool
149
the diagnostic algorithm as similar to the algorithms of the classification, assessment, and monitoring PSMs; the existing rule generators should therefore serve as good starting points for new rule generators for other types of generic PSs in the same sub-class. 3. Reuse an existing rule generator without alteration. This is likely to be the most challenging approach for the developer of a new generic PS. This is because the generic rule set (for the new PS) will be required to use domain specific rules which provide the reasoning for one type of PS, to perform the reasoning for a different type of PS, which will likely use different reasoning knowledge and process(es); it is likely therefore, that it will be very challenging to perform the new type of reasoning with the domain specific rules generated for another type of reasoning.
Generating Propose and Revise Rules The PS(pnr, -) defines seven types of rules (section 4.5.1) which are converted in various ways by the rule generator into PS-RS(pnr, [domain]). The generated rules are similar to the domain dependent part of the original elevator diagnosis KBS (section 4.5.1). The simplest transformation is for the atoms in the initial set up rules, and the output rules: these are all converted into a single function which sets the initial selection of components and values of component properties, and system variables, as well as listing the names of components, component properties and system variables that should be displayed as output. The remaining rules are then generated; the SystemVariableValueAssignmentRules are then translated into corresponding assignment rules; the SystemVariableConstraintRules are translated into corresponding constraint rules; and the FixRules are converted into corresponding fix rules. Finally, two sets of rules are derived from the PS ontology and the SystemVariableValueAssignmentRules. A series of component property value assignment rules are generated based on the SystemComponents of the PS ontology, which ensure that when a particular component is selected by the propose-and-revise algorithm, the values for all of that component’s properties (stored by Jess’s working memory) are updated to the values of the newly selected component. This can result in the creation of numerous rules and, when using a large domain model, can easily result in thousands of lines of code being automatically generated. As this code is automatically generated based on individuals in the ontology, maintenance of the code should be performed by updating the ontology using Prot´eg´e and regenerating the code, and so is a relatively simple task, particularly when compared to searching through thousands of lines of code. Finally, sets of dependency rules are generated from each equation/calculation used by SystemVariableValueAssignmentRules to ensure that when a component is changed, all other values that depend on that value are updated/recalculated to reflect the new component selection. The requirements that PS-RS(pnr, -) places on PS-RS(pnr, [domain]) illustrate the importance of allowing different rule generators for different PSs. It is reasonable to expect that a generic rule translator would not have for example, provided rules in the correct format to work with the PSRS(pnr, -) as PS-RS(pnr, -) uses a specific fact structure for relating each proposed design state and the associated component property values, which must be used by the domain specific rules. Although it may be possible to write PS-RS(pnr, -) in such a way that it can use generically
5.5. Generating the Executable KBS
150
generated rules, this would have required a substantially more complex PS-RS(pnr, -) than the current version; moreover it would not have been possible to reuse components from the original elevator configuration KBS.
Generating Diagnostic Rules The PS(diag, -) defines two types of rules: the CauseRule, which is used to define that a system state can cause another system state, and the RepairRule, which is used to specify that a faulty system state can be repaired by performing a series of actions. These rules can be thought of as forming a cause and repair graph, where nodes are either system states or repairs, and arcs between nodes indicate either the conjunction of states causing another state or references to repairs. As discussed in section 4.5.2, the PS-RS(diag, -) makes use of this graph to query the user about the state of an artefact and attempts to infer a diagnosis and suggested repairs based on the observations provided. The PS(diag, -)’s rule generator is responsible for translating the domain specific PS rules provided by the user into the cause/repair graph that the PS-RS(diag, -) uses. The rule generator does this by building its own representation of the graph, based on the rules, and translating it into a series of deffacts representing the cause nodes, the associated symptoms (system states), and repairs, all of which are used by PS-RS(diag, -) to perform its reasoning.
5.5
Generating the Executable KBS
Generating the executable domain specific rules is effectively the final step in the development of a new, executable KBS. MAKTab displays the generated code to the user, which should, when passed to the appropriate tool (currently JessTab), provide the reasoning component of the new KBS which (in the case of JessTab), automatically uses the enhanced PS ontology as its source of domain knowledge. In the case of both PS(pnr, -) and PS(diag, -) the code returned to the user loads the generic PS code (PS-RS(pnr, -) and PS-RS(diag, -) respectively) from a file and enters it into JessTab, the domain specific PS code (PR-RS(pnr, [domain]) and PR-RS(diag, [domain]) respectively) is then passed to JessTab, and the KBS is executed. In the final KBS of figure 4.3, KBS(pnr, elevator), the original domain ontology (ONT(elevator, [diag])) has been extended with any additional domain knowledge that was created during the KBS development; this process can be achieved through use of the mapping tool with the enhanced PS ontology as the source ontology and the original domain ontology as the target ontology. This creates a domain ontology that contains the domain knowledge required by both the diagnostic and propose-and-revise PSs, ONT(elevator’, [diag, pnr]).
5.6
Summary
This chapter discusses the design of MAKTab, a Prot´eg´e tab plug-in designed to support users building and executing a KBS by configuring a domain ontology and generic PS to work together. To support this process, MAKTab consists of two tools: an ontology mapping tool and a knowledge acquisition tool (the designs of each tool are discussed in detail in sections 5.3 and 5.4 respectively), and has been designed to work with any ontology that can be loaded into Prot´eg´e. The ontology mapping tool supports the user with mapping knowledge from a domain ontology to a PS ontology. To offer maximum support, the mapping tool features an extendible range of mapping types, and is able to automatically suggest mappings to the user (the algorithm for suggesting
5.6. Summary
151
mappings is also easily extendible with new techniques). The scope and cardinality of mappings supported by the mapping tool are also discussed, along with issues regarding the mapping process and how the mapping tool attempts to maintain the consistency of mapped knowledge. Finally, the design features of the mapping tool’s interface, both in terms of what it looks like and how it is built at run-time are discussed. The Knowledge Acquisition (KA) tool is responsible for supporting the user with defining the rules that will enable the KBS to use the mapped domain knowledge. The guided KA process used by the KA tool is based on a series of KA graphs which capture the relationships between the different types of PS rules, combined with a depth first traversal of these graphs to give the process a “natural” flow. Rules are defined for one domain concept at a time, which allows the KA tool to offer more support by suggesting values for the antecedents and consequents of the rules. Along with the ability to create new rules related to a particular concept using the guided KA feature, the KA tool supports the user with adding new domain concepts, editing existing domain concepts, creating new rules individually (i.e. without using the guided KA feature), editing, and deleting existing rules. The tool can also generate an executable version of those rules for the user. The rule generation feature is also extendible: currently, MAKTab produces JessTab rules, however it could be configured to generate rules in any suitable formalism. These generated rules can then be combined with the generic rules provided by the generic PS, and the (now enhanced) PS ontology to produce a new KBS which can (currently) be executed using JessTab.
152
Chapter 6
Acquiring Problem Solvers from the Web 6.1
Overview
This chapter discusses PS2 R, (the Problem Solver Search engine and Repository) an initial system developed to support searching for generic PSs using Web based search and an associated repository. Section 6.2 introduces the proposed approach to searching the Web for PSs/programs written in JessTab, Jess, or CLIPS. Briefly, search terms provided by the user are passed to available standard Web search engines along with the requirement to only return files which have the “.clp” (standard JessTab, Jess, or CLIPS) or “.jess” (non-standard Jess) extension. These results are then parsed to ensure that they are a syntactically correct JessTab, Jess, or CLIPS PS/program, with all files that fail this test being discarded. The syntactically correct programs are then passed to Cjjtoe, a tool which first builds a list of the concept’s used by each program, and then attempts to infer subclass/superclass relationships between those concepts, which are then represented in an OWL ontology. PS2 R then uses these results to maintain a repository of PSs, which can be browsed through use of a tag cloud or searched using standard keyword based searching. Section 6.3 discusses the design and implementation of the tool, including the tool’s architecture (section 6.3.1); how general Web search engines are used to search for PSs, how their search results are parsed, and the ontology extraction mechanism which is then applied (section 6.3.2); and the PS repository, in terms of the database schema, how PSs are added to the repository, how it can be searched by the user, and the generation of a folksonomy of PS terms (section 6.3.3).
6.2
Acquiring Problem Solvers from the Web
Reuse-based KBS development methodologies depend on the availability of generic PSs and domain ontologies. As previously noted, there are a variety of ontology search engines which can be used to acquire a domain ontology; however, there is a lack of similar tools for acquiring PSs. A PS search engine and repository, PS2 R (the Problem Solver Search engine and Repository), was therefore developed to support the task of finding JessTab, CLIPS and Jess PSs/programs on the Web. We focused on these languages, as JessTab PSs can be used by the other tools developed during this work; CLIPS and Jess PSs/programs are also included as it is generally not possible to automatically determine which of the three languages a given program is written in as they use the same standard file extension, “.clp” (the tool also searches for “.jess” files in case developers choose to use this for Jess programs). The main advantage of PS2 R over conventional Web search engines for this task, is that PS2 R only returns syntactically correct programs, which should be executable without modification; this means the user does not have to be concerned with examining (a potentially large number of) programs and test each for their syntactic correctness. PS2 R is also
6.2. Acquiring Problem Solvers from the Web
153
able to provide additional information about each returned program, in the form of an ontology of the program’s concepts, to help the user gain an understanding of the program. The repository of PS2 R automatically indexes and stores programs returned by Web searches. The process of searching for PSs using PS2 R (outlined in figure 6.1) is: 1. The user defines a set of search terms/keywords defining terms they would expect their desired PS to use; these are submitted to the search engine. 2. The search engine then queries a general Web search engine1 for JessTab, CLIPS and Jess files (i.e. any file with the extension “.clp” or “.jess”) which contain the user’s keywords. Google performs its search, and returns the results to the search engine, in the form of a list of URLs.2 3. For every search result (URL): 3.1 The PS search engine downloads the file located at that URL. 3.2 This file is then parsed to ensure that it is a syntactically correct JessTab, CLIPS or Jess program: if it is not, then it is discarded and the next URL (search result) is processed. 3.3 The (syntactically correct) program is then passed to the ontology extraction module (discussed in section 6.2.1), which attempts to build an ontology of the concepts used by the program, producing an OWL-Lite XML encoding of the ontology. 3.4 The program is then added to the repository which stores: the original URL, the file name, the file’s content, the OWL-Lite XML encoding of the extracted ontology, the original search terms, and additional keywords extracted from the ontology (class and property names). 4. The results of the search are then presented to the user; these are the URL of the original file, the list of associated terms (with matched search terms highlighted), along with links to the cached file content and the extracted ontology. The user then browses the results, downloading any programs and/or associated ontologies that look appropriate to evaluate/reuse. The user can also search the repository using a similar keyword-based search or browse it using a tag cloud [121] based visualisation of PS keywords generated from the repository’s contents. This technique was chosen as tag clouds are a popular, easy to use, visualisation technique associated with the display of folksonomies [112], in which a series of tags, or keywords, associated with, for example a particular Web site, are displayed to the user. To improve the visualisation, tags are often displayed with different sizes to reflect, for example, the importance of the tag in that domain: in PS2 R, keywords associated with a higher number of PSs are displayed in a larger font than those associated with a lower number of PSs. A tag cloud was chosen to display the keywords extracted from/used by PSs in the repository as it provides the user with a quick overview of all the extracted keywords (and, in theory, an idea of the type of PSs contained in the repository) and an indication of how frequently a keyword occurs (illustrating the more/lesser popular types 1 2
Currently Google, via the Google Web Service Search API. This search is similar to OntoSearch’s [111] approach to searching for ontologies.
6.2. Acquiring Problem Solvers from the Web
154
Figure 6.1: General flow of searching the Web for PSs. of PS). The tag cloud is selectable by the user; when a tag is selected, all the PSs associated with that keyword are displayed. To further improve the search engine’s performance (and hence the support provided to the user), every program is analysed twice by PS2 R before being added to the repository and returned to the user. Firstly, every result returned by the general Web search is parsed to ensure it is a syntactically correct JessTab, CLIPS, or Jess program, if it is not, then is it discarded. Secondly, the extracted ontological representation of the knowledge declared in the program can be used by the user with suitable (ontology) visualisation tools, to help the process of understanding the reasoning constructs and domain knowledge used by the program, which may help with determining the program’s purpose. The rationale for PS2 R is based on three assumptions: 1) that (implementations of) useful JessTab, Jess, and CLIPS programs/PSs will be publicly available and are indexed by general Web search engines; 2) that these implementations will use terms representative of the type of reasoning they use and/or the domain they (currently) work in; and 3) that the user will be familiar with the vocabulary associated with their desired type of PS, for example, a person searching for a proposeand-revise algorithm will be aware of the terms such as constraint, fix, etc. Firstly, based on the popularity of JessTab, CLIPS, and Jess, it is reasonable to assume that some users will make their programs available on-line. Secondly, it is good programming practice to use terms representative of the function/domain of a program, and so, it is reasonable to assume that comments, variables, functions, rules, templates and slots defined in JessTab, CLIPS, and Jess programs will refer to the type of reasoning the PS provides and/or the domain it works in. Thirdly, it is reasonable to assume a user can define a set of terms related to a domain they are familiar with; they may also be able to provide terms relevant to the type of PS they are looking for; providing a list of popular PS terms, along with those used in previous PSM work (for example, from [98, chap.6]) should help with this task.
6.2.1
Extracting an Ontology from CLIPS, Jess and JessTab Programs
When using unordered facts, both CLIPS and Jess require the use of deftemplates to specify the structure of the data that the rules use. As discussed in section 2.6.4 and [36, chap. 6]) the deftemplate construct has a required structure, which provides various pieces of information
6.2. Acquiring Problem Solvers from the Web
155
about the deftemplate and any associated facts. This required structure can be used to infer possible taxonomic relations between different deftemplates. A possible relation between two deftemplates can be determined using the process outlined in algorithm 7. Essentially, if the slots of a deftemplate, template1, are a subset of the slots of another deftemplate, template2, then, based on the assumption that template2 inherits all of template1’s slots and extends the description with additional slots, the algorithm infers that template1 is a superclass of template2; if the two deftemplates contain the same properties, then the algorithm infers they are siblings; otherwise no relation is determined between them. If the program (being processed) uses defclasses, taxonomic relationships are typically already defined, and the task then involves determining possible superclasses for any deftemplates, based on the slot inheritance assumption. Having inferred taxonomic relationships between the deftemplates, it is then a trivial step to generate an OWL-Lite ontology; it is also possible to convert any deffacts (and definstances) into individuals of the relevant class in the ontology (the OWL-Lite formalism was chosen as OWL is the W3C standard for encoding ontologies and the OWL-Lite version supports the encoding of the ontologies produced by the ontology extraction technique). Obviously applying algorithm 7 to every pair of deftemplates would take exponential time, and so two preprocessing steps are performed before the algorithm is applied to make the task tractable; these preprocessing steps are discussed in section 6.3.2. Algorithm 7 Determining a taxonomical relation between two deftemplates. Input p1, the set of slots of template1 p2, the set of slots of template2 Output A taxonomic relationship between p1 and p2 if one exists Begin if p1 ⊂ p2 then template1 is a superclass of template2 else if p1 ⊃ p2 then template1 is a subclass of template2 else if p1 equals p2 then template1 is a sibling of template2 else there is no relationships between template1 and template2 End The extracted ontology provides a representation of the deftemplates and defclasses used in a JessTab, Jess, or CLIPS program, including an inferred taxonomy, with associated slots represented as properties, with the relevant domain and range values. The success of the algorithm (in terms of producing a useful ontology) is therefore dependent on the program that the ontology extraction process is being applied to: if the program does not use any deftemplates and/or defclasses, then the extracted ontology will not contain any classes or properties. If the program does use deftemplates and/or defclasses, then, if the PS is truly generic the extracted ontology will be an ontology consisting solely of the reasoning related concepts of the PS; otherwise the ontology will likely also contain domain concepts. The main purpose of the extracted ontology is to help the user determine the concepts used by the PS, which can be beneficial when
6.3. Design and Implementation of PS2 R
156
attempting to understand the PS and evaluate its suitability to the user’s task.
Design and Implementation of PS2 R
6.3
This section discusses the design of both the search engine and the repository components of PS2 R. Section 6.3.1 discusses the architecture of PS2 R, section 6.3.2 discusses the Web search and subsequent processing of results, describing how general Web search engines are used to search for PSs, how their search results are parsed, and the ontology extraction mechanism. Finally, section 6.3.3 describes the PS repository, in terms of the database schema, how PSs are added to the repository, how it can be searched by the user, and the generation of a folksonomy of PS terms.
6.3.1
The Architecture of PS2 R
As the author is familiar with the Java programming language, it was decided to implement PS2 R as a Java 2 Enterprise Edition (J2EE) [107] Web application (the standard way for building a Webbased application in Java). As with all J2EE applications [106, chap. 1], PS2 R uses a three-tier architecture featuring a client layer, an application layer, and a data layer, as illustrated in figure 6.2. J2EE applications are considered three-tier as they are distributed over three locations: the client machines, the J2EE server, and the database server. Clients (users) access PS2 R through the client layer using a Web browser. The main functionality of PS2 R is provided by the application layer, which is hosted on a J2EE Web Server which provides a Java Servlet container (currently Apache Tomcat v6 [35] is used, although this was chosen out of convenience as the author is familiar with using it, however any other suitable J2EE Web Server could have been used). The application layer uses a Webbased Model-View-Controller pattern [52] to provide a clean separation between the processing logic (provided by the controller) and the presentation logic (provided by the view) and the data modeling/accessing the database (provided by the model). The controller, which is implemented as a series of Java Servlets (the standard J2EE method for handling and processing HTTP requests and responses on a J2EE server [106, chap. 4]), is responsible for dealing with searching, processing search results, and maintaining the repository. The view, which is implemented as a series of Java Server Pages (JSPs) (the standard J2EE method for building dynamic Web pages [106, chap. 5]), is responsible for converting the results of an operation (typically search) into a Web page for display to the user. Along with modeling the data, the model classes, which are implemented as standard Java classes also handle interaction with the services provided by the data layer. The data layer uses a relational SQL database (currently a MySQL database [75], although this was chosen out of convenience as the author is familiar with using MySQL, however any other suitable SQL Database Server could have been used) to provide the PS repository. The PS repository was created to facilitate storing PSs and extra derived information about the PSs, such as ontologies describing the concepts they refer to and keywords used by the PSs.
6.3.2
Searching the Web for PSs
The main functionality of PS2 R is the search for PSs, which is supported by two types of keywordbased search: Web and repository (discussed in section 6.3.3). The Web search function searches the Web for (syntactically correct) JessTab, CLIPS, and Jess PSs/programs using Web Service APIs provided by general Web search engines, such as Google [45], Yahoo! [122], and MSN
6.3. Design and Implementation of PS2 R
157
Figure 6.2: Illustration of PS2 R’s three-tier architecture. [71]. The keywords the user provides to PS2 R are converted into an appropriate search query, which is then passed to the search engine; the results are then processed before being added to the repository. Figure 6.3 shows the UML class diagram of classes relevant to the Web search function. There are two main classes involved: the SearchController which controls the search; and WebSearcher, which defines the interface for every class that searches the Web. There are presently three Web search classes, one for Google (GoogleWebSearcher), one for Yahoo! (YahooWebSearcher) and one for MSN (MsnWebSearcher), although due to limitations of the search abilities provided by the Yahoo! and MSN search Web Services, only Google is currently used (both Yahoo! and MSN searches do not allow the restriction of results to only “.clp” or “.jess” files). The PsSearchResult class is used for representing the search results. The SearchController stores a list of the available WebSearchers which it uses when searching for PSs; this allows new search engines to be added when relevant APIs become available, as well as removing ones which are no longer available. Figure 6.4 shows the UML interaction diagram for the search. When instructed to initiate a Web search, the SearchController receives the user’s search terms (as an array of strings) and the type of file to search for (either “.clp” or “.jess”). The SearchController then iterates through its list of WebSearchers, calling each one’s search method (passing as arguments the array of keywords and the file type). This passes responsibility for using a specific engine to each WebSearcher, enabling differences in the APIs provided by different Web search engines to be easily accommodated. In figure 6.4, the GoogleWebSearcher creates an appropriate GoogleSearch, connects to the Google Search Web Service, and executes the query. For every result it receives, it creates a PsSearchResult instance, which it stores in a list. When all the (Web search) results
6.3. Design and Implementation of PS2 R
158
Figure 6.3: UML Class diagram of classes relevant to PS Web search.
have been processed, the connection to the Google Search Web Service is closed, and the list of PsSearchResults returned. After invoking each WebSearcher, the SearchController updates its own list of search results, before moving on to the next WebSearcher. To avoid duplication, the SearchController ensures that each stored result is from an unique URI: if two results share a URI, they are compared to ensure all the relevant information (url, name, filetype, code) is retained for that search result.
Figure 6.4: UML sequence diagram for PS2 R’s Web PS search using Google. Any PS search results which are not syntactically correct programs are removed from the list of results, to ensure that only runnable programs/PSs are retained by the system. Syntactic parsing is performed by the CjjtSyntaxValidator, which uses an extended version of Jess’s Rete class to test if each program is a syntactically correct JessTab, CLIPS, or Jess program. The search results are then passed to the ontology extraction tool, Cjjtoe (CLIPS, Jess, JessTab Ontology Extraction), which attempts to extract an ontology of the program/PS’s concepts. Before the search results are returned to the user, they are passed to the RepositoryManager, which adds them to the repository.
6.3. Design and Implementation of PS2 R
159
Extracting Ontologies from JessTab, Jess, and CLIPS Programs To support the user with evaluating a JessTab, Jess, or CLIPS program/PS, the Cjjtoe tool extracts the program’s concepts, and represents them as an ontology. This ontology can then be used with suitable tools to both visualise and browse the program’s concepts more easily than reviewing the actual program code. Every syntactically correct Jess or CLIPS program must declare the concepts that it uses; typically by defining deftemplates and/or defclasses. As JessTab creates the deftemplates for a JessTab program, deftemplates are not explicitly stated in the program code; it is however possible to infer the deftemplates from references to them within the rules. The ontology extraction technique provided by Cjjtoe uses a two-stage process to build an ontological representation of the unordered data used by a JessTab, Jess, or CLIPS program (henceforth referred to as “the program”). (If the program uses ordered facts, then no ontology can be extracted as there are no explicitly stated fact structures.) In the first stage the program is parsed to extract a list of classes and associated properties representing the program’s deftemplates and/or defclasses and slots; in the second stage the ontology’s taxonomical structure is inferred based on the domain values of the properties. Figure 6.5 provides example deftemplates for the elvis-models, doors, and motors concepts from the ONT(elevator, [pnr]) (section 4.5.3). Every deftemplate contains a name and a series of slots which describe various attributes of the template; for example the motors deftemplate contains slots for providing the horsepower, maximum current, model name, and weight. Every slot definition can contain additional information, such as its type (symbol, string, integer, float, number), valid values (for example, allowed strings or numeric ranges), default values, and cardinality. The first stage of the ontology extraction algorithm effectively converts these examples into the OWL ontology class and property definitions described in figure 6.6. The mapping between a deftemplate and an ontology class (as illustrated with, for example, the motors deftemplate in figure 6.5 and the motors class in figure 6.6) is straightforward: an ontology class is created for every deftemplate; any defclasses also map directly to corresponding new ontology classes. The mapping between the slots and ontology properties is summarised in table 6.1. All slots are mapped to datatype properties, as their value type can only be one of those defined in OWL (such as a string or integer). Where the slot’s definition specifies a type then that is used to determine the range of the property; the domain of the property is set to include the class that represents the deftemplate in which the slot is defined (for example, the horsepower slot of the motor deftemplate in figure 6.5 becomes the horsepower datatype property with the motor class as its domain in figure 6.6). Single slots are mapped to functional properties; multi-slots are mapped to non-functional properties. Unlike Jess and CLIPS programs, JessTab programs are very unlikely to contain any deftemplates or defclasses, as these are provided automatically by JessTab using the ontology loaded in Prot´eg´e. As discussed in section 2.6.5, JessTab uses the object-based notation to refer to an ontology’s concepts. These references can be used by an alternative parser to extract the names of the classes and properties from the ontology: no further type information can be extracted so all slots are converted to OWL datatype properties in the extracted ontology. After completing the first stage of the ontology extraction, the ontology consists of a list
6.3. Design and Implementation of PS2 R
160
(deftemplate elvis-models (slot model-name (type STRING)) ) (deftemplate doors (slot car.safety.edge.weight (type FLOAT)) (slot model-name (type STRING)) (slot model.weight.constant (type FLOAT)) (slot opening.strike.side (type STRING)) (slot opening.type (type STRING)) (slot space (type FLOAT)) (slot speed (type STRING)) ) (deftemplate motors (slot horsepower (type INTEGER)) (slot max.current (type INTEGER)) (slot model-name (type STRING)) (slot weight (type FLOAT)) )
Figure 6.5: Example deftemplates for elevator motor and components. CLIPS, Jess, JessTab Slot Concept slot multi-slot (type FLOAT) (type INTEGER) (type NUMBER) (type STRING) (type SYMBOL)
OWL Property Functional Property Property Datatype property, range xsd:float Datatype property, range xsd:int Datatype property, range xsd:float Datatype property, range xsd:string Datatype property, range xsd:string
Table 6.1: Summary of the mappings from a CLIPS, Jess, or JessTab slot definition to an OWL ontology property.
of classes and a list of properties; the second stage uses these lists to infer possible taxonomic relationships based on the domains of the relevant properties. The assumptions of the taxonomic inference algorithm are discussed in section 6.2.1. Briefly, two classes and the properties with those classes in their domain are compared, if the set of properties associated with one class is a superset of the properties associated with the other class, then the first class is inferred to be a subclass of the second. To improve the performance of algorithm 9, two preprocessing steps outlined in algorithm 8 are applied before algorithm 9 is executed. Firstly, the list of extracted classes, classes, is sorted based on the number of properties with each class in their domain; secondly, initial classes are added to a seed ontology, O. The sorting is performed to make the selection of initial classes quicker (as the initial classes will only be associated with a set number of properties, the sorting means it is not necessary to determine the number of properties associated with each class at the
6.3. Design and Implementation of PS2 R
161
Class(elvis-model complete) Class(doors complete) Class(motors complete) DatatypeProperty(model-name domain(elvis-model doors motors) range(xsd:string)) DatatypeProperty(car.safety.edge.weight domain(doors) range(xsd:float) ) DatatypeProperty(model.weight.constant domain(doors) range(xsd:float)) DatatypeProperty(opening.strike.side domain(doors) range(xsd:string)) DatatypeProperty(opening.type domain(doors) range(xsd:string)) DatatypeProperty(space domain(doors) range(xsd:float)) DatatypeProperty(speed domain(doors) range(xsd:string)) DatatypeProperty(horsepower domain(motors) range(xsd:int)) DatatypeProperty(max.current domain(motors) range(xsd:int)) DatatypeProperty(model-name domain(motors) range(xsd:string)) DatatypeProperty(weight domain(motors) range(xsd:float)) Figure 6.6: Abstract OWL syntax [88] representation of the classes created based on the deftemplates in figure 6.5, along with the properties (both datatype and object) inferred from the deftemplate’s slots.
initial class selection stage) and, as discussed below, to reduce the complexity of algorithm 9. The second preprocessing step of adding initial classes to a seed ontology is done to provide an initial set of potential superclasses for the classes in classes passed to algorithm 9. Algorithm 8 is a relatively simple preprocessing algorithm, as it simply involves iterating through the classes list adding classes to the ontology until a class associated with a certain number of properties is found. As the classes in classes are sorted in increasing number of associated properties before it is passed to algorithm 8 the algorithm will always terminate with the correct output of a series of initial classes being added to O (those classes associated with zero properties and those associated with the minimum number of properties greater than zero). As classes are added to O they are also removed from classes to ensure they are not added again or cause errors in algorithm 9. Another consequence of classes being sorted before being passed to algorithm 8 is that, even in the worst case (where either all the classes are associated with zero properties, or the lowest number higher than zero) the time taken for the algorithm to complete will be n time units, where n is the number of classes in classes; in the best case, where the first class in classes is associated with one property and the second class is associated with more than one property, the algorithm will complete in 1 time unit. After the preprocessing has been performed, the root class of O will have at least one subclass, and algorithm 9 is applied. Effectively, algorithm 9 takes every class in classes that has not been added to O, and checks if it can be added to O as a subclass of a class already in O (classes with no properties are excluded as all classes would be added as subclasses of them). Where it has not been possible to add c as a subclass of an existing class, c is added as a subclass of the root class of O. When determining if c can be added as a subclass of a (non-root) class, the algorithm uses the addAsSubclass(Class c, Class parent) method, which determines the classes located deepest in any branch of O that can be superclasses of c, and updates O accordingly.
6.3. Design and Implementation of PS2 R
162
Algorithm 8 Preprocessing steps for the taxonomy inference algorithm. Input classes, a list of classes sorted in ascending order of the number of properties associated with each class (so the first class is associated with the fewest properties and the last class is associated with the most) O, an ontology with no classes or properties Output O, with some classes from classes added as subclasses of the root class classes, with the classes added to O removed Begin 1: if the first class, c, in classes has 0 properties then 2: Add c and all other classes in classes with 0 properties as subclasses of the root class in O 3: Remove all classes that were added to O from classes 4: c ⇐ next class in classes (the first class with more than zero properties) 5: n ⇐ the number of properties of c 6: Add c, and all other classes with n properties as subclasses of the root class in O 7: Remove all classes that were added to O from classes 8: else 9: n ⇐ the number of properties of c (the first class in classes) 10: Add c, and all other classes with n properties as subclasses of the root class in O 11: Remove all classes that were added to O from classes End
Algorithm 9 Taxonomy inference algorithm. for all remaining classes c in classes do added ⇐ false for all direct subclasses sc of the root of O (i.e. every class with the root class of 0 as its immediate superclass) with more than one associated property do if addAsSubclass(c, sc) is true then added ⇐ true if added is false then Add c as a subclass of the root class of O addAsSubclass(Class c, Class parent) Require: c ⇐ a potential subclass parent ⇐ c’s potential superclass if properties of c are not a superset of the properties of parent then return false added ⇐ false for all subclasses psc of parent do if addAsSubclass(c, psc) is true then added ⇐ true if added is false then Add c as a subclass of parent return true
6.3. Design and Implementation of PS2 R
163
As the body of algorithm 9 simply iterates through the classes list, calling addAsSubclass for each class, algorithm 9 will always terminate as long as addAsSubclass terminates. When the arguments passed to addAsSubclass are such that c should not be set as a subclass of parent, the if statement on the first line of the method evaluates to true and the method exits by returning false. When the if statement evaluates to false, the method will always return true (as c should be added as a (potentially indirect) subclass of parent). This is because lines 5 to 7 simply iterate through all the subclasses of parent testing if c can be added as a subclass of one of the subclasses. This will always terminate as when the leaf classes of the ontology are reached parent will not have any subclasses and the recursive call will not be made. This ensures that addAsSubclass will always terminate, which in turn ensures the main body of algorithm 9 terminates. Further, algorithm 9 always terminates with the correct output as it simply iterates through all the classes in classes, attempting to add each class as a subclass of a class already in O; if the class is not added as a subclass of an existing class, it is simply added as a subclass of the root class of O. This works because of the sorting of classes before the preprocessing steps. The sorting of classes means that when the algorithm is attempting to add a class c and a suitable direct superclass for c is found, it is the case that that class will always be a suitable direct superclass (lets call that class s) for c and that any other class being added to the ontology at a later point that can also be added as a subclass of s will either be added as a sibling of c or a subclass of c. It will never be the case that a class added to O after c will be added as a superclass of c, and so adding c as a direct subclass of appropriate “deepest” classes in O when the appropriate classes are found will produce the correct output. To ensure the “deepest” superclass for c is found, the lines 3 to 5 of algorithm 9 iterate through all the direct subclasses of the root class of O (that are associated with more than one property) attempting to add c as a subclass of each existing class. Only classes associated with more than one property are considered as all the classes in classes would be added as subclasses of those classes associated with no properties (due to the preprocessing steps, all classes in classes are associated with at least one property). The addAsSubclass method ensures that, if c is not a suitable subclass of parent then false is returned, otherwise c is added at the deepest suitable subclass of parent or parent if no subclass (of parent) is a suitable superclass (of c). The addAsSubclass method uses a depth first traversal of all of parent’s subclasses (provided by the recursive call) to ensure that c is only added as a subclass of the deepest suitable parent. When c is a suitable subclass of any sc (main algorithm, line 4), true is returned by the addAsSubclass call (line 4), and the added variable is set to true. If, after the for loop (main algorithm, line 2) has completed added is false, then c is added as a subclass of the root class of O. This ensures that all the classes extracted from the original Jess, JessTab, or CLIPS program are added to the ontology that is produced by this algorithm, and that they are added as subclasses of the deepest appropriate classes, which prevents them being added as both direct and indirect subclasses of every suitable class. From the perspective of analysing how long algorithm 9 takes to terminate, the best case is the one in which none of the classes in classes can be added as subclasses of existing classes in O as this means the addAsSubclass method is never called, and the algorithm will complete in the sum of all numbers from n to n+m-1, where n is the number of classes in O associated with more
6.3. Design and Implementation of PS2 R
164
Class(elvis-model complete) Class(doors partial elvis-model) Class(motors partial elvis-model) Figure 6.7: The inferred relationships of the classes in figure 6.6, after the taxonomic inference algorithm has been applied. than one property when O is passed to the algorithm and m is the number of classes in classes (as the first class in classes will be compared with n classes in O, the second class will be compared with n+1 classes, and so, until the last class which will be compared with n+m-1 classes). In the worst case, where addAsSubclass is called for every class in classes and involves iterating to every leaf class in O, the time taken will be n (where n is the number of classes in classes) multiplied by the time taken by the depth first traversal of the ontology. Figure 6.7 shows the changes to the classes described in figure 6.6 after the taxonomy inference algorithm has been used. Once the taxonomical structure has been inferred, an XML encoding of the ontology is returned. This is then associated with the relevant PsSearchResult, which is then passed for indexing and storage in the repository. A second, more detailed example of the taxonomy inference process is provided in figures 6.8 and 6.9. Figure 6.8 provides an example input to algorithm 8, a list of eight classes, c1 to c8, sorted in ascending order of the number of associated properties (values in square brackets indicate a list containing the specified values, so at the input stage classes contains the values c7, c5, c6, c8, c4, c3, c2, and c1). The preprocessing steps remove the first three classes (Temp7 (c7), Temp5 (c5), and Temp6 (c6)) from classes, adding them as subclasses of the root class of O. Figure 6.9 shows the state of the classes and O variables at the end of every iteration of the main for loop in algorithm 9: the first iteration adds Temp8 (c8) as a subclass of the root class of O; the second iteration adds Temp4 (c4) as a subclass of Temp5 and Temp8; the third iteration adds Temp3 (c3) as a subclass of Temp5 and Temp6; the fourth iteration also adds Temp2 (c2) as a subclass of Temp5 and Temp6; and the final iteration adds Temp1 (c1) as a subclass of Temp3 and Temp4.
6.3.3
A Repository of Problem Solvers
The data layer of PS2 R provides the system’s PS repository. As Web PS searches are performed by users, before the results are presented, they are used to update the repository; users can also submit their own PSs. The repository is currently provided by a MySQL relational database accessed through data model and data access objects. The design of the repository’s database is represented in the Enhanced Entity Relationship (EER) diagram, shown in figure 6.10. The problemsolvers table represents the PSs in the repository: for every PS, the repository stores the URL that it was found at; the program code; the file name; the file type (extension); the associated keywords (which includes the user’s search terms and extra keywords extracted from the ontology); a reference to the main ontology associated with the PS (which is typically the ontology extracted by Cjjtoe); references to any additional ontologies; and any other related files. The latter two fields are not used for PSs found through the Web search process, but may be required when a user submits their own PS. The keywords table stores the list of keywords used to describe PSs; the ontology table stores the XML serialisation of related ontologies; and the extrafiles table stores any extra files associated with a
6.3. Design and Implementation of PS2 R
165
Input The following class and property definitions: c1 is Class(Temp1 complete) c2 is Class(Temp2 complete) c3 is Class(Temp3 complete) c4 is Class(Temp4 complete) c5 is Class(Temp5 complete) c6 is Class(Temp6 complete) c7 is Class(Temp7 complete) c8 is Class(Temp8 complete) DatatypeProperty(slot1 domain(Temp5 Temp4 Temp3 Temp2 Temp1)) DatatypeProperty(slot2 domain(Temp6 Temp3 Temp2 Temp1)) DatatypeProperty(slot3 domain(Temp8 Temp4 Temp2 Temp1)) DatatypeProperty(slot4 domain(Temp8 Temp4 Temp3 Temp1)) DatatypeProperty(slot5 domain(Temp2)) classes is the list [c7, c5, c6, c8, c4, c3, c2, c1] O is an empty ontology Output classes is the list [c8, c4, c3, c2, c1] O consists of: Class(Temp7 complete) Class(Temp5 complete) Class(Temp6 complete) DatatypeProperty(slot1 domain(Temp5)) DatatypeProperty(slot2 domain(Temp6)) Figure 6.8: An example input and output for the preprocessing algorithm, algorithm 8. PS. The database contains three extra bridge tables: ontpsbridge, which links the ontologies and PSs; extrapsfilebridge, which links additional files and PSs; and keywordbridge, which links the PSs and their associated keywords.
Adding New Problem Solvers New PSs are added to the repository by users submitting their PSs directly and as part of the PS Web search process. To avoid duplication, before a PS is added to the repository, its URL is compared with those of the stored PSs: if a PS from that location is already present, then the entry is updated if necessary. The cached version of the PS’s code is compared with that of the new PS: if they are not identical, then the old version (and its associated data) is removed from the repository, and the new PS is added. A more sophisticated archiving system (which includes versioning) could be implemented if necessary in the future. The process of adding a PS is relatively trivial using the database design shown in figure 6.10. A new tuple corresponding to the new PS is inserted into the problemsolver table; new tuples are also added to the ontology, extrafiles, and, if necessary, the keywords tables; finally, the three bridge tables are updated accordingly. When a user submits their own PS to the repository, they are required to provide the name of the PS, the URL of the program code, and associated keywords; they also have the option of specifying the URL of the main ontology and any others that should be associated with their PS, along with any other additional files. There are no restrictions placed on the contents of the additional files: for example, they could provide extra program code or documentation. This allows users to submit PSs that are contained in multiple files, or to provide extra information which may be useful for people using the PS. When a PS is added from the Web search, the set of keywords associated with that PS (in
6.3. Design and Implementation of PS2 R
166
Input The following class and property definitions: c1 is Class(Temp1 complete) c2 is Class(Temp2 complete) c3 is Class(Temp3 complete) c4 is Class(Temp4 complete) c8 is Class(Temp8 complete) DatatypeProperty(slot1 domain(Temp4 Temp3 Temp2 Temp1)) DatatypeProperty(slot2 domain(Temp3 Temp2 Temp1)) DatatypeProperty(slot3 domain(Temp8 Temp4 Temp2 Temp1)) DatatypeProperty(slot4 domain(Temp8 Temp4 Temp3 Temp1)) DatatypeProperty(slot5 domain(Temp2)) classes is the list [c8, c4, c3, c2, c1] O consists of: Class(Temp7 complete) Class(Temp5 complete) Class(Temp6 complete) DatatypeProperty(slot1 domain(Temp5)) DatatypeProperty(slot2 domain(Temp6)) After Iteration 1 classes is the list [c4, c3, c2, c1] O consists of: Class(Temp7 complete) Class(Temp5 complete) Class(Temp6 complete) Class(Temp8 complete) DatatypeProperty(slot1 domain(Temp5)) DatatypeProperty(slot2 domain(Temp6)) DatatypeProperty(slot3 domain(Temp8)) DatatypeProperty(slot4 domain(Temp8)) After Iteration 2 classes is the list [c3, c2, c1] O consists of: Class(Temp7 complete) Class(Temp5 complete) Class(Temp6 complete) Class(Temp8 complete) Class(Temp4 partial Temp5 Temp8) DatatypeProperty(slot1 domain(Temp5 Temp4)) DatatypeProperty(slot2 domain(Temp6)) DatatypeProperty(slot3 domain(Temp8 Temp4)) DatatypeProperty(slot4 domain(Temp8 Temp4)) After Iteration 3 classes is the list [c2, c1] O consists of: Class(Temp7 complete) Class(Temp5 complete) Class(Temp6 complete) Class(Temp8 complete) Class(Temp4 partial Temp5 Temp8) Class(Temp3 partial Temp5 Temp6) DatatypeProperty(slot1 domain(Temp5 Temp4 Temp3)) DatatypeProperty(slot2 domain(Temp6 Temp3)) DatatypeProperty(slot3 domain(Temp8 Temp4)) DatatypeProperty(slot4 domain(Temp8 Temp4 Temp3)) After Iteration 4 classes is the list [c1] O consists of: Class(Temp7 complete) Class(Temp5 complete) Class(Temp6 complete) Class(Temp8 complete) Class(Temp4 partial Temp5 Temp8) Class(Temp3 partial Temp5 Temp6) Class(Temp2 partial Temp5 Temp6) DatatypeProperty(slot1 domain(Temp5 Temp4 Temp3 Temp2)) DatatypeProperty(slot2 domain(Temp6 Temp3 Temp2)) DatatypeProperty(slot3 domain(Temp8 Temp4 Temp2)) DatatypeProperty(slot4 domain(Temp8 Temp4 Temp3)) DatatypeProperty(slot5 domain(Temp2)) After Iteration 5 - Return Values classes is an empty list O consists of: Class(Temp7 complete) Class(Temp5 complete) Class(Temp6 complete) Class(Temp8 complete) Class(Temp4 partial Temp5 Temp8) Class(Temp3 partial Temp5 Temp6) Class(Temp2 partial Temp5 Temp6) Class(Temp1 partial Temp3 Temp4) DatatypeProperty(slot1 domain(Temp5 Temp4 Temp3 Temp2 Temp1)) DatatypeProperty(slot2 domain(Temp6 Temp3 Temp2 Temp1)) DatatypeProperty(slot3 domain(Temp8 Temp4 Temp2 Temp1)) DatatypeProperty(slot4 domain(Temp8 Temp4 Temp3 Temp1)) DatatypeProperty(slot5 domain(Temp2)) Figure 6.9: An example execution of taxonomy inference algorithm, algorithm 9.
6.3. Design and Implementation of PS2 R
167
Figure 6.10: EER diagram showing the final, normalised design of the PS2 R repository’s database. terms of the tuples added to the keywordbridge table of the database) are the union of the set of search terms (entered by the user) that were matched by that PS and the set of entity (class and property) names used by the extracted PS ontology; when a user submits their own PS, they can define the set of associated keywords. This can result in a sizeable number of keywords being associated with a PS, which should improve the recall rate of repository searches and provide more keywords for use with the folksonomy of popular PS tags/keywords.
Searching the Repository for PSs From the user’s perspective, there are two techniques for searching the repository: a keywordbased search, similar to that of the Web search engine, and secondly, through the use of the tag cloud of PS keywords. From the design perspective, these are the same operation: selecting a tag (keyword) equates to searching using a single keyword. When a search is performed, the keywords provided are matched against those in the keywords table, and any PSs containing those keywords are returned to the user. The associated sequence diagram (figure 6.11) illustrates the search process: the search terms are passed to the SearchController, which in turn passes them to the RepositoryManager. Access to the repository has been designed using the Abstract Factory pattern [37, chap. 3]. The RepositoryManager knows which RepositoryAccessor it should use; currently this is an instance of the DatabaseRepositoryAccessor class, and instructs it to perform the search. The DatabaseRepositoryAccessor in turn generates the appropriate query, executes it over the database, and converts the results into instances of PsSearchResult. These are returned to the RepositoryManager, which returns them to the SearchController, which returns them to the presentation layer for display to the user. The Abstract Factory pattern is used for repository access to allow the repository to be easily changed to another database, triplestore, or other storage medium, in the future, if required.
Generating a Folksonomy of PS Search Terms To support the user’s search of the repository, a tag cloud visualisation of frequently occurring PS keywords is generated. Figure 6.12 shows the sequence diagram for generating the tag
6.4. Summary
168
Figure 6.11: Sequence diagram of searching the repository. cloud, which is very similar to that of searching the repository; the main difference is that this search retrieves the 20 most popular keywords from the keywords table, and passes them to the TagCloudGenerator class which in turn generates a selectable tag cloud/list for inclusion in a Web page. When the user clicks one of the keywords (or tags) in the tag cloud, all the PSs associated with that keyword in the database are displayed. A sample tag cloud is shown in figure 6.13.
Figure 6.12: Sequence diagram showing the processes for generating the tag cloud/folksonomy of PS tags (keywords) in the repository.
Figure 6.13: A sample tag cloud from the PS2 R.
6.4
Summary
This chapter discusses PS2 R, (the Problem Solver Search engine and Repository) an initial system developed to support searching for generic PSs using Web based search and an associated
6.4. Summary
169
repository. The Web based search of PS2 R uses existing Web search engines to find PSs/programs written in JessTab, Jess, or CLIPS. Search terms provided by the user are passed to available Web search engines along with the requirement to only return files which have the “.clp” (standard JessTab, Jess, or CLIPS) or “.jess” (non-standard Jess) extension. These results are then parsed to ensure that they are a syntactically correct JessTab, Jess, or CLIPS PS/program, with all files that fail this test being discarded. The syntactically correct programs are then passed to Cjjtoe, a tool which first builds a list of the concept’s used by each program, and then attempts to infer subclass/superclass relationships between those concepts, which are then represented in an OWL ontology. PS2 R then uses these results to maintain a repository of PSs, which can be browsed through use of a tag cloud or searched using standard keyword based searching.
170
Chapter 7
Evaluation 7.1
Overview
This chapter discusses the experiments performed to evaluate the KBS development methodology described in this work. Specifically, this involved evaluation of MAKTab (section 7.2) and the search engine component of PS2 R (section 7.3). Details of the procedures used during these evaluations, along with results and conclusions are reported in this chapter. Ideally the MAKTab evaluation would have included a comparative study, in which the performance of MAKTab would be compared with the performance of related tools such as PSM Librarian, CommonKADS, IBROW3, and EXPECT; however, for various reasons discussed in section 7.2.2, such a study was not possible. The evaluation of MAKTab therefore consisted of three experiments outlined in section 7.2.3: an evaluation by the developer building two types of KBS in the elevator domain (Experiment A, section 7.2.3), a user evaluation building two types of KBS in the computer hardware domain (Experiment B, section 7.2.3), and an interface evaluation (Experiment C, section 7.2.3). The results of these experiments are presented and discussed in section 7.2.4. Briefly, Experiment A found that MAKTab could be used to successfully build two large and relatively complex KBSs in the elevator domain, which were both executable and produce correct outputs; during Experiment B MAKTab was successfully used by subjects to build configuration and diagnostic KBSs in the computer hardware domain, and observations of the subject’s use of MAKTab provided suggestions for potential improvements to the tool; potential points for improvement of the tool’s interface were also identified in Experiment C. The evaluation of PS2 R consisted of an empirical evaluation which investigated the on-line availability of JessTab, Jess, and CLIPS implementations of the standard PSMs as returned by Web search engines, in order to determine the extent to which PS2 R supports the acquisition of PSs, particularly from the Web. This evaluation involved a comparison of PS2 R’s performance with that of a standard Web search engine (Google) when searching for PSs which use a variety of terms associated with standard PSMs. The results of this comparison found that, although PS2 R returned more focused results (i.e. files which were not (syntactically) valid programs were not returned), none of the searches returned PSs which could be reused in other domains. However, the extra features of PS2 R, especially the ontology extraction feature, proved to be very useful when evaluating the search results, particularly with gaining an understanding of the domain a program works in.
7.2. Evaluation of MAKTab
7.2
171
Evaluation of MAKTab
MAKTab provides an implementation of the proposed KBS development methodology described in this thesis. To evaluate the methodology, a series of experiments were performed in which KBSs were built using MAKTab. Briefly, KBSs performing diagnosis and configuration in the elevator domain were built by myself; KBSs performing the same tasks were built in the computer hardware domain by a number of subjects; and an evaluation of the tool’s interface was also performed. This section provides details of these experiments, along with results of the studies and conclusions.
7.2.1
Aim of Evaluation
The main aims of the evaluation of MAKTab were to determine if: 1) from the user’s perspective, the overall KBS development process is superior to existing KBS development techniques (section 7.2.2), and 2) using the tool itself is an effective approach to building a new KBS (section 7.2.3). Unfortunately, for reasons discussed in section 7.2.2, it was not possible to determine the first of these points. The specific aims of the evaluation process were to evaluate three of the hypotheses of this thesis. These are (as set out in section 1.2): if one has a KBS which is capable of solving propose-and-revise tasks in the elevator domain, and one wishes to develop a diagnostic KBS in the same domain, then: 1. The necessary diagnostic rule set for an elevator diagnostic KBS will not be contained in an existing elevator configuration KBS which uses propose-and-revise, and hence it will be necessary to acquire the additional domain specific rule set (hypothesis 1). 2. A generic diagnostic problem solver (PS), together with the appropriate domain ontology can be used to “drive” the Knowledge Acquisition (KA) process (hypothesis 2). 3. An ontology developed for use with a propose-and-revise PS for a particular domain will need to be extended when it is used as the basis for acquiring a diagnostic rule set for the same domain (and vice versa) (hypothesis 3). These can of course be generalised to apply to further pairs of problem solvers. These three hypotheses were assessed empirically. Support for/against the first hypothesis was gained by performing experiments in which two sets of KBSs were created by reusing the components of existing KBSs; support for/against the second was gained through the use of MAKTab, and experiments focused on determining its effectiveness for KBS development; support for/against the third was gained during both of these experiments, as after having developed two sets of KBSs, it will be clear if it was necessary to extend the original domain ontology or not.
7.2.2
Potential Comparative Studies
In order to determine if MAKTab is a superior tool for building KBSs than existing tools/techniques, ideally a comparative study would be performed. This would involve the subjects (typically knowledge engineers) being given the task of building two identical KBSs with at least two different KBS development tools. The length of time required to build each KBS would be recorded, as would some measure of each subject’s interactions with the tools used, and the accuracy of the
7.2. Evaluation of MAKTab
172
developed KBSs, to evaluate which tool was the quickest/most efficient for building each type of KBS. Further, after developing each KBS the subjects would be asked to complete a questionnaire which records their opinions of the tool they have just used (by asking, for example, how easy it was to follow what was happening, to create/use the various KBS components, and so on). Upon completion of all the experiments, the subjects would be asked to complete a questionnaire in which they compare and contrast the KBS development tools they have used, provide details of each tool’s strengths and weaknesses, and express views on which tool they prefer. The results of these experiments would then be used to determine which tool was the quickest, on average, for building KBSs, which provided the best user experience, and which the subjects preferred overall; all of which could be used to determine if MAKTab is a superior tool for KBS development. It is unfortunate therefore that, for various reasons, such an experiment is not currently possible. Several KBS development methodologies and their associated tools are discussed in sections 2.2 and 2.4.3 (PSM Librarian, CommonKADS, IBROW3, and EXPECT). Briefly, the main reason why a comparative study cannot be performed with these projects is the lack of complete implementations: section 2.2 discusses in detail the extent to which each of these projects produced a working system; some details are summarised below.
PSM Librarian The implemented tool of the PSM Librarian approach, PSMTab (for Prot´eg´e) does not support the development of an actual executable KBS; it only supports the mapping of instances from one ontology to another, ending with an instantiated version of the PS ontology. Unfortunately, this is not enough to determine if the mapping process has been successfully performed; without executable code it is impossible to determine the correctness of the mappings. It may appear to the user as if they are correct, however without being able to execute the KBS and evaluate its results, the user is unable to evaluate the correctness of the mappings. Evaluation based solely on the appearance of the mappings is akin to evaluating a computer program based on examination of the source code, without using run-time testing to assess its fitness for purpose. A complete comparative study with PSM Librarian is therefore, impossible. As both approaches include some form of ontology mapping between a domain and problem solver ontology, it would be possible to compare the ontology mapping tools from PSMTab and MAKTab, to determine which tool subjects find better to work with.
KADS-I, KADS-II, and CommonKADS As noted in [114], the CommonKADS project suffered from a lack of implementation. There are no usable support tools for the CommonKADS methodology, nor are there any publicly available implementations of the PSs that are discussed in [10] and [98, chap. 6]. Following the CommonKADS methodology, the developer (or subject in a comparative study) is required to build every component of the KBS from scratch. Therefore, a comparison with CommonKADS would not provide any support for the hypothesis that the methodology described in this thesis is more effective for building KBSs through reuse than the CommonKADS methodology (primarily because the CommonKADS methodology incorporates very little (computational) reuse), and so comparison experiments with CommonKADS are not a suitable form of evaluation.
7.2. Evaluation of MAKTab
173
The IBROW3 Project IBROW3 was one of the few projects that did produce a working implementation, in the form of a prototype brokering service which used the IBROW3 architecture to develop a KBS. There were also at least two repositories of reusable problem solvers developed [73; 74]. However, as might be expected several years after the project’s completion, the working system is no longer available. Without a working system, it is impossible to perform a comparative study.
EXPECT The EXPECT system was developed to help users extend and adapt an existing KBS to a new domain. As such, its purpose is not to help with the creation of a brand new KBS by reuse, in the style proposed by the methodology described in this thesis; this therefore limits the suitability of EXPECT for a comparative study. Such a study would require the development of a suitable KBS in EXPECT, which subjects could then customise to work in another domain, which is a somewhat different problem than that MAKTab is designed for. Such an experiment would be solely focused on the subject using EXPECT to define the PS specific domain knowledge and, if necessary, descriptions of extra domain concepts; as such this would only provide a comparison with MAKTab’s KA tool.
7.2.3
Evaluation Techniques
It should be clear from the above that it is not possible to perform a satisfactory comparative study between previous work and the MAKTab approach. Consequently, the implementation (and hence methodology) was evaluated in three stages. Initially, the tool was used by myself to build diagnostic and configuration KBSs in the elevator domain. This determined whether the tool can be used to build two different, working KBSs in the same domain using components extracted from existing working systems (Experiment A). After completing this, an evaluation was performed in which subjects were asked to build KBSs for diagnosis and configuration in the domain of computer hardware (Experiment B). The results from both Experiments A and B were then used to evaluate hypotheses 1, 2, and 3. Finally, a usability study of MAKTab’s interface was carried out (Experiment C).
Elevator Domain Evaluation (Experiment A) Elevator Configuration MAKTab was developed based on manual experiments in which diagnostic and configuration KBSs in the elevator domain were built; the first evaluation of MAKTab involved myself using it to repeat these experiments. Elevator configuration provides a particularly challenging domain: the Sisyphus-II VT challenge documentation [123], which was used to provide the task specification for this experiment, contains 46 pages describing the task, consisting of discussions on 14 component types, each with several sub-components, around 190 named parameters each with at least one associated value assignment expression (calculation), and provides descriptions of 50 separate named constraints (although in some cases the description provides multiple constraints), and around 59 associated fixes. In the Stanford implementation this translates to around 7200 lines of CLIPS code, plus an additional 800 lines for the propose-and-revise method. Developing such a large, complex KBS would provide a significant challenge for MAKTab: if successfully completed, this will show that it is possible to build executable KBSs using the tool and methodology.
7.2. Evaluation of MAKTab
174
During this experiment the following materials were used: • Prot´eg´e with both MAKTab and JessTab plug-ins installed. • The generic propose-and-revise PS, PS(pnr, -). • The elevator domain ontology ONT(elevator, [diag]). • The Sisyphus-II VT specification document, [123]. Elevator Diagnosis To evaluate MAKTab with another type of PS, an elevator diagnostic KBS was developed. The diagnostic knowledge used in this experiment was the same as that used in the corresponding manual experiment, (section 4.6.1). Briefly, this was the interview transcripts upon which the original (CLIPS) elevator diagnosis KBS was based and an extracted version of additional diagnostic knowledge taken from the original system. This gave a total of 90 descriptions relating to elevator malfunctions and their symptoms and repairs, covering over 26 different concepts. During this experiment the following materials were used: • Prot´eg´e with both MAKTab and JessTab plug-ins installed. • The generic diagnostic PS, PS(diag, -). • The elevator domain ontology ONT(elevator, [pnr]). • The interview transcript and additional elevator diagnostic knowledge.
Experimental Evaluation (Experiment B) For these experiments, a number of Computing Science graduates and PhD students, with a range of familiarity with ontologies, PSs, and KBSs were asked to build KBSs using MAKTab. The expected users of MAKTab are Computing Scientists and/or Knowledge Engineers, who may or may not be familiar with ontologies, PSs, and KBSs; it was felt therefore that the chosen subjects represented the expected users well, allowing the results of this experiment to be generalised. As Tallis discusses [108, chap. 6], controlled KA experiments of this kind are difficult: subjects are a limited resource, and require familiarity with the domain (or need to be given some training); additionally, experiments are time consuming, which further reduces the number of subjects willing to participate. It was therefore decided to ask subjects to build a highly constrained configuration and/or diagnostic KBS in the computer hardware domain. The domain of computer hardware was used as opposed to the elevator domain as it provides a simpler domain, and hence building KBSs within it is a more tractable task. For example, a basic computer configuration system requires only a handful of rules. Using this task means that it is possible to perform a thorough test of the system within an experimentally acceptable time period; for example, the computer configuration task that was used can be achieved in around two hours. (As discussed in section 7.2.4, the complete elevator configuration KBS took 16 hours for myself to build; not only would it be difficult to find subjects who could give up at least that length of time, this period would lead to subject fatigue and loss of interest, all of which would adversely affect the results.) Therefore two different experiments were performed:
7.2. Evaluation of MAKTab
175
B1 The development of a propose-and-revise KBS to specify computer hardware configurations, by mapping a computer ontology to a generic propose-and-revise PS, and then extending it (if necessary) with configuration rules. B2 The development of a diagnostic KBS for computer faults, by mapping a computer hardware ontology to a generic diagnostic PS, and then extending it (if necessary) with diagnostic rules. In both of these experiments subjects were provided with the following resources to help them perform the experiment: • Prot´eg´e with both MAKTab and JessTab plug-ins installed. • The generic propose-and-revise PS or the generic diagnosis PS for MAKTab. • An ontology for the computer hardware domain, taken from an existing KBS (for either propose-and-revise or diagnosis), developed by myself previous to the evaluation process. An initial computer hardware ontology was developed based on the descriptions of computer hardware components provided by two on-line computer retailers (Dabs1 and eBuyer 2 ); this was then used (and extended as necessary) by myself to build the two KBSs following the task descriptions that were given to the subjects (see below). The domain ontologies were then extracted from both of these systems, and all links to any reasoning specific knowledge and any concepts that were not used by the final KBS were removed. Subjects were then provided with the extracted computer ontology from the other KBS from which they were asked to build: for example, subjects building KBS(diag, computer), were provided with PS(diag, -) and my ONT(computer, [pnr]). (The computer domain ontologies were built after the task specification documents had been written, but without reference to them.) • The MAKTab user manual (appendix C). • An introductory document briefly describing KBSs, ontologies, Prot´eg´e, and JessTab (appendix D). • A brief introduction to the experiment (appendix E and appendix F). • Problem solver related documentation describing how to configure PS(pnr, -) and PS(diag, -), (appendices G and H respectively). • An introductory tutorial document for configuring the generic propose-and-revise or generic diagnostic PS (appendices I and J respectively). The subjects were also provided with a description of the knowledge their KBS had to contain; they were free to supplement this with additional knowledge if desired. The introduction to computer configuration is included in appendix K, and the introduction to computer diagnosis is included in appendix L. The computer configuration document was written based on my general 1 2
http://www.dabs.com http://www.ebuyer.com
7.2. Evaluation of MAKTab
176
knowledge of the task; the computer diagnosis document was written using a subset of the problems and repairs described in [94]. Both documents describe systems which can be represented with a similar number of rules (my KBS(pnr, computer) contained 60 rules, and my KBS(diag, computer) contained 61 rules). Before starting the experiments, subjects were asked to complete a questionnaire in which they provided details of their familiarity with ontologies, Prot´eg´e, PSs/PSMs, Jess, KBSs, MAKTab, the domain, and the task their KBS was to address. During the experiments, subjects were encouraged to “think out loud” and detailed notes of their actions were taken by the investigator; these information sources were then used to determine where subjects have difficulties, and so highlight potential areas that need improvement. Further, after completion of the experiment, subjects were asked to complete a questionnaire in which they provide details/opinions about their experience of using the system. A sample questionnaire is provided in appendix M. Using the results of these experiments, it should be possible to determine if: 1. When building a new KBS for, for example, diagnosis, using existing components from, for example, a propose-and-revise based configuration KBS, do extra diagnostic rules need to be acquired (hypothesis 1)? 2. The knowledge from the domain (computer) ontology (relating to computer hardware components) and the generic PS can be used to “drive” the acquisition of any additional PS specific knowledge that is required (hypothesis 2). 3. The knowledge in a domain ontology from one KBS needs to be extended during the development of a KBS for another type of problem solving (hypothesis 3). Further, by developing MAKTab, which is capable of producing executable KBSs, this work provides a significant advance over previous attempts at this problem. User evaluations will determine if MAKTab is easy to use, along with providing insights into how it can be improved to provide the Knowledge Engineering community with a methodology and a tool for achieving the long standing goal of KBS creation by configuring reusable components.
Interface Evaluation (Experiment C) A usability study [77] was also performed to evaluate the interface of the tool. The basis for this evaluation was the Xerox heuristic evaluation check-list [89], which provides a comprehensive set of criteria for evaluating graphical user interfaces (GUIs). This check-list consists of 339 evaluation criteria grouped into 13 categories developed specifically to assess the user interface of a program. Each category evaluates a different aspect of the interface. The 13 categories are: visibility of system status (29 checkpoints), match between system and the real world (24 checkpoints), user control and freedom (23 checkpoints), consistency and standards (51 checkpoints), help users recognise, diagnose and recover from errors (21 checkpoints), error prevention (15 checkpoints), recognition rather than recall (40 checkpoints), flexibility and minimalist design (16 checkpoints), aesthetic and minimalist design (12 checkpoints), help and documentation (23 checkpoints), skills (22 checkpoints), pleasurable and respectful interaction with the user (17 checkpoints), and privacy (3 checkpoints).
7.2. Evaluation of MAKTab
177
This evaluation was performed by two of the subjects who participated in Experiment B after they had completed it and by myself. Subjects were asked to provide opinions on each evaluation point (whether the system complies with the checkpoint or not, or if the checkpoint is not applicable); the results of the study were used to improve the interface where necessary.
7.2.4
Results
Elevator Domain Evaluation (Experiment A) Elevator Configuration (KBS(pnr, elevator)) In total, it took 16 hours 40 minutes to build the KBS(pnr, elevator) described in [123] using MAKTab. Although this appears an excessively long time, consider that the generated version of the rules consists of 8377 lines of code (using standard Jess formatting), which contains 340,979 characters; given an average (fast) touch typing speed of 40 5-character words per minute, it would take over 23 hours to type those rules manually (without any additional time taken for thinking about the syntax and semantics of the rules). Of the 16 hours and 40 minutes, around 1 minute was spent using the mapping tool, with the remaining 16 hours 39 minutes spent using the KA tool. As I was familiar with [123], I was aware of the type of domain concepts that it uses, which facilitated a quick analysis of the suitability of the domain concepts in ONT(elevator, [diag]) with respect to the KBS that was being developed. Effectively, it was determined that no relevant knowledge could be acquired through mapping. The bulk of the work was therefore done with the KA tool: in total 651 rules were defined, along with 206 new SystemVariables, and 23 subclasses of SystemComponent representing various elevator components, which had further 80 associated individuals and 39 new associated properties. The following 23 new classes were defined during the experiment (the number in brackets is the number of associated individuals): • car buffer (2), • car guiderail (5), • car guideshoe (1), • compensation cable (7), • counterweight between guiderails (3), • counterweight buffer (2), • counterweight frame model (1), • counterweight guiderail (4), • counterweight guideshoe (1), • crosshead (5), • deflector sheave (2), • doors (6), • governor cable (1), • governor (1), • hoist cable (2), • machine beam (10), • machine groove (2), • machine (4), • motor generator (4),
7.2. Evaluation of MAKTab
178
• motor (6), • platform (3), • safety beam (3), and • sling (5). The following 39 properties were also defined to describe various aspects of these components: • a, • b, • bending moment, • bending moment maximum, • c, • compatible with, • deflection index, • depth, • diameter, • distance, • hasCrossheadModel, • height, • horsepower, • left offset, • maximum load, • max current, • max load, • max suspended load, • min load, • model-code, • opening-strike-side, • opening-type, • protrusion, • right offset, • safety beam constant, • safety beam model a, • safety beam model b, • section modulus, • sheave diameter, • sheave height, • speed, • stroke, • ultimate strength, • unit-weight, • upgrade-name, • voltage,
7.2. Evaluation of MAKTab
179
• weight, • weight-limit, and • width. The 651 rules consisted of 273 assignment rules, 76 constraint rules, 62 fix rules, 192 initial value rules, and 48 output rules. When defining these rules, MAKTab automatically created a total of 6864 new individuals to represent the rules (651 individuals), atom lists (1382 individuals), atoms and their property values (which required a combined total of 4831 individuals). A further breakdown is provided in table 7.1, which, for each rule type, lists the total number of rules of that type, the number of rules with antecedents, the maximum number of antecedents, the average number of antecedents per rule, the number of rules with consequents, the maximum number of consequents, and the average number of consequents per rule. The executable JessTab KBS(pnr, elevator) that was generated consisted of 8937 lines of code, 8377 lines of which corresponded to PS-RS(pnr, [elevator]) and 560 lines of which corresponded to PS-RS(pnr, -). Along with the 273 assignment rules, 76 constraint rules, and 62 fixes rules, PS-RS(pnr, [elevator]) also contained 440 dependency rules generated from the assignments defined in the assignment and fix rules, and 636 property value assignment rules generated from the domain model (see section 4.5.1 for more discussion on the various rule types). Rule Type
SystemVariableConstraintRule FixRule SystemVariableValueAssignmentRule OutputCalculatedComponentRule OuputCalculatedSystemVariablesRule InitialComponentSelectionRule InitialSystemVariablesValue
Total # Num. Max. # Rules Rules Ant. w/Ant. 76 76 3 62 62 1 273 145 3 16 16 1 32 32 6 19 4 1 173 173 1
Average Num. Max. # # Ant. Rule Con. w/Con. 1.434 76 1 1.000 62 1 1.766 271 1 1.000 0 1.438 0 1.000 15 1 1.000 0 -
Average # Con. 1.000 1.000 1.000 1.00 -
Table 7.1: Summary of the types of rules defined in KBS(pnr, elevator).
To evaluate the correctness of the generated KBS it was executed with JessTab. The first time it was executed, it failed to produce a valid design, reporting that several constraints (12, 18, 20, 32, and 42) had been violated and could not be repaired. As I knew that this should not be the case, MAKTab was then used to examine the associated constraint rules: in most cases a mistake had been made when defining the constraint: for example, constraint 42 had been expressed using a “less than” comparison instead of a “greater than” and an argument in the comparison expression for constraint 12 had been placed in the wrong position within the expression. Constraint 32 (which was related to the car buffer striking speed minimum) was slightly more unclear as the constraint had been expressed correctly; MAKTab was therefore used to search for all assignment rules which referred to the car-buffer-striking-speed-minimum variable, which found that two rules were attempting to set the variable’s value, however looking at these rules it was clear that one should have been setting the car-buffer-striking-speed-maximum value not minimum value, and the correction was made. After these corrections had been made,
7.2. Evaluation of MAKTab
180
the KBS was regenerated and executed with JessTab, which successfully produced a design specification. The KBS was then executed several times with the outputs being recorded. To evaluate the KBS’s design proposals, they were compared with the sample solution provided in [123]: one design proposal was equal to the sample solution, the others were similar enough for me to accept them as correct. A solution was judged to be similar enough to the sample solution if there were only minor differences in component selection (such as a more powerful motor being selected) and dependent variables (such as the total weight of the machine assembly, and, in turn, the total machine beam impact load (which is dependent on the machine assembly weight)) were correctly updated to reflect the slight difference in component selection. Overall this experiment provided a significant test of MAKTab and the PS(pnr, -), with many lessons being learnt about the PS configuration process. Although mapping was not required in this experiment, this did illustrate that the KA tool was capable of being used to define all the domain concept knowledge as well as the reasoning/rules. The KA tool was used extensively throughout this experiment, which provided a good opportunity to thoroughly evaluate the support it provides. In this experiment, the descriptions given in [123] were followed sequentially from start to finish, this meant defining various general parameters, knowledge of the elevator and related components, rules regarding component selection, loads and moments, constraints and associated fixes, and finally output selections and initial values. Therefore the guided KA process was rarely used to its full extent for the PS(pnr, -) (i.e. defining the initial value(s), output selection(s), value assignment(s), constraint(s), and fix(es)), instead rules relevant to the knowledge being encoded were created as appropriate. However, subsequences of the guided KA process were frequently used: for example, often after defining a constraint rule, a series of fix rules were subsequently defined. In fact, often the description of a constraint required the creation of (possibly several) new variables, and associated assignment, constraint, and fix rules. When the KA process was used for this task it was found to be very useful, particularly when it was necessary, for example, to define an assignment rule, then multiple associated constraint rules, each with multiple fixes, after a new variable had been created. The automatic copying of one rule’s consequents to a related rule’s antecedents was very useful for various reasons. Firstly, when defining a constraint and related fix rules, it reduced the workload (and so time taken) by as much as a quarter, as it was only necessary to define one set of antecedents and two sets of consequents. Secondly, when defining multiple fix rules, it was easy to remember which constraint any given fix rule was being defined for. Automatically suggesting property values for an atom was also useful, reducing the time taken to define the atom and subsequently the rule. This was especially true when the atoms were relatively simple, such as those of the initial selection and output rules: if a domain concept had been selected, it was assigned as the (only) property value for the (only necessary) atom, and so the rule was defined without any interaction by myself. Applying the same value suggestion technique to the object properties associated with any individual created in the process of defining a rule was also very useful, particularly when used in conjunction with the filtering of individuals in the domain concept display. For example, many assignment rules require that several variables are used to calculate a value (for example, the motor torque releveling assignment rule references six different
7.2. Evaluation of MAKTab
181
variables); as an assignment expression/calculation is essentially a mathematical function with an associated list of arguments, before adding a (variable) argument, the domain concept display’s filter feature was used to locate the variable in the list of the (200+) variables, which was then selected and an argument was then added to the expression/calculation. This caused MAKTab to automatically add the (newly) selected variable as the new argument. Adding the variable argument in this way meant the operation only took three mouse clicks/interactions with the tool’s main GUI (two more would be necessary if the variables were not being displayed in the domain concept display). Alternatively, had this technique not been used, it would be necessary to add a new argument to the expression/calculation (one click), select the type of argument (two clicks), then select to set the value of the argument (two clicks), display the variables (one click), and select the appropriate one (at least two clicks, more if scrolling/filtering is necessary); making eight mouse clicks on the tool’s main GUI and two dialog boxes (a constant number of key presses would also be necessary to define a filter). Although this may not appear significant, considering every constraint rule required at least one comparison expression, and almost all of the assignment rules made use of at least one comparison expression and/or an assignment expression, this made the process quicker, less error-prone, and less tedious. To achieve the same operation using Prot´eg´e’s Individuals tab (the standard tab for creating individuals), requires (at least) 11 mouse clicks on the tab and four dialog boxes, searching through two lists of individuals (the list individuals and variables), and an understanding of how lists are represented in RDF. When editing a rule, the rule definition display gives the option of creating a new individual or selecting an existing individual when setting the value of an object property. This was originally designed to allow the easy selection of domain concepts as property values, although it had the unintended benefit of allowing the selection of, for example, arguments from a previous calculation to be used as an argument in other expressions. For example, the unit weight of the compensation cable (modeled as the compensation cable class with an associated unit-weight property) is used in six assignment rules and one constraint; to use this concept in a rule, it is necessary to create an individual of the relevant type (that allows selection of a component’s property), select the compensation cable as the component and the unit-weight as the property (three operations, requiring 12 mouse clicks); once this individual has been created; however, the option to select an existing individual allows that concept (the compensation cable’s unit weight) to be selected with one operation which requires four mouse clicks; again, a small saving of time which, when used repeatedly can save significant time overall. This is especially true when one considers that the item being reused can, for example, be a complex calculation or comparison expression which can take significantly more operations to (originally) define than the previous example. The ability to define new domain concepts and variables, and to edit existing domain concepts from the KA tool was also very useful. As domain knowledge had not been gained from the mapping stage, domain concepts were created when they were described in [123]. This often meant creating a new concept class, associated properties and individuals (for example, when the sling class is first described in [123, section 5.3], it has three properties a, b, and c and five individuals). Often the class descriptions and associated individuals had to be extended later with extra information (for example, in [123, section 5.5], the crosshead component is described, which requires a new property to be associated with the sling class (hasCrossheadModel)
7.2. Evaluation of MAKTab
182
to associate each sling with the appropriate crosshead). When defining the constraints, it was necessary to remodel the representation of the counterweight between guiderails distance concept.
Originally, it had been modeled as a
SystemVariable individual; however, C-14 placed a constraint on the value and the associated fix stated that, if the constraint was violated, then the counterweight between guiderails distance should be upgraded and provided details of the three valid values and the upgrade relationships between them. To encode the fix correctly, it was necessary to remodel the counterweight between guiderails distance concept to a SystemComponent subclass (counterweight between guiderails with two associated properties (distance and upgrade) and three individuals. As this type of refactoring was not supported directly by the tool, one of the main challenges was then to ensure that all references to the original SystemVariable were updated to refer to the new class. This was achieved by using the filter feature of the existing rules display to list all rules which had the name of the original SystemVariable (“counterweight between guiderails distance”) in their browser slot text description (which is displayed in the existing rules display); these rules were then subsequently updated. Although performing this task was made easier by the rule display’s filter feature, it could have been considerably harder for someone not familiar with the tool; as it is a relatively simple task to automate, it may be desirable to add such a feature to the tool in the future. During this experiment, a problem with the rule definition display was also identified: although the display works well for most rules, when an antecedent or consequent has many properties, it can become too large for the relevant part of the GUI. For example, when an expression with more than seven or eight arguments is displayed (especially when those arguments themselves are expressions), the display becomes too large both vertically and horizontally for the area of screen, and therefore a significant amount of (typically horizontal) scrolling is necessary. As expressions are displayed in a prefix style, when defining the more complex expressions or ones with numerous arguments (for example, the calculation for the machine-beam-bending-moment-left-maximum has the form A*B+C(D-E), which becomes (+ (* A B) (* C (- D E) ) ) in prefix notation, where A, B, C, D, and E are variables or component properties), it became easy to lose track of which argument is associated with which expression. Although the colour coding of the borders for property values did make the task significantly easier, and all expressions were defined correctly (often after double checking), it is still desirable to improve this part of the tool’s interface. As well as providing a test of the support provided by the KA tool and its GUI, the experiment also provided a good test of the tool’s performance when dealing with large KBSs. During the process of building the KBS, particularly during defining the constraints, the responsiveness of the tool notably reduced. The most likely cause of this was Java memory problems, as closing and reloading Prot´eg´e solved the problem. The main potential source of this problem was identified as the core Prot´eg´e libraries, which are known to be inefficient when dealing with thousands of individuals, although the control and ontology processing classes of the KA tool and the KA tool’s GUI could also be responsible. Generating the executable KBS also provided a thorough test of the rule generator associated with PS(pnr, -), with several minor bugs being discovered. These bugs were subsequently fixed,
7.2. Evaluation of MAKTab
183
and the rule generator produced a syntactically valid PS-RS(pnr, [elevator]). As these rules were then used as part of KBS(elevator, pnr) to produce a valid design solution, the rule generator can be assumed to be functioning correctly. Finally, this experiment also provided a test of the rules modeled in PS-ONT(pnr, -). Overall, all rules necessary to build KBS(pnr, elevator) were defined satisfactorily, with a valid elevator configuration being produced. The only type of expression which was not accurately modeled were expressions such as “select the motor with the smallest horsepower value greater than the required motor horsepower”. These statements were modeled as “select the motor with horsepower value greater than the required motor horsepower” as it was not possible to correctly represent the “smallest value greater than” constraint. However, this does not effect the ability of KBS(pnr, elevator) to produce a valid elevator configuration, and this type of statement could be modeled relatively easily with the addition of an appropriate function in PS-ONT(pnr, -). When performing this experiment, I noticed the following interesting curiosities in [123]: 1. Most noticeably, [123] does not actually define any doors. [123, section 5.1] describes the properties of a door, in terms of its opening strike side (two options), speed (two options), opening type (two options) and model name, the door operator component, and how to calculate the weight of the door operator component (the motor that sits on top of the doors and is responsible for opening and closing them), but no actual doors are described. The ONT(elevator, [pnr]), extracted from the Stanford solution defines eight door individuals, appearing to assume that a door is available for each combination of the three properties (opening strike side, speed, and opening type); the same approach was used in this experiment. 2. The total weight of the car (defined in [123, section 5.5]) is the sum of various weights such as the cab weight, platform weight, fixture weight, and so on. The weight of the cab is estimated as (130*(the platform width + the platform depth))/12 (the platform is essentially the base of the cab), this fails to take into account any variation in cab height (which, according to constraint C-2, can be between 84 and 240 inches), which presumably will effect the weight of the three walls of the cab and so the overall cab weight, nor does it take the weight of the doors into account (although as noted above, as no doors are defined, their weight is unknown). 3. Either the formula provided in [123, section 5.14] for calculating the motor horsepower required is expressed incorrectly, or the motor horsepower required value is incorrect in the test case: • The formula provided is (C*S(1-P))/33000*Es, where C is the car capacity (“The maximum total car occupant weight, in pounds, that the system must be able to support” [123, section 3], this is stated as part of the design specification and does not change during design; it is assumed to be between 2000 and 4000 pounds, and is set at 3000 in the test case), S is the (desired) car speed (again, stated as part of the design specification, and is unchanged during the design process; in the test case, it is set at 250), P is the counterweight percent, and Es is the motor overall system efficiency.
7.2. Evaluation of MAKTab
184
• The counterweight percent (P) is defined in [123, section 5.8] in the sentence “the counterweight should balance the load on the car side of the hoistway when the car is empty, plus about forty percent (counterweight percent) of the car’s maximum carrying car capacity”. This appears to state that the counterweight percent is 40% (the test case in [123, section 5.9] confirms this by stating the initial and final values should be 0.4). • The motor overall system efficiency (Es) is defined as the motor hoistway efficiency (estimated as 0.95) * the machine efficiency (which is defined by a lookup table based on the car speed and machine model); in the test case its initial value is 0.7695, decreasing to 0.722 for the final value. Using the final values from the test case presented in [123, section 9.1], I calculate the required motor horsepower as 9.8455 (to 4 decimal places) (calculated by (3000 * 250 * (1 - 0.4))/33000 * 0.722); however, the test case states the final value as 18.8869 (to 4 decimal places). • The difference is due to the order in which the /33000 * 0.722 is applied: as division and multiplication have equal precedence, they should be calculated left to right to give the calculation (with all brackets added) ((3000 * 250 * (1 - 0.4))/33000) * 0.722, giving ((3000 * 250 * 0.6)/33000)*0.722, which gives 13.636363 * 0.722, which equals 9.8455 (to 4 decimal places); however, it appears the formula has been interpreted as (3000 * 250 * (1 - 0.4))/(33000 * 0.722) in the test case (and in the original Prot´eg´e solution discussed in section 4.5.1) (this calculation works out as: (3000 * 250 * 0.6)/23826, which gives 450000/23826, which equals 18.8869 (to 4 decimal places). • This is important, as the motors described in [123] are capable of providing horsepower in values 10, 15, 20, 25, and 30, and so if the formula has been defined incorrectly, the final solution provided by a KBS may be inappropriate, as a less powerful (and presumably cheaper) motor could have been selected. Note: to allow comparison of KBS(pnr, elevator)’s solutions with the test case, I used the altered version of the formula. 4. There is no initial design specification provided; the test case provides initial values for all the parameters and component properties (for example, the motor horsepower required is initially set to 17.721071), however in my new system many of these values are calculated when the appropriate values have been set (for example, the car capacity, car speed, counterweight percent, and motor efficiency), and so do not require to be set initially. Elevator Diagnosis As expected, the KBS(diag, elevator) took considerably less time to build: just 1 hour 33 minutes. Of this, 8 minutes were spent using the mapping tool and the remaining 1 hour 25 minutes were spent using the KA tool. During the mapping stage, two mappings were defined to copy the elvis-components and elvis-models classes from ONT(elevator, [pnr]) to subclasses of PS-ONT(diag, -)’s SystemComponent class; both of these mappings were defined to be applied to one level of subclass and associated individuals were also copied. Two property equivalence mappings were also defined between the parameter-name and parameter-documentation properties in the domain of the elvis-parameter class
7.2. Evaluation of MAKTab
185
in ONT(elevator, [pnr]) and the name and description properties in the domain of PSONT(diag, -)’s SystemVariable class respectively. Execution of these mappings created 33 new SystemComponent subclasses: • buffer-system, • cable-system, • car-system, • carbuffers, • carguiderails, • carguideshoes, • compensationcables, • counterweight-system, • counterweightbetweenguiderails, • counterweightbuffers, • counterweightguiderails, • deflectorsheave-system, • deflectorsheaves, • door-system, • doors, • elvis-components, • elvis-models, • hoistcables, • hoistway-system, • machine-system, • machinebeam-system, • machinebeams, • machinegrooves, • machines, • motor-system, • motors, • opening-system, • platform-system, • platforms, • safety-system, • safetybeams, • sling-system, and • slings. Additionally, 206 SystemVariable individuals were also created. During the KA stage, a total of 173 rules were defined, consisting of 90 cause rules and 83 repair rules. When defining these rules, MAKTab automatically created a further 840 individuals to represent the rules (173 individuals), the atom lists (491 individuals), the atoms and their property values (which required a combined total of 349 individuals). A further breakdown is provided in table 7.2 which, for each rule type, lists the total number of rules of that type, the number of
7.2. Evaluation of MAKTab
186
rules with antecedents, the maximum number of antecedents, the average number of antecedents per rule, the number of rules with consequents, the maximum number of consequents, and the average number of consequents per rule. During the KA stage it was also necessary to create 20 new classes to represent additional components mentioned in the interview transcript and additional elevator diagnostic knowledge, these were: • Bearing, • Brakes, • Button, • Car, • Counterweight, • Door Operator, • Elevator, • Floor Selector, • Hydrolic Value, • Infra Red Multibeam, • Microprocessor, • Moving Cables, • Oil Dash Box, • Pit, • Power Cable, • Power Supply, • Relays, • Rollers, • Safety Device, and • Sensor. It was not necessary to create any new individuals or properties. Rule Type
Total # Rules
CauseRule RepairRule
90 83
Num. Max. # Average Num. Max. # Rules Anteced. # Rule Con. w/Anteced. Anteced. w/Con. 90 2 1.056 90 1 83 1 1.000 83 2
Average # Con. 1.000 1.012
Table 7.2: Summary of the types of rules defined in KBS(elevator, diag). The generated JessTab KBS(diag, elevator) consisted of 777 lines of code, of which 316 lines were from PS-RS(diag, [elevator]) and 461 lines were from PS-RS(diag, -). To test the correctness of KBS(diag, elevator), it was run several times with a range of (initial) malfunction descriptions, which included those known to the system and some not known to system. The diagnoses and repairs provided by the system for these malfunctions were compared with those used to build the system, which found that KBS(diag, elevator) provided the expected diagnoses and repair advice for the reported malfunctions. Overall, this experiment provided another good test for MAKTab, requiring use of both the mapping and KA sub-tools. During the mapping stage, the main challenge was deciding which
7.2. Evaluation of MAKTab
187
classes, if any, could be used by the rules that would be defined during the KA stage. It was clear that some of the model classes in ONT(elevator, [pnr]), which describe the different subcomponents of an elevator (such as the doors), would be required during the KA stage, as would some of the component classes which describe the elevator sub-systems (such as the car), but it was unclear exactly which ones would be required. To simplify the task, it was decided to define one mapping for the component superclass and one for the model superclass, with both mappings being applied to one level of subclass. Although this meant mapping at least two classes (the two superclasses, elvis-models and elvis-components) that were very unlikely to be required by the KBS(diag, elevator), it was decided that it was easier to do this (and hence not use irrelevant classes during the KA stage) than it was to determine exactly which classes were likely to be required by the KBS(diag, elevator). The parameters from ONT(elevator, [pnr]) were mapped to variables in PS-ONT(diag, -), as some of them refer to concepts, and it was felt they may be useful during the KA stage (although they may require to be renamed before being used). As with the first experiment, the KA tool was also used extensively in this experiment to define the diagnostic and repair rules. All of the defined rules used references to components and associated (textual) descriptions of their state, and textual descriptions of the repairs. The guided KA process worked well, providing a natural ordering to the rule creation, with the automatic addition of a rule’s antecedents being very useful and time saving when defining multiple causes and repairs for a particular malfunction; similarly, the range of options presented for actions to perform after a repair had been defined was also helpful. When defining the diagnostic rules, it became apparent that it would be useful to suggest values for a datatype property based on that property’s value for other individuals. For example, the textual descriptions of component states are provided by the SystemVariableOrComponentPropertyDatatypeValueAtom’s hasLiteralValue property. This type of atom was used by all the cause rules to describe component states. The description “broken” was often applied to describe a component state, and required to be typed every time (the descriptions were taken from the interview transcript); the similar term “broke” was also used. To reduce time and potential errors, when a description is being entered during the KA stage, the tool could use an autocomplete style function to list the other descriptions that contain the (partial) description that has been entered. This would allow reoccurring descriptions to be entered quickly (for example, many components may be described as “broke”), and would also mean that when the KBS is executed, the user would know to enter the text “broke” to describe all broken, faulty, etc. component states. This technique could feasibly be used on properties that have a string or numeric value.
User Evaluation (Experiment B) Two subjects developed the computer configuration KBS; both were initially unfamiliar with MAKTab, KBSs, and ontologies, however they both managed to produce an executable system (which produced a valid design suggestion) in under three hours. In both cases, the subjects read a brief introduction to the experiment (appendix E), and went straight into building the KBSs (little attention was paid to the provided resources, other than the task specification document (appendix K), which describes the computer hardware configuration task). The subjects rapidly identified
7.2. Evaluation of MAKTab
188
which components they required to map, and successfully defined and executed appropriate mappings. Using my solution as the gold standard [11, chap. 30], the subjects produced KBSs similar to that: their systems contained numbers of assignment, constraint, and fix rules comparable to the gold standard, and identical numbers of input and output rules. Both subjects generated executable KBSs from MAKTab (which contained around 5000 lines of code) and ran these with JessTab; both systems produced correct design proposals. Four subjects also developed the computer diagnosis KBS; again, these subjects were not familiar with MAKTab, although they had a range of experiences with KBSs and ontologies. Again, in these experiments, the subjects read a brief introduction to the experiment (appendix F), and went straight into building the KBSs, with little attention being paid to the provided resources, other than the introduction to computer hardware diagnosis document (appendix L). The subjects managed to correctly identify which mappings should be defined, to define them, and to execute them. The subjects then successfully used the KA tool to add the diagnostic information. Again, the systems developed were similar in rule count to the gold standard. The time taken by the subjects for this experiment ranged between one and two hours. Again, all subjects generated executable diagnostic KBSs, which they successfully ran and tested themselves using JessTab. The results of the questionnaires completed after the experiments found that, once the subjects had become used to MAKTab, they found it easy to use: the guided KA feature was particularly well received, and all subjects were able to use it to input a series of rules. All of the subjects felt they could use MAKTab to build further KBSs. Computer Hardware Configuration Two subjects took part in the computer configuration experiment; table 7.3 provides a summary of each subject’s familiarity with the relevant concepts for KBS(pnr, computer). Table 7.4 presents details of the gold standard KBS and the KBSs that were produced by the two subjects. This includes the number of mappings defined and executed during the mapping stage, the time taken by the subject to perform the mapping stage, the number of rules defined using the KBS (broken into the five main types of propose-and-revise rules), and the time taken to complete the KA stage. Question Experienced with ontologies Prot´eg´e familiarity PS familiarity Jess familiarity KBSs familiarity
Subject 1 no none none relatively little none
Computer configuration familiarity pnr algorithm familiarity
none
Subject 2 no none none relatively little used but not created none
none
none
Table 7.3: Summary of each subject’s familiarity with the relevant concepts for the KBS(pnr, computer) experiment. The subjects both defined seven mappings which copied component classes from the domain ontology to the PS ontology. This showed that the subjects were able to identify the classes in the domain ontology that were relevant to the computer hardware configuration task (as defined by
7.2. Evaluation of MAKTab Subject
Number Time of Mapping Mappings Defined
189
Number of Rules Defined
Input Output Assignment Constraint Fix 12 12 2 12 18
Gold Standard Subject 1
7
-
7
12 mins
3
4
2
12
18
Subject 2
7
15 mins
5
2
2
12
18
Time KA
New Concepts
-
5 Sys. Var. Ind. 5 Sys. Var. Ind. 5 Sys. Var. Ind.
2 hours 13 mins 2 hours 21 mins
Table 7.4: Summary of the KBSs developed by the subjects in the computer hardware configuration experiment. the task specification document), and define appropriate mappings. During the KA stage, both subjects used the KA tool to define the five variables described in the task specification document. They also used the KA tool to define the two assignment rules, 12 constraints, and 18 associated fixes described in the task specification document. The number of input and output rules varies between KBSs as it is possible to add more than one component or variable to a rule; in the gold standard, one rule was defined for every initial value and output selection, the subjects chose to add more than one initial value specification or output selection to their appropriate rules. Many lessons were learnt from observing the subjects using MAKTab, particularly during the KA stage: • Firstly, it was noted that the subjects did not use the entire KA process for a concept. Instead, both subjects followed the structure of the task specification document, defining the constraint and related fix(es) for a component (using the guided KA process), then the next constraint and related fix(es), and so on, before moving onto the next component and repeating the process. The initial selection and output selection rules were defined (also using the guided KA process) after all the constraints and fixes had been defined. • The only difference between the two subjects was that they defined the new variables at different points in the KA process: – One subject created them and the associated assignment rules before performing KA for the components. – The other created a new variable when it was first required by a (component related) constraint, and defined the assignment rule for it later (either after they had defined the constraint and related fixes they were defining or before they defined the initial selection/value rules). • Initially the subjects had difficulty deciding how to express a constraint, and so tended to take longer to do it; however, once they became familiar with the process, they found it easier, and so performed it quicker. Both subjects mentioned this during the experiment and in the questionnaire they completed after the experiment.
7.2. Evaluation of MAKTab
190
Figure 7.1: Time taken by subjects to define the constraint and fix rules for KBS(pnr, computer) (please see section K.6 for definitions of the constraint names). • Initially, the subjects struggled at both the conceptual level (determining how to convert the textual description provided in the task specification document into a constraint) and implementation level (using the rule definition interface to define the expression part of the constraint). – The conceptual problems were caused by the difference in how a constraint was described (for example, “total memory must be more than or equal to the desired memory”) and how the corresponding constraint rule must be defined (if total-memory < desired-memory then constraint-violation). – The implementation problems were caused by the prefix style used for defining expressions (which was felt to be hard to use), unclear function names/descriptions for certain functions (such as “is not member of list”), and the subjects’ unfamiliarity with the domain ontology. • Once the subjects spent a little time examining the domain ontology (using Prot´eg´e’s Classes tab, or the relevant ontology display from the mapping tool), and referred to the tutorial document describing how to configure the PS and the MAKTab user manual, their understanding of the task improved, and they were able to define the rules relatively easily and relatively quickly. The times taken by the subjects to define each constraint and associated fix(es) are shown in figure 7.1: it can be seen that the time taken to define each rule generally reduces as the subjects became more familiar with the tool. This was reflected in the comments provided by the subjects in the questionnaires after completing the experiment: both stated that initially the KA tool was difficult to use, but once they understood how to use it, they found it relatively easy to use, with the main difficulty being caused by the prefix display of expressions. One subject also noted that (an investigator planned) mismatch between terms used by the domain ontology (for example, “ComputerMemory”) and task specification document (for example, “Memory”) made the task more difficult. Table 7.5 provides a summary of the responses given by the subjects for this experiment. Firstly, the questionnaires show the subjects found the mapping tool easy to use, and also ranked
7.2. Evaluation of MAKTab Question 1 2 3 4 5 6 7 8 9 10
Subject 1 Rating 2 2 3 1 1 2 1 1 1 1
Subject 2 Rating 3 2 5 1 1 3 1 2 1 1
Question 11 12 13 14 15 16 17 18 19 20
191 Subject 1 Rating 3 2 3 1 2 1 2 1 2 2
Subject 2 Rating 2 2 1 1 2 2 3 1 7 1
Table 7.5: Summary of the responses to the questionnaire given by the subjects who took part in the elevator configuration experiment (please see section M.2 for the questionnaire). the KA tool favourably (although the ratings may be more influenced by the rules defined after their initial difficulties had been overcome). In particular, positive ratings were given for the ordering of the rule definition and the copying of a constraint rule’s consequents to a related fix rule’s antecedent. Although it appears to have taken a relatively long time for the subjects to build their KBS, it should be noted that neither subject was familiar with KBSs, ontologies, the propose-and-revise algorithm, MAKTab, or the task of designing a new computer. Despite this, both subjects were able to produce executable KBS, which successfully solve the task of designing a computer. Therefore, these results were very satisfactory. Computer Hardware Diagnosis Four subjects took part in the computer hardware diagnosis experiment. Table 7.6 provides a summary of each subject’s familiarity with the relevant concepts for KBS(diag, computer) (note subject 2 also performed the computer configuration experiment, two weeks earlier). A summary of the KBSs created by these subjects (and the gold standard created by myself) is provided in table 7.7. Table 7.7 provides the subject, the number of mappings they defined during the mapping stage, the time it took to define and execute these mappings, the total number of cause and repair rules they defined for their KBS, and the time it took to define these rules. Briefly, all subjects managed to correctly identify which mappings should be defined, to define them, and to execute them. They then successfully used the KA tool to add the diagnostic information and generate an executable KBS. Again, the systems developed were similar in rule number to the gold standard. As is expected, the number of rules in the various KBSs produced during this experiment vary slightly, mainly due to the different modelling choices made by the subjects; however, in general all of the necessary problems and repairs were defined in the KBSs. Reasons for the differences in rule numbers are discussed below, as well as lessons learnt from observing the subjects in this experiment. During the mapping stage, subject 2 chose to create one mapping to copy the superclass of the components in the computer ontology to a subclass of the PS ontology’s SystemComponent class; this mapping was specified to apply to four levels of subclass, and so, copied all the components from the domain ontology to the PS ontology. This subject choose this option as (s)he felt it was easier and quicker than determining exactly which components should be mapped and
7.2. Evaluation of MAKTab Question Experienced with ontologies Prot´eg´e familiarity PS familiarity Jess familiarity
192
Subject 2 Subject 3 Subject 4 Subject 5 no yes no yes basics none relatively little used by not built basic
KBSs familiarity
basic none none
none none none
none
none
basic none relatively little none
Computer diagnosis detailed basic detailed familiarity Diagnostic algorithms understand understand understand understand familiarity basic basic basic basic Table 7.6: Summary of each subject’s familiarity with the relevant concepts for the KBS(diag, computer) experiment.
Subject
Number of Mappings
Time - Total Cause Mapping Rules
Gold Standard Subject 2
10
-
33
1
4 mins
Subject 3 Subject 4
1 10
Subject 5
1
Total Repair Rules
Time - KA
30
Number New Concepts Defined 4 classes
34
29
3 classes
3 mins 9 mins
30 31
30 30
5 classes 3 classes
4 mins
32
30
5 classes
1 hour 28 mins 43 minutes 1 hour 4 mins 50 mins
-
Table 7.7: Summary of the KBSs built by the subjects in the computer hardware diagnosis experiment.
defining mappings for each of them. Overall, the mapping stage was completed quickly without any problems by each subject. The KBS produced by subject 2 during the KA stage contains one more cause rule than the gold standard because this KBS contains a rule which was created but not completed, and so should have been deleted; as the defined rule states that a “nil” component with a “nil” state is caused by the Speaker component, the (executable version of the) rule has no effect on the execution of the KBS. Subject 2 defined one less repair rule, as (s)he described two separate malfunctions (the computer did not have the correct modem or sound card drivers installed) identically (as the computer not having the correct driver installed), and realised that the fix rule define for one of these malfunctions would be triggered for the second, and so there was no need to define a second, identical repair rule. Subject 2 only felt it necessary to define three new component classes in the PS ontology (for the computer, monitor, and speaker components), choosing to describe malfunctions such as “loose optical drive power supply cable” as relating to the optical drive component, not the optical drive power supply cable, which would have required the creation of a new class.
7.2. Evaluation of MAKTab
193
During the mapping stage, subject 3 also chose to create one mapping for the domain ontology’s component superclass and apply it to all the subclasses, which was performed easily and quickly. During the KA stage, subject 3 defined three fewer cause rules than the gold standard as subject 3 choose to model three of the problems differently from the gold standard. In the gold standard, the malfunction of a computer randomly freezing was attributed to the computer overheating, which was attributed to a faulty cooling system; subject 3 choose to attribute the computer randomly freezing problem directly to a faulty cooling system, and so reduced the need for the extra cause rule. A similar approach was taken by subject 3 for problems relating to the modem and hard drive, which account for the three less cause rules in their KBS. During the KA stage, subject 3 defined five new component classes to represent the computer, hard disk drive cable, heatsink, monitor, and video cable. Subject 4 chose to copy 10 classes from the domain ontology which they identified as being referred to in the problem specification document. After identifying which classes should be mapped, the mappings were defined without difficulty. During the KA process, subject 4 used the KA process extensively to define a malfunction, its cause, subsequent causes and repairs, then repeated the process for other causes of the malfunction. Subject 4 defined two cause rules fewer than the gold standard: similar to subject 3, they did not describe the frequently corrupted documents as a hard drive problem and did not include a rule for the modem failing to dial when the correct driver was installed. Similar to subject 2, subject 4 chose to define three new concepts: computer, monitor, and speaker, attributing other causes to the related component (for example, associated sound recording problems with the computer class instead of a (new) microphone class). Subject 5 chose to define one mapping for the domain ontology’s component superclass and applied it to one level of subclasses; this was performed without difficulty. The rules created by subject 5 during the KA stage covered all the problems and repairs detailed in the problem specification document. These were defined sequentially with the guided KA process being used to define a malfunction, its causes, their causes and repairs, then the next cause of the malfunction, and so on. This was completed relatively easily, with only major difficulties encountered when it was unclear exactly which component a problem (for example, problems related to the video cable) should be associated with (for example, either the video card or a new video cable class). In total, subject 5 defined five new component classes for the computer, monitor, speaker, heatsink, and microphone. Observing the subjects using MAKTab I identified some unexpected ways of using the system. For example, the computer diagnosis document (appendix L) often describes more than one potential cause for a given problem. For example, problem VP-2 describes multiple causes and repairs for the image displayed on the monitor missing a primary colour. When defining the associated rules, one subject defined all the cause rules first, then went back, selected to edit each rule from the list of rules, then defined associated fix rules. Table 7.8 provides a summary of the responses given by subject 2 (S2), 3 (S3), 4 (S4), and 5 (S5) to the questionnaires they completed after building their KBS. In general, the ratings and comments were positive, with mapping tool generally rated as easy/very easy to use, and the KA tool being rated similarly. The comments received from these subjects expressed that they liked the way in which the
7.2. Evaluation of MAKTab
194
KA tool guided them through the process of defining a set of rules for a component; all subjects were confident they could use MAKTab to built another KBS. Areas identified by the subjects as causing the most difficulties were the execution of the KBS (1 subject) and identifying how a rule they had previously defined was related to other rules (1 subject). Regarding the former point, the current JessTab API does not allow code to be passed to Jess programmatically, and so MAKTab is unable to run the KBS automatically for the user, although better instructions on how to achieve this should be provided. Regarding the latter point, the subject suggested adding the option of having the existing rule display show all the rules (not just either the cause or repair rules), with appropriate highlighting to identify the type of each rule; alternatively, it should be possible to provide a graph visualisation of the defined rules, either based on the rule KA graphs that are maintained by the KA tool as the rules are defined, or linking two rules where the first rule’s consequents are the same as the second rule’s antecedents. The automatic suggestion of textual values (identified in Experiment A) was also identified by subjects as a desirable feature. Question 1 2 3 4 5 6 7 8 9 10
Responses (S2, S3, S4, S5) 2, 2, 3, 1 2, 3, 3, 2 1, 3, 2, 1 3, 2, 2, 2 1, 2, 1, 1 2, 3, 2, 1 2, 1, 1, 1 2, 1, 2, 1 3, 2, 1, 2 2, 1, 2, 1
Question 11 12 13 14 15 16 17 18 19 20
Responses (S2, S3, S4, S5) 3, 2, 2, 2 1, 3, 2, 1 3, 2, 2, 1 1, 2, 1, 1 3, 5, 2, 2 2, 3, 2, 1 2, 2, 2, 1 2, 2, 2, 1 1, 2, 1, 1 1, 4, 1, 1
Table 7.8: Summary of the responses to the questionnaire given by the subjects that took part in the elevator diagnosis experiment (please see section M.2 for the questionnaire).
Interface Evaluation (Experiment C) The interface evaluation that was performed provided some good feedback regarding various aspects of the interface. Tables 7.9 and 7.10 summarise the ratings to the points in [89] (the categories are provided in section 7.2.3; unfortunately, due to copyright and space restrictions, the questionnaire can not be included here, the reader is referred to [89] for the questions). The ratings columns provide the ratings of the three subjects in this experiment (S1, S2, and S3; note these do not correspond with subjects 1, 2, and 3 in Experiment B); questions which are not included were rated “N/A” by all subjects. In general, the tool received positive feedback for relevant issues, such as ensuring the system status is displayed to the user (for example, how the rule currently being defined relates to others), that the interface is consistent3 , and that it has an aesthetically pleasing and minimalist design. Some potential points for improvement were also identified; some of these were also identified and discussed in Experiments A and B and so have not been included below: • MAKTab does not provide an undo function or related features such as the option to view a summary of the operations that a user has performed. 3
For example, in how text is displayed.
7.2. Evaluation of MAKTab
195
• MAKTab does not provide a copy and paste function (the closest is the KA tool’s “acquire a similar rule” feature). • The colours used by the KA tool for highlighting the properties in the rule display were not felt to be far enough apart in the colour spectrum. • The mapping tool does not provide warnings about potential errors when the user defines a mapping, there is limited/no indication of the severity of errors that occur during the mapping process, and there is no advice given on the actions needed to correct errors. • The help provided by the tools regarding their use could be improved. Specifically context sensitive help such as explaining what the options on the GUIs do and instructing the user with, for example, what they should do next (for example, after completing a rule definition, instructing them to click the “Acquire Next Rule” button), or how to achieve a particular action (for example, how to define rules related to a rule which was defined previously and so is not currently shown in the rule definition area). Further, it may be desirable to offer multiple levels of help, reflecting the user’s experience/knowledge of the tool. Question Ratings Question Ratings Question Ratings Question Ratings S1 S2 S3 S1 S2 S3 S1 S2 S3 S1 S2 S3 1.1
YYY
1.4
YYY
1.5
YYY
1.7
YYY
1.8
NNN
1.9
YYY
1.12
YYY
1.13
YYY
1.17
YYY
1.18
YYY
1.19
YYY
1.20
YYY
1.21
YYY
1.22
YYY
1.23
YNY
1.24
YYY
1.26
YNN
1.27
YYY
1.28
NNN
1.29
N N/A N/A
2.3
YYY
2.4
NNN
2.7
YYY
2.9
YYY
2.10
YYY
2.11
YYY
2.15
YYY
2.22
YYY
3.4
YYY
3.6
YNN
3.7
NNN
3.9
YYY
3.10
NNY
3.11
YYY
3.13
YYY
3.16
NNY
3.17
YYY
3.18
NNN
3.19
YYY
3.21
NNN
3.22
N/A N N/A
4.2
YYY
4.3
YYY
4.9
YYY
4.122
YYY
4.16
YNY
4.17
Y N/A N/A
4.18
YYY
4.19
YYY
4.20
YNY
4.21
N Y N/A
4.22
YYY
4.23
YYY
4.24
Y N/A N/A
4.25
Y N/A N/A
4.26
YYY
4.27
YYY
4.29
YYY
4.30
NNN
4.32
YYY
4.33
YYY
4.34
YYY
4.35
YYY
4.36
YYY
4.37
Y Y N/A
4.38
YYY
4.41
YYY
4.46
YYY
4.47
YYY
4.48
N N/A N/A
4.49
N N/A N/A
4.51
YNN
5.2
YYY
5.3
YYY
5.4
YYY
5.5
YYN
5.6
YYY
5.7
YYY
5.8
YYY
5.10
YYY
5.11
YYY
5.12
YYY
5.16
YNN
5.17
YYY
5.18
YYY
5.19
YYY
5.20
NNN
5.21
NNN
6.5
YNY
6.6
Y Y N/A
6.9
YYY
6.10
Y N/A N/A
6.11
NYN
6.12
NNN
6.14
YYY
6.15
YNN
Table 7.9: Summary of the interface evaluation checklist results (Y = Yes, N = No, N/A = not applicable).
7.2. Evaluation of MAKTab
196
Question Ratings Question Ratings Question Ratings Question Ratings S1 S2 S3 S1 S2 S3 S1 S2 S3 S1 S2 S3 7.1
YYY
7.2
YYY
7.3
YYY
7.4
YYY
7.5
YYY
7.6
YYY
7.7
YYY
7.9
Y Y N/A
7.10
YYY
7.11
YYN
7.12
YYY
7.13
NNN
7.14
YYY
7.15
YYY
7.26
YYY
7.17
Y N N/A
7.18
YYY
7.21
YYY
7.22
YYY
7.23
YYY
7.25
NNN
7.26
YNY
7.27
YYY
7.28
YYY
7.29
YYY
7.30
N/A N/A Y
7.33
YYY
7.34
YYY
7.35
N N/A N/A
7.36
YYY
7.37
YYY
7.38
Y N/A N/A
7.39
Y N/A N/A
7.40
YYY
8.6
YNY
8.7
NNN
8.9
YYY
8.11
YNY
8.13
YNY
8.14
Y N/A N/A
8.15
YYY
8.16
NNN
9.1
YYY
9.2
Y N/A N/A
9.3
Y N/A N/A
9.4
Y N/A N/A
9.5
Y N/A N/A
9.6
YYY
9.7
YYY
9.8
YYY
9.9
YYY
9.10
Y N/A N/A
9.12
YYY
10.1
YYY
10.2
YNN
10.3
YYY
10.4
N N/A N/A
10.5
YYY
10.7
NNN
10.8
NNN
10.9
NNN
10.10
YYY
10.11
YYY
10.12
YYY
10.12
YYY
10.14
YYY
10.15
YYY
10.16
YYY
10.17
YYY
10.18
YYY
10.19
NNN
10.20
NNN
10.21
N N/A N/A
10.22
N/A N/A N/A
10.23
YYY
11.2
YYY
11.4
NNN
11.6
NNN
11.7
YYY
11.8
YYY
11.9
YYY
11.12
YYY
11.14
YNY
11.16
YYY
11.17
N N/A N/A
11.22
YYY
12.1
YYY
12.3
YYY
12.4
YYY
12.5
YNN
12.6
YYY
12.8
YYY
12.9
YYY
12.13
YYY
12.14
NNN
12.15
YYY
12.16
N N/A N/A
12.17
NNN
Table 7.10: Continued summary of the interface evaluation checklist results (Y = Yes, N = No, N/A = not applicable).
7.2.5
Summary
This section describes the experimental evaluations that were performed with MAKTab. Ideally a comparative study would have been done, in which the performance of MAKTab would be compared with the performance of related tools such as PSM Librarian, CommonKADS, IBROW3, and EXPECT; however, for various reasons, such a study was not possible. The evaluation of MAKTab therefore consisted of three experiments: an evaluation by the developer building two types of KBS in the elevator domain, a user evaluation building two types of KBS in the computer hardware domain, and an interface evaluation. The developer evaluation consisted of using MAKTab to build KBS(pnr, elevator) by combining PS(pnr, -) with ONT(elevator, [diag]), and to build KBS(diag, elevator) by combining PS(diag, -) with ONT(elevator, [pnr]). Both of these KBSs were relatively complex, and building them provided an extensive test of MAKTab. Building KBS(pnr, elevator) took 16 hours 40 minutes, during which time, 651 new rules were defined, along with 309 new domain concepts. MAKTab was then able to successfully convert these into executable JessTab code, which formed part of the final KBS. This KBS was then executed several times, each time producing a valid elevator
7.3. Evaluation of PS2 R
197
design proposal. The KBS(diag, elevator) took 1 hour 33 minutes to build, and involved defining 173 rules and 20 new domain classes. MAKTab successfully converted these into executable JessTab code to produce the final KBS, which, when tested with a range of known and unknown symptoms, produced the correct diagnosis and repair advice. The user evaluation of MAKTab involved having a series of subjects use MAKTab to produce either a computer hardware configuration KBS, KBS(pnr, computer) or a computer hardware fault diagnosis KBS, KBS(diag, computer). Two subjects took part in the KBS(pnr, computer) experiment; both built a new KBS in under 2 hours 36 minutes; the subjects then used MAKTab to generate an executable KBS (consisting of around 5000 lines of code) which produce valid design suggestions; while building their KBSs, the subjects found it necessary to create five new domain concepts. Four subjects took part in the KBS(diag, computer) experiment; all subjects successfully built their KBS in between 46 minutes and 1 hour 32 minutes, and the executable KBSs produce correct diagnosis and repair advice; while building their KBS, each subject found it necessary to define between three and five new domain concepts.
7.3 7.3.1
Evaluation of PS2 R Aim of Evaluation
PS2 R was developed to support the acquisition of PSs, particularly from the Web. To evaluate the extent to which PS2 R achieved this, I performed an empirical evaluation which investigated the on-line availability of JessTab, Jess, and CLIPS implementations of the standard PSMs as returned by Web search; additionally, I compared the tool’s Web PS search function with manual Web searching, and investigated the process of evaluating the potential reusability of PSs found on the Web. The aims of this experiment were to determine: 1. If PS2 R’s Web based PS search returned more focused results than a general Web search engine. 2. The number of JessTab, Jess, and CLIPS programs that can be found using Web search engines, that use terminology associated with PSMs. 3. If each returned program could be considered to be a PS, and if so, the degree to which they are tied to a domain, and the ease with which they could be reused. 4. The processes that were used during the analysis of the returned PSs, and the background knowledge that was required to perform such analyses.
7.3.2
Experimental Design
This experiment involved performing several Web searches for PSs using both Google and PS2 R. A variety of search terms were used to search for files with the standard Jess and CLIPS file extension (“.clp”) and additionally, files with the “.jess” extension; when searching with Google, every search result was scanned visually to determine if it appeared to be a syntactically valid program, the degree to which it was related to a domain, and its suitability for reuse; after analysing each search result, the processes that I had used and the background knowledge that was required to make these judgements were recorded. In the case of the PS2 R, a similar process was used,
7.3. Evaluation of PS2 R
198
Narrow PS Search Terms Assessment; Classification; Scheduling; Diagnosis; Diagnose; Configuration; Monitoring; Synthesis; Planning; Prediction; Design; Configuration Design; Modelling Broad PS Search Terms Assessment Subject; Assessment Resource; Assessment SubjectGroup; Assessment Allocation; Classification Object; Classification Class; Classification Attribute; Classification Feature; Scheduling Job; Scheduling Unit; Scheduling Resource; Scheduling Constraint; Diagnosis Complaint; Diagnosis Symptom; Diagnosis Hypothesis; Diagnosis Differential; Diagnosis Finding; Diagnosis Evidence; Diagnosis Fault; Configuration Component; Configuration Parameter; Configuration Constraint; Configuration Preference; Configuration Requirement; Monitoring Parameter; Monitoring Norm; Monitoring Discrepancy; Synthesis SystemStructure; Synthesis Constraint; Synthesis Preference; Synthesis Requirement; Planning Goal; Planning Action; Planning Plan Domain Search Terms Apple; Minerals; Rock; Animal; Load; Loan; Computer; Elevator; Lift; Copier; Car; Plane; Airplane; Plant; Office; Staff; Employees; Gate; Therapy; Treatment; Disease; Illness; Production; Job; Work Figure 7.2: Search terms used during the PS2 R evaluation. the differences being that it was not necessary to scan the program to determine if it appeared to be syntactically valid, and when trying to gain an understanding of the retrieved program, I first examined the extracted ontologies to determine if a (potential) understanding of the program could be gained from the ontology before examining the program code to gain further understanding of the program and determine if the initial understanding (gained from the ontology) had been correct. It is important to highlight that any programs returned by either search engine that were not written in English were discarded. The sets of search terms used during these experiments were derived from keywords associated with PSM descriptions in the literature, particularly [98, chap. 6]. The search terms were grouped into three categories: narrow PS search terms (PS names), broad PS search terms (PS type plus an associated concept name), and domain terms.
7.3.3
Materials
The sets of search terms were based on the template knowledge models (PSM descriptions) provided in [98, chap. 6], which provides descriptions of 13 different types of PSM. Every PSM description provides the PSM name, related phrases, and potential domains, which were used to derive the three groups of search terms used in this study. This gave a total of 72 search terms. These groups are shown in figure 7.2.
7.3.4
Results
A summary of the search results is presented in table 7.11. These results show the search terms, the (combined) number of results returned by the PS2 R search for those terms (for “.jess” and “.clp” files), the (combined) number of results returned by the Google search for those terms (for “.jess” and “.clp” files), the number of these (Google results) that appeared to be syntactically valid JessTab, Jess, or CLIPS programs after visually scanning the code, the apparent Google precision rate (apparently valid results/total results), and the actual Google precision rate (the number of PS2 R results/number Google results; the syntax parsing performed by PS2 R on the search results
7.3. Evaluation of PS2 R
199
ensures they are all syntactically valid programs). As all results returned by PS2 R are syntactically valid, the precision is always 1.00 and so is not shown in table 7.11. A total of 41 search terms returned no results for both searches and are not included in table 7.11. Search Terms
Number PS2 R Results
Number Google Results
Diagnosis Symptom Planning Goal Planning Action Assessment Classification Diagnosis Diagnose Configuration Planning Prediction Design Apple Rock Animal Load Computer Lift Copier Car Plane Plant Office Staff Employees Gate Therapy Treatment Disease Production Job Work Total
0 0 0 0 2 5 1 0 3 0 0 0 0 11 16 5 0 1 8 0 0 2 4 1 0 0 1 1 0 2 6 69
1 1 1 1 3 8 2 7 7 3 10 4 17 16 54 25 3 1 22 1 4 5 6 2 5 1 2 1 3 10 26 253
Number of Apparently Valid Google Results 1 1 1 1 3 8 2 6 5 3 4 0 2 16 50 19 3 1 13 1 4 3 4 1 4 0 2 1 2 2 9 173
Google Apparent Precision
Google Actual Precision
100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 85.71% 71.43% 100.00% 40.00% 0.00% 11.76% 100.00% 92.59% 76.00% 100.00% 100.00% 59.09% 100.00% 100.00% 60.00% 66.67% 50.00% 80.00% 0.00% 100.00% 100.00% 66.67% 20.00% 34.62% 68.38%
0.00% 0.00% 0.00% 0.00% 66.67% 62.50% 50.00% 0.00% 42.86% 0.00% 0.00% 0.00% 0.00% 68.75% 29.63% 20.00% 0.00% 100.00% 36.36% 0.00% 0.00% 40.00% 66.67% 50.00% 0.00% 0.00% 50.00% 100.00% 0.00% 20.00% 23.08% 27.27%
Table 7.11: Summary of the retrieval rates of PSs from the PS2 R evaluation. Table 7.11 shows that only 27.27% of the 253 results returned by Google are actually valid JessTab, Jess, or CLIPS programs, despite 68.38% (173) appearing to be after a quick visual scan. This shows the benefit of using PS2 R as it considerably reduces the user’s work load during the search evaluation process: not returning files which are neither valid JessTab, Jess, nor CLIPS programs, allows the user to focus on program analysis. The analysis stage of the Google results found that it was necessary to have a familiarity with JessTab, Jess, and CLIPS concepts to
7.3. Evaluation of PS2 R
200
understand the purpose and reusability of the programs that were found. Analysis of the PS2 R results found that, although the ontology was useful (typically for gaining an idea of the domain and the application), knowledge of JessTab, Jess, and CLIPS was still required to determine what each program did. Further, analysis of the valid programs determined that they were all relatively simple with little reuse potential; in fact, every program found by PS2 R came from an educational institute and was an example program from a teaching course.
7.3.5
Conclusion
The aim of this experiment was to investigate four issues related with PS search, particularly the quantity of available PSs, their suitability for reuse, how they were analysed, and if PS2 R produced a superior search engine than standard Web search engines in terms of user workload. After performing this experiment, the following conclusions were reached: • PS2 R’s Web search did produce more focused results than Google (i.e. Google also returned files that were not JessTab, Jess, or CLIPS programs): Google had a perceived recall of 68.38% and an actual recall of 27.27% (the number of apparently valid Google search results that were determined to be valid programs by PS2 R; this assumes the Google Web search and Google Web Service Search API provide identical results). Due to the syntactic parsing of search results performed by PS2 R, its recall is 100% (i.e. all returned programs are syntactically valid). • Using PS2 R’s PS Web search returned 69 syntactically valid JessTab, Jess, or CLIPS programs, which consisted of 49 unique programs (i.e. 20 were returned for multiple searches). • Analysis of the perceived valid programs during the Google search, found that all programs were relatively simple and it would not be possible to reuse them in another domain. • Of the extracted ontologies: – three of them could not be loaded due to invalid characters in the concept names, – two did not have any concepts (there are various reasons for this, for example, the program may have used ordered facts), – 20 provided no assistance in helping to gain an understanding of the program (for example, one was unable to decide the domain of the ontology, or the task it was being used for), – 24 provided assistance in gaining a potential understanding of the program, in terms of either the domain that it was operating in (13 ontologies describing domains including animals, academic courses, electronic circuits, finance, and family relationships) or the task the program was performing (11 ontologies, with tasks including diagnosis of car faults, basic medical diagnosis, animal classification, or the classic AI sticks game). • Although the extracted ontologies were helpful in some cases, useful ontologies were not extracted from all PSs and the analysis process required knowledge of JessTab, Jess, and CLIPS.
7.3. Evaluation of PS2 R
7.3.6
201
Summary
The evaluation of PS2 R involved a comparison of its performance with that of a standard Web search engine (Google) when searching for PSs which use a variety of terms associated with standard PSMs. The results of this comparison found that, although PS2 R returned more focused results (i.e. files which were not (syntactically) correct programs were not returned), none of the searches returned PSs which could be reused in other domains. However, the extra features of PS2 R, especially the ontology extraction feature, proved to be very useful when evaluating the search results, particularly with gaining an understanding of the domain a program works in.
202
Chapter 8
Conclusions and Future Work 8.1
Overview
This chapter concludes by discussing each of the hypotheses investigated in this work with respect to the experiments described in chapter 8 and outlining directions for future work. The MAKTab experimental evaluations served to investigate the first three hypotheses defined for this work; sections 8.2.1, 8.2.2, and 8.2.3 discuss each of these hypotheses with respect to the experiments discussed in section 7.2. Briefly, using the results of these experiments support is found for each hypothesis. The empirical evaluation of PS2 R, presented in section 7.3, investigated the fourth hypothesis defined for this work, which is discussed in section 8.2.4. Briefly, this experiment failed to find support this hypothesis. The fifth hypothesis was investigated by the evaluations of PJMappingTab, MAKTab, and PS2 R, which all showed that it was possible to provide semiautomated support to help knowledge engineers reuse knowledge sources. There are several options for extending the current version of MAKTab, which are discussed in section 8.3. These include expanding the range of available generic PS, which should increase the number of potential users; developing new rule generators, especially for alternative formats such as CSPs due to the advantages that they offer for some types of PSs; and applying the MAKTab KBS development methodology in a (Semantic) Web setting.
8.2
Conclusions
The questions that I planned to investigate at the start of this thesis have all been addressed, and each is summarised in this section. Briefly, these questions were: If one has a Knowledge Based System (KBS) which is capable of solving propose-and-revise (pnr) tasks in the elevator domain, then if one wishes to develop a diagnostic KBS in the same domain, I hypothesise that: • The necessary diagnostic rule set for an elevator diagnostic KBS will not be contained in an existing elevator configuration KBS which uses propose-and-revise, and hence it will be necessary to acquire the additional domain specific rule set (hypothesis 1). • A generic diagnostic problem solver (PS), together with the appropriate domain ontology can be used to “drive” the Knowledge Acquisition (KA) process (hypothesis 2). • An ontology developed for use with a propose-and-revise PS for a particular domain will need to be extended when it is used as the basis for acquiring a diagnostic rule set for the same domain (and vice versa) (hypothesis 3).
8.2. Conclusions
203
These can of course be generalised to apply to further pairs of problem solvers and other domains. Further, I hypothesise that it should be possible to find rule sets providing implementations of well-known problem solvers online, and that it will be possible to semi-automatically reconfigure these rule sets for use in another domain, using a technique such as that described in this thesis (hypothesis 4). Additionally, I hypothesise that it is possible to provide semi-automated support to help knowledge engineers reuse knowledge sources (hypothesis 5).
8.2.1
Hypothesis 1
Two sets of KBSs have been built during this work: one in the elevator domain, KBS(pnr, elevator) and KBS(diag, elevator), and one in the computer hardware domain, KBS(pnr, computer) and KBS(diag, computer). From comparison of the semantics of the rules contained in each pair of KBSs, it is clear that the propose-and-revise and diagnostic rules are significantly different in functionality. There is no overlap between the actions performed by the propose-and-revise rules with those performed by the diagnostic rules, in either the elevator or computer hardware domains. This is to be expected, as diagnosis is generally performed on an artefact that has been functioning correctly up until the point when a malfunction occurred, i.e. the artefact already had a valid design/configuration. The type of fixes used by propose-and-revise (for example, to select a narrower elevator door, if the shaft is not wide enough) would generally not be an issue when performing diagnosis (if the doors had been too wide the elevator would not have been built successfully); the same is found in the computer hardware domain. The results of Experiments A and B described in section 7.2.4 therefore support hypothesis 1, as, for example, the necessary rule set for a diagnostic KBS was not contained in a propose-andrevise based KBS (and vice versa) and so had to be acquired in both the elevator and computer hardware domains. This is significant as previous approaches to reuse-based KBS development assumed that the reasoning knowledge required by a particular PS would be provided by a domain ontology taken from a KBS which used another type of PS; this has been demonstrated not to be the case. Generalising these results, this has demonstrated that hypothesis 1 is true.
8.2.2
Hypothesis 2
The KA technique developed for the MAKTab KA tool uses the (rule) requirements of a generic PS, combined with an appropriate domain ontology, to acquire the rules that the PS needs to enable it to reason with the concepts described in the domain ontology. The underlying KA technique is not tied to any particular type of PS; rather it uses knowledge of how the different types of required rules are related to drive the KA process. This means that the technique should be general enough to be applicable to any type of PS in any domain. During the evaluations described in section 7.2, the KA technique has been successfully applied to two different types of PSs (propose-and-revise configuration and diagnosis) in two different domains (elevator and computer hardware). The experiments in the elevator domain provided an extensive test of the KA process and its usefulness. Developing the KBS(pnr, elevator) required the use of the KA process to define 651 rules that provided the domain specific reasoning component of a KBS, which produces valid elevator configurations. The support provided by the KA tool during the definition of the rules
8.2. Conclusions
204
was found to be useful, as rules were defined in a natural order and more quickly due to the autocompletion that the ordering enabled. This was reflected in the time taken to create the KBS, 16 hours and 40 minutes. Although this appears an excessively long time, consider that the generated version of the rules consists of 8377 lines of code (using standard Jess formatting), which contains 340,979 characters; given an average (fast) touch typing speed of 40 5-character words per minute, it would take over 23 hours to type those rules manually (without any additional time taken for thinking about the syntax and semantics of the rules). Developing the KBS(diag, elevator) required use of the focused rule KA process to define 173 diagnostic rules, which provided the domain specific reasoning component of a KBS which produces valid diagnosis of elevator faults for a given set of symptoms. Again, the KA process facilitated the quick and easy creation of these rules. The experiments in the computer hardware domain also provided a test of the KA process and its usefulness as perceived by the subjects. In the computer configuration experiment, both subjects had little familiarity with KBSs, Jess, the propose-and-revise algorithm, and the computer configuration task; however, by using the KA technique, they were both able to define 39 rules which enabled the resulting KBS to produce valid computer hardware configurations. The KA technique enabled these rules to be defined in a relatively short period of time (2 hours 13 minutes for subject 1, and 2 hours 21 minutes for subject 2), with both subjects generally feeling that the support provided by MAKTab helped with the process. The computer diagnosis experiments also showed that subjects were able to define appropriate diagnostic rules (which then formed part of an executable KBS) in a reasonable time period by using the guided KA technique. As with the previous experiment, the four subjects in the computer diagnosis experiment had very little familiarity with KBSs and Jess, however, they did have an understanding of diagnosis and familiarity with computer hardware diagnosis. This prior understanding of the task was probably responsible for the subjects defining more rules in less time than the subjects in the configuration experiment. Briefly, by using the guided KA technique, subject 2 defined 53 rules in 1 hour 28 minutes, subject 3 defined 60 rules in 43 minutes, subject 4 defined 61 rules in 1 hour 4 minutes, and subject 5 defined 62 rules in 50 minutes; the KBSs generated using these rules all provided correct diagnosis and repair advice for symptoms that the KBSs know about, and appropriate messages for symptoms that the KBSs do not know about. The results of Experiments A and B described in section 7.2.4 therefore support hypothesis 2.
8.2.3
Hypothesis 3
It was necessary to define additional domain concepts for all four KBS creation experiments described in section 7.2 (two KBS in the elevator domain and two in the computer hardware domain). The domain ontologies used for the elevator KBSs were taken from pre-existing KBSs, and so I had no influence over their contents. When using ONT(elevator, [diag]) to build KBS(pnr, elevator), it was necessary to extend the domain knowledge provided by ONT(elevator, [diag]) with 23 new classes, 286 new individuals, and 39 new properties to facilitate the creation of the proposeand-revise rules. Similarly, when using ONT(elevator, [pnr]) to build KBS(diag, elevator) it was necessary to extend the domain knowledge provided by ONT(elevator, [pnr]) with 20 new classes to facilitate the creation of the diagnostic rules.
8.2. Conclusions
205
During the computer configuration experiments it was necessary to create the domain ontologies that would be used by the participants. To minimise any potential effects of providing ontologies which would support hypothesis 3, the task specification documents were written before the domain ontologies were created (which meant the documents were not influenced by the contents of a given ontology) and were based on texts provided by external sources (to avoid any purposeful difference in the components referenced). The computer domain ontologies were based on an initial ontology derived from component classifications and descriptions used by two online computer hardware retailers (and so the ontology was not influenced by the contents of the task specification documents). Further, the initial computer ontology was used to build the two computer KBSs, and then any concepts not used by a KBS were removed. This produced two computer hardware domain ontologies each of which solely met the requirements of either the configuration or diagnostic KBS. There were two exceptions to this: the KBS(diag, computer) did not use any individuals associated with the component classes, nor did it use the required power value of the separate components; however, both of these were required by the KBS(pnr, computer). As it was expected that the creation of KBS(pnr, computer) would take around the maximum desired time for an experiment, it was decided to leave these concepts in ONT(computer, [diag]). Had they not been left in, the participants would have required to add them, which is essentially an ontology population task which would have evaluated Prot´eg´e’s individual creation tab, rather than MAKTab. All five participants who took part in Experiment B found it necessary to extend the domain knowledge provided by the domain ontology with either new individuals or new component classes. Both subjects in the computer configuration experiment found it necessary to define five new SystemVariable individuals; in the computer diagnosis experiment, subject 2 defined three new component classes, subject 3 defined five new component classes, subject 4 defined three new component classes, and subject 5 defined five new component classes. This is significant as previous approaches to reuse-based KBS development assumed that all the necessary domain knowledge for a given PS would be provided by a domain ontology, and that that domain ontology could be used with multiple PSs. The results of Experiments A and B described in section 7.2.4 show that this is not necessarily the case, and so support hypothesis 3.
8.2.4
Hypothesis 4
The PS2 R tool provides a basic Web search facility for problem solvers, particularly ones written in JessTab, Jess, or CLIPS. As part of the evaluation of PS2 R, multiple Web searches were performed for problem solvers which used keywords typically associated with popular types of problem solvers and applications of problem solvers. Although the parsing facilities of PS2 R improved the search process by only returning syntactically valid programs, none of these programs were sufficiently sophisticated to be considered worthwhile reusing. Therefore, at the current time, the search technique provided by PS2 R does not support the hypothesis that it is possible to find and reuse existing PSs from the Web. This is either because there are very few examples of good PSs available on the Web written in JessTab, Jess, or CLIPS format, or, if there are many examples of good PSs available, then the search technique is not sufficient enough to find them.
8.3. Future Work
8.2.5
206
Hypothesis 5
Three tools have been developed in this work to provide semi-automated support to help knowledge engineers reuse knowledge sources, namely PJMappingTab, MAKTab, and PS2 R. PJMappingTab provides support with reusing simple PSs with different domain ontologies by suggesting mappings between the concepts used by a PS and a domain ontology and updating the concept references in the PS to refer to the concepts in the domain ontology. The evaluation of PJMappingTab found this support to be beneficial when reusing a relatively simple PS with different ontologies, and so PJMappingTab provides semi-automated support with the reuse of knowledge sources (PSs and domain ontologies). The experimental evaluations of MAKTab also found that MAKTab supports the knowledge engineer with reuse of knowledge components, in particular generic PSs and domain ontologies. Although the mapping tool was unable to suggest mappings between domain and PS ontologies in these experiments, the subjects were able to define suitable mappings manually using the tool, and the mapping application algorithm successfully mapped the domain concepts to the PS ontology. The various support features of the KA tool, such as the guided KA process, suggesting values for new antecedents and consequents added to a rule, and suggesting antecedents based on previous/related rule’s consequents were found to be of benefit to the rule creation process. Moreover, the rule generation feature was found to be especially useful, with the tool automatically generating executable systems significantly larger than could have been produced manually in the same time. The evaluation of PS2 R found that, although no PSs were found that formed good candidates for reuse, the semi-automated support provided by PS2 R improved the PS search and evaluation process, and so PS2 R supports the knowledge engineer with acquiring components to reuse, the first stage of the reuse process. In particular, returning only syntactically correct PSs considerably reduces the number of results the user is required to evaluate when compared with identical searches using a standard Web search engine. Further, the ontology extraction mechanism was found to be very useful when examining a PS to gain an understanding of the domain it works in, what it does, and how it does it. These three tools and their evaluations therefore support the hypothesis 5, that it is possible to provide semi-automated support to help knowledge engineers reuse knowledge sources.
8.3
Future Work
There are various directions in which I would like to take this work. Initially I would like to investigate applying this approach to the other types of standard PSMs; I would also like to investigate alternative languages for the executable KBSs that are generated. As the Semantic Web movement gathers more momentum, I am particularly interested in applying this approach on the (Semantic) Web.
8.3.1
Expanding the Range of Generic PSs
The evaluations performed in this work have shown that this approach can be successfully applied to the propose-and-revise and diagnosis problem solvers. These are two of the PSMs that were identified by the knowledge engineering community in the 1980s and 1990s. [98, chap. 6] identifies nine further types of PS, six of which are discussed, namely: classification, assessment,
8.3. Future Work
207
monitoring, assignment, planning, and scheduling. The classification, assessment, and monitoring PSMs are grouped with diagnosis as analytic tasks, as they all require similar types of input (typically a description of an artefact), use similar reasoning processes, and produce a similar output (again, typically a classification of an artefact derived from the inputs); as such it is easy to imagine them working well with the MAKTab KBS development approach. The remaining PSMs are grouped with configuration design (the CommonKADS terminology for propose-and-revise based design), as synthetic tasks. In fact, a general description of a synthesis task is provided, which illustrates how, at a general level, these different PSMs are all related; this argument is furthered by the PSM descriptions. The description of the assignment PSM specifies that if the “simple method for assignment (that is provided) is not appropriate ... it is best to use a configuration design method instead” [98, chap. 6]; a similar suggestion is given for the planning PSMs: “if the space of possible plans is large, we advise you to use the method described for configuration design ... (which is) easy to adapt to planning” [98, chap. 6]. These descriptions are encouraging, as they imply that it should be possible to apply the MAKTab methodology to assignment, planning, and scheduling PSMs since the approach has been successfully used with configuration design. They also suggest that it should be possible to alter the existing rule generator for PS(pnr, -) to produce rule generators, initially for each of the new synthetic generic PSs, and later to possibly produce a generic rule generator for all synthetic PSs. This argument also applies to the rule generator for PS(diag, -), and it use as the basis of rule generators for the different analytic PSs, and possibly a generic rule generator for all analytic PSs. I am therefore very interested in investigating this further, as, if successful, it would 1) make more types of PSs available for use with MAKTab, and 2) contribute to the CommonKADS approach by illustrating how their PSMs can be implemented and used to build KBSs. In fact, the template knowledge models described in [98, chap. 6] may provide good starting points for new generic PSs: the domain schema provided by each template knowledge model may provide the domain knowledge structure for a generic PS, and the knowledge model’s reasoning description could be used to guide the development of the generic PS’s domain specific rules and domain dependent rule structure.
8.3.2
Generic SWRL to JessTab Rule Generator
Currently, separate rule generators are used by MAKTab for PS(pnr, -) and PS(diag, -) to meet the requirements of their associated generic code (PS-RS(pnr, -) and PS-RS(diag, -) respectively). However, it may be desirable to also provide a generic SWRL to JessTab rule generator which is capable of converting any SWRL rule into an executable JessTab rule, as this may be of use when developing other types of generic PSs.
8.3.3
Tracking New Domain Concepts
The KA tool of MAKTab allows the user to define new domain concepts if required by the rules. However, currently the onus is on the user to go back and perform KA for or define relevant rules for that new domain concept (for example, if a new variable is defined, to ensure that it is initialised). It may be desirable for MAKTab to remember any new concepts that are defined, and, at some point before the executable rules are generated, prompt the user to create corresponding rules for the new concepts (if they have not already been created).
8.3. Future Work
8.3.4
208
Ensuring Rule Completeness
During Experiment B (section 7.2.3), it was observed that when building the computer diagnosis KBS, one subject did not complete the definition of a rule before producing the executable system. Essentially the subject defined the value of the component property for an atom in a rule, but did not define the value of the associated malfunction property. Although this did not cause any runtime errors when the KBS was executed, it is easy to imagine a situation where it may have. To help prevent this type of runtime error, it is desirable to extend MAKTab to ensure that all rules have been satisfactorily completed before generating the executable KBS. This could easily be achieved by placing appropriate cardinality constraints on the properties associated with relevant classes in the PS ontology. It would then be possible to validate any defined rule that uses individuals of these classes by comparing the individuals (that make up the defined rule), with their associated class definitions. If an individual does not comply with the relevant cardinality restrictions, then the user would be prompted to correct this error (or delete the related atom/rule) before any rules are generated. This is a relatively simple extension to MAKTab, which would help reduce potential runtime errors in the generated KBSs.
8.3.5
Alternative Executable Formats
MAKTab stores the rules that are defined using the SWRL formalism, which currently, are then converted into JessTab rule sets. JessTab provides a very good environment for building KBSs using ontologies in Prot´eg´e. However, it is easy to imagine a situation where it is desirable for the KBS to be executed independent of Prot´eg´e, perhaps as part of a larger system. Further, Jess is not always the most efficient engine for running certain types of PS. Runcie [96], demonstrated that significant reductions in execution time can be gained from using a CSP approach to configuration design. In fact, Runcie developed a technique (and supporting tool) which takes as input a proposeand-revise based KBS which is very similar to those generated by MAKTab and converts it to an executable CSP, which is typically executed in a small percentage of the time taken by Jess. For example, the CSP generated from the Stanford Sisyphus II VT solution produces a solution in 0.01 seconds; the CLIPS solution in contrast takes several seconds (as does the Prot´eg´e JessTab version discussed in section 4.5.1). Further, by using a CSP approach, the fix related knowledge (required by propose-and-revise) was not required. It is therefore desirable to investigate having MAKTab generate KBSs in formalisms other than JessTab, as there are potentially many benefits that could be gained by using alternative engines; there are, of course, many disadvantages as well, not least requiring the user to have additional software installed and to move away from the Prot´eg´e environment to execute the KBS. As the rules currently defined with MAKTab are expressed in an implementation independent format (SWRL), it should be possible to create new rule generators for MAKTab which convert the SWRL based rules into other appropriate (executable) formalisms.
8.3.6
Developing KBSs on the (Semantic) Web
Along with potentially providing access to many more ontologies, the Semantic Web should also provide details of how to map from one ontology to another, which can be used to enhance the KBS development process. This potential application of the MAKTab technique on the (Semantic) Web is outlined in
8.3. Future Work
209
Figure 8.1: Building KBS on the Semantic Web. figure 8.1, and is described briefly below.1 1. Browsing the Semantic Web, the user finds a page(s) which provides the domain knowledge they wish to use (to reason about). 2. The user provides the URL(s) of the selected web page(s), and the tool retrieves the associated ontology. 3. The user browses the library of generic PSs; selecting the one which provides the type of reasoning they wish to use. 4. The tool searches its repository of stored mappings for any previously used with the selected generic PS and the user’s domain ontology; further, on the Semantic Web, the tool will be able to use the mapping knowledge associated with the user’s domain ontology (and others on the Semantic Web) to, if necessary, create a sequence of mappings which map the user’s domain ontology to the generic PS through a series of intermediate ontologies (see below for details). After the user checks the mappings, altering them if necessary, the tool executes them, providing the generic PS with knowledge of some of the concepts in the domain. This step corresponds to step 2 in the MAKTab implementation described in section 4.4. 5. Using interactive Web technologies the tool supports the user with defining the required rules for each appropriate domain concept. New ontological concepts can be added to enhance the representation of the domain if necessary. This step corresponds to step 3 in the MAKTab implementation described in section 4.4. 1
Until the Semantic Web vision is realised, steps 1 and 2 can be substituted with the user providing their domain ontology directly to the tool.
8.3. Future Work
210
6. Having defined the required rules, the tool generates an executable KBS by combining the generic PS code, the result of converting the user’s defined rules into an executable format, and the enhanced user’s domain ontology. This step corresponds to steps 4 and 6 in the MAKTab implementation described in section 4.4. 7. The tool could then execute the KBS, returning the results to the user.
Domain Ontologies In MAKTab domain ontologies are taken from a variety of sources including ontology search engines, repositories such as OntoSearch2 [84], online directories, and existing KBSs (if they can be decoupled from the associated reasoning module). If the Semantic Web vision is fully realised, then ontologies will become much more widely available than they are today. Hopefully, this should mean that in the future, as well as existing sources, it will also be possible to import ontologies from appropriate Semantic Web Sites, which can then be used as the domain knowledge source for a new KBS. The quality of these ontologies, in terms of accuracy and completeness will, of course vary, but hopefully they will at least provide a loosely structured KB, which using this methodology can be further developed and extended into a suitable domain knowledge source for a KBS. Tools such as CleOn [102] and RepairTab [62] which deal with detecting and repairing lexical and logical errors in the domain ontologies, could be incorporated into the online tool to further improve the quality of an ontology before it is used in KBS development.
Generic Problem Solvers In MAKTab a generic PS is provided by the PS ontology, which is loaded into MAKTab before it is configured for a particular domain. MAKTab also handles the generation of the executable KBS. When applying the MAKTab methodology on the Semantic Web, it will be desirable to store a repository of generic PSs, which can be used for browsing and selection. Ideally, providers of generic PSs will also handle the generation and execution of the final KBSs, to both reduce server workload and increase the overall flexibility of the system (for example, this would allow any programming language to be used for executing the KBS, not just those directly supported on the server). It is also anticipated that the generic PS stored in PS repositories will not just be implemented in JessTab, but in a wide range of programming languages. It will be important to ensure that these other programming languages can also successfully integrate with instantiated ontologies. Obvious candidates for consideration are CLIPS and Prolog, as Prot´eg´e plug-ins already exist for these languages (CLIPSTab [3] and PrologTab2 ) which enable them to be used with an ontology in Prot´eg´e.
Mapping As discussed previously, there are two main challenges for the user during the mapping stage: firstly determining which are the corresponding concepts in the domain ontology and the generic PS; and secondly determining how the correspondences between these concepts can be defined, these are likely to remain crucial steps in the Semantic Web version of the tool. In general, if one has a source (domain) concept (sc1 ), and a target (PS) concept (tc1 ) and a set of mapping rules M1 ... Mn (possibly from a repository of mappings) describing how to map between many different 2
http://prologtab.sourceforge.net/
8.4. Summary
211
concepts, then showing whether it is possible to find a set of mappings which will transform sc1 to tc1 is likely to be a sizeable search problem, where many potential configurations of mappings would need to be tried, many of which would lead to dead ends. There is perhaps a more effective way of achieving mappings, which was implicit in [8]. This paper specifies that the ontology representing a Semantic Web Site’s content will be associated with a set of mappings between it and other ontologies (either for other Semantic Web Sites, or standard ontologies). Lets say a given page is represented by ontology O1 , which is associated with a set of mappings to change the concepts in O1 to one of several ontologies, lets call them Ox and Oy . So effectively, O1 is associated with a set of mappings from O1 to Ox and O1 to Oy ; lets call these Mappings(1, x) and Mappings(1, y) respectively. If we wish to map to an ontology, O2 , representing another page’s content, we would first check if Mappings(1, 2) is associated with O1 . If not, we could check if specified repositories of mappings have Mappings(x, 2) or Mappings(y, 2), in which case we would know we would be able to map from O1 to O2 in two stages. Of course, in general this mapping process could explore a larger number of mappings stages. Having mapping information centralised in mapping repositories would, in fact, allow the existence of a suitable sequence of mappings to be established relatively easily and cheaply; the actual mapping process then simply requires the performance of the mappings as specified in the derived sequence.
8.4
Summary
The MAKTab experimental evaluations served to investigate the first three hypotheses defined for this work. The KBSs built as part of the experiments in both the elevator and computer domains found that, when developing a new KBS for one particular type of PS (for example, diagnosis), the necessary rule set was not contained in another KBS which used a different type of PS (for example, propose-and-revise), and so it was necessary to acquire new rules as part of the development process. The focused KA technique that was developed was successfully used by myself to build two relatively large and complex KBSs in the elevator domain, and by other subjects to build KBSs in the computer domain. All subjects were able to use the KA process to correctly define the required rules, and produce KBSs which successfully solved the specified tasks. All subjects also found that it was necessary to define new domain concepts when building their KBS, indicating that the domain ontology extracted from another KBS was unable to provide all of the domain knowledge required by the new KBS. The fourth hypothesis was investigated experimentally using both PS2 R and Google to search for PSs which use well known domain and PSM keywords. The results of this experiment found that, although PS2 R produced more focused results (by only returning valid programs), unfortunately, none of the programs that were retrieved could be reconfigured for other domains. The fifth hypothesis was investigated by the evaluations of PJMappingTab, MAKTab, and PS2 R, which all showed that it was possible to provide semi-automated support to help knowledge engineers reuse knowledge sources. There are several options for extending the current version of MAKTab, and I am particularly interested in expanding the range of available generic PS, as this should increase the number of potential users. I am also interested in developing new rule generators, especially for alternative
8.4. Summary
212
formats such as CSPs due to the advantages that they offer for some types of PSs. Overall, I am especially interested in applying the MAKTab KBS development methodology in a (Semantic) Web setting; as the Semantic Web vision becomes reality, it should provide access to a wealth of new components suitable for building KBSs, particularly ontologies, mappings, and problem solvers. These components have the potential to further automate and improve the MAKTab technique for KBS development. I also believe that a Web based version of MAKTab will be of particular benefit to the knowledge engineering community, and hopefully subsequently to domain experts.
BIBLIOGRAPHY
213
Bibliography [1] RIF Working Group, Accessed Jan 2008. www.w3.org/2005/rules/. [2] CLIPS Reference Manual, Volume I, Basic Programming Guide, Accessed December 2008. http://clipsrules.sourceforge.net/documentation/v630/bpg.htm. [3] R. Ameer. Embedding CLIPS Engine in Prot´eg´e. In Proceedings of Sixth International Prot´eg´e Workshop, 2003. [4] A. B. Bagula. Hybrid traffic engineering: the least path interference algorithm. In SAICSIT ’04: Proceedings of the 2004 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries, pages 89–96. South African Institute for Computer Scientists and Information Technologists, 2004. [5] R. Benjamins. Problem-Solving Methods for Diagnosis and their Role in Knowledge Acquisition. International Journal of Expert Systems: Research & Applications, 2(8):93–120, 1995. [6] V. Benjamins, E. Plaza, E. Motta, D. Fensel, R. Studer, B. Wielinga, G. Schreiber, Z. Zdrahal, and S. Decker. IBROW3: An Intelligent Brokering Service for Knowledge-Component Reuse on the World-Wide Web. In 11th Banff Knowledge Acquisition for nowledge-Based System Workshop (KAW98), 1998. [7] T. Berners-Lee. Notation 3, An readable language for data on the Web, March 2006. http: //www.w3.org/DesignIssues/Notation3. [8] T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific American, May, 2001. [9] J. Blythe, J. Kim, S. Ramachandran, and Y. Gil. An Integrated Environment for Knowledge Acquisition. In Proceedings of the 2001 International Conference on Intelligent User Interfaces (IUI-2001), 2001. [10] J. Breuker and W. V. de Velde, editors. CommonKADS Library for Expertise Modelling Reusable Problem Solving Components. IOS Press, Amsterdam, The Netherlands, 1994. [11] B. Buchanan and E. Shortliffe, editors. Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading, MA., 1985. [12] H. Chalupsky. OntoMorph: A Translation System for Symbolic Knowledge. In Principles of Knowledge Representation and Reasoning, pages 471–482, 2000. [13] S. Chapman. SimMetrics - open source Similarity Measure Library, Accessed January 2008. http://www.dcs.shef.ac.uk/˜sam/simmetrics.html. [14] V. K. Chaudhri, M. E. Stickel, J. Thom´er´e, and R. J. Waldinger. Using Prior Knowledge:
BIBLIOGRAPHY
214
Problems and Solutions. In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI’00), pages 436–442. AAAI Press / The MIT Press, 2000. [15] C. Corbridge, N. Major, and N. Shadbolt. Models Exposed: An Empirical Study. In 9th AAAI-Sponsored Banff Knowledge Acquisition for Knowledge Based Systems, 1995. [16] D. Corsar and D. Sleeman. Reusing JessTab Rules in Prot´eg´e. In M. Bramer and F. Coenen and T. Allen, editor, Research and Development in Intelligent Systems XXII Proceedings of AI-2005 the Twenty-Fifth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, pages 7–20. Springer, 2005. [17] D. Corsar and D. Sleeman. Reusing JessTab rules in Prot´eg´e. Knowledge-Based Systems, 19(5):291–297, 2006. [18] M. Crub´ezy. The PSM Librarian: Download Area, Accessed Janurary 2008. http:// protege.stanford.edu/plugins/psmtab/psmtab_download.html. [19] M. Crub´ezy and M. A. Musen. Ontologies in Support of Problem Solving. In S. Staab and R. Studer, editors, Handbook on Ontologies, International Handbooks on Information Systems. Springer, 2003. [20] R. D. Cuthbert and L. Peckham. APL models for operational planning of shipment routing, loading, and scheduling. In WSC ’73: Proceedings of the 6th conference on Winter simulation, pages 622–631. ACM Press, 1973. [21] V. Devedzic. Knowledge Modeling – State of the Art. Integrated Computer-Aided Engineering, 8(3):257–281, 2001. [22] R. Dieng, O. Corby, A. Giboin, and M. Ribi´ere. Methods and Tools for Corporate Knowledge Management. In 11th Banff Workshop on Knowledge Acquisition, Modelling and Management (KAW98), page 42, 1998. [23] L. Ding, T. Finin, A. Joshi, R. Pan, R. S. Cost, Y. Peng, P. Reddivari, V. C. Doshi, and J. Sachs. Swoogle: A Search and Metadata Engine for the Semantic Web. In Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management. ACM Press, November 2004. [24] A. Doan and A. Halevy. Semantic-Integration Research in the Database Community. AI Magazine, 26(1):83–94, 2005. [25] H. Eriksson. The JESSTAB Approach to Prot´eg´e and JESS Integration. In Proceedings of the IFIP 17th World Computer Congress - TC12 Stream on Intelligent Information Processing, pages 237–248. Kluwer, B.V., 2002. [26] H. Eriksson. Using JessTab to Integrate Prot´eg´e and JESS. IEEE Intelligent Systems, 18(2):43–50, 2003. [27] H. Eriksson. JessTab Manual Integration of Prot´eg´e and Jess. Link¨oping University, January 2004. [28] L. Eshelman, D. Ehret, J. McDermott, and M. Tan. MOLE: a tenacious knowledgeacquisition tool. In B. G. Buchanan and D. C. Wilkins, editors, Readings in knowledge acquisition and learning: automating the construction and improvement of expert systems, chapter 3, pages 253–260. Morgan Kaufmann Publishers Inc, 1993.
BIBLIOGRAPHY
215
[29] A. Farquhar, R. Fikes, and J. Rice. The Ontolingua Server: a Tool for Collaborative Ontology Construction. Technical report, Stanford KSL, 1996. [30] C. Fellbaum, editor. WordNet: An Electronic Lexical Database. The MIT Press, Cumberland, RI, 1998. [31] D. Fensel, V. Benjamins, S. Decker, M. Gaspari, R. Groenboom, W. Grosso, M. Musen, E. Motta, E. Plaza, G. Schreiber, R. Studer, and B. Wielinga. The Component Model of UPML in a Nutshell. In First Working IFIP Conference on Software Architecture (WICSA1), 1999. [32] D. Fensel, V. Benjamins, E. Motta, and B. Wielinga. UPML: A Framework for Knowledge System Reuse. In In Proceedings of the International Joint Conference on AI (IJCAI-99), pages 16–23, 1999. [33] D. Fensel and E. Motta. Structured Development of Problem Solving Methods. Knowledge and Data Engineering, 13(6):913–932, 2001. [34] D. Fensel, E. Motta, V. Benjamins, S. Decker, M. Gaspari, R. Groenboom, W. Grosso, F. van Harmelen, M. Musen, E. Plaza, G. Schreiber, R. Studer, A. Teije, and B. Wielinga. The Unified Problem-Solving Method Development Language. ESPRIT project number 27169, IBROW3, Deliverable 1.1, Chapter 1, 1999. [35] A. S. Foundation. Apache Tomcat, Accessed January 2008. http://tomcat.apache. org/. [36] E. Friedman-Hill. Jess In Action: Rule-Based Systems in Java. Manning Publications Co., Greenwich, CT, 2003. [37] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, March 1995. [38] M. Genesereth. Knowledge Interchange Format. In J. Allen and R. Fikes and E. Sandewall, editor, Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning, pages 599–600. Morgan Kaufmann Publishers, 1991. ´ E-II: ´ [39] J. Gennari, R. Altman, and M. Musen. Reuse with PROTEG from Elevators to Ribosomes. In SSR ’95: Proceedings of the 1995 Symposium on Software Reusability, pages 72–80. ACM Press, 1995. [40] J. H. Gennari, M. A. Musen, R. W. Fergerson, W. E. Grosso, M. Crub´ezy, H. Eriksson, N. F. Noy, and S. W. Tu. The Evolution of Prot´eg´e: an Environment for Knowledge-Based Systems Development. nternational Journal of Human-Computer Studies, 58(1):89–123, 2003. [41] J. C. Giarratano and G. Riley. Expert Systems: Principles and Programming. PWS Pulishing Company, 1998. [42] Y. Gil. Knowledge Refinement in a Reflective Architecture. In Proceedings of the Twelfth National Conference on Artificial Intelligence, AAAI ’94, pages 520–526, 1994. [43] Y. Gil and J. Kim. Deriving Expectations to Guide Knowledge Base Creation. In Proceedings of the Fourteenth National Conference on Artificial Intelligent (AAAI-99), July 19-22 1999. [44] C. Golbreich. Combining Rule and Ontology Reasoners for the Semantic Web. In G. Antoniou and H. B. (Eds.), editors, Rules and Rule Markup Languages for the Semantic Web,
BIBLIOGRAPHY
216
pages 6–22. RuleML, Springer-Verlag Berlin and Heidelberg, November 2004. [45] Google. Google SOAP Search API (Beta), Accessed January 2008. http://code. google.com/apis/soapsearch/index.html. [46] J. Grant and D. Beckett. RDF Test Cases. W3C Recommendation, February 2004. http: //www.w3.org/TR/rdf-testcases/\#ntriples. [47] T. Gruber. A Translation Approach to Portable Ontology Specification. Knowledge Acquisition, 5(2):199–220, 1993. [48] M. Horridge. The Next-Generation Prot´eg´e-OWL Editor and the OWL Toolkit. In Proceedings of the 10th International Prot´eg´e Conference, 2007. [49] I. Horrocks, P. F. Patel-Schneider, H. Boley, S. Tabet, B. Grosof, and M. Dean. SWRL: A Semantic Web Rule Language Combining OWL and RuleML. W3C Member Submission, May 2004. http://www.w3.org/Submission/SWRL/ Accessed January 2008. [50] IBROW3.
IBROW3 Homepage.
IBROW3: An Intelligent Brokering Service for
Knowledge-Component Reuse on the World-Wide Web, Esprit project number: 27169, Accessed January 2008.
http://hcs.science.uva.nl/projects/IBROW3/
home.html. [51] J.Gennari, S. Tu, T. Rothenfluh, and M. Musen. Mapping Domains to Methods in Support of Reuse. International Journal of Human Computer Studies, 41:399–424, 1994. [52] jGuru.
jGuru: JavaServer Pages Fundamentals, Short Course Contents.
Sun Devel-
oper Network, Accessed December 2008. http://java.sun.com/developer/ onlineTraining/JSPIntro/contents.html\#JSPIntro3. [53] K. Johar and R. Simha. JWord 2.0, Accessed January 2008. http://www.seas.gwu. edu/˜simhaweb/software/jword/index.html. [54] G. Kahn, S. Nowlan, and J. McDermott. MORE: An Intelligent Knowledge Acquisition Tool. In Proceedings IJCAI-85, 1985. [55] Y. Kalfoglou and M. Schorlemmer. Ontology Mapping: the State of the Art. The Knowledge Engineering Review, 18(1):1–31, 2003. [56] J. Kingston. Pragmatic KADS: A methodological approach to a small knowledge based systems project. Technical report, Artificial Intelligence Applications Institute, University of Edinburgh, 80 South Bridge, Edinburgh EH1 1HN, UK, November 1994. AIAI-TR-110, 1994. [57] H. Knublauch, R. W. Fergerson, N. F. Noy, and M. A. Musen. The Prot´eg´e OWL Plugin: An Open Development Environment for Semantic Web Applications. In S. A. McIlraith, D. Plexousakis, and F. van Harmelen, editors, The Semantic Web - ISWC 2004: Third International Semantic Web Conference, pages 229–243, 2004. [58] J. Kopena. DAMLJessKB, Accessed January 2008. http://edge.cs.drexel.edu/ assemblies/software/damljesskb/. [59] J. Kopena. OWLJessKB, Accessed January 2008. http://edge.cs.drexel.edu/ assemblies/software/owljesskb/. [60] R. Korf. Depth-First Iterative-Deepening: An Optimal Admissible Tree Search. Artificial Intelligence, 27:97–109, 1985. [61] S. Kueng. TortoiseMerge, Accessed January 2008. http://tortoisesvn.tigris.
BIBLIOGRAPHY
217
org/TortoiseMerge.html. [62] J. S. C. Lam. Methods for Resolving Inconsistencies in Ontologies. PhD thesis, University of Aberdeen, 2007. [63] S. H. Liao. Expert System Methodologies and Applications - A Decade Review from 1995 to 2004. Expert Systems with Application, 28(1):93–103, 2005. [64] S. Marcus and J. McDermott. SALT: a Knowledge Acquisition Language for Propose-andRevise Systems. Artificial Intelligence, 39(1):1–38, 1989. [65] S. Marcus, J. Stout, and J. McDermott. VT: An Expert Elevator Designer That Uses Knowledge-Based Backtracking. AI Magazine, 9(1):95–112, Spring 1988. [66] D. McGuinness, R. Fikes, J. Rice, and S. Wilder. An Environment for Merging and Testing Large Ontologies. In Proceedings of the Seventh International Conference of Principles of Knowledge Representation and Reasoning (KR2000), pages 483–493, 2000. [67] D. L. McGuinness and E. Frank van Harmelen. Overview.
W3C Recommendation, Feburary 2004.
OWL Web Ontology Language http://www.w3.org/TR/
owl-features/ Accessed January 2008. [68] C. Mellish and X. Sun. The Semantic Web as a Linguistic Resource: Opportunities for Natural Language Generation. In M. Bramer and F. Coenen and T. Allen, editor, Research and Development in Intelligent Systems XXII Proceedings of AI-2005 the Twenty-fifth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, pages 77–87. Springer, 2005. [69] T. Menzies. Limits to Knowledge Level-B Modeling (and KADS). In Proceedings of AI ’95, Australia. World-Scientific, 1995. [70] K. Metaxiotis and J. Psarras. Expert Systems in Business: Applications and Future Directions for the Operations Researcher. Industrial Management & Data Systems, 103(5):361– 368, 2003. [71] Microsoft.
Live Search, Accessed January 2008.
http://dev.live.com/
livesearch/ Accessed January 2008. [72] P. Mitra and G. Wiederhold. Resolving Terminological Heterogeneity In Ontologies. In Proceedings of the ECAI’02 workshop on Ontologies and Semantic Interoperability, 2002. [73] E. Motta, D. Fensel, M. Gaspari, and R. Benjamins. Specifications of Knowledge Components for Reuse. In Proceedings of SEKE ’99, 1999. [74] E. Motta and W. Lu. A Library of Components for Classification Problem Solving. In Proceedings of the 6th Pacific International Knowledge Acquisition Workshop (PKAW 2000), 2000. [75] MySQL. MySQL, Accessed January 2008. http://www.mysql.com/. [76] NASA Johnson Space Center. CLIPS: A Tool for Building Expert Systems, Accessed January 2008. http://clipsrules.sourceforge.net. [77] J. Nielsen and R. Molic. Heuristic evaluation of user interfaces. In Proceedings of the SIGCHI conference on Human factors in computing systems: Empowering people, pages 249–256, New York, NY, USA, 1990. ACM Press. [78] T. E. Nordlander. Constraint Relaxation Techniques & Knowledge Base Reuse. PhD thesis, Department of Computing Science, University of Aberdeen, 2004.
BIBLIOGRAPHY
218
[79] N. Noy and M. Musen. SMART: Automated Support for Ontology Merging and Alignment. In Proceedings of the 12th Workshop on Knowledge Acquisition, Modelling and Management (KAW’99), October 1999. [80] N. Noy and M. Musen. PROMPTDIFF: A Fixed-Point Algorithm for Comparing Ontology Versions. In Proceedings of the 18th National Conference on Artificial Intelligence (AAAI’02), August 2002. [81] N. Noy and M. A. Musen. PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. In Proceedings of the 17th National Conference on Artificial Intelligence (AAAI’00), July 2000. [82] M. J. O’Connor, H. Knublauch, S. W. Tu, B. Grossof, M. Dean, W. E. Grosso, and M. A. Musen. Supporting Rule System Interoperability on the Semantic Web with SWRL. In Proceedings of the Fourth International Semantic Web Conference (ISWC2005), pages 974– 986. Springer, November 6–10 2005. [83] B. Omelayenko, M. Crubezy, D. Fensel, R. Benjamins, B. Wielinga, E. Motta, M. Musen, and Y. Ding. UPML: The Lanugage and Tool Support for Making the Semantic Web Alive, chapter 5, pages 141–170. Spinning the Semantic Web. The MIT Press, 2003. [84] J. Z. Pan, E. J. Thomas, and D. H. Sleeman. ONTOSEARCH2: Searching and Querying Web Ontologies. In IADIS International Conference WWW/Internet 2006 (University of Murcia), pages 211–219, 2006. [85] J. Park and M. Musen. VM-in-Prot´eg´e: A study of software reuse. In Proceedings of the Ninth World Congress on Medical Informatics (MEDINFO’98). IOS Press, 1998. [86] J. Y. Park. The Virtual Knowledge Constructor: A Schema for Mapping Across Ontologies in Knowledge-Based Systems. PhD Thesis Proposal. Stanford University, 1999. [87] J. Y. Park, J. Gennari, and M. Musen. Mappings for Reuse in Knowledge-Based Systems. In 11th Workshop on Knowledge Acquisition, Modelling and Management KAW 98, 1998. [88] P. F. Patel-Schneider, I. Patrick Hayes, and E. Ian Horrocks. OWL Web Ontology Language Semantics and Abstract Syntax. W3C Recommendation, February 2004. http://www. w3.org/TR/owl-semantics/ Accessed January 2008. [89] D. Pierotti. ration, 2000.
Heuristic Evaluation - A System Checklist.
Xerox Corpo-
http://www.stcsig.org/usability/topics/articles/
he-checklist.html Accessed January 2008. [90] E. Prud’hommeaux and E. Andy Seaborne. SPARQL Query Language for RDF. W3C Recommendation, January 2008. http://www.w3.org/TR/rdf-sparql-query/ Accessed Janurary 2008. [91] E. Rahm and P. Bernstein. A Survey of Approaches to Automatic Schema Matching. VLDB Journal: Very Large Data Bases, 10(4):334–350, 2001. [92] S. Ramachandran and J. Blythe. Knowledge Acquisition using an English-Based Method Editor. In Proceedings of the Tenth Banff Knowledge Acquisition for Knowledge-Based Systems Workshop (KAW-99), 1999. [93] T. M. H. Reenskaug. MVC XEROX PARC 1978-79, Accessed January 2008. http: //heim.ifi.uio.no/˜trygver/themes/mvc/mvc-index.html.
BIBLIOGRAPHY
219
[94] M. Rosenthal. Computer Repair with Diagnostic Flowcharts: Troubleshooting PC Hardware Problems from Boot Failure to Poor Performance. Foner Books, 2003. [95] T. Rothenfluh, J. Gennari, H. Eriksson, A. Puerta, S. Tu, and M. Musen. Reusable Ontolo´ E-II ´ Solutions to gies, Knowledge-Acquisition Tools, and Performance Systems: PROTEG Sisyphus-2. International Journal of Human-Computer Studies, 44:303–332, 1996. [96] T. J. Runcie. Reuse of Knowledge Bases & Problem Solvers Explored in the VT Domain. PhD thesis, Department of Computing Science, University of Aberdeen, 2008. [97] A. T. Schreiber and W. P. Birmingham. The Sisyphus-VT Initiative. International Journal of Human-Computer Studies, 44(3–4):275–280, March 1996. [98] G. Schreiber, R. de Hoog, H. Akkermans, A. Anjewierden, N. Shadbolt, and W. V. de Velde. Knowledge Engineering and Management: the CommonKADS methodology. The MIT Presss, December 1999. [99] G. Schreiber, B. Wielinga, R. de Hoog, H. Akkermanns, and W. V. de Velde. CommonKADS: A Comprehensive Methodology for KBS Development. IEEE Expert, 9(6):28– 37, 1994. [100] A. Seaborne. RDQL: A Query Language for RDF. W3C Member Submission, January 2004. http://www.w3.org/Submission/2004/SUBM-RDQL-20040109/. [101] M. Sintek.
OKBC Tab Website, Accessed January 2008.
http://protege.
stanford.edu/plugins/okbctab/okbc_tab.html. [102] D. H. Sleeman and Q. H. Reul. CleanONTO: Evaluating Taxonomic Relationships in Ontologies. In D. Vrandecic, M. C. Surez-Figueroa, A. Gangemi, and Y. Sure, editors, Proceedings of 4th International EON Workshop on Evaluation of Ontologies for the Web, 2006. [103] D. Smith, J. Park, and M. Musen.
Therapy Planning as Constraint Satisfaction: A
Computer-Based Antiretroviral Therapy Advisor for the Management of HIV. In Proceedings of the AMIA Fall Symposium, American Medical Informatics Association, 1998. [104] Stanford Medical Informatics, Stanford University. Prot´eg´e Website, Accessed January 2008. http://protege.stanford.edu. [105] G. Stumme and A. Maedche. Ontology Merging for Federated Ontologies on the Semantic Web. In Proceedings of the International Workshop for Foundations of Models for Information Integration (FMII-2001), September 2001. [106] Sun Microsystems. The Java EE 5 Tutorial, October 2008. http://java.sun.com/ javaee/5/docs/tutorial/doc/index.html Accessed December 2008. [107] Sun Microsystems. Java EE at a Glance, Accessed December 2008. http://java. sun.com/javaee/. [108] M. Tallis. A Script-Based Approach to Modifying Knoweldge-Based Systems. PhD thesis, Faculty of the Graduate School, University of Southern California, December 2000. [109] M. Tallis and Y. Gil. A Script-Based Approach to Modifying Knowledge-Based Systems. In Proceedsing of the Fourteenth National Conference on Artificial Intelligent, AAAI-97, July 27-31 1997. [110] The Rule Markup Initiative. RuleML Homepage, Accessed January 2008. http://www. ruleml.org.
BIBLIOGRAPHY
220
[111] E. Thomas, Y. Zhang, D. Sleeman, A. D. Preece, C. McKenzie, and J. Wright. OntoSearch: Retrieval and Reuse of Ontologies. In J. Rogers and A. Rector, editors, Proceedings AIME05 workshop on Biomedical Ontology Engineering. AIME Directorate, 2005. [112] Thomas Vander Wal.
Folksonomy, February 2007.
http://vanderwal.net/
folksonomy.html Accessed December 2008. [113] S. Tu and M. Musen. Episodic Refinement of Episodic Skeletal Plan Refinement. International Journal of Human–Computer Studies, 48:475–497, 1992. [114] A. Valente, J. Breuker, and W. V. de Velde. The CommonKADS library in perspective. International Journal of Human-Computer Studies, 49(4):391–416, October 1998. [115] W. van Melle, E. H. Shortliffe, and B. G. Buchanan. EMYCIN: A Knowledge Engineer’s Tool for Constructing Rule-Based Expert Systems. In Buchanan and Shortliffe [11], chapter 15, pages 302–313. [116] W3C. Web Ontology Language (OWL). W3C Recommendation, 2004. http://www. w3.org/2004/OWL/. [117] W3C. Resource Description Framework (RDF). W3C Recommendation, Accessed January 2008. http://www.w3.org/RDF/. [118] S. White. Enhancing Knowledge Acquisition with Constraint Technology. PhD thesis, Department of Computing Science, University of Aberdeen, 2000. [119] S. White and D. Sleeman. A Constraint-Based Approach to the Description & Detection of Fitness-for-Purpose. Electronic Transactions on Artificial Intelligence, 4:155–183, 2002. [120] T. White. Can’t Beat Jazzy: Introducing the Java Platform’s Jazzy New Spell Checker API, Accessed January 2008. http://www-106.ibm.com/developerworks/java/ library/j-jazzy/. [121] Wikipedia. Tag cloud, Accessed December 2008. http://wn.wikipedia.org/ wiki/Tag_cloud. [122] Yahoo! Developer Network. Yahoo! Search Web Services, Accessed January 2008. http: //developer.yahoo.com/search/ Accessed January 2008. [123] G. R. Yost.
Configuring elevator systems.
ftp://ftp-smi.stanford.edu/pub/protege/S2-
WFW.ZIP, February 1994. Accessed April 2006.
221
Appendix A
Example Application of Mapping Algorithms A.1
Introduction
Section 5.3 provides a description of algorithm 2, the algorithm used by the mapping tool to perform the mapping of data from one ontology to another; algorithm 3 is also described which is a suggested algorithm for performing certain types of mapping. This appendix provides a detailed walk-through of the execution of these two algorithms when mapping data from a source ontology to a target ontology using a given set of mappings.
A.2
The Mapping Components
A.2.1
The Source Ontology
Figure A.2.1 provides a description of the fragments of the ontology that will be used as the source ontology in this example. The mappings that are used in this example reference six classes in this ontology, two object properties, six datatype properties, and seven individuals. To improve readability of the example all the components in the source ontology have been given abstract names: class names are structured “SCX” which stands for “Source Class X” where X is a number, property names are structured “spX” which stands for “source property X” where X is a number, and individual names are structured “siX-Y” which stands for “source individual X-Y” where X is the number of the class that the individual is a type of and Y provides the unique identifier for that individual (for example si1-1 is the first individual of type SC1, si1-2 is the second individual of type SC1, and so on).
A.2.2
The Target Ontology
Figure A.2.2 provides a description of the fragments of the ontology that will be used as the target ontology in this example. The mappings that are used in this example reference seven classes in this ontology, two object properties, and five datatype properties. Similar to the source ontology, to improve readability of the example all the components in the target ontology have been given abstract names: class names are structured “TCX” which stands for “Target Class X” where X is a number, property names are structured “tpX” which stands for “target property X” where X is a number, and individual names (individuals are created during the mapping process) are structured “tiX-Y” which stands for “target individual X-Y” where X is the number of the class that the individual is a type of and Y provides the unique identifier for that individual (for example ti1-1 is the first individual of type TC1, ti1-2 is the second individual of type TC1, and so on).
A.2.3
Mappings
Figure A.3 describes the set of mappings that will be applied to the source ontology described above. Six translation mappings are defined between various properties in the source ontology
A.3. Walk-Through
222
Class(so:SC1 complete) Class(so:SC2 complete) Class(so:SC3 complete) Class(so:SC4 complete) Class(so:SC5 complete) Class(so:SC6 complete) ObjectProperty(so:sp1 domain(so:SC1) range(so:SC1)) ObjectProperty(so:sp2 domain(so:SC2) range(so:SC3)) DatatypeProperty(so:sp3 domain(so:SC1) range(xsd:string)) DatatypeProperty(so:sp4 domain(so:SC2) range(xsd:string)) DatatypeProperty(so:sp5 domain(so:SC2) range(xsd:string)) DatatypeProperty(so:sp6 domain(so:SC3) range(xsd:string)) DatatypeProperty(so:sp7 domain(so:SC6) range(xsd:int)) Datatypeproperty(so:sp8 domain(so:SC6) range(xsd:int)) Individual(so:si1-1 type(so:SC1) value(so:sp1 si1-2) value(so:sp2 si3-1) value(so:sp3 “previous”)) Individual(so:si1-2 type(so:SC1) value(so:sp1 si1-1) value(so:sp1 si1-3) value(so:sp2 si3-2) value(so:sp3 “hello”) value(sp3 “world”)) Individual(so:si1-3 type(so:SC1) value(so:sp1 si1-2) value(so:sp3 “descriptive text”)) Individual(so:si2-1 type(so:SC2) value(so:sp4 “for”) value(so:sp5 “until”)) Individual(so:si2-2 type(so:SC2) value(so:sp4 “while”) value(so:sp5 “repeat”)) Individual(so:si3-1 type(so:SC3) value(so:sp6 “document”)) Individual(so:si3-2 type(so:SC3) value(so:sp6 “paper”) value(so:sp6 “canvas”))) Figure A.1: Relevant fragments of the source ontology. and properties in the target ontology; when these mappings are applied all the individuals with the type SC1, SC2, and SC3 will, if appropriate, be mapped into corresponding individuals in the target ontology. To improve readability of the walk-through, translation mappings have been defined between properties with the same identifying number. A further four direct mappings are also defined, each of which creates a new concept in the target ontology based on a concept in the source ontology.
A.3
Walk-Through
Algorithm 2 receives the following inputs: • ed - an empty list which will be used for storing error messages, • w - an empty list which will be used for storing warning messages, • so - the source ontology defined in figure A.2.1 • mappings - the list of mappings defined in figure A.3 • T1 - a table of previously mapped concepts • T2 - a table of incomplete mappings
A.3. Walk-Through
223
Class(to:TC1 complete) Class(to:TC2 complete) Class(to:TC3 complete restriction(to:tp6 minCardinality(2)) Class(to:TC4 complete) Class(to:TC5 complete) Class(to:TC6 complete) Class(to:TC7 complete) ObjectProperty(to:tp1 domain(to:TC1)) ObjectProperty(to:tp2 domain(to:TC2)) DatatypeProperty(to:tp3 domain(to:TC1) range(xsd:string)) DatatypeProperty(to:tp4 domain(to:TC2) range(xsd:int)) DatatypeProperty(to:tp5 domain(to:TC2) range(xsd:string)) DatatypeProperty(to:tp6 domain(to:TC3) range(xsd:string)) DatatypeProperty(to:sp7 domain(so:TC6) range(xsd:int)) Figure A.2: Relevant fragments of the target ontology before the mappings process. Translation Mappings TM1 - so:sp1 in domain so:SC1 maps to to:tp1 in domain to:TC1 TM2 - so:sp2 in domain so:SC1 maps to to:tp2 in domain to:TC1 TM3 - so:sp3 in domain so:SC1 maps to to:tp3 in domain to:TC1 TM4 - so:sp4 in domain so:SC2 maps to to:tp4 in domain to:TC2 TM5 - so:sp5 in domain so:SC2 maps to to:tp5 in domain to:TC2 TM6 - so:sp6 in domain so:SC3 maps to to:tp6 in domain to:TC3 Direct Mappings DM1 - map so:SC4 to an individual of type to:TC4 DM2 - copy so:SC5 to a subclass of to:TC5 DM3 - copy so:sp7 in domain so:SC6 to a property in the domain to:TC6 DM4 - map so:sp8 in domain so:SC6 to an individual of type to:TC7 Figure A.3: The mappings that will be applied between the source and target ontologies. The for loop on the first line of the algorithm applies all the direct creation mappings that were passed to this algorithm. In this case, mappings DM1, DM2, DM3, and DM4 will be applied; the resultant additions to the target ontology are described in figure A.4. After the direct creation mappings have been applied, the algorithm proceeds to apply all of the translation mappings. A step-by-step walk-through of this process, describing the variable values at each step is provided in tables A.1 through to A.12. Each table provides the step number, the line number that has just been executed, and a description of the value of each variable that has been defined at that line number. With the exception of the final two columns (labelled “T1” and “T2”), if a cell does not have a value, then the variable is not within the scope of the code executing at that line number. When either column describing T1 or T2 is empty, this means the relevant table (T1 or T2) is empty. Square brackets around a comma-separated list of values indicate that the value of the variable is a list with the specified contents (an empty list is represented by “[ ]”). Where the value of a variable is an instance (of a Java class), the values of the properties are described in a semi-colon separated list. The column labelled “ti” describes an
A.3. Walk-Through
224
Class(to:SC5 partial to:TC5)) DatatypeProperty(to:sp7 domain(to:TC7) range(xsd:int)) Individual(to:SC4 type(to:TC4)) Individual(to:sp8 type(to:TC7)) Figure A.4: Additions to the target ontology after the direct creation mappings have been applied.
individual in the target ontology; to maintain the table layout the individual is described using a frame like notation: “(” “(” “) )” where is the name of the individual, is the name of a property and are the value of that property (there may be any number of property values described). Each row of table T1 (displaying in the column labelled “T1”) is displayed using a : notation, when is the name of a concept in the source ontology and is the name of the concept in the source ontology created based on (for example, SI1-1:TI1-1 should be read as SI1-1 was mapped to TI1-1), each row is separated by a semi-colon. Each row of table T2 (displaying in the column labelled “T2”) is displayed using a :: notation, when is the name of a concept in the source ontology, is the name of an individual in the source ontology, and is the name of a property in the target ontology that the value of should be set to the mapped value of when is mapped (for example, SI1-2:TI1-1:tp1 should be read as when SI1-2 is mapped, the mapped value should be added to the value of tp1 for TI1-1). Also note that to improve layout, the namespaces (“so:” for the source ontology and “to:” for the target ontology) have not been added to concept names. For example, at step 1 line 4 has just been executed. At this point sc is set to SC1, tm is a list consisting of TM1, TM2, and TM3, and ltc is set to TC1, ed is an empty list, so is w, and T1 and T2 are both empty; the variables tc, si, error, bi, m, results, and ti have not yet been defined. Similarly, at step 7, line 19 has just been executed, sc is set to SC1, tm is a list consisting of TM1, TM2, and TM3, and ltc is set to TC1, tc is set to TC1, si is set to SI1-1, error is set to false, bi is set to a new builder instance with the property newInstanceName set to TI1-1 and tp1 set to an empty list, m is set to TM1, results is an empty list, as is ed and w, ti have not yet been defined, T1 is still empty, and T2 has one row consisting of SI1-2:TI1-1:tp1. Additionally, consider step 29 where ti has the value: (TI12 (tp1 TI1-1) (tp2 ) (tp3 “hello” “world”)), which describes an individual called TI1-1 which has the value TI1-1 for the property tp1, no value for property tp2, and the strings “hello” and “world” for the value of property tp3. Briefly, steps 1 to 49 map the individuals associated with SC1 to the target ontology (as individuals of TC1) using the translation mappings TM1, TM2, and TM3. Steps 2 to 15 map SI11 to TI1-1 by applying TM1 (steps 5 to 7), TM2 (steps 8 to 10) and TM3 (steps 11 to 15); steps 16 to 32 map SI1-2 to TI1-2 by applying TM1 (steps 20 to 22), TM2 (steps 23 to 25) and TM3 (steps 26 to 28); finally, steps 33 to 49 map SI1-3 to TI1-3 by applying TM1 (steps 37 to 39), TM2 (steps 40 to 42) and TM3 (steps 43 to 45). The algorithm the moves on to the next source class
A.3. Walk-Through
225
with mappings associated with it, SC2, in steps 50 to 72 where it attempts to map the individuals associated with SC2 to the target ontology by applying mappings TM4 and TM5 to SI2-1 (steps 53 to 62) and SI2-2 (steps 63 to 72). An error is thrown when attempting to apply TM4, as sp4 (the source property of the mapping) has range string which is defined to be incompatible with the range of tp4 (the target property of the mapping) which is integer. Each error is caught (at steps 59 and 69) and the error message is added to ed (the error messages “ERR-MSG-1” and “ERR-MSG2” are used to maintain the table layout, in reality a more useful explanation would be returned). The algorithm then moves on to the next source class with mapping associated with it, SC3, in steps 73 to 99 where is uses mapping TM6 to map SI3-1 (steps 76 to 88) and SI3-2 (steps 89 to 99) to the target ontology as individuals TI3-1 and TI3-2. When applying TM6 to SI3-1, a warning is thrown (step 81) because SI3-1 has one value for the sp6 property, however the target property (tp6) has a minimum cardinality of two, and although the mapping could be applied the resultant individual (TI3-1) is inconsistent with the class definition of TC3. Therefore, the warning message (“Warning Msg 1” (this abbreviated message is used to preserve layout and a more useful message would be returned)) is added to w (step 83) and the mapped value (“document”) is added to results.
w []
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
ed []
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
ti
T1
Table A.1: Walk-through of the mapping algorithm for applying the translation mappings (1).
Step Linesc tm ltc tc si error bi m results # # 1 4 sc1 [TM1,TC1 TM2, TM3] 2 7 sc1 [TM1,TC1 TC1 SI1-1 false TM2, TM3] 3 8 sc1 [TM1,TC1 TC1 SI1-1 false new instance TM2, TM3] 4 9 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM2, = TI1-1 TM3] 5 10 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM1 TM2, = TI1-1 TM3] 6 11 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM1 [ ] TM2, = TI1-1 TM3] 7 19 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM1 [ ] TM2, = TI1-1; tp1 = [ ] TM3] 8 10 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM2 TM2, = TI1-1; tp1 = [ ] TM3] 9 11 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM2 [ ] TM2, = TI1-1; tp1 = [ ] TM3] 10 19 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM2 [ ] TM2, = TI1-1; tp1 = [ ]; TM3] tp2 = [ ] 11 10 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM3 TM2, = TI1-1; tp1 = [ ]; TM3] tp2 = [ ] SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
SI1-2:TI1-1:tp1;
SI1-2:TI1-1:tp1;
SI1-2:TI1-1:tp1;
T2
A.3. Walk-Through 226
w []
[]
[]
[]
[]
[]
[]
[]
[]
[]
ed []
[]
[]
[]
[]
[]
[]
[]
[]
[]
T1
SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
SI1-2:TI1-1:tp1; SI1-3:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
T2
SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-1:TI1-2:tp1
SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2;
TI1-1 (tp3 “pre- SI1-1:TI1-1 SI1-2:TI1-1:tp1; vious”) SI3-1:TI1-1:tp2;
TI1-1 (tp3 “previous”)
ti
Table A.2: Walk-through of the mapping algorithm for applying the translation mappings (2).
Step Linesc tm ltc tc si error bi m results # # 12 11 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM3 “previous” TM2, = TI1-1; tp1 = [ ]; TM3] tp2 = [ ] 13 19 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName TM3 “previous” TM2, = TI1-1; tp1 = TM3] [ ]; tp2 = [ ]; tp3:“previous” 14 24 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName “previous” TM2, = TI1-1; tp1 = TM3] [ ]; tp2 = [ ]; tp3:“previous” 15 25 sc1 [TM1,TC1 TC1 SI1-1 false newInstanceName “previous” TM2, = TI1-1; tp1 = TM3] [ ]; tp2 = [ ]; tp3:“previous” 16 6 sc1 [TM1,TC1 TC1 SI1-2 TM2, TM3] 17 7 sc1 [TM1,TC1 TC1 SI1-2 false TM2, TM3] 18 8 sc1 [TM1,TC1 TC1 SI1-2 false new instance TM2, TM3] 19 9 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM2, = TI1-2 TM3] 20 10 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM1 TM2, = TI1-2 TM3] 21 11 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM1 [TI1-1] TM2, = TI1-2 TM3]
A.3. Walk-Through 227
[]
[]
27 11 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM3 [“hello, TM2, = TI1-2; tp1 = “world] TM3] TI1-1; tp2 = [ ]
28 19 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM3 [“hello, TM2, = TI1-2; tp1 = “world] TM3] TI1-1; tp2 = [ ]; tp3 = [“hello, “world]
[]
[]
[]
ti
T2
SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-2:tp1 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-2:tp1 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2
T1
Table A.3: Walk-through of the mapping algorithm for applying the translation mappings (3).
[]
[]
[]
26 10 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM3 TM2, = TI1-2; tp1 = TM3] TI1-1; tp2 = [ ]
[]
[]
[]
[]
[]
[]
w
ed
25 19 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM2 [ ] TM2, = TI1-2; tp1 = TM3] TI1-1; tp2 = [ ]
Step Linesc tm ltc tc si error bi m results # # 22 19 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM1 [TI1-1] TM2, = TI1-2; tp1 = TM3] TI1-1 23 10 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM2 TM2, = TI1-2; p1 = TI1-1 TM3] 24 11 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName TM2 [ ] TM2, = TI1-2; tp1 = TM3] TI1-1
A.3. Walk-Through 228
w []
[]
[]
[]
[]
[]
[]
ed []
[]
[]
[]
[]
[]
[]
(TI1-2 TI1-1) ) (tp3 “world)) (TI1-2 TI1-1) ) (tp3 “world)) (TI1-1 TI1-2) ) (tp3 “world)) (TI1-1 TI1-2) ) (tp3 “world))
ti
T2
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2
(tp1 SI1-1:TI1-1 SI1-2:TI1-1:tp1; (tp2 SI3-1:TI1-1:tp2; “hello SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 (tp1 SI1-1:TI1-1; SI1-2:TI1-1:tp1; (tp2 SI1-2:TI1-2 SI3-1:TI1-1:tp2; “hello SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 (tp1 SI1-1:TI1-1; SI1-2:TI1-1:tp1; (tp2 SI1-2:TI1-2 SI3-1:TI1-1:tp2; “hello SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 (tp1 SI1-1:TI1-1; SI3-1:TI1-1:tp2; (tp2 SI1-2:TI1-2 SI1-3:TI1-2:tp1; “hello SI3-2:SI1-2:tp2
T1
Table A.4: Walk-through of the mapping algorithm for applying the translation mappings (4).
Step Linesc tm ltc tc si error bi m results # # 29 24 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName [“hello, TM2, = TI1-2; tp1 = “world] TM3] TI1-1; tp2 = [ ]; tp3 = [“hello, “world] 30 25 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName [“hello, TM2, = TI1-2; tp1 = “world] TM3] TI1-1; tp2 = [ ]; tp3 = [“hello, “world] 31 27 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName [“hello, TM2, = TI1-2; tp1 = “world] TM3] TI1-1; tp2 = [ ]; tp3 = [“hello, “world] 32 28 sc1 [TM1,TC1 TC1 SI1-2 false newInstanceName [“hello, TM2, = TI1-2; tp1 = “world] TM3] TI1-1; tp2 = [ ]; tp3 = [“hello, “world] 33 6 sc1 [TM1,TC1 TC1 SI1-3 TM2, TM3] 34 7 sc1 [TM1,TC1 TC1 SI1-3 false TM2, TM3] 35 8 sc1 [TM1,TC1 TC1 SI1-3 false new instance TM2, TM3]
A.3. Walk-Through 229
w []
[]
[]
[]
[]
[]
[]
[]
ed []
[]
[]
[]
[]
[]
[]
[]
ti
T2
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2
T1
Table A.5: Walk-through of the mapping algorithm for applying the translation mappings (5).
Step Linesc tm ltc tc si error bi m results # # 36 9 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM2, = TI1-3 TM3] 37 10 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM1 TM2, = TI1-3 TM3] 38 11 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM1 [TI1-2 ] TM2, = TI1-3 TM3] 39 19 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM1 [TI1-2 ] TM2, = TI1-3; tp1 = TM3] [TI1-2] 40 10 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM2 TM2, = TI1-3; tp1 = TM3] [TI1-2] 41 11 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM2 [ ] TM2, = TI1-3; tp1 = TM3] [TI1-2] 42 19 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM2 [ ] TM2, = TI1-3; tp1 = TM3] [TI1-2]; tp2 = [ ] 43 10 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM3 TM2, = TI1-3; tp1 = TM3] [TI1-2]; tp2 = [ ]
A.3. Walk-Through 230
w []
[]
[]
[]
[]
[]
[]
ed []
[]
[]
[]
[]
[]
[]
T2
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2 SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2
T1
(TI1-32 (tp1 SI1-1:TI1-1; TI1-1 TI1- SI1-2:TI1-2; 3) (tp2 ) SI1-3:TI1-3 (tp3 “hello”, “world”)) (TI1-32 (tp1 SI1-1:TI1-1; TI1-1 TI1- SI1-2:TI1-2; 3) (tp2 ) SI1-3:TI1-3 (tp3 “hello”, “world”)) SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI1-3:TI1-2:tp1; SI3-2:SI1-2:tp2
(TI1-3 (tp1 SI1-1:TI1-1; SI3-1:TI1-1:tp2; TI1-2) (tp2 ) SI1-2:TI1-2; SI1-3:TI1-2:tp1; (tp3 “descrip- SI1-3:TI1-3 SI3-2:SI1-2:tp2 tive text”))
(TI1-3 (tp1 SI1-1:TI1-1; SI3-1:TI1-1:tp2; TI1-2) (tp2 ) SI1-2:TI1-2 SI1-3:TI1-2:tp1; (tp3 “descripSI3-2:SI1-2:tp2 tive text”))
ti
Table A.6: Walk-through of the mapping algorithm for applying the translation mappings (6).
Step Linesc tm ltc tc si error bi m results # # 44 11 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM3 [“descriptive TM2, = TI1-3; tp1 = text”] TM3] [TI1-2]; tp2 = [ ] 45 19 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName TM3 [“descriptive TM2, = TI1-3; tp1 = text”] TM3] [TI1-2]; tp2 = [ ]; tp3 = [“descriptive text”] 46 24 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName [“descriptive TM2, = TI1-3; tp1 = text”] TM3] [TI1-2]; tp2 = [ ]; tp3 = [“descriptive text”] 47 25 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName [“descriptive TM2, = TI1-3; tp1 = text”] TM3] [TI1-2]; tp2 = [ ]; tp3 = [“descriptive text”] 48 27 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName [“descriptive TM2, = TI1-3; tp1 = text”] TM3] [TI1-2]; tp2 = [ ]; tp3 = [“descriptive text”] 49 28 sc1 [TM1,TC1 TC1 SI1-3 false newInstanceName [“descriptive TM2, = TI1-3; tp1 = text”] TM3] [TI1-2]; tp2 = [ ]; tp3 = [“descriptive text”] 50 3 sc2 TM4, TM5
A.3. Walk-Through 231
[]
[]
[]
[ERR- [ ] MSG-1] [ERR- [ ] MSG-1]
sc2 [TM4,TC2 TC2 SI2-1 false new instance TM5]
sc2 [TM4,TC2 TC2 SI2-1 false newInstanceName TM5] = TI2-1
55 8
56 9
57 10 sc2 [TM4,TC2 TC2 SI2-1 false newInstanceName TM4 TM5] = TI2-1
58 11 sc2 [TM4,TC2 TC2 SI2-1 false newInstanceName TM4 [ ] TM5] = TI2-1
59 13 sc2 [TM4,TC2 TC2 SI2-1 false newInstanceName TM4 [ ] TM5] = TI2-1
60 14 sc2 [TM4,TC2 TC2 SI2-1 true newInstanceName TM4 [ ] TM5] = TI2-1
[]
[]
[]
[]
[]
ti SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3
T1
Table A.7: Walk-through of the mapping algorithm for applying the translation mappings (7).
[]
[]
[]
sc2 [TM4,TC2 TC2 SI2-1 false TM5]
[]
[]
[]
[]
[]
w
ed
54 7
results
sc2 [TM4,TC2 TC2 SI2-1 TM5]
m
53 6
error bi
sc2 [TM4,TC2 TC2 TM5]
si
52 5
Step Linesc tm ltc tc # # 51 4 sc2 [[TM4,TC2 TM5]]
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
T2
A.3. Walk-Through 232
[ERR- [ ] MSG-1] [ERR- [ ] MSG-1] [ERR- [ ] MSG-1, ERRMSG-2] [ERR- [ ] MSG-1, ERRMSG-2]
sc2 [TM4,TC2 TC2 SI2-2 false new instance TM5]
sc2 [TM4,TC2 TC2 SI2-2 false newInstanceName TM5] = TI2-2
65 8
66 9
67 10 sc2 [TM4,TC2 TC2 SI2-2 false newInstanceName TM4 TM5] = TI2-2
68 11 sc2 [TM4,TC2 TC2 SI2-2 false newInstanceName TM4 [ ] TM5] = TI2-2
69 13 sc2 [TM4,TC2 TC2 SI2-2 false newInstanceName TM4 [ ] TM5] = TI2-2
[ERR- [ ] MSG-1]
ti
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
T2
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3
T1
Table A.8: Walk-through of the mapping algorithm for applying the translation mappings (8).
70 14 sc2 [TM4,TC2 TC2 SI2-2 true newInstanceName TM4 [ ] TM5] = TI2-2
[ERR- [ ] MSG-1]
sc2 [TM4,TC2 TC2 SI2-2 false TM5]
64 7
[ERR- [ ] MSG-1]
sc2 [TM4,TC2 TC2 SI2-2 TM5]
63 6
[ERR- [ ] MSG-1]
[ERR- [ ] MSG-1]
[]
62 23 sc2 [TM4,TC2 TC2 SI2-1 true newInstanceName TM5] = TI2-1
w
[ERR- [ ] MSG-1]
ed
Step Linesc tm ltc tc si error bi m results # # 61 21 sc2 [TM4,TC2 TC2 SI2-1 true newInstanceName TM4 [ ] TM5] = TI2-1
A.3. Walk-Through 233
ed [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2]
results []
sc3 [TM6]TC3
sc3 [TM6]TC3 TC3
sc3 [TM6]TC3 TC3 SI3-1
sc3 [TM6]TC3 TC3 SI3-1 false
sc3 [TM6]TC3 TC3 SI3-1 false new instance
74 4
75 5
76 6
77 7
78 8
[]
[]
[]
[]
[]
[]
[]
[]
w
ti
T2
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
T1
Table A.9: Walk-through of the mapping algorithm for applying the translation mappings (9).
sc3 [TM6]
73 3
72 23 sc2 [TM4,TC2 TC2 SI2-2 true newInstanceName TM4 [ ] TM5] = TI2-2
Step Linesc tm ltc tc si error bi m # # 71 21 sc2 [TM4,TC2 TC2 SI2-2 true newInstanceName TM5] = TI2-2
A.3. Walk-Through 234
ed
ti
T2
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
T1
[“Warning (TI3-1 (tp4 SI1-1:TI1-1; SI3-1:TI1-1:tp2; Msg 1”] “document”) SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3; SI3-1:TI3-1
[“Warning (TI3-1 (tp4 SI1-1:TI1-1; SI3-1:TI1-1:tp2; Msg 1”] “document”) SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
[“Warning Msg 1”]
[“Warning Msg 1”]
[]
[]
[]
[]
w
Table A.10: Walk-through of the mapping algorithm for applying the translation mappings (10).
86 25 sc3 [TM6]TC3 TC3 SI3-1 false
85 24 sc3 [TM6]TC3 TC3 SI3-1 false
84 19 sc3 [TM6]TC3 TC3 SI3-1 false
83 17 sc3 [TM6]TC3 TC3 SI3-1 false
82 16 sc3 [TM6]TC3 TC3 SI3-1 false
81 11 sc3 [TM6]TC3 TC3 SI3-1 false
80 10 sc3 [TM6]TC3 TC3 SI3-1 false
results
[ERRMSG-1, ERRMSG-2] newInstanceName TM6 [ERR= TI3-1 MSG-1, ERRMSG-2] newInstanceName TM6 WARNING: [ERR= TI3-1 result = MSG-1, “document” ERRmsg = MSG-2] [“Warning Msg 1” newInstanceName TM6 “document” [ERR= TI3-1 MSG-1, ERRMSG-2] newInstanceName TM6 “document” [ERR= TI3-1 MSG-1, ERRMSG-2] newInstanceName TM6 “document” [ERR= TI3-1; tp4 = MSG-1, [“document”] ERRMSG-2] newInstanceName “document” [ERR= TI3-1; tp4 = MSG-1, [“document”] ERRMSG-2] newInstanceName “document” [ERR= TI3-1; tp4 = MSG-1, [“document”] ERRMSG-2]
Step Linesc tm ltc tc si error bi m # # 79 9 sc3 [TM6]TC3 TC3 SI3-1 false newInstanceName = TI3-1
A.3. Walk-Through 235
sc3 [TM6]TC3 TC3 SI3-2 false
sc3 [TM6]TC3 TC3 SI3-2 false
sc3 [TM6]TC3 TC3 SI3-2 false
90 7
91 8
92 9
ed
ti
[“Warning Msg 1”]
[“Warning Msg 1”]
[“Warning Msg 1”]
[“Warning Msg 1”]
[“Warning Msg 1”]
[“Warning Msg 1”]
[“Warning (TI1-1 Msg 1”] TI1-2) TI3-1))
[“Warning (TI1-1 Msg 1”] TI1-2) TI3-1))
w (tp1 SI1-1:TI1-1; (tp2 SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 (tp1 SI1-1:TI1-1; (tp2 SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1
T1
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-1:TI1-1:tp2; SI3-2:SI1-2:tp2
T2
Table A.11: Walk-through of the mapping algorithm for applying the translation mappings (11).
94 11 sc3 [TM6]TC3 TC3 SI3-2 false
93 10 sc3 [TM6]TC3 TC3 SI3-2 false
sc3 [TM6]TC3 TC3 SI3-2
89 6
88 28 sc3 [TM6]TC3 TC3 SI3-1 false
results
“document” [ERRMSG-1, ERRMSG-2] newInstanceName “document” [ERR= TI3-1; tp4 = MSG-1, [“document”] ERRMSG-2] [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2] new instance [ERRMSG-1, ERRMSG-2] newInstanceName [ERR= TI3-2 MSG-1, ERRMSG-2] newInstanceName TM6 [ERR= TI3-2 MSG-1, ERRMSG-2] newInstanceName TM6 [“paper”, [ERR= TI3-2 “canvas”] MSG-1, ERRMSG-2]
Step Linesc tm ltc tc si error bi m # # 87 27 sc3 [TM6]TC3 TC3 SI3-1 false newInstanceName = TI3-1; tp4 = [“document”]
A.3. Walk-Through 236
[“paper”, “canvas”]
[“paper”, “canvas”]
[“paper”, “canvas”]
97 25 sc3 [TM6]TC3 TC3 SI3-2 false newInstanceName = TI3-2; tp6 = [“paper”, “canvas”]
98 27 sc3 [TM6]TC3 TC3 SI3-2 false newInstanceName = TI3-2; tp6 = [“paper”, “canvas”]
99 28 sc3 [TM6]TC3 TC3 SI3-2 false newInstanceName = TI3-2; tp6 = [“paper”, “canvas”]
[“Warning Msg 1”]
[ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2] [ERRMSG-1, ERRMSG-2]
[ERR- [“Warning MSG-1, Msg 1”] ERRMSG-2]
[ERR- [“Warning MSG-1, Msg 1”] ERRMSG-2]
[“Warning Msg 1”]
[“Warning Msg 1”]
w
ed
T1
SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 (TI3-2 (tp6 “pa- SI1-1:TI1-1; per” “canvas”)) SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 (TI3-2 (tp6 “pa- SI1-1:TI1-1; per” “canvas”)) SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1; SI3-2:TI3-2 (TI1-2 (tp1 SI1-1:TI1-1; TI1-1 TI1-3) SI1-2:TI1-2; (tp2 TI3-2) SI1-3:TI1-3; (tp3 “hello” SI3-1:TI3-1; “world”)) SI3-2:TI3-2 (TI1-2 (tp1 SI1-1:TI1-1; TI1-1 TI1-3) SI1-2:TI1-2; (tp2 TI3-2) SI1-3:TI1-3; (tp3 “hello” SI3-1:TI3-1; “world”)) SI3-2:TI3-2
ti
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
T2
Table A.12: Walk-through of the mapping algorithm for applying the translation mappings (12).
[“paper”, “canvas”]
96 24 sc3 [TM6]TC3 TC3 SI3-2 false newInstanceName = TI3-2; tp6 = [“paper”, “canvas”]
Step Linesc tm ltc tc si error bi m results # # 95 19 sc3 [TM6]TC3 TC3 SI3-2 false newInstanceName TM6 [“paper”, = TI3-2; tp6 = “canvas”] [“paper”, “canvas”]
A.3. Walk-Through 237
A.3. Walk-Through Step Linesi sp tiname tp T1 # # 1 0 SI1-1 sp1 TI1-1 tp1 2 10 SI1-1 sp1 TI1-1 tp1 3 14 SI1-1 sp1 TI1-1 tp1
T2
238 values
SI1-2:TI1-1:tp1
v
SI1-2 SI1-2
Table A.13: Applying mapping TM1 to SI1-1; returns []. Target Ontology after Mapping Class(to:TC1 complete) Class(to:TC2 complete) Class(to:TC3 complete restriction(to:tp6 minCardinality(2))) Class(to:TC4 complete) Class(to:TC5 complete) Class(to:TC6 complete) Class(to:TC7 complete) Class(to:SC5 partial to:TC5)) ObjectProperty(to:tp1 domain(to:TC1)) ObjectProperty(to:tp2 domain(to:TC2)) DatatypeProperty(to:tp3 domain(to:TC1) range(xsd:string)) DatatypeProperty(to:tp4 domain(to:TC2) range(xsd:string)) DatatypeProperty(to:tp5 domain(to:TC2) range(xsd:string)) DatatypeProperty(to:tp6 domain(to:TC3) range(xsd:string)) DatatypeProperty(to:sp7 domain(to:TC7) range(xsd:int)) Individual(to:ti1-1 type(to:TC1) value(to:tp1 ti1-2) value(to:tp2 ti3-1) value(to:tp3 “previous”)) Individual(to:ti1-2 type(to:TC1) value(to:tp1 ti1-1) value(to:tp1 ti1-3) value(to:tp2 ti3-2) value(to:tp3 “hello”) value(tp3 “world”)) Individual(to:ti1-3 type(to:TC1) value(to:tp1 ti1-2) value(to:tp3 “descriptive text”)) Individual(to:ti3-1 type(to:TC3) value(to:tp6 “document”)) Individual(to:ti3-2 type(to:TC3) value(to:tp6 “paper”) value(to:tp6 “canvas”))) Individual(to:sc4 type(to:TC4)) Individual(to:sp8 type(to:TC7)) Figure A.5: The target ontology after all the mappings have been applied. Figure A.5 shows the target ontology after all of the mappings have been applied. At this point, the control flow goes to line 29 of algorithm 2, where the algorithm attempts to complete any incomplete mappings in T2 using the mapped values in T1 to complete the definition of the individuals in the target ontology that could not be completed when they were correct and were not completed by later mappings. Following this, line 35 displays the errors to the user, which in this case are “ERR-MSG-1” and “ERR-MSG-2”; line 37 then displays the warnings to the user, which in this case is “Warning Msg 1”. The mapping application algorithm then terminates. At various steps, algorithm 3 is used to apply a mapping to the property values of an individual (for example, at step 6 when TM1 is applied to the value of sp1 for SI1-1, at step 9 when TM2 is is applied to the value of sp2 for SI1-1, and at step 12 when TM3 is applied to the value of sp3 for SI1-1). Tables A.13 to A.25 provide walk-throughs for the various invokations of algorithm 3.
A.3. Walk-Through
Step Linesi sp tiname tp T1 # # 1 0 SI1-1 sp2 TI1-1 tp2 2 10 SI1-1 sp2 TI1-1 tp2 3 14 SI1-1 sp2 TI1-1 tp2
T2
239
values
SI1-2:TI1-1:tp1; SI1-2:TI1-1:tp1; SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2
Table A.14: Applying mapping TM2 to SI1-1; returns: [].
v
SI3-1 SI3-1
4 6 7
2
3
4
values
v
nv
SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2 SI1-2:TI1-1:tp1; [ ] SI3-1:TI1-1:tp2 SI1-2:TI1-1:tp1; [ ] “previous” SI3-1:TI1-1:tp2 SI1-2:TI1-1:tp1; [“previous”] “previous” “previous” SI3-1:TI1-1:tp2
T2
Table A.15: Applying mapping TM3 to SI1-1; returns [“previous”].
SI1-1 sp3 TI1-1 tp3
SI1-1 sp3 TI1-1 tp3
SI1-1 sp3 TI1-1 tp3
Step Linesi sp tiname tp T1 # # 1 0 SI1-1 sp3 TI1-1 tp3
A.3. Walk-Through 240
A.3. Walk-Through
Step Linesi sp tiname tp T1 T2 # # 1 0 SI1-2 sp1 TI1-2 tp1 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2 2 10 SI1-2 sp1 TI1-2 tp1 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2 3 12 SI1-2 sp1 TI1-2 tp1 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2 4 10 SI1-2 sp1 TI1-2 tp1 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2 5 14 SI1-2 sp1 TI1-2 tp1 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1
241
values
v
[]
SI1-1
[TI1-1]
SI1-1
[TI1-1]
SI1-3
[TI1-1]
SI1-3
Table A.16: Applying mapping TM1 to SI1-2; returns [TI1-1].
Step Linesi sp tiname tp T1 T2 values # # 1 0 SI1-2 sp2 TI1-2 tp2 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1 2 10 SI1-2 sp2 TI1-2 tp2 SI1-1:TI1-1 SI1-2:TI1-1:tp1; [ ] SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1 3 14 SI1-2 sp2 TI1-2 tp2 SI1-1:TI1-1 SI1-2:TI1-1:tp1; [ ] SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2
Table A.17: Applying mapping TM2 to SI1-2; returns [ ].
v
SI3-2
SI3-2
“hello” “hello”
“hello” “hello”
“world”
“world” “world”
“world” “world”
[]
[“hello” ]
[“hello” ]
[“hello” ]
[“hello”, “world” ]
nv
“hello”
v
[]
[]
values
Table A.18: Applying mapping TM3 to SI1-2; returns [“hello”, “world”].
Step Linesi sp tiname tp T1 T2 # # 1 0 SI1-2 sp3 TI1-2 tp3 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 2 4 SI1-2 sp3 TI1-2 tp3 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 3 6 SI1-2 sp3 TI1-2 tp3 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 4 7 SI1-2 sp3 TI1-1 tp3 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 5 8 SI1-2 sp3 TI1-1 tp3 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 6 6 SI1-2 sp3 TI1-2 tp3 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 7 7 SI1-2 sp3 TI1-2 tp3 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 8 8 SI1-2 sp3 TI1-2 tp3 SI1-1:TI1-1 SI1-2:TI1-1:tp1; SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2
A.3. Walk-Through 242
A.3. Walk-Through
Step Linesi sp tiname tp T1 T2 values # # 1 0 SI1-3 sp1 TI1-3 tp1 SI1-1:TI1-1; SI1-2:TI1-1:tp1; SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 2 10 SI1-3 sp1 TI1-3 tp1 SI1-1:TI1-1; SI1-2:TI1-1:tp1; [ ] SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 3 12 SI1-3 sp1 TI1-3 tp1 SI1-1:TI1-1; SI1-2:TI1-1:tp1; [TI1-2] SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2
243
v
SI1-2
SI1-2
Table A.19: Applying mapping TM1 to SI1-3; returns [TI1-2].
Step Linesi sp tiname tp T1 T2 values # # 1 0 SI1-3 sp2 TI1-3 tp2 SI1-1:TI1-1; SI1-2:TI1-1:tp1; SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 2 10 SI1-3 sp2 TI1-3 tp2 SI1-1:TI1-1; SI1-2:TI1-1:tp1; [ ] SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2
Table A.20: Applying mapping TM2 to SI1-3; returns [].
v
“class” “class”
“class” “class”
[]
[“class” ]
nv
“class”
v
[]
[]
values
Table A.21: Applying mapping TM3 to SI1-3; returns [“class”].
Step Linesi sp tiname tp T1 T2 # # 1 0 SI1-3 sp3 TI1-3 tp3 SI1-1:TI1-1; SI1-2:TI1-1:tp1; SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 2 4 SI1-3 sp3 TI1-3 tp3 SI1-1:TI1-1; SI1-2:TI1-1:tp1; SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 3 6 SI1-3 sp3 TI1-3 tp3 SI1-1:TI1-1; SI1-2:TI1-1:tp1; SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 4 7 SI1-3 sp3 TI1-3 tp3 SI1-1:TI1-1; SI1-2:TI1-1:tp1; SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2 5 8 SI1-3 sp3 TI1-3 tp3 SI1-1:TI1-1; SI1-2:TI1-1:tp1; SI1-2:TI1-2 SI3-1:TI1-1:tp2; SI1-3:TI1-1:rp1; SI3-1:TI1-2:tp2
A.3. Walk-Through 244
A.3. Walk-Through
245
Step Linesi sp tiname tp T1 T2 # # 1 0 SI2-1 sp4 TI2-1 tp4 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2;; SI3-2:SI1-2:tp2 SI1-3:TI1-3
Table A.22: Applying mapping TM4 to SC2-1; an error is thrown at line 2 as properties are not compatible.
Step Linesi sp tiname tp T1 T2 # # 1 0 SI2-2 sp4 TI2-2 tp4 SI1-1:TI1-1; SI3-1:TI1-1:tp2; SI1-2:TI1-2; SI3-2:SI1-2:tp2 SI1-3:TI1-3
Table A.23: Applying mapping TM4 to SC2-2; an error is thrown at line 2 as properties are not compatible.
SI3-1:TI11:tp2;SI3-2:SI12:tp2 SI3-1:TI11:tp2;SI3-2:SI12:tp2 SI3-1:TI11:tp2;SI3-2:SI12:tp2 SI3-1:TI11:tp2;SI3-2:SI12:tp2
T2 nv
“document” “document”
“document”
v
[“document”] “document” “document”
values
Table A.24: Applying mapping TM6 to SI3-1; returns [“document”].
Step Linesi sp tiname tp T1 # # 1 0 SI3-1 sp6 TI3-1 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 2 6 SI3-1 sp6 TI3-1 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 3 7 SI3-1 sp6 TI3-1 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3 4 8 SI3-1 sp6 TI3-1 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3
A.3. Walk-Through 246
“paper” “paper”
[“paper”]
[“paper”]
[“paper”]
[“paper”, “canvas”]
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
“canvas” “canvas”
“canvas” “canvas”
“canvas”
“paper” “paper”
nv
SI3-2:SI1-2:tp2
v
“paper”
values
SI3-2:SI1-2:tp2
SI3-2:SI1-2:tp2
T2
Table A.25: Applying mapping TM6 to SI3-2; returns [“paper”, “canvas”].
Step Linesi sp tiname tp T1 # # 1 0 SI3-2 sp6 TI3-2 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 2 6 SI3-2 sp6 TI3-2 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 3 7 SI3-2 sp6 TI3-2 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 4 8 SI3-2 sp6 TI3-2 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 5 6 SI3-2 sp6 TI3-2 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 6 7 SI3-2 sp6 TI3-2 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1 7 8 SI3-2 sp6 TI3-2 tp6 SI1-1:TI1-1; SI1-2:TI1-2; SI1-3:TI1-3; SI3-1:TI3-1
A.3. Walk-Through 247
248
Appendix B
Example Application of the KA Algorithms Section 5.4 describes the three KA related algorithms, algorithms 4, 5, and 6, which work together to guide the user through the KA process. This appendix provides a detailed walk-through illustrating the application of these algorithms when performing KA for an example problem solver.
B.1
Example PS Ontology
Figure B.1 provides a visualisation of the parts of a PS ontology relevant to the KA process (the different rule types and their associated meta-data individuals) that are used in this example.
B.2
Walk-Through
Consider the situation where the user has loaded the above PS ontology into the KA tool, selected a SystemComponent individual (called concept), and clicked on the start KA button. Algorithm 5 is the first to be executed with concept being passed as input. The first step of the algorithm is to call algorithm 4 to build the list of KA graphs. Algorithm 4 proceeds as follows: Input selectedConcept is concept ir is a list containing RuleType1, RuleType2, and RuleType3 nodes is an empty list which will contain KaGraphNode instances kaGraphs is an empty list which will contain KaGraphNode instances Algorithm 4 then proceeds by going through every rule type in ir and building the rule KA graph if appropriate: Begin Algorithm 4 {Algorithm 4 Step 1 - r is RuleType1} if RuleType1 is relevant to concept, which it is then if nodes (currently empty) contains a node for RuleType1 (which it does not) then {code node executed} else KaGraphNode ruleT ype1KaN ode is set to a new KaGraphNode ruleT ype1KaN ode.ruleType is set to RuleType1 ruleT ype1KaN ode is added to nodes buildGraphFor(ruleT ype1KaN ode, concept) is called ruleT ype1KaN ode is then added to kaGraphs {End Algorithm 4 Step 1} The flow of control of buildGraphFor(ruleType1KaNode, concept) is:
B.2. Walk-Through
Figure B.1: Fragments of a PS ontology relevant to the KA process.
249
B.2. Walk-Through
250
Input n is ruleT ype1KaN ode the KaGraphNode that the graph should be built for, ruleT ype1KaN ode.ruleType is RuleType1, and the next rule types for RuleType1 are RuleType4, RuleType5, and RuleType6 selectedConcept is concept nodes contains the single item ruleT ype1KaN ode Begin buildGraphFor(ruleT ype1KaN ode, concept) for every next rule type, nr, for ruleT ype1KaN ode.ruleType (which are RuleType4, RuleType5, and RuleType6) do {buildGraphFor(ruleT ype1KaN ode, concept) Step 1 - nr is RuleType4} if RuleType4 is relevant to concept (which it is) then if nodes contains a node for RuleType4 (which it does not) then {code not executed} else KaGraphNode ruleT ype4KaN ode is set to a new KaGraphNode ruleT ype4KaN ode.ruleType is set to RuleType4 ruleT ype4KaN ode is added to ruleT ype1KaN ode.nextRuleTypes ruleT ype4KaN ode is added to nodes buildGraphFor(ruleT ype4KaN ode, concept) is called {End buildGraphFor(ruleT ype1KaN ode, concept) Step 1} The call to buildGraphFor(ruleType4KaNode, concept) then proceeds: Input n is ruleT ype4KaN ode, the KaGraphNode that the graph should be built for, and ruleT ype4KaN ode.ruleType is RuleType4, and the next rule type for RuleType4 is RuleType5 selectedConcept is concept is the selected concept nodes contains ruleT ype1KaN ode, and ruleT ype4KaN ode Begin buildGraphFor(ruleT ype4KaN ode, concept) for every next rule type, nr, for ruleT ype4KaN ode.ruleType (which is RuleType5) do {buildGraphFor(ruleT ype4KaN ode, concept) Step 1 - nr is RuleType5} if RuleType5 is relevant to concept (which it is) then if nodes contains a node for RuleType5 (which it does not) then {code not executed} else KaGraphNode ruleT ype5KaN ode is set to a new KaGraphNode ruleT ype5KaN ode.ruleType is set to RuleType5 ruleT ype5KaN ode is added to ruleT ype4KaN ode.nextRuleTypes ruleT ype5KaN ode is added to nodes buildGraphFor(ruleT ype5KaN ode, concept) is called {End buildGraphFor(ruleT ype4KaN ode, concept) Step 1} End buildGraphFor(ruleT ype4KaN ode, concept) After this call: •ruleT ype4KaN ode.nextRuleTypes contains ruleT ype5KaN ode •nodes contains ruleT ype1KaN ode, ruleT ype4KaN ode, ruleT ype5KaN ode, ruleT ype7KaN ode, and ruleT ype8KaN ode
B.2. Walk-Through
251
The call to buildGraphFor(ruleType5KaNode, concept) then proceeds: Input n is ruleT ype5KaN ode, the KaGraphNode that the graph should be built for, ruleT ype5KaN ode.ruleType is RuleType5, and the next rule types for RuleType5 are RuleType7 and RuleType8 concept is the selected concept nodes contains ruleT ype1KaN ode, ruleT ype4KaN ode, and ruleT ype5KaN ode Begin buildGraphFor(ruleT ype5KaN ode, concept) for every next rule type, nr, for ruleT ype5KaN ode.ruleType (which are RuleType7 and RuleType8) do {buildGraphFor(ruleT ype5KaN ode, concept) Step 1 - nr is RuleType7} if RuleType7 is relevant to concept (which it is) then if nodes contains a node for RuleType7 (which it does not) then {code not executed} else KaGraphNode ruleT ype7KaN ode is set to a new KaGraphNode ruleT ype7KaN ode.ruleType is set to RuleType7 ruleT ype7KaN ode is added to ruleT ype5KaN ode.nextRuleTypes ruleT ype7KaN ode is added to nodes buildGraphFor(ruleT ype7KaN ode, concept) is called {End buildGraphFor(ruleT ype5KaN ode, concept) Step 1} The method call to buildGraphFor(ruleT ype7KaN ode, concept) proceeds: Input n is ruleT ype7KaN ode, the KaGraphNode that the graph should be built for, ruleT ype7KaN ode.ruleType is RuleType7, and the next rule type for RuleType7 is RuleType4 concept is the selected concept nodes contains ruleT ype1KaN ode, ruleT ype4KaN ode, ruleT ype5KaN ode, and ruleT ype7KaN ode Begin buildGraphFor(ruleT ype7KaN ode, concept) for every next rule type, nr, for ruleT ype7KaN ode.ruleType (which is RuleType4) do {buildGraphFor(ruleT ype7KaN ode, concept) Step 1 - nr is RuleType4} if RuleType4 is relevant to concept (which it is) then if nodes contains a node for RuleType4 (which it does) then ruleT ype4KaN ode is retrieved from nodes ruleT ype4KaN ode is added to ruleT ype7KaN ode.nextRuleTypes else {code not executed} {End buildGraphFor(ruleT ype7KaN ode, concept)} The control flow then reverts back to buildGraphFor(ruleType5KaNode, concept), moving on to the next iteration: {buildGraphFor(ruleT ype5KaN ode, concept) Step 2 - nr is RuleType8} if RuleType8 is relevant to concept (which it is) then if nodes contains a node for RuleType8 (which it does not) then
B.2. Walk-Through
252
{code not executed} else KaGraphNode ruleT ype8KaN ode is set to a new KaGraphNode ruleT ype8KaN ode.ruleType is set to RuleType8 ruleT ype8KaN ode is added to ruleT ype5KaN ode.nextRuleTypes ruleT ype8KaN ode is added to nodes buildGraphFor(ruleT ype8KaN ode, concept) is called {End buildGraphFor(ruleT ype5KaN ode, concept) Step 2 - nr is RuleType8} End buildGraphFor(ruleT ype5KaN ode, concept) As RuleType8 has no next rule types the call buildGraphFor(ruleT ype8KaN ode, concept) does nothing as the main for loop does not execute. The control flow therefore then reverts back to buildGraphFor(ruleType1KaNode, concept), moving on to the next item in the list: {buildGraphFor(ruleType1KaNode, concept) Step 2 - nr is RuleType5} {nodes now contains ruleT ype1KaN ode, ruleT ype4KaN ode, ruleT ype5KaN ode, ruleT ype7KaN ode, and ruleT ype8KaN ode} if RuleType5 is relevant to concept (which it is) then if nodes contains a node for RuleType5 (which it does) then ruleT ype5KaN ode is retrieved from nodes ruleT ype5KaN ode is added to ruleT ype1KaN ode.nextRuleTypes else {code not executed} {End buildGraphFor(ruleType1KaNode, concept)} {buildGraphFor(ruleType1KaNode, concept) Step 3 - nr is RuleType6} if RuleType6 is relevant to concept (which it is) then if nodes contains a node for RuleType6 (which it does not) then {code not executed} else KaGraphNode ruleT ype6KaN ode is set to a new KaGraphNode ruleT ype6KaN ode.ruleType is set to RuleType6 ruleT ype6KaN ode is added to ruleT ype1KaN ode.nextRuleTypes ruleT ype6KaN ode is added to nodes buildGraphFor(ruleT ype6KaN ode, concept) is called {End buildGraphFor(ruleType1KaNode, concept) Step 3 - nr is RuleType6} End buildGraphFor(ruleType1KaNode, concept) ruleT ype1KaN ode is added to kaGraphs End Algorithm 4 Step 1 When this call to buildGraphFor(ruleType1KaNode, concept) is finished, the variables have the following values ruleT ype1KaN ode.nextRuleTypes is a list containing ruleT ype4KaN ode, ruleT ype5KaN ode, and ruleT ype6KaN ode ruleT ype4KaN ode.nextRuleTypes is a list containing ruleT ype5KaN ode ruleT ype5KaN ode.nextRuleTypes is a list containing ruleT ype7KaN ode and
B.2. Walk-Through
253
ruleT ype8KaN ode ruleT ype6KaN ode.nextRuleTypes is an empty list ruleT ype7KaN ode.nextRuleTypes is a list containing ruleT ype4KaN ode ruleT ype8KaN ode.nextRuleTypes is an empty list nodes contains ruleT ype1KaN ode, ruleT ype4KaN ode, ruleT ype5KaN ode, ruleT ype7KaN ode, ruleT ype8KaN ode and ruleT ype6KaN ode Algorithm 4 continues with the next iteration of the for loop, with r equal to RuleType2: Algorithm 4 Step 2 - r is RuleType2 {nodes now contains ruleT ype1KaN ode, ruleT ype4KaN ode, ruleT ype5KaN ode, ruleT ype7KaN ode, ruleT ype8KaN ode, and ruleT ype6KaN ode} if RuleType2 is relevant to concept, which it is not then {code not executed} End Algorithm 4 Step 2 Algorithm 4 continues with the next iteration of the for loop, with r equal to RuleType3: Algorithm 4 Step 3 - r is RuleType3 if RuleType3 is relevant to concept, which it is then if nodes contains a node for RuleType3 (which it does not) then {code node executed} else KaGraphNode ruleT ype3KaN ode is set to a new KaGraphNode ruleT ype3KaN ode.ruleType is set to RuleType3 ruleT ype3KaN ode is added to nodes buildGraphFor(ruleT ype3KaN ode, concept) is called ruleT ype3KaN ode is then added to kaGraphs End Algorithm 4 Step 3 The control flow for buildGraphFor(ruleType3KaNode, concept) is: Input n is ruleT ype3KaN ode, the KaGraphNode that the graph should be built for, ruleT ype3KaN ode.ruleType is RuleType3, and the next rule type for RuleType3 is RuleType11 selectedConcept is concept nodes contains ruleT ype1KaN ode, ruleT ype4KaN ode, ruleT ype5KaN ode, ruleT ype7KaN ode, ruleT ype8KaN ode, ruleT ype6KaN ode, and ruleT ype3KaN ode Begin buildGraphFor(ruleT ype3KaN ode, concept) for every next rule type, nr, for ruleT ype3KaN ode.ruleType (which is RuleType11) do {buildGraphFor(ruleT ype3KaN ode, concept) Step 1 - nr is RuleType11} if RuleType11 is relevant to concept (which it is) then if nodes contains a node for RuleType11 (which it does not) then {{code not executed else KaGraphNode ruleT ype11KaN ode = new KaGraphNode ruleT ype11KaN ode.ruleType is set to RuleType7
B.2. Walk-Through
254
ruleT ype11KaN ode is added to ruleT ype11KaN ode.nextRuleTypes ruleT ype11KaN ode is added to nodes buildGraphFor(ruleT ype11KaN ode, concept) is called {End buildGraphFor(ruleT ype3KaN ode, concept) Step 1} End buildGraphFor(ruleT ype3KaN ode, concept) The control flow for buildGraphFor(ruleType11KaNode, concept) is: Input n is ruleT ype11KaN ode, KaGraphNode that the graph should be built for, ruleT ype11KaN ode.ruleType is RuleType11, and the next rule type for RuleType11 is RuleType8 concept is the selected concept nodes contains ruleT ype1KaN ode, ruleT ype4KaN ode, ruleT ype5KaN ode, ruleT ype7KaN ode, ruleT ype8KaN ode, ruleT ype6KaN ode, ruleT ype3KaN ode, and RuleT ype11KaN ode Begin buildGraphFor(ruleT ype11KaN ode, concept) for every next rule type, nr, for ruleT ype11KaN ode.ruleType (which is RuleType8) do {buildGraphFor(ruleT ype11KaN ode, concept) Step 1 - nr is RuleType8} if RuleType8 is relevant to concept (which it is) then if nodes contains a node for RuleType8 (which it does) then ruleT ype8KaN ode is retrieved from nodes ruleT ype8KaN ode is added to ruleT ype11KaN ode.nextRuleTypes else {code not executed} {End buildGraphFor(ruleT ype11KaN ode, concept) Step 1} End buildGraphFor(ruleT ype11KaN ode, concept) End Algorithm 4 The control flow is then returned to buildGraphFor(ruleType3KaNode, concept), which returns control to algorithm 4 step 3. As there are no more initial rule types, algorithm 4 finishes, returning kaGraphs. The final variable states after the algorithm completes are •nodes contains ruleT ype1KaN ode, ruleT ype4KaN ode, ruleT ype5KaN ode, ruleT ype7KaN ode, ruleT ype8KaN ode, ruleT ype6KaN ode, ruleT ype3KaN ode, and ruleT ype11KaN ode •ruleT ype3KaN ode.nextRuleTypes is a list containing ruleT ype11KaN ode •ruleT ype11KaN ode.nextRuleTypes is a list containing ruleT ype8KaN ode •kaGraphs contains ruleT ype1KaN ode and ruleT ype3KaN ode Figure B.2 illustrates the RuleKaNode instances that have been created during the above KA process. The kaGraphs variable is then passed back from algorithm 4 to algorithm 5, which proceeds: kaGraphs is set to a list containing ruleT ype1KaN ode and ruleT ype3KaN ode as discussed above f irst is set to ruleT ype1KaN ode
B.2. Walk-Through
255
Figure B.2: The RuleKaNodes that are created to represent the rules and relationships described in the example PS ontology.
rule1 1 is set to a new instance (rule) of RuleType1 (the value of f irst.ruleType) RuleGraphNode rule1 1GraphN ode is set to a new RuleGraphNode, with rule property set to rule1 1 and typeNode set to ruleT ype1KaN ode. rule1 1 is displayed to the user After the user has defined the rule rule1 1, and clicks on the define next rule button, algorithm 5 is invoked, this time the acquiring the next rule section: Input concept, the selected domain concept rule1 1GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule1 1GraphN ode.typeN ode is ruleT ype1KaN ode, the value of its nextRuleTypes property is the list containing ruleT ype4KaN ode, ruleT ype5KaN ode, ruleT ype6KaN ode, and so the following is executed} The user is asked to select the type of the next rule from RuleType4, RuleType5, and RuleType6. Assuming the user selected RuleType4 tempT ypeN ode is set to ruleT ype4KaN ode if tempT ypeN ode is not null, which is true then rule4 1 is set to a new (rule) instance of type RuleType4 RuleGraphNode rule4 1GraphN ode is set to a new RuleGraphNode rule4 1GraphN ode.rule is set to rule4 1 rule4 1GraphN ode.previousRule is set to rule1 1GraphN ode rule4 1GraphN ode is added to rule1 1GraphN ode.nextRules
B.2. Walk-Through
256
rule4 1GraphN ode.rule (rule4 1) is displayed to the user End Once the user has completed the rule rule4 1 and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule4 1GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule4 1GraphN ode.typeNode is ruleT ype4KaN ode, the value of its nextRuleTypes property is the list containing ruleT ype5KaN ode, and so the following is executed} tempT ypeN ode is set to ruleT ype5KaN ode if tempT ypeN ode is not null, which is true then rule5 1 is set to a new (rule) instance of type RuleType4 RuleGraphNode rule5 1GraphN ode is set to a new RuleGraphNode rule5 1GraphN ode.rule is set to rule5 1 rule5 1GraphN ode.previousRule is set to rule4 1GraphN ode rule5 1GraphN ode is added to rule4 1GraphN ode.nextRules rule5 1GraphN ode.rule (rule5 1) is displayed to the user End After the user has defined the rule rule5 1, and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule5 1GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule5 1GraphN ode.typeNode is ruleT ype5KaN ode, the value of its nextRuleTypes property is the list containing ruleT ype7KaN ode, and ruleT ype8KaN ode so the following is executed} The user is asked to select the type of the next rule from RuleType7, and RuleType8. Assuming the user selected RuleType7 tempT ypeN ode is set to ruleT ype7KaN ode if tempT ypeN ode is not null, which is true then rule7 1 is set to a new (rule) instance of type RuleType4 RuleGraphNode rule7 1GraphN ode is set to a new RuleGraphNode rule7 1GraphN ode.rule is set to rule4 1 rule7 1GraphN ode.previousRule is set to rule5 1GraphN ode
B.2. Walk-Through
257
rule7 1GraphN ode is added to rule5 1GraphN ode.nextRules rule7 1GraphN ode.rule (rule7 1) is displayed to the user End Once the user has completed the rule rule7 1 and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule7 1GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule7 1GraphN ode.typeNode is ruleT ype7KaN ode, the value of its nextRuleTypes property is the list containing ruleT ype4KaN ode, and so the following is executed} tempT ypeN ode is set to ruleT ype4KaN ode if tempT ypeN ode is not null, which is true then rule4 2 is set to a new (rule) instance of type RuleType4 RuleGraphNode rule4 2GraphN ode is set to a new RuleGraphNode rule4 2GraphN ode.rule is set to rule4 2 rule4 2GraphN ode.previousRule is set to rule7 1GraphN ode rule4 2GraphN ode is added to rule7 1GraphN ode.nextRules rule4 2GraphN ode.rule (rule4 2) is displayed to the user End Once the user has completed the rule rule4 2 and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule4 2GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule4 2GraphN ode.typeNode is ruleT ype4KaN ode, the value of its nextRuleTypes property is the list containing ruleT ype5KaN ode, and so the following is executed} tempT ypeN ode is set to ruleT ype5KaN ode if tempT ypeN ode is not null, which is true then rule5 2 is set to a new (rule) instance of type RuleType5 RuleGraphNode rule5 2GraphN ode is set to a new RuleGraphNode rule5 2GraphN ode.rule is set to rule5 2 rule5 2GraphN ode.previousRule is set to rule4 2GraphN ode rule5 2GraphN ode is added to rule4 2GraphN ode.nextRules rule5 2GraphN ode.rule (rule5 2) is displayed to the user
B.2. Walk-Through
258
End After the user has defined the rule rule5 2, and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule5 2GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule5 2GraphN ode.typeNode is ruleT ype5KaN ode, the value of its nextRuleTypes property is the list containing ruleT ype7KaN ode, and ruleT ype8KaN ode so the following is executed} the user is asked to select the type of the next rule from RuleType7, and RuleType8. Assuming the user selects RuleType8 tempT ypeN ode is set to ruleT ype8KaN ode if tempT ypeN ode is not null, which is true then rule8 1 is set to a new (rule) instance of type RuleType8 RuleGraphNode rule8 1GraphN ode is set to a new RuleGraphNode rule8 1GraphN ode.rule is set to rule8 1 rule8 1GraphN ode.previousRule is set to rule5 2GraphN ode rule8 1GraphN ode is added to rule5 2GraphN ode.nextRules rule8 1GraphN ode.rule (rule8 1) is displayed to the user End After the user has defined the rule rule8 1, and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule8 1GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule8 1GraphN ode.typeNode is ruleT ype8KaN ode, the value of its nextRuleTypes property is an empty list and so determineNextOptions is called (see below)} Say the user selects to create a rule of type RuleType6 from the options presented by determineNextOptions(rule8 1GraphN ode) tempT ypeN ode is set to ruleT ype6KaN ode if tempT ypeN ode is not null, which is true then rule6 1 is set to a new (rule) instance of type RuleType6 RuleGraphNode rule6 1GraphN ode is set to a new RuleGraphNode rule6 1GraphN ode.rule is set to rule6 1
B.2. Walk-Through
259
rule6 1GraphN ode.previousRule is set to rule1 1GraphN ode rule6 1GraphN ode is added to rule1 1GraphN ode.nextRules rule6 1GraphN ode.rule (rule6 1) is displayed to the user End The call to determineNextOptions(rule8 1GraphNode) proceeds as follows: Input rule8 1GraphN ode, the rule that was just defined Begin options is set to a new list The option to add a new rule similar to rule8 1 is added to options RuleGraphNode temprn is set to rule5 2GraphN ode (the value of rule8 1GraphN ode’s previousRule property) while temprn does not equal null do Iteration 1 An option to add a rule similar to rule5 2 is added to options for every KA graph node in temprn’s (rule5 2GraphN ode) typeNode’s (ruleT ype5KaN ode) nextRuleTypes property (which are ruleT ype8KaN ode and ruleT ype7KaN ode) do {The following options are added to options during the execution of this for loop} An option to define another rule of type RuleType8 related to rule5 2 is added to options An option to define a rule of type RuleType7 related to rule5 2 is added to options temprn is set to ruleT ype4 2GraphN ode Iteration 2 An option to add a rule similar to rule4 2 is added to options for every KA graph node in temprn’s (rule4 2GraphN ode) typeNode’s (ruleT ype4KaN ode) nextRuleTypes property (which is ruleT ype5KaN ode) do if A rule of type RuleType5 is in temprn’s nextRules property then An option to define another rule of type RuleType5 related to rule4 2 is added to options temprn is set to ruleT ype7 1GraphN ode Iteration 3 An option to add a rule similar to rule7 1 is added to options for every KA graph node in temprn’s (rule7 1GraphN ode) typeNode’s (ruleT ype7KaN ode) nextRuleTypes property (which is ruleT ype4KaN ode) do if A rule of type RuleType4 is in temprn’s nextRules property then An option to define another rule of type RuleType4 related to rule7 1 is added to options temprn is set to ruleT ype5 1GraphN ode Iteration 4 An option to add a rule similar to rule5 1 is added to options for every KA graph node in temprn’s (rule5 1GraphN ode) typeNode’s (ruleT ype5KaN ode) nextRuleTypes property (which are ruleT ype8KaN ode and ruleT ype7KaN ode) do
B.2. Walk-Through
260
{The following options are added to options during the execution of this for loop} An option to define a rule of type RuleType8 related to rule5 1 is added to options A option to define another rule of type RuleType7 related to rule5 1 is added to options temprn is set to ruleT ype4 1GraphN ode Iteration 5 An option to add a rule similar to rule4 1 is added to options for every KA graph node in temprn’s (rule4 1GraphN ode) typeNode’s (ruleT ype4KaN ode) nextRuleTypes property (which is ruleT ype5KaN ode) do if A rule of type RuleType5 is in temprn’s nextRules property then An option to defined another rule of type RuleType5 related to rule4 1 is added to options temprn is set to ruleT ype1 1GraphN ode Iteration 6 An option to add a rule similar to rule1 1 is added to options for every KA graph node in temprn’s (rule1 1GraphN ode) typeNode’s (ruleT ype1KaN ode) nextRuleTypes property (which are ruleT ype4KaN ode, ruleT ype5KaN ode, and ruleT ype6KaN ode) do {The following options are added to options during the execution of this for loop} An option to define another rule of type RuleType4 related to rule1 1 is added to options An option to define a new rule of type RuleType5 related to rule1 1 is added to options An option to define a new rule of type RuleType6 related to rule1 1 is added to options temprn is set to null The option to perform KA again for concept starting with a rule of type RuleType1 (ruleT ype1 1GraphN ode’s ruleType value) is added to options if currently performing KA for the last rule in kaGraphs, which is false then {code not executed} else The option to perform KA for graph starting with a rule of type RuleType3 (ruleT ype3KaN ode’s ruleType value) The user is presented with the following options: •Define a rule similar to rule8 1 •Define a rule similar to rule5 2 •Define another rule of type RuleType8 related to rule5 2 •Define a rule of type RuleType7 related to rule5 2 •Define a rule similar to rule4 2 •Define another rule of type RuleType5 related to rule4 2 •Define a rule similar to rule7 1 •Define another rule of type RuleType4 related to rule7 2 •Define a rule similar to rule5 1 •Define a rule of type RuleType8 related to rule5 1 •Define another rule of type RuleType7 related to rule5 1
B.2. Walk-Through
261
•Define a rule similar to rule4 1 •Define another rule of type RuleType5 related to rule4 1 •Define a rule similar to rule1 1 •Define another rule of type RuleType4 related to rule1 1 •Define a new rule of type RuleType5 related to rule1 1 •Define a new rule of type RuleType6 related to rule1 1 •Perform KA again starting with a rule of type RuleType1 •Perform KA for graph starting with RuleType3 End After the user has defined the rule rule6 1, and clicks on the acquire next rule button, algorithm 5 is invoked: Input concept, the selected domain concept rule6 1GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule6 1GraphN ode.typeNode is ruleT ype6KaN ode, the value of its nextRuleTypes property is an empty list and so determineNextOptions(rule6 1GraphN ode is called (see below)} Assuming the user selects to perform KA starting with a rule of type RuleType3 tempT ypeN ode is set to ruleT ype3KaN ode if tempT ypeN ode is not null, which is true then rule3 1 is set to a new (rule) instance of type RuleType3 RuleGraphNode rule3 1GraphN ode is set to a new RuleGraphNode rule3 1GraphN ode.rule is set to rule3 1 rule3 1GraphN ode.previousRule is left as null rule3 1GraphN ode.rule (rule3 1) is displayed to the user End The call to determineNextOptions(rule6 1GraphNode) proceeds as follows: Input rule6 1GraphN ode, the rule that was just defined Begin options is set to a new list The option to add a new rule similar to rule6 1 is added to options RuleGraphNode temprn is set to rule1 1GraphN ode (the value of rule6 1GraphN ode’s previousRule property) while temprn does not equal null do Iteration 1 An option to add a rule similar to rule1 1 is added to options for every KA graph node in temprn’s (rule1 1GraphN ode) typeNode’s (ruleT ype1KaN ode) nextRuleTypes property (which are ruleT ype4KaN ode, ruleT ype5KaN ode, and
B.2. Walk-Through
262
ruleT ype6KaN ode) do {The following options are added to options during the execution of this for loop} An option to define another rule of type RuleType4 related to rule1 1 is added to options An option to define a new rule of type RuleType5 related to rule1 1 is added to options An option to define another rule of type RuleType6 related to rule11 is added to options temprn is set to null The option to define another set of rules related to concept starting with a rule of type RuleType1 (ruleT ype1 1GraphN ode’s ruleType value) is added to options if currently performing KA for the last rule in kaGraphs, which is false then {code not executed} else The option to perform KA for graph starting with a rule of type RuleType3 (ruleT ype3KaN ode’s ruleType value) The user is presented with the following options: •Define a new rule similar to rule6 1 •Define another rule of type RuleType4 related to rule1 1 •Define a new rule of type RuleType5 related to rule1 1 •Define another rule of type RuleType6 related to rule1 1 •Define another set of rules related to concept starting with a rule of type RuleType1 •Perform KA for graph starting with a rule of type RuleType3 End
Once the user has completed the rule rule3 1 and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule3 1GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule3 1GraphN ode.typeNode is ruleT ype3KaN ode, the value of its nextRuleTypes property is the list containing ruleT ype11KaN ode, and so the following is executed} tempT ypeN ode is set to ruleT ype11KaN ode if tempT ypeN ode is not null, which is true then rule11 1 is set to a new (rule) instance of type RuleType11 RuleGraphNode rule11 1GraphN ode is set to a new RuleGraphNode rule11 1GraphN ode.rule is set to rule11 1 rule11 1GraphN ode.previousRule is set to rule3 1GraphN ode rule11 1GraphN ode is added to rule3 1GraphN ode.nextRules rule11 1GraphN ode.rule (rule11 1) is displayed to the user End
B.2. Walk-Through
263
Once the user has completed the rule rule11 1 and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule11 1GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule11 1GraphN ode.typeNode is ruleT ype11KaN ode, the value of its nextRuleTypes property is the list containing ruleT ype8KaN ode, and so the following is executed} tempT ypeN ode is set to ruleT ype8KaN ode if tempT ypeN ode is not null, which is true then rule8 2 is set to a new (rule) instance of type RuleType8 RuleGraphNode rule8 2GraphN ode is set to a new RuleGraphNode rule8 2GraphN ode.rule is set to rule8 2 rule8 2GraphN ode.previousRule is set to rule11 1GraphN ode rule8 2GraphN ode is added to rule11 1GraphN ode.nextRules rule8 2GraphN ode.rule (rule8 2) is displayed to the user End Once the user has completed the rule rule8 2 and clicks on the acquire next rule button, algorithm 5 is invoked again: Input concept, the selected domain concept rule8 2GraphN ode, the RuleGraphNode for the rule that the user has just defined Begin if the user has selected another concept - which they have not then {code not executed} KaGraphNode tempT ypeN ode is set to null {rule8 2GraphN ode.typeNode is ruleT ype8KaN ode, the value of its nextRuleTypes property is any empty list, so determineNextOptions(rule8 2GraphN ode) is called} Assuming the user selects to finish KA tempT ypeN ode is set to null the KA process completes The call to determineNextOptions(rule8 2GraphNode) proceeds as follows: Input rule8 2GraphN ode, the rule that was just defined Begin options is set to a new list The option to add a new rule similar to rule8 2 is added to options RuleGraphNode temprn is set to rule11 1GraphN ode (the value of rule8 2GraphN ode’s previousRule property)
B.2. Walk-Through
264
while temprn does not equal null do Iteration 1 An option to add a rule similar to rule11 1 is added to options for every KA graph node in temprn’s (rule11 1GraphN ode) typeNode’s (ruleT ype11KaN ode) nextRuleTypes property (which is ruleT ype8KaN ode) do An option to define another rule of type RuleType8 related to rule11 1 is added to options temprn is set to rule3 1GraphN ode Iteration 2 An option to add a rule similar to rule3 1 is added to options for every KA graph node in temprn’s (rule3 1GraphN ode) typeNode’s (ruleT ype3KaN ode) nextRuleTypes property (which is ruleT ype11KaN ode) do An option to define another rule of type RuleType11 related to rule3 1 is added to options temprn is set null The option to define another set of rules related to concept starting with a rule of type RuleType3 (rule3 1GraphN ode’s ruleType value) is added to options if currently performing KA for the last rule in kaGraphs, which is true then The option to finish performing KA for concept is added to options else {code not executed} The user is presented with the following options: Define a new rule similar to rule8 2 Define a new rule similar to rule11 1 Define another rule of type RuleType8 related to rule11 1 Define a new rule similar to rule3 1 Define another rule of type RuleType11 related to rule3 1 Define another set of rules related to concept starting with a rule of type RuleType3 Finish performing KA for concept End Figure B.3 illustrates the various RuleGraphNode instances that have been created by the above KA process.
B.2. Walk-Through
Figure B.3: The RuleGraphNodes created during the example KA session.
265
266
Appendix C
MAKTab User Manual C.1
Introduction
This chapter provides introductions to the various concepts that are used throughout both this user manual and the MAKTab related documents. The chapter starts with a (very) brief introduction to Knowledge-Based Systems (KBS), explaining what they are, why they are useful and how they are created. This is followed by a brief introduction to some of their underlying technologies: in particular ontologies; secondly, to Prot´eg´e, a tool for creating and editing ontologies; and thirdly to MAKTab. Finally the process of building KBSs using ontologies, specifically within the Prot´eg´e environment is discussed, as is the MAKTab tool, a plug-in to Prot´eg´e which helps with the task of building a KBS, primarily by reusing existing artefacts, such as ontologies and problem solvers (PSs).
C.1.1
Brief Introduction to KBS
A Knowledge Based System (KBS) is an Artificial Intelligence (AI) system which attempts to imitate human problem solving by applying some form of reasoning to domain knowledge, which is stored in a knowledge base. An early KBS (or Expert System as they are also known) was the MYCIN system, which helped diagnose and recommend treatment for certain blood infections. Another, the XCON system helped in the configuration of computer systems to match user requirements; the spirit of this system is still around today in on-line computer retailing stores which allow the shopper to customise the components of their new system to meet their requirements; systems for designing elevators have also been built; and other industries such as marketing, finance, banking and forecasting have all made use of KBSs. KBSs are developed for various reasons such as providing decision support, the archiving of rare skills, preserving the knowledge of retiring personnel, and aggregating domain knowledge from several experts. KBSs are capable of storing significant levels of information, which often contains some complex reasoning processes, which, once encoded into the system, are available for everyone to exploit. KBSs are generally composed of two discrete components: a domain knowledge base, which contains specific knowledge relating to a particular domain, for example a knowledge base of elevator parts; and some form of reasoning, such as a set of rules, which use the information in the (domain) knowledge base to perform design or diagnosis. A design KBS in the elevator domain for example, would consist of a knowledge base of elevator components, and a set of configuration (design) rules, which specify how the components in the elevator knowledge base could be combined together to produce the design of a working elevator. A high level view of this is shown in figure C.1.
C.1. Introduction
267
Figure C.1: High level view of a KBS for elevator design.
C.1.2
Brief Introduction to Ontologies
One commonly quoted definition of an ontology is “an explicit specification of a conceptualisation” [47]. Essentially, this means that an ontology provides a formal model of some concept or domain. In KBS, ontologies are typically used as a way to define both the domain the KBS is working in, and sometimes the reasoning the KBS uses. Briefly, an ontology consists of a series of classes, which represent some concept or entity. This could be for example, the motor of an elevator. A class is essentially a schema which defines how instances of that class must look, similar to how a table in a database defines what each tuple (or row) must look like. Each class has associated with it a name, and a series of associated properties; for example the ElevatorMotor class may have properties such as maximumSupportedWeight, horsePower, modelName, and so on. When defining instances, the allowed value a property can take varies depending on the formalism used to represent the ontology. Typical valid values are strings, numeric values (integers, floats), Booleans, another instance and, sometimes, a class. For example the maximumSupportedWeight property of the ElevatorMotor class may have as its value 1000Kg, and as its manufacturedBy value it may have an instance of the Manufacturer class. A class can be instantiated, that is, an instance of the class can be created to represent an actual instance of that car; for example, an instance of the ElevatorMotor class would be BigMotor, which has property values: maximumSupportedWeight 5000Kg, horsepower 10000, modelName 5000kgBigMotor, and manufacturedBy LiftMotorsInc.
Prot´eg´e Ontologies are typically defined using some specific language or formalism, such as Frames or OWL. All of these languages have their own particular syntax for writing ontologies. In general, people tend to find it easier to create ontologies using a dedicated tool: an ontology editor rather than writing them in, for example, XML. One of the most widely used ontology editors is Prot´eg´e. There are two versions of Prot´eg´e: Prot´eg´e Frames which stores ontologies based on Frames, and Prot´eg´e OWL which enables the creation of ontologies in OWL. Both versions allow you to define classes and their properties, as well as new instances of the classes. For more information on Prot´eg´e, see the Prot´eg´e website -
C.1. Introduction
268
http://protege.stanford.edu.
C.1.3
Brief Introduction to KBS in Prot´eg´e
An instantiated ontology (i.e. one which has instances associated with it) can be viewed as a knowledge base. The instances provide information about the ontology’s domain. Therefore it would be useful to be able to make use of this knowledge base. Ontologies have become very popular, with countless numbers of them being created and made publicly available. Potentially this provides many knowledge bases which could be used as part of KBSs, if they could be combined with some reasoning facility. Prot´eg´e has been designed so that developers can write plug-ins, which allows extra functionality to be added to it. Plug-ins are able to access the ontology/knowledge base within Prot´eg´e, and are available for use within the Prot´eg´e environment itself. Henrik Erikson developed a plugin called JessTab, which facilitates the application of Jess rules over a knowledge base within Prot´eg´e. Jess is a Java version of the popular Expert System Shell called CLIPS. An Expert System Shell is a program which facilitates the development of KBSs. Therefore Jess provides a Java program which can be used for building KBSs. To build a KBS using Jess, it has to be provided with a knowledge base (which could be an instantiated ontology in Prot´eg´e), and a set of inference rules, which it can then apply to the knowledge base in an attempt to deduce some conclusions. As illustrated in figure C.2, JessTab provides Jess with access to the knowledge base (ontology instances) that has been defined as an ontology loaded in Prot´eg´e. As such, to develop a KBS in Prot´eg´e using JessTab, all that needs to be provided are the inference rules.
Figure C.2: Illustration of how JessTab links the Prot´eg´e ontology and its individuals with Jess.
C.1.4
Brief Introduction to MAKTab KBS Development Methodology
MAKTab is another plug-in for Prot´eg´e which is designed to help in the development of KBSs within the Prot´eg´e framework. MAKTab (short for MApping and Knowledge Acquisition Tab) provides a series of partial (referred to as generic) PSs, which can be used to provide the reasoning capabilities of a KBS. Essentially, a generic PS is the specification of a domain-independent strategy which realises a common task. As PSs are designed to be domain independent, in order
C.2. Installing MAKTab
269
to provide some worthwhile function they must be configured with relevant domain knowledge. MAKTab is designed to support this process. MAKTab provides various generic PSs which are designed to be configured to work within various domains. Each PS consists of an ontology, which defines the domain concepts and the types of inference rules that it uses to perform its reasoning. The configuration process involves first providing the PS with the domain knowledge by having the user define mappings for copying the concepts from a domain ontology/knowledge base to the PS ontology; and secondly helping the user define rules relevant to those concepts which allow the PS to work within the domain. After these rules are defined, MAKTab converts them into an executable form, for example to JessTab rules, which provide an executable KBS. The rules that a KBS is composed of are often related, and MAKTab uses this characteristic to structure and guide the process of defining new rules for a particular domain/application. For example, consider a KBS concerned with diagnosis of an elevator. At a simple level, diagnosis is concerned with determining causes of some observed fault, and providing some knowledge regarding the repair of this fault. Often one fault is caused by another: this gives rise to three simple but related rules: one in which causes of a symptom are determined (for example, if the elevator will not move vertically, then either the motor is jammed, or the level selection buttons are broken); in the next rule type the causes of a fault are determined (for example a broken elevator motor can be caused by either no power is being received or the cogs have broken); in the final type of rule, repairs for faults are suggested (for example if there is no power being received then ensure the power connection has not been switched off). MAKTab uses this to ask for the symptoms related to a concept (e.g. the elevator) to be defined, along with the faults that cause the symptoms; it then asks for the causes of these faults, and finally for the repair actions for these causes. Of course, the causes of the faults may themselves have causes, which can also have causes, which can also have causes, repeating until eventually something can be repaired.
C.2
Installing MAKTab
This section describes how to install MAKTab and other relevant software. MAKTab is a plug-in to Prot´eg´e, which is therefore also required. Further, as MAKTab currently produces executable JessTab rules, in order to run these, both Jess and JessTab are required. All of these systems require Java to be installed on the system.
C.2.1
Prerequisites
Before MAKTab can be installed and used Java, Prot´eg´e, Jess and JessTab must all be installed. This section describes where to get these programs, and how to install them.
Java Java is required to run Protg. It is recommended that the latest version of Java, which is currently Java 6 SE is installed. The Sun Java 6 downloads can be found at http://java.sun.com/javase/downloads/index.jsp. The current version that is required is Java JDK 6 Update 2 or Java Runtime Environment (JRE) 6 Update 2. Installation instructions can also be found at this location.
C.3. Using MAKTab
270
Prot´eg´e The Prot´eg´e ontology and knowledge base editor can be downloaded from http://protege.stanford.edu/download/registered.html. The current latest version is Prot´eg´e 3.3, which MAKTab is known to work with, therefore it is recommended that this version be used, however MAKTab is also known to work with Prot´eg´e 3.3 beta. For either of these Prot´eg´e versions, the “Full” or “Basic” download/install of Prot´eg´e will work. Instructions for installing Prot´eg´e are also available on the download page.
C.2.2
Installing MAKTab
Download the MAKTab zip file and unzip it to the plug-ins directory in the main Prot´eg´e directory (for Prot´eg´e 3.3, this is usually something like C:\Program Files\Protege 3.3\plugins\ on a Windows based system).
JessTab MAKTab currently generates executable JessTab rules which provide the executable KBS. JessTab must be installed as a Prot´eg´e plug-in before the executable rules produced by MAKTab can be run. If the “Full” version of Prot´eg´e was installed, JessTab is probably already installed. If not, then see http://www.ida.liu.se/∼her/JessTab/ for instructions on downloading and installing JessTab. JessTab requires Jess to work correctly. Jess can be downloaded from http://www.jessrules.com/jess/download.shtml. Installation instructions are also available at the Jess download page. To enable JessTab, once Jess has been downloaded and unzipped, copy the jess.jar file to the JessTab installation directory (typically something like C:\Program Files\Protege 3.3\plugins\se.liu.ida.JessTab\).
C.3
Using MAKTab
MAKTab aids the development of new KBSs by helping the (re)use of an existing domain ontology/knowledge base with an existing PS (which provides some particular type of reasoning) to produce a KBS which uses the provided type of reasoning in the ontology’s domain. This is achieved by performing two steps, illustrated in figure C.3. Essentially, in the first step relevant domain knowledge is mapped from the domain ontology to the PS, and in the second step this domain knowledge provides the context in which new domain rules required by the new PS are defined. These rules are then converted into an executable form, currently JessTab.
C.3.1
Enabling MAKTab
MAKTab is a tab plug-in for Prot´eg´e, and so before it is used within Prot´eg´e, Prot´eg´e must be told to display it. To do this 1. Load an ontology (either a domain or PS ontology). 2. When the ontology has loaded, select the Project option from the menu bar at the top of the Prot´eg´e window, then select Configure... from the menu that is displayed. This should bring up the dialog window which is used to select which tabs should be displayed in Prot´eg´e.
C.4. Using the Mapping Tool
271
Figure C.3: High level overview of the MAKTab KBS development process. 3. Select the Tab Widgets option (from the options at the top of the dialog window). A list of available tabs is then displayed. Find MAKTab in that list: it may be necessary to scroll through the list (they should be listed in alphabetical order). 4. Click the box on the left of MAKTab (a tick or cross should appear to indicate that MAKTab should be displayed). If MAKTab is not on the list, see section C.2.2 for installation instructions. 5. Click the OK button; MAKTab should then be displayed next to the standard Prot´eg´e tabs such as Classes, Slots (for frame based ontologies) or Properties (for OWL based ontologies), Instances (for frame based ontologies) or Individuals (for OWL based ontologies). 6. Select MAKTab by clicking on the MAKTab tab near the top of the Prot´eg´e window.
C.4
Using the Mapping Tool
MAKTab consists of two tabs: a Mapping tab (this is initially displayed) and a Knowledge Acquisition (KA) tab. The Mapping tab allows the definition of mappings between two ontologies. Essentially mappings define how the information (classes, properties and individuals) should be copied from one ontology (the source ontology) to another (the target ontology). MAKTab provides two general categories of mappings: class level and property level. Class level mappings allow the copying of data associated with a class in the source ontology to the target ontology. An example of this would be to copy a class from the source ontology to the target ontology. These
C.4. Using the Mapping Tool
272
mappings are discussed further in section C.4.3. Property level mappings allow information associated with a property to be copied from the source ontology to the target ontology. (For example, property mappings make it possible to copy the values of a particular property of a class’s individuals (e.g. the values of the name property) in a source ontology to the value of a property of instances of a class in the target ontology (e.g. the model-name property).) These mappings are discussed further in section C.4.3.
Figure C.4: A screenshot of the mapping tool in MAKTab.
C.4.1
Loading the Source Ontology
Before mappings can be defined, the source and target ontologies must be imported into MAKTab. There are two options for importing the source ontology, which are displayed on the left hand third of the Mapping tab titled Source Ontology, which is labelled 1 in figure C.4. The two options are: 1. Import the currently loaded ontology, by clicking on the Import Current button. 2. Load an ontology from another file, by clicking on the Import External button and using the file selection dialog that is displayed to select the ontology file. Once the ontology has been loaded, it is possible to save the ontology by clicking the Save Ontology button. Note this will save the ontology to the same file that it was loaded from.
C.4.2
Loading the Target Ontology
The target ontology is displayed on the right hand third of the Mapping tab titled Target Ontology Display, which is labelled 3 in figure C.4. The options for importing the target ontology are identical to those detailed in section C.4.1 for importing the source ontology.
C.4. Using the Mapping Tool
273
The Ontology Display Once the (source or external) ontology has been imported, the standard Prot´eg´e tree based ontology view is displayed. Classes can be selected by clicking on them, if there are subclasses, double clicking on a class will display them. When a class is selected, the properties of the class are displayed below the class tree. These can also be selected by clicking on them. The individuals of the selected class can be viewed by clicking on the Individuals tab (below the class tree).
C.4.3
Defining A Mapping
The centre third of the Mapping tab (labelled 2 in figure C.4 is used to define mappings between the source and target ontologies. The mapping definition area features a drop down list which displays the type of mappings that are suitable for the selected class/property in the source ontology display. To create a new mapping, 1. Select the type of mapping that should be created from the Select mapping type drop down list 2. Click on the Set type button to the right of the drop down list.
Defining a Class Level Mapping A class level mapping allows the copying of information related to a class in the source ontology to, typically, a class in the target ontology. Examples of this type of mapping are copying a class from the source ontology to the target ontology; or creating an instance of a class in the target ontology with the name of a class in the source ontology. These mappings are further described below. Copy A Class Mappings allows the copying of a class from the source ontology to the target ontology. Figure C.5 shows a defined Copy A Class Mapping. To create a new Copy A Class Mapping: 1. Select the class in the source ontology that is to be copied. 2. Select the Copy A Class Mapping option from the mapping type drop down box. 3. Click on the Set type button. 4. Select how “deep” this mapping should be applied: i.e. how many levels of subclasses of the class being copied should also be copied. Set this to 0 to just copy the selected class, to 1 to copy the selected class and its direct subclasses, 2 to copy the select class, its direct subclasses and their direct subclasses, and so on. 5. Select the class in the target ontology which the class being copied should be copied as a subclass of. The mapping definition area will update to display the Target Superclass Name field to display the name of the selected class. 6. If the individuals of the class should also be copied, ensure that the box beside Also copy individuals has a tick or a cross in it. Do this by clicking on the box (clicking it again will remove the tick/cross).
C.4. Using the Mapping Tool
274
7. Ensure that the mapping is stored by clicking on the Store Mapping button at the bottom of the Prot´eg´e window - if this is not clicked the defined mapping will not be stored, and so will not be executed.
Figure C.5: A screenshot of a defined copy a class mapping. Class to Instance Mappings sometimes it is not necessary to copy a class, and just creating a instance (to represent the class) of a class in the target ontology will suffice. The Class to Instance Mapping type enables this. Figure C.6 shows a screenshot of a defined Class to Instance Mapping. To create a Class to Instance Mapping: 1. Select the class in the source ontology, which the mapping is going to be defined for. 2. Select the Class to Instance Mapping option from the mapping type drop down box. 3. Click on the Set type button. 4. Select the type of the new instance by selecting a class in the target ontology display. 5. The mapping definition area will be updated to display the name of that class, and allow the specification of the values for all the properties of the class. The values can be specified by clicking on the box next to the property name, and entering/selecting the relevant value. 6. Once all of the desired values have been entered, ensure that the mapping is stored by clicking on the Store Mapping button at the bottom of the window - if this is not clicked on it will not be stored, and so will not be executed. 7. Note: if the target class does not have any properties, then no properties will be displayed in the mapping definition area
C.4. Using the Mapping Tool
275
Figure C.6: A screenshot of a defined Class To Individual Mapping.
Defining a Property Mapping Property level mappings can be used to define how the individuals of a class in the source ontology should be copied to create new individuals of a class in the target ontology. When the mappings are executed, all the property mappings for a particular class are applied to every individual of that class in order to create new individuals of the target class. The exceptions to this are Copy a Property mappings and Property to Instance mappings. There are various types of property level mappings: Property Renaming Mappings, which allow the direct copying of the value of a property of an individual (e.g. the name property) to be set as the value of a property of an individual in the target class (e.g. title); Simple Property Concatenation Mapping, which allow the values of one or more properties of a source class’s individual to be concatenated to provide the value of one property of a new target class individual (e.g. the concatenation of firstname and surname properties to form the name property value); Property to Individual mappings create an instance of a class in the target ontology with the name of a property in the source ontology; and Copy a Property mapping, which copies a property from the source ontology to the target ontology. Property Renaming Mappings are one of the mapping types can be applied to the individuals of a class in the source ontology to create new individuals in the target ontology. This type of mapping simply defines that two properties (one in the source ontology, one in the target ontology) are equivalent; figure C.7 provides a screenshot showing a defined mapping of this type. To define a Property Renaming Mapping: 1. Select a class in the source ontology display. 2. Select a property of that class in the property list below the source ontology display. 3. Select the Copy a property mapping type from the mapping type drop down box.
C.4. Using the Mapping Tool
276
4. Click on the Set type button. 5. Select a target class in the target ontology display. 6. Select a property from the properties list below the target ontology display. 7. When the source and target classes and properties have been selected, the mapping definition area will updated to display the relevant details. Make sure you click on the Store Mapping button to store the mapping.
Figure C.7: A screenshot of a defined Property Renaming Mapping. Simple Property Concatenation Mappings are another which are applied to all the individuals of the class in the source ontology at execution time. This time however, the values of multiple properties of the source class’ individuals can be concatenated together to give the value of the target property; figure C.8 shows a screenshot of a defined mapping of this type. To define a Simple Property Concatenation Mapping: 1. Select a class in the source ontology display. 2. Select a property of that class in the property list below the source ontology display 3. Select the Simple Property Concatenation Mapping mapping type from the mapping type drop down box. 4. Click on the Set type button. 5. Select a target class in the target ontology display. 6. Select a property from the properties list below the target ontology display.
C.4. Using the Mapping Tool
277
7. If desired add more properties from the source class by clicking on the Add Source Property button in the mapping definition area. This will display a drop down list displaying the properties of the selected source class: use this to select the next property that should be used. It is possible to add as many properties as desired. To remove a property from the concatenation list, click on the Remove button next to the drop down list displaying the property. 8. Click on the Store Mapping button to store the mapping.
Figure C.8: A screenshot of a defined Property Concatenation Mapping. Property to Individual Mappings are very similar to the Class to Instance Mappings (see section C.4.3). The only difference being to define a Property to Individual Mapping, select the class in the source ontology, then the property; and set Property to Individual Mapping as the mapping type. Figure C.9 shows a screenshot of a defined mapping of this type. These mappings will only be applied once at execution time (i.e. not to every individual like previous property level mappings). Copy a Property Mappings are very similar to the Copy a Class Mappings (section C.4.3), except that with this type of mapping, the selected property in the source ontology is copied to be in the domain of the selected class in the target ontology. Figure C.10 shows a screenshot of a defined mapping of this type.
C.4.4
Executing the Mappings
To execute the mappings that have been defined and stored, click on the Execute Mappings button near the bottom of the Prot´eg´e window. This will apply all the mappings as well as deleting all the mappings that have been previously defined and stored; this means that it is possible to define new mappings and execute them without re-executing previously executed mappings.
C.4. Using the Mapping Tool
Figure C.9: A screenshot of a defined Property To Individual Mapping.
Figure C.10: A screenshot of a defined Copy A Property Mapping.
278
C.5. Using the KA Tool
C.4.5
279
Automatic Mapping Suggestions
MAKTab can attempt to suggest mappings between the source and target ontologies. To run this function click on the Suggest Mappings button, located near the bottom of the window. MAKTab then looks for classes and properties, which it thinks are similar, and suggests property renaming mappings for them. Suggested mappings will be displayed when the relevant class or property is selected.
C.4.6
Viewing/Editing Mappings
Stored mappings can be viewed and edited by selecting the relevant class and/or property in the source ontology. When a class (or property) is selected in the source ontology, if a mapping has been defined for it, it will be displayed in the mapping definition area (assuming mappings have not been executed since it was stored). The mapping can then be edited as normal: clicking the Store mapping button will update the stored mapping for the class/property.
C.5
Using the KA Tool
The MAKTab Knowledge Acquisition (KA) tool is designed to enable the definition of the rules which will configure the generic PS for a particular domain. Each generic PS has an ontology associated with it, which is used during the mapping stage. The main purpose of this ontology is to provide the KA tool with information about the required rules to enable it to help automate the process of defining these required rules. This information includes: what rules are required, what each rule should contain, and the order in which the different types of rules should be acquired. This section discusses using the KA tool, including how to select the PS ontology, entering and editing the rules, and how to generate the executable rules that are central to the KBS implementation. To start using the KA tool, select the KA tab within MAKTab (it is located near the top of the screen, under the main Prot´eg´e tabs).
C.5.1
Select Ontology
Before the KA process can start, the KA tool must be provided with the PS ontology. For this, the PS ontology can be either the source (see section C.4.1) or the target (see section C.4.2) ontology used by the mapping tool. Select the relevant option by clicking on the button to the left of the correct option (the button should then fill in to indicate it has been selected, as shown in figure C.11 for selecting the target ontology). Then click on the Start KA button.
Figure C.11: Selecting the target ontology as the ontology to use during KA.
C.5. Using the KA Tool
C.5.2
280
The KA Interface
The interface of the KA tool, shown in figure C.12, is split into three main areas. At the top left (labelled 1 in figure C.12) is the PS concept area: this is used to select a subclass of PSConcept (generally these are SystemVariable and SystemComponent, but can vary depending on the PS and the actions taken during the mapping stage). When a class is selected, the individuals of that class will be displayed in the list below the class selection box. When an individual from this list is selected, the KA process can be initialised for it. Alternatively, KA can be initialised to acquire rules relating to the selected class. Defined rules are displayed at the top right (labelled 2 in figure C.12) of the KA tab; this area can be used to edit an existing rule, create a new rule of a particular type, or delete an existing rule. The main section of the interface (labelled 3 in figure C.12), is where the rules are defined, edited and so on. The bottom section (labelled 4 in figure C.12) provides access to functions such as validating the defined rules and generating the executable rules.
Figure C.12: The interface of the KA tool in MAKTab.
C.5.3
Selecting a Concept for KA/Starting KA
The central idea of the KA tool is that it is used to define all the rules that relate to a particular domain concept, and then define the rules relating to the next concept, then the next and so on, until all the necessary rules have been defined. To select a concept individual and start KA for it:
C.5. Using the KA Tool
281
1. Select the type (class) of the individual to start KA with by using the class selection drop down box, labelled Select a concept. 2. Select the individual from the list below the class selection box. 3. Click on the Start KA for X button, where X is the name of selected individual. If there are many individuals shown, it is possible to filter the individuals listed by entering some text in the Filter field; only individuals which contain the entered text will then be displayed. To select a concept type and start KA for it: 1. Select the type (class) by using the class selection drop down box. 2. Click the box next to Acquire for Class X, where X is the name of the selected class. A tick or a cross should then appear in the box to indicate it has been selected. 3. Click on the Start KA for X button, where X is the name of selected class.
C.5.4
Adding New/Editing Domain Concepts
There are various options for editing the domain concepts available for KA: add a new type of concept, edit the description of a concept type, and add a new instance of a concept.
Adding a New Concept Type It may be desirable to add a new type of concept, if it was not included in the domain ontology used in the mapping phase. To do this, click on the Create New Concept Type button. The standard Prot´eg´e class definition interface will then be displayed. For instructions on using the Prot´eg´e class definition interface, see the Prot´eg´e manual.
Editing a Concept Description It may also be desirable to alter the definition of a concept, for example to add new or remove existing properties from its display. To do this, once the type (class) has been selected, click on the Create New Concept Type button. The standard Prot´eg´e class definition interface will then be displayed. For instructions on using the Prot´eg´e class definition interface, see the Prot´eg´e manual.
Adding a New Concept Individual It may be desirable to add a new individual, as it was not included in the domain ontology which was used in the mapping phase. To do this, once the type (class) of the new individual has been selected, click on the Create New X button (where X is the name of the selected class). The standard Prot´eg´e individual definition interface will then be displayed; use this to specify the values of the various properties of the new individual. For instructions on using the Prot´eg´e individual definition interface, see the Prot´eg´e manual.
C.5.5
The Rule Definition Interface
The rule definition area of the interface (marked 3 in figure C.12) is where new rules are created and existing rules are edited. It is composed of 5 main parts shown in figure C.13: 1. The antecedent description text, which describes the purpose of the antecedents in the rule currently being defined.
C.5. Using the KA Tool
282
2. The antecedents. 3. The consequent description text, which describes the purpose of the consequents in the rule currently being defined. 4. The consequents. 5. Various buttons for adding antecedents and consequents, acquiring another rule of this type, and acquiring the next rule.
Figure C.13: The rule definition area of the KA tool.
C.5.6
Adding an Antecedent to the Rule
Once the KA process is started, or when creating a new rule, it will probably be necessary to add new antecedents to the rule. To add an antecedent, click on the Add Antecedent button (shown in area 5 of figure C.13). Depending on the type of rule, either the antecedent display area will be updated to display the new atom, or a dialog window will be displayed asking which type of antecedent (atom) to add.
Selecting Antecedent Type Depending on the rule, it may be possible to add various types of antecedent to the rule. If this is the case, then MAKTab will require to be told which type of antecedent to add. It does this by displaying a dialog, shown in figure C.14 showing the available options. 1. The list on the left hand side shows all the types of atoms that can be added. Select one of them to find out more about it. 2. The areas on the right hand side then update to provide information about the atom: the top area displays its name, and the bottom area provides a description of the atom. 3. Select the suitable atom type, then click the OK button. 4. The operation can be stopped by clicking the Cancel button.
C.5. Using the KA Tool
283
Figure C.14: Dialog for selecting the type of atom to add as an anticedent or consequent.
C.5.7
Removing an Antecedent
Any antecedent can be removed by clicking the Remove button on the right hand side of the atom display.
C.5.8
Adding a Consequent to the Rule
Adding consequents is almost exactly the same as adding antecedents: the only difference is that the Add Consequent button is used
Selecting Consequent Type As with antecedents, if may be necessary to select the type of the consequent, as this process is identical to that of selecting the type of a new antecedent, please see section C.5.6 section for instructions of how to do this.
C.5.9
Removing a Consequent
Any consequent can be removed by clicking the Remove button on the right hand side of the atom display.
C.5.10
Creating a Similar Rule
It is often desirable to produce a rule with the same set of antecedents but with different consequents. The Create a Similar Rule button allows this: it creates a new rule of the same type as the one currently being defined, and assigns it a copy of the antecedents, allowing a different set of consequences to be defined. To use this function: 1. Click on the Create a Similar Rule button below the rule definition part of the interface. 2. MAKTab then creates a new rule, and assigns it the same set of antecedents as the previous rule. 3. The new rule can then be edited as normal.
C.5. Using the KA Tool
C.5.11
284
Acquiring the Next Rule
As discussed previously, the KA tool of MAKTab walks through the acquisition/definition of all the rules relevant to a particular domain concept. Once the definition of the current rule is complete, the next rule can be defined by: 1. Clicking on the Acquire Next Rule button. 2. It may be necessary to select the type of rule to acquire next, depending on the relations between rules: see below for help with this. 3. When possible MAKTab attempts to copy the consequents of the previous rule in the path (which may not necessarily be the rule that was acquired previously), to be the antecedents of the next (new) rule, in order to provide a starting point for the definition of the new rule.
Selecting Rule Type It can be the case that the PS defines that a particular type of rule is related to several other types of rules. In the KA tool, this means that once a rule of that type has been defined, there are several options for the type of rule to define next. When this happens, MAKTab asks the user to select which type of rule to acquire next. To choose the type of the next rule that will be defined: 1. MAKTab will display a rule type selection dialog, shown in figure C.15: at the top of the dialog is a description of the current rule, on the left is a list of the available next rule types; when a rule type is selected the various fields on the right side of the dialog are updated to display the details of the rule type. 2. Select a rule type from the Next Rule Types list. 3. Use the information displayed on the right of the dialog to determine which type of rule to acquire next. Eventually at least one rule of all types should be acquired, so if there is no preference for which rule type to choose then select the first rule type. 4. Click on the OK button to start defining the next rule. 5. The operation can be cancelled at any point by clicking on the Cancel button.
C.5.12
Viewing Existing Rules
All the rules that have already been defined can be browsed using the Browse Existing Rules area at the top right of the KA tool interface (labelled 2 in figure C.12). To browse the existing rules: 1. Use the Available rule types drop down box to select the type of rules to display. 2. The list below the drop down box will then be updated to display all the rules of that type.
C.5.13
Editing an Existing Rule
It may at some point be necessary to edit an existing rule, this can be done using the Defined Rules area of the KA tool: 1. Use the Available rule types drop down box to select the type of rule to display.
C.5. Using the KA Tool
Figure C.15: Dialog for selecting the type of the next rule to be acquired.
285
C.5. Using the KA Tool
286
Figure C.16: The atom display interface. 2. The list below the drop down box will then be updated to display all the rules of that type. 3. Select the rule in the list that needs to be edited. 4. Click on the Edit Rule button. 5. The rule definition area will now be updated to display the rule, which can then be edited in the usual way
C.5.14
Creating a New Rule
Likewise it may be desirable to add a new rule of a certain type. To create a new rule: 1. Use the Available rule types drop down box to select the type of the new rule. 2. Click on the Create New Rule button, below the list of existing rules. 3. The rule definition area will now update to display a new rule, which can then be edited in the usual way
C.5.15
Deleting an Existing Rule
It may be the case that a rule is no longer required, and so it should be deleted from the PS/KBS. To delete a rule: 1. Use the Available rule types drop down box to select the type of the rule that is to be deleted. 2. The list below the drop down box will then be updated to display all the rules of that type. 3. Select the rule in the list to delete. 4. Click on the Delete Selected Rule button, below the list of existing rules. Note this will delete the rule, but it will leave the actual atoms of the antecedent and consequent lists, so that they can be used by other rules.
C.5.16
Editing an Atom
When an atom is added to a list of antecedents or consequents, it will probably need to be edited. A screenshot showing the display of an atom is shown in figure C.16. The atom display shows the properties of the atom: for each property shown, the property name and its current value (or a message if no value has been set) are displayed. To set/change the value of a property: 1. Left click on the black downwards arrow to the right of the name of the property.
C.5. Using the KA Tool
287
Figure C.17: The menu of available options for editing the value of a property of an atom. 2. This will bring up a menu of options: the range of which will depend on the type of the property. When the property is a datatype property (i.e. one which can have, for example, strings, numeric values, Booleans, and so on as its value), the menu only has the option to edit the value; clicking on this option will bring up a dialog with the relevant part of the individual definition form which can be used to set the value; when the value has been edited satisfactorily, close the dialog. 3. If the property is an object property, then it allows other individuals to be set as its value. In this case, a range of options are displayed in the menu. These options are: (a) Create a New Value - allows the definition of a new individual which is then set as the value of the property. (b) Select an Alternative Value - allows the selection of an individual already defined in the ontology to be set as the value of the property. (c) Change the Type of this Value - it may be that the property will allow different types of individuals to be set as the value (the type of an individual is the class that it is an individual of), this option (which is only displayed when there is already a value set), allows the individual which is currently the value to be altered to have another class as its type. When an individual is set as the value of a property, the atom display will show the properties of that individual as defined by the individual’s browser text option (see the Prot´eg´e manual for information on setting this). This can lead to a slightly more complex interface, especially when properties of an individual have other individuals as their values; to try and keep the interface clear boxes are displayed around each property to help seperate the display of each property’s value.
C.5.17
Creating Expressions
Expressions are often required in PSs: it is often necessary to create expressions such as: property x of instance y < property a of instance b. In MAKTab, expressions are just another type of atom. Each expression essentially has two properties: a function symbol and a list of arguments. Every argument can generally be 1. A literal value. 2. A SystemVariable. 3. A property of a SystemComponent class. 4. Another expression.
C.5. Using the KA Tool
288
However, obviously the permitted values change depending on the PS. This means that creating an expression in MAKTab is just like creating any other type of atom. The only slightly unusual issue about creating expressions in MAKTab is that it (typically) uses prefix notation for building expressions (MAKTab can also be configured to use postfix notation). In prefix notation the operation is listed before the arguments, instead of between them as infix expressions. For example, property x of instance y < property a of instance b is expressed as < property x of instance y property a of instance b.
C.5.18
Adding Arguments to Expressions and Lists in General
When an expression is being defined, MAKTab displays the function followed by a list of arguments, which can be empty when no arguments have yet been defined. It may be necessary to add more arguments to the list than MAKTab provides space for by default. New arguments can be added to the list of expressions by using once of the buttons below the argument, as shown in figure C.18 1. +b - adds a new atom (argument) to the list of arguments before the one above the button. 2. +a - add a new atom (argument) to the list of arguments after the one above the button. 3. del - removes the atom (argument) from the list
Figure C.18: Adding an argument to a list.
C.5.19
Generating the Executable KBS
Once all the relevant rules have been defined, for the generic PS to work in the particular domain, all that is required to generate the executable PS is to click on the Generate Rules button located near the bottom of the window. Follow any prompts that are displayed, and the code for the executable KBS will be displayed in a dialog box. Select this code (right click on the code, and select Select All), and copy it to the system clipboard (right click on the code, and select Copy). Currently, MAKTab generates a JessTab rule set which provides the KBS; to execute this, switch to JessTab and paste the rules into the text entry box near the bottom of the window (click in the box, and press ctrl+v. If the rules do not run automatically, enter (reset)(run) at the JessTab entry box, and press the return key.
C.5.20
Useful Functions
The generic PSs provide a range of useful functions which can be used during the rule definition process. One of particular use is the member$ function, which tests if the value of the first argument, is a member of the value of the second argument. So for example, if a Shelf class has
C.5. Using the KA Tool
289
a canSupportItem property, which lists a series of items the shelf can support, to see if it can support the item defined by the Item class’ name property define an expression: member$ Item name, Shelf canSupportItem. Where member$ is the function, Item name is a SystemComponent property as the first argument, and Shelf canSupportItem is a SystemComponent property as the second argument.
290
Appendix D
A Short Introduction to Knowledge-Based Systems D.1
Introduction
This chapter provides introductions to the various concepts that are used throughout both this user manual and the MAKTab related documents. The chapter starts with a (very) brief introduction to Knowledge-Based Systems (KBS), explaining what they are, why they are useful and how they are created. This is followed by a brief introduction to some of their underlying technologies: in particular ontologies; secondly, to Prot´eg´e, a tool for creating and editing ontologies; and thirdly to MAKTab. Finally the process of building KBSs using ontologies, specifically within the Prot´eg´e environment is discussed, as is the MAKTab tool, a plug-in to Prot´eg´e which helps with the task of building a KBS, primarily by reusing existing artefacts, such as ontologies and problem solvers (PSs).
D.1.1
Brief Introduction to KBS
A Knowledge Based System (KBS) is an Artificial Intelligence (AI) system which attempts to imitate human problem solving by applying some form of reasoning to domain knowledge, which is stored in a knowledge base. An early KBS (or Expert System as they are also known) was the MYCIN system, which helped diagnose and recommend treatment for certain blood infections. Another, the XCON system helped in the configuration of computer systems to match user requirements; the spirit of this system is still around today in on-line computer retailing stores which allow the shopper to customise the components of their new system to meet their requirements; systems for designing elevators have also been built; and other industries such as marketing, finance, banking and forecasting have all made use of KBSs. KBSs are developed for various reasons such as providing decision support, the archiving of rare skills, preserving the knowledge of retiring personnel, and aggregating domain knowledge from several experts. KBSs are capable of storing significant levels of information, which often contains some complex reasoning processes, which, once encoded into the system, are available for everyone to exploit. KBSs are generally composed of two discrete components: a domain knowledge base, which contains specific knowledge relating to a particular domain, for example a knowledge base of elevator parts; and some form of reasoning, such as a set of rules, which use the information in the (domain) knowledge base to perform design or diagnosis. A design KBS in the elevator domain for example, would consist of a knowledge base of elevator components, and a set of configuration (design) rules, which specify how the components in the elevator knowledge base
D.1. Introduction
291
Figure D.1: High level view of a KBS for elevator design. could be combined together to produce the design of a working elevator. A high level view of this is shown in figure D.1.
D.1.2
Brief Introduction to Ontologies
One commonly quoted definition of an ontology is “an explicit specification of a conceptualisation” [47]. Essentially, this means that an ontology provides a formal model of some concept or domain. In KBS, ontologies are typically used as a way to define both the domain the KBS is working in, and sometimes the reasoning the KBS uses. Briefly, an ontology consists of a series of classes, which represent some concept or entity. This could be for example, the motor of an elevator. A class is essentially a schema which defines how instances of that class must look, similar to how a table in a database defines what each tuple (or row) must look like. Each class has associated with it a name, and a series of associated properties; for example the ElevatorMotor class may have properties such as maximumSupportedWeight, horsePower, modelName, and so on. When defining instances, the allowed value a property can take varies depending on the formalism used to represent the ontology. Typical valid values are strings, numeric values (integers, floats), Booleans, another instance and, sometimes, a class. For example the maximumSupportedWeight property of the ElevatorMotor class may have as its value 1000Kg, and as its manufacturedBy value it may have an instance of the Manufacturer class. A class can be instantiated, that is, an instance of the class can be created to represent an actual instance of that car; for example, an instance of the ElevatorMotor class would be BigMotor, which has property values: maximumSupportedWeight 5000Kg, horsepower 10000, modelName 5000kgBigMotor, and manufacturedBy LiftMotorsInc.
Prot´eg´e Ontologies are typically defined using some specific language or formalism, such as Frames or OWL. All of these languages have their own particular syntax for writing ontologies. In general, people tend to find it easier to create ontologies using a dedicated tool: an ontology editor rather than writing them in, for example, XML. One of the most widely used ontology editors is Prot´eg´e. There are two versions of Prot´eg´e: Prot´eg´e Frames which stores ontologies based on Frames, and
D.1. Introduction
292
Prot´eg´e OWL which enables the creation of ontologies in OWL. Both versions allow you to define classes and their properties, as well as new instances of the classes. For more information on Prot´eg´e, see the Prot´eg´e website - http://protege.stanford. edu.
D.1.3
Brief Introduction to KBS in Prot´eg´e
An instantiated ontology (i.e. one which has instances associated with it) can be viewed as a knowledge base. The instances provide information about the ontology’s domain. Therefore it would be useful to be able to make use of this knowledge base. Ontologies have become very popular, with countless numbers of them being created and made publicly available. Potentially this provides many knowledge bases which could be used as part of KBSs, if they could be combined with some reasoning facility. Prot´eg´e has been designed so that developers can write plug-ins, which allows extra functionality to be added to it. Plug-ins are able to access the ontology/knowledge base within Prot´eg´e, and are available for use within the Prot´eg´e environment itself. Henrik Erikson developed a plugin called JessTab, which facilitates the application of Jess rules over a knowledge base within Prot´eg´e. Jess is a Java version of the popular Expert System Shell called CLIPS. An Expert System Shell is a program which facilitates the development of KBSs. Therefore Jess provides a Java program which can be used for building KBSs. To build a KBS using Jess, it has to be provided with a knowledge base (which could be an instantiated ontology in Prot´eg´e), and a set of inference rules, which it can then apply to the knowledge base in an attempt to deduce some conclusions. As illustrated in figure D.2, JessTab provides Jess with access to the knowledge base (ontology instances) that has been defined as an ontology loaded in Prot´eg´e. As such, to develop a KBS in Prot´eg´e using JessTab, all that needs to be provided are the inference rules.
Figure D.2: Illustration of how JessTab links the Prot´eg´e ontology and its individuals with Jess.
D.1.4
Brief Introduction to MAKTab KBS Development Methodology
MAKTab is another plug-in for Prot´eg´e which is designed to help in the development of KBSs within the Prot´eg´e framework. MAKTab (short for MApping and Knowledge Acquisition Tab)
D.1. Introduction
293
provides a series of partial (referred to as generic) PSs, which can be used to provide the reasoning capabilities of a KBS. Essentially, a generic PS is the specification of a domain-independent strategy which realises a common task. As PSs are designed to be domain independent, in order to provide some worthwhile function they must be configured with relevant domain knowledge. MAKTab is designed to support this process. MAKTab provides various generic PSs which are designed to be configured to work within various domains. Each PS consists of an ontology, which defines the domain concepts and the types of inference rules that it uses to perform its reasoning. The configuration process involves first providing the PS with the domain knowledge by having the user define mappings for copying the concepts from a domain ontology/knowledge base to the PS ontology; and secondly helping the user define rules relevant to those concepts which allow the PS to work within the domain. After these rules are defined, MAKTab converts them into an executable form, for example to JessTab rules, which provide an executable KBS. The rules that a KBS is composed of are often related, and MAKTab uses this characteristic to structure and guide the process of defining new rules for a particular domain/application. For example, consider a KBS concerned with diagnosis of an elevator. At a simple level, diagnosis is concerned with determining causes of some observed fault, and providing some knowledge regarding the repair of this fault. Often one fault is caused by another: this gives rise to three simple but related rules: one in which causes of a symptom are determined (for example, if the elevator will not move vertically, then either the motor is jammed, or the level selection buttons are broken); in the next rule type the causes of a fault are determined (for example a broken elevator motor can be caused by either no power is being received or the cogs have broken); in the final type of rule, repairs for faults are suggested (for example if there is no power being received then ensure the power connection has not been switched off). MAKTab uses this to ask for the symptoms related to a concept (e.g. the elevator) to be defined, along with the faults that cause the symptoms; it then asks for the causes of these faults, and finally for the repair actions for these causes. Of course, the causes of the faults may themselves have causes, which can also have causes, which can also have causes, repeating until eventually something can be repaired.
294
Appendix E
MAKTab KBS Development Introduction Computer Configuration E.1
Introduction
Thank you for agreeing to take part in this experiment. During this experiment, you will be asked to use MAKTab to develop a Knowledge Based System (KBS) which helps design/configure new computers. This KBS will use knowledge of computer hardware to produce the specification of a new computer, in terms of the components that it will consist of. Before starting this experiment you will be provided with various materials to help you with the development of the KBS. You are free to ask for any assistance while looking at these materials, prior to starting the development of the computer configuration KBS; these materials will be available when you are using the tool.
E.2
Methodology
This experiment is structured as follows: 1. After reading this document, you will be provided with the following: (a) A Short Introduction to Knowledge Based Systems, which describes MAKTab, along with the various related concepts: ontologies, problem solvers (PSs), Knowledge Based Systems (KBSs) and Prot´eg´e. You are encouraged to read this document, if you are unfamiliar with any of these concepts. (b) How to Adapt the Generic Propose-and-Revise Problem Solver for a Domain, which describes the propose-and-revise algorithm: a PS which deals with designing new artefacts (in this case computers). This document describes both the general algorithm, along with how you can adapt it to work in a domain (in this case computer hardware). You are encouraged to read this document, to help understand what is required when building the KBS. (c) MAKTab Tutorial for Building a Propose-and-Revise Based KBS, which provides a simple walk-through example of how to use MAKTab to build a propose-and-revise based KBS. If you are unfamiliar with MAKTab, you are encouraged to do the walkthrough exercise. You will also be provided with the MAKTab User Manual. (d) An Introduction to Building a Desktop Computer. This document describes how the various computer hardware components are put together to build a new computer; it describes the various components (their functions, etc) along with constraints on
E.3. And Finally...
295
how they can be combined together to produce a working system. It also provides a description of all the components in the domain ontology that you will be provided with. 2. After reading the introductory materials and completing the walkthrough (if desired), when you are ready to start building the KBS you will be provided with a computer domain ontology and the generic propose-and-revise PS, for use with MAKTab; along with a working version of Prot´eg´e with MAKTab installed, which you will use to build the KBS. 3. While building the KBS you are free to refer back to any of the introductory/tutorial materials that you have been provided with. 4. Once you have entered all the rules for the KBS using MAKTab; generate the executable rules; and test that they work. The introduction to building computers document provides a list of the components in the domain ontology along with some sample valid configurations which you can use to determine if the KBS you have developed produces the correct results.
E.3
And Finally...
Feel free to ask any questions you may have regarding the experiment. Thank you once again for taking part in this experiment.
296
Appendix F
MAKTab KBS Development Introduction Computer Diagnosis F.1
Introduction
Thank you for agreeing to take part in this experiment. During this experiment, you will be asked to use MAKTab to develop a Knowledge Based System (KBS) which helps diagnose faults with computers, specifically computer hardware. This KBS will use knowledge of the different computer hardware components, including the faults associated with the various components, symptoms of these faults, and repairs for these faults, to help produce accurate diagnoses for a given set of symptoms. Before starting this experiment you will be provided with various materials to help you with the development of the KBS. You are free to ask for any assistance while looking at these materials, prior to starting the development of the computer diagnosis KBS.
F.2
Methodology
This experiment is structured as follows: 1. After reading this document, you will be provided with the following: (a) A Short Introduction to Knowledge Based Systems, which describes MAKTab, along with the various related concepts; ontologies, problem solvers (PSs), Knowledge Based Systems (KBSs) and Prot´eg´e. You are encouraged to read this document, if you are unfamiliar with any of these concepts. (b) How to Adapt the Generic Diagnostic Problem Solver for a Domain, which describes the diagnosis PS that you will use as the basis for the KBS you are building. This document describes both the general algorithm, along with how you can adapt it to work in a domain (in this case computer hardware). You are encouraged to read this document, to help understand what is required when building the KBS. (c) MAKTab Tutorial for Building a Diagnosis Based KBS, which provides a simple walkthrough example of how to use MAKTab to build a diagnosis based KBS. If you are unfamiliar with MAKTab, you are encouraged to do the walk-through exercise. You will also be provided with the MAKTab User Manual. (d) An Introduction to Computer Hardware Fault Diagnosis. This document describes the faults associated with various computer hardware components; and where relevant, repairs for each of these faults.
F.3. And Finally...
297
2. After reading the introductory materials and completing the walk-through (if desired), when you are ready to start building the KBS you will be provided with a computer domain ontology and the generic diagnosis PS, for use with MAKTab; along with a working version of Prot´eg´e with MAKTab installed, which you will use to build the KBS. 3. While building the KBS you are free to refer back to any of the introductory/tutorial materials that you have been provided with. 4. Once you have entered all the rules for the KBS using MAKTab; generate the executable rules; and test that they work. The introduction to computer diagnosis document provides a set of test cases which you can use to determine if the KBS you have developed produces the correct results.
F.3
And Finally...
Feel free to ask any questions you may have regarding the experiment. Thank you once again for taking part in this experiment.
298
Appendix G
How to Adapt the Generic Propose-and-Revise Problem Solver for a Domain G.1
The General Algorithm
Propose-and-revise is a design problem solver (PS): it attempts to design an artefact based on knowledge about the objects that make up that artefact. To do this, the algorithm uses three types of knowledge about the objects: 1. Rules which define how the various objects are combined to produce the artefact. 2. Constraints which enforce some limitation(s) on how these objects can be combined. 3. Fixes which define what to do if, during the design process, a constraint is violated. Once the PS has been provided with this knowledge, the PS must be provided with an initial list of objects with which to start the design process; this is called the initial design. The PS then applies the following steps to build the artefact, (also illustrated in figure G.1): 1. Repeat the following steps: (a) Propose an extension to the design, using the configuration rules. (b) Check if any of the constraints have been violated. (c) If a constraint has been violated, and there are appropriate fixes. i. Apply the fix. ii. Return to step 1 (b). (d) Go back to 1 (a). 2. Until there are no more remaining design extensions and there are no violated constraints, in which case a solution has been found; or until there are no more fixes for violated constraints, in which case no solution can be found. Consider the following example: You are faced with the problem of designing a shelf, 2m long by 1m wide, to hold various objects. There are a variety of options for the material of the self, all of which will break if too much weight is placed on them. Consider the following materials: chipboard which can support up to a load of 2 kg/m2 ; oak which can support 4 kg/m2 ; and steel which can support up to 8 kg/m2 . Apart from the maximum load, price is also an important factor in choice of material: chipboard is the cheapest of the three options, costing £10/m2 ; oak is the most expensive, costing £50/m2 ;
G.1. The General Algorithm
299
Figure G.1: Flow diagram illustration of the propose-and-revise algorithm. and steel therefore is the mid priced material, costing £25/m2 . This information if represented in table G.1 For cost reasons, initially you may decide to use the cheapest material, and so opt to use chipboard; this initial design state is shown top left in figure G.2 - the maximum shelf load is the size of the shelf (in m2 ) * the shelf maximum load (in m2 ). However, there is a constraint on the shelf which states that it must be able to hold up to 7 kg; therefore a constraint has been violated (figure G.2, State 1). There are two possible fixes for this violation (figure G.2, middle left): to choose either an oak or a steel shelf (both of these will support the 7 kg required weight). The choice of which material to use is then based on which option is more desirable. Desirability can be based on anything: lower price options may be more desirable than higher priced ones for Material Price ( £/m2 ) Max. Supported Load (kg/m2 ) Chipboard 10 2 Oak 50 4 Steel 25 8 Table G.1: Sample materials for constructing a shelf.
G.1. The General Algorithm
300
example; perhaps cost is not an issue, but supported weight is, and so shelves which can hold more are more desirable; perhaps a balance of both; or other requirements such as using a particular material to fit with the environment the shelve will be put in, i.e. the colour of the material, and so on. In this example, we assume desirability is based on cost (low cost equals high desirability), in which case the steel shelf would be chosen and the design is revised, using steel as the material (figure G.2, State 2). This revise design is then re-evaluated: there are no violated constraints (the steel shelf can support the required 7 kg), and so the design process is complete. The cost of the shelf is also calculated using the formula: cost = size of shelf (in m2 ) * material cost per m2 .
Figure G.2: Illustration of how the propose-and-revise algorithm solves the shelf design problem. Along with the domain knowledge represented in Table G.1, this example also uses the following: • An initial selection of the chipboard as the shelf’s material. • An initial assignment that the required load of the shelf is 7 kg. • Two assignment rules which specify: – The maximum supported shelf load is: the size of the shelf * the maximum load of the material being used.
G.2. Requirements
301
– The cost of the shelf is the area of the shelf (m2 ) * the cost of the material (/m2 ) being used. • A constraint rule which specifies: If capacity of the shelf material is less than the required load, then there is a “shelf not strong enough” violation. • A fix rule which specifies: If there is a “shelf not strong enough” violation, then choose the cheapest material whose maximum supported load is over the required load.
G.2
Requirements
This simple example demonstrates the requirements of the propose-and-revise PS, which are: 1. Information about the objects in the domain that it will be working (in this case materials for shelving). 2. Information about other factors that need to be considered in the design, called variables (e.g. the load the shelf must support, the maximum load the shelf can support, and the cost of the shelf). 3. Initial selection information regarding which domain objects to start the design with (here this is to start with the chipboard material). 4. Initial values for the other factors (e.g. the size of the shelf is 2 m2 , the required load is 7 kg, and so on). 5. How to calculate the value of factors that they are dependent on other features (e.g. the cost of the shelf and maximum load the shelf can support). 6. Constraint rules which place restrictions on the domain objects or other factors (the shelf must be able to hold at least the required load). 7. Fix rules which specify what to do when a specific constraint is violated (select a shelf which supports at least the required load). 8. Which values (both domain objects and other factors) should be displayed as output once the design is complete (here the material and cost of the shelf).
G.3
How to Configure the Generic Propose and Revise PS for a Domain
Section G.2 describes the types of knowledge the generic propose-and-revise PS requires in order to work in a domain. This section provides some more details on how this knowledge is formulated.
G.3. How to Configure the Generic Propose and Revise PS for a Domain
G.3.1
302
Step 1 – Mapping
Before the PS can be provided with the domain specific knowledge/rules it needs to be provided with knowledge of the domain. This is achieved by mapping the domain knowledge from a domain ontology to the propose-and-revise PS ontology. Please see the MAKTab User Manual for instructions on performing mapping within MAKTab. Typically, when using the propose-and-revise PS, it is desirable to map domain knowledge, both classes and individuals to the SystemComponent class of the propose-and-revise ontology, which is used by the PS to store the domain objects. Preferably this mapping is achieved by copying them as subclasses of it, or by creating new individuals of SystemComponent.
G.3.2
Step 2 – Rule Knowledge Acquisition
After some knowledge has been mapped to the PS ontology, the knowledge acquisition (KA) tab of MAKTab can be used by the user to define the domain specific knowledge/rules described in section G.2. This section discusses how these rules are defined, in terms of the order in which they are acquired, along with the structure of each rule. When KA is started for a particular domain concept, the KA tool follows a “path” which it uses to guide the user in defining the rules related to the selected domain concept. Selecting a domain object and pressing the Start KA button on MAKTab, will result in the following path of rule acquisition/definition being followed, which is shown in figure G.3: first, either an Initial Component Selection Rule is defined, which defines the components (domain objects) that the algorithm should start the design process with; or an Initial System Variables Value rule is defined, in which the (possibly initial) value of any SystemVariables can be defined (the SystemVariable class is used by the PS to represent any features of the domain which it would not be correct to represent as domain objects (SystemComponents), such as the total cost of the shelf). Then either an Output Calculated Component Rule is defined which allows the specification of the components (domain objects) that should be displayed after the design is completed; or an Output Calculated System Variables Rule is defined, which specifies SystemVariables that should be displayed after the algorithm completes. Then a System Variable Value Assignment Rule is defined, specifies how to calculate a value (typically for a SystemVariable); then a System Variable Constraint Rule is defined which specifies constraints on SystemVariables or components and finally a Fix Rule is defined which specifies a fix for a violated constraint.
Figure G.3: Illustration of the path the KA tool follows when acquiring rules for propose-andrevise, which starts with one of the rules on the left, then one of the rules in the middle, and finally the three rules on the right.
G.3. How to Configure the Generic Propose and Revise PS for a Domain
G.3.3
303
Propose and Revise Rules
The generic propose-and-revise PS defines seven different types of rules which are used to define the domain specific reasoning knowledge which is required to configure the generic propose-andrevise PS for a domain. In this section, each rule is described in terms of its purpose, and the allowed types of antecedents and consequents.
InitialComponentSelectionRules Purpose To define the components (domain objects) that the algorithm should start the design process with. For example, in the shelf example above, to specify that it should start with the chipboard shelf. General Notes It is possible to define one InitialComponentSelectionRule to define all the components (domain objects) that the PS should use, but it is acceptable to define a separate rule for every initial component. This type of rule does not need any consequents, it simply requires a list (which can contain one or more items) of the initial components as the list of antecedents in the rule. Allowed Antecedents: • SystemComponentIndividualAtom (section G.3.4). Allowed Consequents: Not required.
InitialSystemVariablesValue Purpose To specify the value of a SystemVariable or of a property of a SystemComponent before the propose-and-revise algorithm is run. For example, in the shelf example, to specify that the initial value of the SystemVariable for required shelf capacity is 7 kg. General Notes As with InitialComponentSelectionRules, it is possible to define one InitialSystemVariablesValue rule to define the values of all the SystemVariables, but it is also acceptable to define a separate rule for every SystemVariable. Again these rules do not require any consequents. Allowed Antecedents: • SystemVariableOrPropertyIndividualValueAssignmentAtom (section G.3.4). • SystemVariableOrPropertyDataTypeValueAssignmentAtom (section G.3.4). Allowed Consequents: Not required.
OutputCalculatedComponentRule Purpose To specify that the calculated SystemComponent (or a list of them) should be displayed after the propose-and-revise algorithm has been executed. For example, to specify that the selected shelf should be displayed. General Notes This type of rule does not require any consequents. Allowed Antecedents: • SystemComponentClassAtom (section G.3.4). Allowed Consequents: Not required.
G.3. How to Configure the Generic Propose and Revise PS for a Domain
304
OutputCalculatedSystemVariablesRule Purpose To specify that the calculated value of a SystemVariable should be displayed after the propose-and-revise algorithm has been executed. For example, to specify that the value of the required shelf capacity SystemVariable should be displayed. General Notes This type of rule does not require any consequents. Allowed Antecedents: • SystemVariableAtom (section G.3.4). Allowed Consequents: Not required.
SystemVariableValueAssignmentRule Purpose To specify that if some SystemVariables or properties of SystemComponents have a certain value, then the value of a SystemVariable or a SystemComponent property should be set to a certain value. General Notes It is not necessary for there to be any antecedents in this rule: for example a rule can state that the total weight of the shelf is the weight of the shelf and the weight of the objects it holds, without having to state that the shelf must have a weight as must the objects it holds. Allowed Antecedents: • SystemVariableOrPropertyIndivdualValueAtom (section G.3.4). • SystemVariableOrPropertyDatatypeValueAtom (section G.3.4). • SystemVariableOrPropertyDatatypeValueExpressionAtom (section G.3.4). Allowed Consequents: • SystemVariableOrPropertyIndividualValueAssignmentAtom (section G.3.4). • SystemVariableOrPropertyDatatypeValueAssignmentAtom (section G.3.4). • SystemVariableOrPropertyDatatypeValueExpressionAssignmentAtom (section G.3.4).
SystemVariableConstraintRule Purpose This type of rule allows the definition of a constraint on the value of a SystemVariable, or the property of a SystemComponent. General Notes Follows the SystemVariableValueAssignmentRule in the KA flow, and precedes FixRule. Allowed Antecedents: • ConstraintAtom (section G.3.4). Allowed Consequents: • ConstraintViolationAtom (section G.3.4).
G.3. How to Configure the Generic Propose and Revise PS for a Domain
305
FixRule Purpose To specify a fix that should be applied in response to a constraint violation. A fix is associated with a desirability value, in the range 1 to 10, where 1 is most desirable, and 10 least desirable. General Notes Follows the SystemVariableConstraintRule in the KA flow. Allowed Antecedents: • ConstraintViolationAtom (section G.3.4). Allowed Consequents: • FixAtom (section G.3.4).
G.3.4 Atoms SystemComponentIndividualAtom This atom is used to select an individual of type SystemComponent, or a subclass of it. It has one property: hasComponent, which is used to select a SystemComponent (or subclass of) individual.
SystemVariableOrPropertyIndividualValueAssignmentAtom This atom allows the specification of a individual as the value for a SystemVariable or a property of a SystemComponent. The hasVariableOrProperty property allows the selection of a SystemVariable or SystemComponentPropertyAtom which will be assigned some value. The hasIndividualValue property, specifies the individual that will be set as the value of the selected SystemVariable or a SystemComponet’s property.
SystemVariableOrPropertyDatatypeValueAssignmentAtom This atom allows the specification of a value, which will be a datatype value (e.g. string or numeric value), for a SystemVariable or a property of a SystemComponent. The hasDatatypeValue property allows the selection of a SystemVariable or SystemComponentPropertyAtom which will have its value set. The hasDatatypeValue property specifies that value. There is also an isANumericValue property, which can be set to true or false to indicate whether the value is numeric or not.
SystemVariableOrPropertyDatatypeValueExpressionAssignmentAtom This atom allows a value, which will be determined by evaluating some expression, to be set as the value of a SystemVariable or a property of a SystemComponent. The hasAssignDatatypeResultOfExpression property allows the selection/definition of a expression, which will be evaluated, and the result assigned to the selected SystemVariable/SystemComponent property.
SystemComponentClassAtom This atom allows the selection of the SystemComponent class, or one of its subclasses. The swrl:classPredicate property is used to select the class.
SystemVariableAtom This atom allows the selection of the SystemVariable individual. The hasSystemVariable property is used to select the SystemVariable.
ConstraintViolationAtom Specifies that a constraint has been violated. The hasViolatedConstraintName property is used to specify the name of the constraint that has been violated.
G.3. How to Configure the Generic Propose and Revise PS for a Domain
306
FixAtom Specifies a Fix which should be applied. There are various types of fixes, which fall into two categories: predefined, i.e. defined by the rules, and determined at run time, in which part of the fix is selected based on certain conditions. All fixes have a name and a desirability. Predefined fixes have an assignToSystemVariableOrProperty property which specifies the SystemVariable or SystemComponent property that will have its value altered. Predefined fixes include UpgradeFix which species a (component) upgrade through the newIndividualValue property; AssignIndividualFix which specifies an individual that should be set as the value; IncreaseFix and DecreaseFix which determine that a (numeric) value should be increased or decreased respectively by the amount specified in the deltaAmount property; AssignLiteralValueFix specifies that a newLiteralValue should be assigned; AssignValueOfSystemVariableOrComponentProperty which specifies that the value of a variable or a property of a particular selected component should be set as the value of another variable or component’s property; and ExpressionFix which specifies that the result of evaluating an expression defined by the hasAssignDataTypeResultOfExpression property should be assigned. Run time determined fixes include PropertyValueBasedFix, which specifies that a SystemComponent of the type (class) selected by replaceCurrentComponentType property should be replaced with an individual that satisfies the expression defined by the assignIndividualWithPropertiesThatSatisfy property.
ConstraintAtom Specifies a constraint by setting the value of the hasConstraint property to be an instance of the Constraint class.
SystemVariableOrPropertyIndivdualValueAtom This atom type allows the specification that a particular SystemVariable or a property of a SystemComponent must have a specific value, which must be an individual, as specified by the hasIndividualValue property.
SystemVariableOrPropertyDatatypeValueAtom This atom type allows the specification that a particular SystemVariable or a property of a SystemComponent must have a specific value (which must be of a particular datatype (for example, a string, numeric or Boolean)), as specified by the hasDataTypeValue property.
SystemVariableOrPropertyDatatypeValueExpressionAtom This atom type allows the specification that a particular SystemVariable or a property of a SystemComponent must have a specific value, which is a value that is calculated by evaluating the expression from the hasResultOfExpression property.
307
Appendix H
How to Adapt the Generic Diagnostic Problem Solver for a Domain H.1
How to Adapt the Generic Diagnostic Problem Solver for a Domain
The generic diagnostic problem solver (PS) provided with MAKTab can be configured to provide the reasoning functionality of a diagnostic Knowledge-Based System in some domain. This configuration is achieved by defining a series of diagnostic rules for the chosen domain. These rules typically define that a particular system state (typically a faulty component or symptom) can be caused by some other system state (for example either another faulty component). Alternatively, the rules may specify that a particular (faulty) system state can be repaired by performing some (set of) action(s). The remainder of this document provides more details on the diagnostic algorithm used by the generic diagnostic PS, along with an example of diagnosis in the car domain; finally a description of how to configure the generic diagnostic PS to work in a particular domain using MAKTab is provided.
H.1.1
The General Diagnostic Algorithm
Once the diagnostic rules have been defined for a particular domain, the PS builds a graph of the system states described in the rules. This graph is used to relate how one system state can be caused by another system state, which can be caused by another, and so on until a repair action is defined. When executed, the PS uses this graph, along with knowledge about the state of the components in the domain (which it acquires during execution), in order to determine a diagnosis and suggest possible repairs. For example, consider the car domain. A typical system state may be that the engine will not start. In this example, this can be caused by two other system states: either the starter motor is faulty, or the battery is bad (a generic term which will be further expanded later). The usual action for sorting a faulty starter motor is to replace it. A bad battery can be caused by bad battery connections, or a flat battery. Bad battery connections can be repaired by resetting them. A flat battery can be caused by a faulty alternator, poor electrolyte content, or the lights being left on. Repairs for these states would be to replace the alternator, replace the battery fluid, and replace the battery, respectively. The graph which the diagnostic PS uses to represent this situation is shown in figure H.1. The cause rules it describes are listed in table H.1, and the repair rules are shown in table H.2.
H.1. How to Adapt the Generic Diagnostic Problem Solver for a Domain
308
Figure H.1: Example diagnosis graph for the car domain. State Engine will not start Engine will not start Bad battery Bad battery Flat battery Flat battery Flat battery
Can be caused by state Bad battery Starter motor faulty Bad connections Flat battery Alternator faulty Poor electrolyte Lights left on
Table H.1: Example cause rules in the car domain.
H.1.2
Requirements
The car example demonstrates the domain information that the generic diagnostic PS requires when being configured for a domain: 1. Information about the objects in the domain that it will be working; in the above example these are components of a car engine. 2. Information about the states each component can be in. This can refer to the component as a whole (e.g. the battery being flat), or a property of the component (e.g. the electrolyte content of the battery being low). 3. Cause rules which specify how one system state can be caused by another (e.g. a flat battery can be caused by a faulty alternator). 4. Repair rules which specify how one system state can be repaired (e.g. replacing the alternator).
H.1. How to Adapt the Generic Diagnostic Problem Solver for a Domain State Starter motor faulty Bad connections Alternator faulty Poor electrolyte Lights left on
309
Can be repaired by Replace starter motor Reset connections Replace alternator Replacing battery fluid Replace battery
Table H.2: Example repair rules in the car domain. Once this information has been defined in MAKTab, it can be used to create an executable diagnostic PS for that particular domain.
H.1.3
Configuring the Generic Diagnostic Problem Solver
This section describes how to configure the generic diagnostic PS with the domain specific information it requires to work in a particular domain. Initially the generic PS is provided with some information about the domain, typically this consists of descriptions of the objects/components of the domain, by defining mappings between a domain ontology and the generic diagnostic PS ontology; this information is then used by MAKTab’s KA tool, which acquires the cause and repair rules for the domain concepts.
Step 1 – Mapping Before MAKTab can be used to define the rules, it is necessary to provide it with knowledge about the domain. This is achieved by mapping the domain knowledge from a domain ontology (for example, an ontology of engine parts) to the diagnostic PS ontology. Please see the MAKTab User Manual for instructions on how to perform mapping with MAKTab. When using the diagnostic PS there are typically two ways of approaching the mapping. The simplest approach involves the mapping of the relevant domain classes and instances to subclasses (and associated instances) of the SystemComponent class of the PS ontology. The alternative approach involves more complex mappings, in which the domain components are mapped directly to instances of the various diagnostic related atom classes.
Step 2 – Guided Rule Acquisition The Knowledge Acquisition (KA) tool of MAKTab can be used to define the cause and repair rules based around the mapped domain knowledge. This section discusses the ordering of the rule acquisition, along with the structure of the cause and repair rules, including details of the atoms that they consist of. When KA is started for a particular domain concept, the KA tool follows a “path” which it uses to guide the definition of the rules related to the selected domain concept. Selecting a domain concept and pressing the Start KA button on MAKTab, will result in the following path of rule acquisition/definition being followed: first a Cause Rule will be acquired, which defines how one system state can cause another. This rule has the basic form: “State X can be caused by State Y” where State X and State Y are states of the domain concepts. After a Cause Rule, either another Cause Rule will be acquired, or a Repair Rule depending on which option is relevant at that time (the choice of which is left to the user). If the Cause Rule option is selected, then another Cause Rule is acquired, and the same process is carried out. If the Repair Rule option is selected, then
H.1. How to Adapt the Generic Diagnostic Problem Solver for a Domain
310
Figure H.2: The path followed by MAKTab when acquiring diagnostic rules: First a Cause Rule is acquired, followed by either another Cause Rule, or a Repair Rule. No rules are acquired after a Repair Rule. a rule of the form “State Y can be repaired by X” is acquired (where Y is a state of a domain concept, and Z is some action). A Repair Rule does not have any rules which follow it. This path is represented in figure H.2.
H.1.4
Diagnostic Rules
As described in the previous section, the generic diagnostic problem solver uses two types of rules to define the reasoning: cause rules and repair rules. In this section each rule is described in terms of its purpose, and the allowed types of antecedents and consequents.
Cause Rule Purpose The purpose of a Cause Rule is to state that when a domain object or variable in the domain is in a particular state, this can be caused by another domain object or variable being in a particular state. This is achieved by selecting a domain concept (either: a SystemVariable, a SystemComponent, or a property of a SystemComponent) and defining the value (or state) that it has. The value (state) of a domain concept can either be another instance, a literal (for example, a string or numeric) value, or the result of some expression. Allowed Antecedents: 1. SystemVariableOrComponentPropertyDatatypeValueAtom. 2. SystemVariableOrComponentPropertyInstanceValueAtom. 3. SystemVariableOrComponentPropertyDatatypeExpressionValueAtom. Allowed Consequents: 1. SystemVariableOrComponentPropertyDatatypeValueAtom. 2. SystemVariableOrComponentPropertyInstanceValueAtom. 3. SystemVariableOrComponentPropertyDatatypeExpressionValueAtom.
Repair Rule Purpose The purpose of a Repair Rule is to state that when a domain concept (either: a SystemVariable, a SystemComponent, or a property of a SystemComponent) is in a particular state, that it can be repaired by performing a particular action. Allowed Antecedents 1. SystemVariableOrComponentPropertyDatatypeValueAtom 2. SystemVariableOrComponentPropertyInstanceValueAtom 3. SystemVariableOrComponentPropertyDatatypeExpressionValueAtom
H.1. How to Adapt the Generic Diagnostic Problem Solver for a Domain
311
Allowed Consequents 1. RepairAtom
H.1.5
Atoms
Each rule is composed of a list of antecedents and a list of consequents. These lists are composed of a series of atoms, of a particular type. The allowed type of atoms for antecedents and consequents are restricted depending on the rule (see previous section). This section provides a description of each of the atom types (classes) that are used by the diagnostic PS.
SystemVariableOrComponentPropertyDatatypeValueAtom This atom type allows the specification that a particular SystemVariable (e.g. car acceleration), SystemComponent (e.g. the car battery), or property of a SystemComponent (e.g. the car battery’s fluid level) is in a state defined by some literal value (e.g. a textual value such as “broken” or a number value such as 5).
SystemVariableOrComponentPropertyInstanceValueAtom This atom type allows the specification that a particular SystemVariable, SystemComponent, or property of a SystemComponent is in a state defined by some other instance in the PS ontology.
SystemVariableOrComponentPropertyDatatypeExpressionValueAtom This atom type allows the specification that a particular SystemVariable, SystemComponent, or property of a SystemComponent is in a state defined by evaluating an expression (for example, the battery fluid level < 5).
ComponentPropertyAtom This type of atom is used by the other atom types to define the property of a SystemComponent that is being used for some reason; for example the property “fluid level” of the class “battery” which is used as part of some test.
ExpressionAtom This type of atom is used to define an expression, for example: battery fluid level = 0.
RepairAtom A repair atom is used to define a repair action, which can be performed in order to fix a faulty state. A repair consists of a textual description of the action which must be taken.
312
Appendix I
MAKTab Tutorial for Building a Propose-and-Revise Based KBS I.1
Introduction
MAKTab provides a generic propose-and-revise problem solver (PS) which can be used to build a Knowledge Based System (KBS) for design in a particular domain. This document provides a tutorial on how to use MAKTab along with the generic propose-and-revise PS to build a simple design KBS for designing shelving. This document is designed to complement How to Adapt the Generic Propose-and-Revise Problem Solver for a Domain and the MAKTab User Manual.
I.2
The Domain: Shelving
The file ShelfMaterials.pprj contains a simple ontology describing the domain of shelf materials. This ontology describes each material in terms of its name, the maximum weight it can support (per m2 ), and the cost per m2 (£/m2 ). There are three types of materials, represented as three instances of the Material class: Chipboard, which costs £10/m2 , and can support a maximum load of 2 kg/m2 ; Steel which costs £25/m2 , and can support up to 8 kg/m2 ; and Oak, which costs £50/m2 and can support up to 4 kg/m2 . When designing a new shelf, several factors must be taken into account: the maximum weight the shelf must be capable of supporting, the total cost, how it will fit in with the environment it is being placed in, the size of the objects it will support, size constraints of the wall it will be attached to, how it will be attached to the wall, and so on; all of these factors make designing a shelf a potentially tricky problem. In this tutorial you are going to build a basic shelf design KBS, which will select the appropriate material to satisfy the following requirements: • The shelf must be able to support a maximum load of 7 kgs. • The shelf must be 2 m2 .
I.3
Building the Shelf Design KBS
You will build the shelf design KBS by using MAKTab with the generic propose-and-revise PS, to define the (selection, constraint and fix) rules necessary for the propose-and-revise PS to work in the shelf domain. The KBS will be developed by mapping the shelf (domain) knowledge stored in the Shelf ontology to the PS ontology, and then defining the various rules relevant for shelf design.
I.4. Mapping
I.3.1
313
Loading the Propose and Revise Ontology
The first task is to load the propose-and-revise ontology into Prot´eg´e. To do this in Prot´eg´e: 1. Select File from the menu. 2. Select Open. 3. Navigate to where you saved the files related to this tutorial, and select the Generic-PnR.pprj file. 4. Click on the OK button.
I.3.2
Select MAKTab
You will use MAKTab to build the KBS. Select it by clicking on the Maktab tab, located beside the other Prot´eg´e tabs, such as Classes, Instances (for frame ontologies) or Individuals (for OWL ontologies).
I.4
Mapping
You will now be presented with the mapping interface. This interface will be used to define mappings from the Shelf domain ontology to the propose-and-revise ontology; these mappings will allow MAKTab to transfer the domain knowledge to the PS ontology, so that it can then be used when defining the propose-and-revise rules relevant to the shelf domain.
I.4.1
Load the Shelf Ontology as the Source Ontology
The shelf ontology will be used as the source ontology for the mappings. To do this: 1. Using the Source Ontology area (left hand side), click on the Import External button. 2. This will display a file selection dialog. Use this to select the ShelfMaterials.pprj file, and click on the OK button. The Shelf ontology will now be displayed as the source ontology.
I.4.2
Import the Propose and Revise Ontology as the Target Ontology
The propose-and-revise ontology will be used as the target ontology. To do this: 1. Using the Target Ontology area (right hand side), click on the Import Current button. This will import the ontology currently loaded in Prot´eg´e (in this case, the propose-and-revise ontology) as the target ontology.
I.4.3
Define a Copy Class Mapping
The next task is to define mappings between the two ontologies. For this tutorial, and as recommended for the propose-and-revise ontology, you will define a Copy a Class Mapping to copy the Material class from the Shelf ontology, to a subclass of the SystemComponent class of the propose-and-revise ontology: 1. Click on the Material class in the source ontology.
I.5. Rule Knowledge Acquisition
314
2. If it it not selected, select the Copy a Class Mapping option from the Select mapping type drop down box at the top of Mapping Definition area. 3. Click on the Set Type button 4. Click on the SystemComponent class in the target ontology. It may be necessary to first double click on the PSSystem class in order to get the SystemComponent class displayed. The mapping definition area will be updated to show the SystemComponent class as the Target Superclass. 5. As you will need the individuals of the Material class, click on the box next to Also copy individuals on the mapping display area. 6. Click on the Store Mapping button near the bottom of the window. This completes the necessary mappings for this tutorial. In other applications it will probably be necessary to define various other mappings; please see the MAKTab User Manual for guidance with this.
I.4.4
Execute the Mappings
As it is only necessary to define one mapping for this tutorial, the mappings can now be executed. This will copy the Material class and its individuals to the propose-and-revise ontology. 1. Click on the Execute Mappings button, near the bottom of the window. 2. After a short pause, a dialog stating that the mappings have been executed successfully will be displayed. Click the OK button on this dialog You can check that the mappings have been successfully applied by clicking on the Material class in the Target Ontology, the list below the main ontology display will be updated to display individuals of the Material class.
I.5
Rule Knowledge Acquisition
Now that mapping has been performed, the propose-and-revise PS ontology contains some knowledge about the concepts in the domain (i.e. the different types of shelf materials). You will now define the rules that will enable it to perform some reasoning with those concepts. As described in How to Adapt the Generic Propose-and-Revise Problem Solver for a Domain there are seven different types of rules that can be defined. This tutorial will involve creating one Initial SystemComponent Selection rule, one Constraint rule, one Fix rule, two Initial SystemVariable Value rules, one Output SystemComponents rule and two Output SystemVariables rule. The resulting system will be a KBS which selects the appropriate material for a shelf 2 m2 which can support a maximum load of 7 kg.
I.5.1
Select the KA Tab
To start the KA, click on the KA tab within MAKTab (it is to the right of the Mapping tab; below the standard Prot´eg´e tabs.
I.5. Rule Knowledge Acquisition
I.5.2
315
Select KA Ontology
It is necessary to tell MAKTab which ontology should be used for KA. In this tutorial, this is the target ontology from the mapping stage: 1. Click on the The mapping tab’s target ontology option. 2. Click on the Start KA button. The KA interface will now be displayed.
I.5.3
Create a new SystemVariable for the Shelf Supported Load
The first task in building this KBS is to define a new system variable which will be used to calculate the maximum weight the shelf is able to support. This is calculated by multiplying the size of the shelf (in m2 ) by the maximum load the material of the shelf can support (in kg/m2). This is acheived by creating a new individual of type SystemVariable, and defining three relevant rules: one for setting the initial value, one defining that it should be displayed by the PS, and one defining how to calculate the value. 1. Using the drop down box labelled Select a concept in the Problem Solver Concepts area of the KA tab (located at the top left), select the SystemVariable class. 2. Click on the Create New Instance of SystemVariable. 3. In the dialog that is displayed, enter “Shelf Supported Load” in the name field. 4. Click on the OK button. 5. The list of instances will be updated, showing the new Shelf Supported Load SystemVariable.
I.5.4
Start KA for Shelf Supported Load
There are three rules that should be defined for the Shelf Supported Load variable: an Initial SystemVariables Value rule, which defines the initial value of the variable, an Output SystemVariables rule, which tells the PS to display the value of the variable after successful execution, and a SystemVariable Value Assignment rule, which defines how to calculate the Shelf Supported Load. To start the KA process: 1. Select Shelf Supported Load in the list of PS concepts. 2. Click on the Start KA for Shelf Supported Load button. 3. MAKTab now starts the KA process.
Define an Initial SystemVariable Value Rule for Shelf Supported Load MAKTab now displays a new Initial SystemVariables Value rule for the Shelf Supported Load SystemVariable, and attempts to fill in some of the information. Use this rule to set the initial value of Shelf Supported Load to 0:
I.5. Rule Knowledge Acquisition
316
• The antecedents list (entitled Please provide a list of initial values for any system variable that will be used in the design process) should already contain an antecedent. However, this antecedent is of the wrong type: MAKTab tries to assign the value of Shelf Supported Load to be an individual; it is necessary however, to assign a literal value (“0”) as the value; change the type of the antecedent by clicking on the Change Type button next to the antecedent. • You will then be prompted to select the new type of the antecedent. There should be two options, listed under Suitable Types. Select (by clicking on) the Assign a SystemVariable or a SystemComponent’s property a datatype value, and click on the OK button. The interface is then updated to display the new antecedent. • It may be necessary to specify Shelf Supported Load as the value of the concept property of the antecedent. To do this 1. Right click on the text concept, and click on the Select an Alternative Instance option from the menu that is displayed. 2. Use the resulting dialog to select the SystemVariable class from the list on the left hand side, and then Shelf Supported Load from the list of individuals on the right. Then click on the OK button. 3. The interface is then updated to display the Shelf Supported Load SystemVariable as the value of the concept property. • Next it is necessary to sepecify the value of Shelf Supported Load by setting the value of the assign value property: 1. Right click on the text “Right click to assign a value for assign value”. 2. Select (by clicking on) the Edit Value option from the menu that is displayed. 3. Enter “0” into the displayed text field. 4. Close the dialog by clicking on the cross at the top right of it. • Next it is necessary to specify if the value is numeric or not by setting the value of the is numeric? property. 1. Right click on the text “Right click to assign a value for is numeric?” 2. Select (by clicking on) the Edit Value option from the menu that is displayed. 3. Click on the box next to the text “IsANumericValue” a tick should appear in the box to indicate that the value is true. 4. Close the dialog by clicking on the cross at the top right of it. • As consequents are not required by this type of rule, if there are any consequents displayed, remove them by clicking on the Remove button to the right of the consequent’s display.
I.5. Rule Knowledge Acquisition
317
Define an Output SystemVariables Rule for Shelf Supported Load Now that the the Shelf Supported Load SystemVariable has an initial value assignment, click on the Acquire Next Rule button, to define an Output SystemVariables rule for Shelf Supported Load. 1. This type of rule specifies which SystemVariables the PS should display the value of when the PS’s execution completes successfully. MAKTab should have created a new rule, and added one antecedent and one consequent, each specifying Shelf Supported Load. 2. As there is no need for consequents in this type of rule, click on the Remove button to the right of the consequent (the consequents are displayed on the bottom, below the text “Consequents are not necessary in this type of rule”.) Once you have defined this rule, click on the Acquire Next Rule button. The next rule type is a SystemVariable Value Assignment rule.
Define a SystemVariable Value Assignment Rule for Shelf Supported Load This rule will instruct the PS that the value of Shelf Supported Load should be calculated based on the value of other features of the shelf being designed. In this case, these are the size of the shelf and the maximum supported load of the material. This value will be recalculated every time the PS proposes using a different material (as will be the case for this tutorial). To define a rule which tells the PS how to calculate the value of Shelf Supported Load: 1. MAKTab may have added an antecedent and a consequent to the new SystemVariable Value Assignment Rule. The purpose of this rule is to tell the PS how to calculate the value of the Shelf Supported Load. This will be done by altering the consequent, and as such there is no need for the antecedent, so click on the Remove button next to the antecedent (if one was added automatically). 2. MAKTab will have added a consequent to the rule. However, this consequent is not of the type required here, as it is necessary to assign a numeric value which is calculated by evaluating an expression; change the type of the consequent by clicking on the Change Type button next to the consequent. 3. The consequent type selection dialog is then displayed, as there are a variety of valid consequent types for this rule. Select the type for assigning a calculated datatype value from the list of Suitable Types, and click on the OK button. 4. The interface will then be updated to display the new consequent. The concept property should have the Shelf Supported Load SystemVariable set as its value. The next task is to define the expression that will be used to calculate the value. 5. The assign result of calculating property will have an expression as its value; as such there are two properties displayed: function and arguments. You will use this property to define the expression: 2 * the value of the Material’s maxSupportedWeight property: (a) Right click on the text function, and select the Select an Alternative Instance from the menu that is displayed. A dialog will then be displayed showing all the functions
I.5. Rule Knowledge Acquisition
318
in the list on the right hand side; select * from the list and click on the OK button. The interface will be updated to show this as the value of the function property. (b) The arguments property is a list of the arguments that will be used by the function (in this case, the * function). Here the arguments will be “2” and a SystemComponent Property, which is used to define that the maxSupportedWeight property of the Material should be used. The display will initially have one argument in this list: which will likely be the Shelf Supported Weight SystemVariable. Each argument is displayed as the property argument. Right click on argument, and select Make a New Individual from the menu that is displayed. (c) An argument type selection dialog is then displayed. Select SystemComponent Property from the list of Suitable Types displayed on the left hand side of the dialog; then click on the OK button. (d) The value for the argument property is now another instance, and as such two further properties are displayed: the class and property properties. (e) Set the value of the class property to the Material class by right clicking on the class text, selecting Select an Alternative Individual option from the menu, a dialog showing all the classes of the ontology is displayed, select Material from the list, and click on the OK button). (f) Set the value of the property property to the maxSupportedWeight property by right clicking on the property text, selecting Select an Alternative Individual option from the menu, a dialog showing all the properties used in the ontology is displayed, select maxSupportedWeight from the list, and click on the OK button (g) Add another argument to the list of arguments by clicking on +a button under the first argument. (h) The argument type selection dialog is displayed again. This time select the Literal type from the list, and click on the OK button. (i) The interface is updated to display another argument to the right of the original, this is also titled argument, but has a value property inside of its display area. Right click on the value text, and select Make a New Instance. The display will be updated to show the value property, right click on the text “Right click to assign a value for has value” and select Edit Value. Enter “2” the text field of the dialog that is then displayed, and close the dialog by clicking on the cross at its top right. This then completes the expression. 6. Finish the consequent, by setting the value of the is numeric? property, by right clicking on the text “Right click to assign a value for is numeric?” under is numeric, select Edit Value, then click on the box next to IsANumericValue in the dialog that is displayed, and close the dialog. The PS now knows the initial value of the Shelf Supported Load, how to calculate it, and that it must display it after execution, the definition of rules for Shelf Supported Load is complete.
I.5. Rule Knowledge Acquisition
I.5.5
319
Create a new SystemVariable for the Required Load
The next task in building this KBS is to define a new system variable which will be used to keep track on the maximum load the shelf must be able to support. This is acheived by creating another SystemVariable individual. 1. Using the drop down box labelled Select a concept in the Problem Solver Concpets area of the KA tab (located at the top left), select the SystemVariable class. 2. Click on the Create New Instance of SystemVariable. 3. In the dialog that is displayed, enter “Maximum Load” in the name field. 4. Click on the OK button. 5. The list of individuals will be updated, showing the new SystemVariable.
I.5.6
Start KA for Maximum Load
There are two rules that should be defined for the Maximum Load SystemVariable: an Initial SystemVariables Value rule and an Output SystemVariables rule. 1. Select Maximum Load from the list of SystemVariable individuals. 2. Click on the Start KA for Maximum Load button. 3. MAKTab now starts the KA process.
Define an Initial SystemVariables Value rule for Maximum Load MAKTab now displays a new Initial SystemVariables Value rule for the Maximum Load SystemVariable, and attempts to fill in some of the information. Use this rule to set the initial value of Maximum Load to 7: • The antecedents list (entitled Please provide a list of initial values for any system variable that will be used in the design process) should already contain one antecedent. However, this antecedent is of the wrong type: MAKTab tries to assign the value of Maximum Load to be an individual; it is necessary however, to assign a literal value (“7”) as the value. To change the type of this antecedent, click on the Change Type button to the right of the antecedent’s display. • You will then be prompted to select the new type of the argument. There should be two options, listed under Suitable Types. Select (by clicking on) the type for setting as a datatype value, and click on the OK button. The interface is then updated to display the new argument in the list of antecedents. • It may be necessary to specify Maximum Load as the value of the concept property of the antecedent. To do this 1. Right click on the text concept, and click on the Select an Alternative Instance option from the menu that is displayed.
I.5. Rule Knowledge Acquisition
320
2. Use the resulting dialog to select the SystemVariable class from the list on the left hand side, and then Maximum Load from the list of individuals on the right. Then click on the OK button. 3. The interface is then updated to display the Maximum Load system variable as the value of the concept property. • Next it is necessary to sepecify the value of Maximum Load by setting the value of the assign value property: 1. Right click on the text “Right click to assign a value for assign value”. 2. Select (by clicking on) the Edit Value option from the menu that is displayed. 3. Enter “7” into the displayed text field. 4. Close the dialog by clicking on the cross at the top right of it. • Next it is necessary to specify if the value is numeric or not by setting the value of the is numeric? property. 1. Right click on the text “Right click to assign a value for is numeric?”. 2. Select (by clicking on) the Edit Value option from the menu that is displayed. 3. Click on the box next to the text “IsANumericValue” a tick should appear in the box to indicate that the value is true. 4. Close the dialog by clicking on the cross at the top right of it. • As consequents are not required by this type of rule, if there are any consequents displayed, remove them by clicking on the Remove button to the right of the consequent’s display.
Define an Output SystemVariables Rule Now that the the Maximum Load SystemVariable has an initial value assignment, click on the Acquire Next Rule button, to define an Output SystemVariables rule for this SystemVariable. 1. This type of rule specifies which system variables the PS should display the value of when it finishes its execution. MAKTab should have created a new rule, and added one antecedent and one consequent, each specifying Maximum Load. 2. As there is no need for consequents in this type of rule, click on the Remove button to the right of the consequent (the consequents are displayed on the bottom, below the text “Consequents are not necessary in this type of rule”.) Once you have defined this rule, click on the Acquire Next Rule button. The next rule type is a SystemVariable Value Assignment Rule, which is used to specify that the value of a SystemVariable should be calculated based on the value of other features of the artefact under design. This is not the case for the Maximum Load variable that we are currently defining rules for, so if MAKTab has suggested any antecedents or consequents automatically, remove them and then click on the Acquire Next Rule button again to acquire a Constraint rule.
I.5. Rule Knowledge Acquisition
I.5.7
321
Define a Constraint Rule for Maximum Load
The Maximum Load variable has one constraint associated with it: namely, that the Maximum Load must be less than or equal to the Shelf Supported Load of the currently selected material for the shelf (the propose-and-revise algorithm starts by selecting the specified material as the current material, and then checks the constraints for any violations, if there is a violation then another type of material will become the currently selected material). To define this constraint: 1. Click on the Add Antecedent button, located near the bottom of the window. 2. The antecedent display area will updated to show a constraint antecedent. A constraint rule essentially consists of one or more constraint antecedents, which combine to define a constraint: if they are all satisfied then the constraint is violated. This constraint states that the Maximum Load must be less than or equal to the maximum supported weight of the shelf material (stored in the Shelf Supported Weight SystemVariable). This is expressed in a constraint rule as: if Shelf Supported Weight < Maximum Load then there is a constraint violated. At first there appears to be an inconsistency between how the constraint is described, and how it is expressed in the rule. This is because of the way the PS will work: the violation will only be noted if the rule is satisfied, and in this case the rule should only be satisfied if Shelf Supported Weight is less than the Maximum Load is must support. Expressing the constraint in the rule as it is described in natural language (if Maximum Load ≤ Shelf Supported Weight) is incorrect: the rule would be activated when the Maximum Load can actually be supported by the shelf; for this reason the constraint is expressed slightly differently so that the rule is only activated when the Maximum Load > Shelf Supported Weight: this is the desired effect. If in doubt, expressing the constraint in the natural language form of “If x Then y” can make the transformation more obvious. 3. The Constraint argument has one property called constraint. A constraint is essentially an expression. Expressions have two properties: a function (specified in the function property) and a list of arguments (specified in the arguments property). Right click on the text function, which will bring up the menu for defining the function for this expression; select the Select an Alternative Instance option. 4. This brings up a dialog showing all the individuals of the Function class from the PS ontology. Select the < option from the list on the right, and click on the OK button. The interface will be updated to show the < function as the value of the function property. 5. It is now necessary to define the arguments of the expression. These are the Maximum Load SystemVariable and the maxSupportedWeight property of the Material class. The arguments property will currently have one argument: in the box titled argument. Right click on the title (argument) and select Make a New Instance. The first argument is going to be the Material class’s maxSupportedWeight property; to define this as an argument using the argument type selection dialog that is displayed, select the SystemComponent Property type from the list on the left of the dialog, and then click on the OK button. 6. The interface will be updated to display the properties for selecting a SystemComponent Property. These are class, which will be used to specify the Material class, and property,
I.5. Rule Knowledge Acquisition
322
which will be used to specify the maxSupportedWeight property. Right click on the text class and select Select an Alternative Instance from the menu that pops up. Use the resulting dialog to select the Material class from the list on the right of the dialog, and click on the OK button. 7. Now right click on the text property and select Select an Alternative Instance from the menu that pops up. Use the resulting dialog to select the maxSupportedWeight property, and click on the OK button. The interface is updated to show that the maxSupportedWeight property of the currently selected Material will be used as the first argument of the constraints expression. 8. Add another argument by clicking on +a button below the existing argument display. 9. As the next argument will be the SystemVariable, select the SystemVariable type using the dialog that is displayed, and click on the OK button. 10. The interface is updated to display the new argument for the SystemVariable. Right click on the text SystemVariable and select Select an Alternative Instance. 11. A dialog is then displayed showing all the instances of the SystemVariable class. Select the Maximum Load instance, and click on the OK button. 12. The GUI is then updated to display the final expression, which, in natural language, states that the value of the currently selected Material’s maxSupportedWeight property is less than the value of the Maximum Load. 13. Add a consequent to this rule by clicking on the Add Consequent button. 14. Consequents of this type of rule are simply the name of the constraint that has been violated. Right click on the text Right click to assign a value for violated constraint name on the consequent display, and select the Edit Value option. A dialog is then displayed which is used to specify the value of this constraint; enter “Shelf material too weak” in the text field and then close the dialog by clicking on the cross at the top right of the dialog. 15. The constraint rule has now been successfully defined. The natural language version of the rule is: if the value of the currently selected Material’s maxSupportedWeight property is less than the value of the Maximum Load, then the constraint material too weak has been violated. 16. Click on the Acquire Next Rule button to define a fix rule for when this constraint is violated.
Define a Fix Rule Having defined a constraint, it is now necessary to define what to do if that constraint is violated. For this constraint, the obvious fix is to pick a material that is able to support the Maximum Load. To define such a fix:
I.5. Rule Knowledge Acquisition
323
1. The interface should already be displaying the interface for defining a fix rule. As part of this it should have the material too weak violated constraint in the antecedent display: leaving you to define the actions that are necessary to remedy this violation (i.e. the fix) as the consequents. 2. Click on the Add Consequent button, located near the bottom of the window. 3. The interface is updated to display a new Fix consequent. Right click on the fix consequent display and select Make a New Instance from the menu that pops up. 4. There are a variety of options for which type of fix to create: each of which are described in How to Adapt the Generic Propose and Revise Problem Solver for a Domain. You are going to define a PropertyValueBasedFix, so select that option from the list and click on the OK button. 5. The interface now updates to display the properties associated with this type of fix: starting on the left and moving right there are properties for: the name of the fix; the desirability of the fix; specifying which (currently selected) component will be replaced; and defining an expression which will be used when applying the fix to select the suitable new components. 6. Change the value of the name property (right click on the property, select Edit Value) to “Select strong enough material”. 7. Change the value of the desirability property (right click on the property, select Edit Value) to “1”. 8. Change the value of the replace current component (right click on the property, select Select an Alternative Instance) to be the Material class (select Material from the list on the right of the dialog, click on the OK button. 9. Change the value of the select using equation to be the following equation: ≤ Maximum Load, Material maxSupportedWeight. To do this: (a) Right click on the function text, select Select an Alternative Instance, and select the ≤ function (then click on the OK button) (b) Right click on the argument for the argument that is displayed, select Select an Alternative Instance, and using the display that is displayed select SystemVariable from the list of the left, and select Maximum Load from the list on the right, then click on the OK button. (c) Add another argument to the list of arguments, by clicking on the +a button below the argument display. (d) Click on the OK button on the dialog that is then displayed. (e) Right click on the argument text above the display of the new argument that has just been added (not the one you previously changed), and select Select an Alternative Instance from the menu that pops up.
I.5. Rule Knowledge Acquisition
324
(f) Using the dialog that is displayed select SystemComponent Property from the list of the left if it is not already selected, and select Material maxSupportedWeight from the list on the right, then click on the OK button. 10. This completes the definition of the new fix rule.
I.5.8
Select the Initial Material to Use
There are now just two more rules to define: one which selects Chipboard as the inital material to be used, and one which specifies that the PS should display the valid type of material after it has solved the problem. 1. Using the Problem Solver Concepts area at the top left of the KA interface, select the Material class from the drop down list. 2. Select Chipboard from the list of Material instances displayed below the drop down box. 3. Using the Defined Rules area at the top right of the KA interface select Initial SystemComponent Selection from the Available rule types drop down box. 4. Click on the Create New Rule button located at the bottom of the Defined Rules area. 5. MAKTab will create a new Initial Component Selection Rule with antecedents and conseqents for initially selecting Chipboard. Remove the consequent. This completes the KA that is required for the chipboard material.
I.5.9
Define an Output SystemComponent Rule for Material
It is now necessary to define a new output calculated component rule, specifying that the selected material type should be displayed. To do this: 1. Select the Material class in the drop down box in the Problem Solver Concepts area, (if it is not already selected) and click on the Acquire for Class Material button located below the list of Material instances. 2. Select Output SystemComponents from the Available rule types drop down list in the Browse Defined Rules area. 3. Click on the Create New Rule button. 4. MAKTab will create a new output calculated componet rule, with an antecedent and a consequent for the Material class. Remove the consequent. This completes that definition of rules related to the Material class.
I.5.10
Generate the Problem Solver
This completes the definition of the rules required for the generic propose-and-revise PS to work in the domain of shelf design. All that is required now is to generate the executable PS by clicking on the Generate Rules button located near the bottom of the window. Follow any prompts that are displayed, and then copy the rules from the rules dialog, switch to JessTab and paste the rules into the text entry box near the bottom of the window. If the rules do not run automatically, enter “(reset)(run)” at the JessTab entry box, and press enter.
I.6. Extensions
I.6
325
Extensions
This is a very simple KBS. It may be desirable to make it more complex, to provide a better, more appropriate KBS. One way in which this could be done is to alter the constraint rule defined in I.5.7, adding another antecedent which states that the name of the currently selected Material is equal to “Chipboard”. This way the constraint will be violated if the currently selected material is chipboard, and it is unable to hold the minimum required weight. There would be two fixes associated with this violation: the first would be to use the Steel Material for the shelf, by defining a fix rule similar to that defined in I.5.7, but which states in the satisfaction equation that the name of the Material must be equal to “Steal” (= Material name “Steal”), which would have desirability 1 (1 is highest desirability, 10 is lowest); along with another similar fix rule which specifies that that Oak material should be used, as oak is more expensive, this fix is less desirable, and should be assigned desirability 2.
326
Appendix J
MAKTab Tutorial for Building a Diagnosis Based KBS J.1
Introduction
This document provides a step by step guide to configuring the generic diagnosis problem solver (PS) provided with MAKTab to build a simple diagnosis KBS for a car engine. Before reading this document, you should be familiar with the MAKTab User Manual, as well as the How to adapt the Generic Diagnostic Problem Solver for a Domain document, which describe using MAKTab and the generic diagnostic PS respectively.
J.2
Introduction to the Domain: Car Diagnosis
Diagnosis of faults in the car domain are introduced in the How to adapt the Generic Diagnostic Problem Solver for a Domain, in which a very basic outline of some engine related faults, their causes and repairs is provided. In this tutorial, you are going to use MAKTab along with the generic diagnostic PS to build that KBS. The relevant engine parts have been modeled in the ontology stored in the file EngineParts.pprj
J.3
Building the Car Diagnosis KBs
This KBS will be developed by performing the following steps: 1. The engine parts will be mapped from the EngineParts domain ontology, to the generic diagnostic PS. 2. MAKTab will then be used to acquire the fault, cause and repair rules required by the PS for the various engine parts.
J.3.1
Loading the Generic Diagnosis Problem Solver
The first task is to load the diagnosis ontology into Prot´eg´e. To do this from within Prot´eg´e: 1. Select File from the menu. 2. Select Open. 3. Navigate to where you saved the files, and select the Generic-Diagnosis.pprj file. 4. Click on the OK button.
J.3.2
Select MAKTab
You will use MAKTab to build the KBS. Select it by clicking on the Maktab tab.
J.4. Mapping
J.4
327
Mapping
You will now be presented with the mapping interface. This interface will be used to define mappings from the EngineParts domain ontology to the diagnosis ontology, which will allow MAKTab to “map” the knowledge from the domain ontology to the PS ontology, so that it can be used when defining the diagnostic rules relevant to the car domain.
J.4.1
Load the Engine Parts Ontology as the Source Ontology
The engine parts ontology will be used as the source ontology for the mappings. To do this: 1. On the Source Ontology, click on the Import External button. 2. This will display a file selection dialog. Use this to select the EngineParts.pprj file, and click on the OK button. The EngineParts ontology will now be displayed as the source ontology.
J.4.2
Import the Diagnosis Ontology as the Target Ontology
The diagnosis ontology will be used as the target ontology. To do this: 1. On the Target Ontology, click on the Import Current button. This will import the current ontology (the diagnosis ontology) as the target ontology.
J.4.3
Define Mappings
MAKTab provides a range of different types of mappings to copy the domain knowledge from the domain ontology to the PS ontology. For this tutorial, we are going to define a series of ClassToIndividual mappings, in which the instances of the SystemComponent class in the PS (diagnosis) ontology will be created to represent the various classes in the EngineParts ontology. This is because in the KBS we will only be referring to the engine parts at a high level and do not need to know any information about properties of the engine parts, or the actual different instances of the components that are available. A class to instance mapping will be defined for the Engine, StarterMotor, Battery, Alternator, and Lights class. The following instructions walk through creating a mapping for the Engine class, and should be repeated for the other classes, so that after executing the mappings, the SystemComponent class in the PS ontology will have five individuals associated with it. 1. Select the Engine class in the Source Ontology Display. 2. Use the drop down box in the mapping definition area (in the centre of the mapping interface), to select the Class to Instance Mapping (may be called Class to Individual Mapping. 3. Click on the Set type button. 4. Click on the SystemComponent class in the Target Ontology Display (it may be necessary to first double click on the PSConcept class if the SystemComponent class is not visible). 5. The mapping display will update to show the properties of the SystemComponent class: these are name and description. In the text field next to name enter “Engine” and if desired enter a short descriptive text in the text field beside description.
J.5. Defining Engine Part Diagnostic Rules
328
6. Click on the Store Mapping located near the bottom of the window. Repeat these steps for the StarterMotor, Battery, Alternator and Lights class, changing the value for the name property as appropriate. Once all the mappings have been defined, click on the Execute Mappings button, located at the bottom of the window. This should execute all the mappings, and five individuals will be created of type SystemComponent (the Target Ontology Display should show this by having a 5 in brackets after the SystemComponent class name).
J.5
Defining Engine Part Diagnostic Rules
Following the specification of the domain in How to Adapt the Generic Diagnostic Problem Solver for a Domain, it is relatively straightforward to see the links between the rules. This section walks through creating some of the rules, which involve a couple of cause rules, and a couple of repair rules; it is left to you to do the rest.
J.5.1
Select the KA Tab
To start the KA, click on the KA tab within MAKTab (it is to the right of the Mapping tab; below the standard Prot´eg´e tabs.
J.5.2
Select KA Ontology
It is necessary to tell MAKTab which ontology should be used for KA. In this tutorial, this is the target ontology from the mapping stage: 1. Click on the The target ontology from the mapping stage option 2. Click on the Start KA button. The KA interface will now be displayed.
J.5.3
Start KA for Engine
The first step is to define a rule which states that the engine will not start. There are two causes for this, and so it will demonstrate nicely how to define diagnosis rules. 1. Using the concept selection drop down box in the Problem Solver Concepts area at top left of the KA interface, select the SystemComponent class. 2. The individuals associated with the SystemComponent class are displayed in a list below the concept selection drop down box. Select Engine by clicking on it, and then click on the Start KA for Engine button.
J.5.4
Define Engine Will Not Start Problem
The rule definition area will update to display a cause rule, which can be used to specify that a malfunction can be caused by some other system state. Here this will be used to define a rule that states the engine failing to start can be caused by a faulty starter motor.
J.5. Defining Engine Part Diagnostic Rules
329
1. The list of antecedents displayed below The following component states currently contains one antecedent. This antecedent has two properties: one for selecting a component, and another for describing that component’s state. The first property, described by the text Concept should already be set to Engine 2. The second property, described as with state currently does not have a value, and so double click to assign value for with state is displayed (in red). Either double click on the (red) text, or click on the arrow to the right of with state and select Edit Value. 3. Enter the text “will not start” into the text field in the dialog box that pops up and press enter. The pop up box will disappear, and the value for has state has changed to will not start. 4. Now it is necessary to define the cause of the engine not starting. This is done by adding consequents to the rule; the consequents are displayed below can be caused by (just below the antecedents). There should be one consequent, which has the engine concept selected. It is necessary to change this to the starter motor. To do this, click on the arrow to the right of concept and select Select an alternative value. 5. Select the SystemVariable class from the list on the left hand side of the dialog that is displayed. This will display all the SystemVariable individuals in the list on the right; select the StarterMotor individual from this list, and click on the OK button. 6. The consequent will now display the StarterMotor as the concept, with the with state value reading double click to assign value (in red). Either double click on the (red) text, or click on the arrow to the right of with state and select Edit Value. 7. Enter the text “faulty” into the text field in the dialog box that pops up and press enter. The pop up box will disappear, and the value for has state has changed to faulty. That is all that is necessary for this rule. Other rules may require multiple antecedents or consequents, to describe a series of malfunctions or that a series of states are the cause. This can be done by adding antecedents or consequents using the appropriate button on the rule definition display.
J.5.5
Define a Fix Rule
It is now necessary to define a fix rule for the faulty starter motor. To do this, click on the Acquire Next Rule button. 1. A dialog box will appear asking which type of rule should be acquired next. As a cause rule is related to both itself (for example, to specify that the faulty starter motor is caused by another faulty component) or a repair rule (which specifies how to, for example, repair the faulty starter motor), it is necessary to tell MAKTab which type of rule should be defined. For this example, select the Repair rule from the list of rules, and click on the OK button. 2. The rule definition area will update to display a repair rule. Repair rules state that some component state(s), in this case the faulty starter motor, can be repaired by performing some action. MAKTab will have added the faulty start motor as the first antecedent of this rule. As this is all that is necessary for this rule, there is no need to alter the antecedents.
J.5. Defining Engine Part Diagnostic Rules
330
3. A consequent should already have been added to this rule. If not, click on the Add Consequent button. Consequents in repair rules are described with repair by and performing action - the value of performing action has yet to be set, and so double click to assign value is displayed. Either double click on this text, or click on the arrow to the right of performing action and select Edit Value. 4. For this example, use the simple repair advice of replacing the StarterMotor: enter the text “replace starter motor” into the text field in the dialog box that pops up and press enter. The pop up box will disappear, and the value for performing action has changed to replace starter motor. This completes this rule and this example. Feel free to add some more rules for the starter motor, either by clicking on the Acquire Next Rule button, and selecting an appropriate option from the list that is displayed, or clicking on the Start KA for StarterMotor button again; also feel free to select alternative components and define rules for them.
J.5.6
Generate the Problem Solver
Once all the relevant rules have been defined, the generic diagnosis PS can work in the domain of car motors. All that is required now is to generate the executable PS by clicking on the Generate Rules button located near the bottom of the window. Follow any prompts that are displayed, and then copy the rules from the rules dialog, switch to JessTab and paste the rules into the text entry box near the bottom of the window. If the rules do not run automatically, enter “(reset)(run)” at the JessTab entry box, and press enter; the follow the instructions that are displayed to use the program.
331
Appendix K
An Introduction to Building a Desktop Computer K.1
Introduction
Building a desktop computer can be a daunting task: there are various different components that can be used, some of which are required, some of which are optional. Having selected which types of components one would like in a computer, the tricky task then is to select components that are compatible with each other. This document describes how to do just that: it provides a general introduction to each of the required components, and then describes how to go about selecting ones that are compatible, along with various other constraints which must be considered during the design process, and what to do if any of these constraints are violated. Using this document as a guide, not only should you be able to design a new basic desktop computer, but you should also be able to build a Knowledge-Based System to do the job for you.
K.2
The Basics
A basic desktop computer, which you will be asked to build a system to design, is composed of between seven and twelve components. Any desktop must have a case, a power supply, a motherboard, a processor (also referred to as a CPU), some memory (also referred to as RAM), a hard drive, and an optical drive (e.g. DVD reader or writer, CD-ROM drive, etc).
K.3
The Task
When building a new desktop computer, the main task is to select the correct combination of components which a) work correctly together, b) produce a system which meets the requirements for which it is being built, and usually c) does so at an acceptable price. Although this can seem a daunting task at first, once you are familiar with how it is achieved, which is described in the remainder of this document, it is relatively simple.
K.4
Selecting Components
One of the trickiest things about building a new computer is selecting parts such as the memory, CPU, hard drive, and motherboard that are all compatible with one another. Don’t be overwhelmed with the part names and acronyms, model numbers, etc. The key is to make sure that everything fits with the motherboard. If you follow the specifications of the motherboard, you can’t go wrong. This section briefly describes the seven required components of a computer.
K.4. Selecting Components
K.4.1
332
The Motherboard
The motherboard is the main board around which a PC is built. It is the centre of the PC in the sense that every system component connects to the motherboard, directly or indirectly. The motherboard you choose determines which processors are supported, how much and what type of memory the system can have, what type of video adapters can be installed, how many and the type of optical and hard drives that can be added, how many other internal cards can be added, the speed of communication ports, and many other key system characteristics. One of the most important features of the motherboard is its size. The size of the motherboard, referred to as its form factor determines which cases can be used. Generally motherboards come in one of two form factors (sizes): ATX, which is the standard size, and microATX, which is a smaller version. This means that, a case that only supports a microATX sized motherboard, will be unable to support an ATX one; the reverse however, is only sometimes true (some ATX supporting cases can also support microATX motherboards). The other key feature is the chipset that it uses. The chipset, often referred to as socket type, determines the type of processor that can be used with the motherboard. The motherboard processor socket and the processor socket type must be the same. Example socket types include, the sockets for Intel Pentium 4 and Celeron processors, which are called Socket 478 and Socket 775; socket LGA775, which is used by the newer Intel Core 2 Duo processors; AMD Athlon XP processors use Socket A, and AMD 64 processors use Socket 939. With regards to memory, a motherboard will only support certain types. There are several characteristics which determine memory’s suitability: the speed of the memory (a motherboard typically only supports certain speeds of memory; the speed of the memory is related to the type of the memory); the type of memory’s integrity checking supported (there are two options here: EEC and Non-EEC, both of which are normally supported by the motherboard); also motherboards will only support up to a certain amount of memory; and there is also a limit to the number of memory modules that can be added, as a motherboard will only have a set number of memory slots (the physical place on the motherboard that the memory is added) on it. The motherboard also has a variety of other slots used for connecting other devices such as hard disk drives, optical drives, and video cards, such as IDE, SATA, PCI-Express 16, PCIExpress 1, AGP, and PCI slots: all of which are used by the various other components. Obviously, the motherboard must have enough of these slots to allow all of the components to be added.
K.4.2
The Case
The case (or chassis) is the foundation of any system. Its obvious purpose is to support the power supply, motherboard, drives, and other components. Less obvious purposes include containing the radio-frequency interference produced by internal components, to ensure proper system cooling, and to subdue the noise produced by the various components with moving parts. There are at least five types of cases: Server cases which are large cases designed to hold lots of hard drives; Micro cases which are very small cases often used for entertainment systems; MidiTower cases which are popular upright cases, just not as tall as the Full Tower cases; Full Tower cases which are upright cases, usually taller than the similar Midi Towers; and Desktop cases which are the popular horizontal cases often placed under the monitor. The choice of case
K.4. Selecting Components
333
is usually based on the requirements of the system, the surrounding environment where it will be used, and so on. The main influence the case has on the other components is that is must be big enough to accommodate the motherboard, which is where motherboard form factor (size) is used; the case must also have enough drive bays to store the (optical and hard) drives (each drive requires its own bay).
K.4.3
The Processor (a.k.a. CPU)
There are two brands of processor: Intel and AMD. Both offer extensive catalogues of processors, and the choice of which to use is often determined by price, as AMD processors are often cheaper than comparable Intel ones, however often the AMD motherboards are more expensive, which can make the overall cost about equal. Therefore selection is often left to the builder’s preference. Original processors were based on 32-bit technology, however new processors such as the AMD 64 range use 64-bit technology. There are two factors which affect a processor’s performance: the clock speed, quoted in MHz or GHz (1 GHz equals 1000 MHz) and the amount of cache. The processor cache is a small piece of very fast memory the processor uses to store data while performing its operations. Modern processors have two types of cache: level one cache and level two cache and generally the more cache, the faster the processor works. The socket type of the processor determines which motherboards it can be used with (it must be the same as the chipset/supported processor socket of the motherboard).
K.4.4
Computer Memory (a.k.a. RAM)
Computer Memory comes in modules of varying density. Typical densities are 256MB, 512MB, 1GB and 2GB. Each module has a number of pins, which are used to transfer data between the memory module and other components. If using more than one module, it is good practice to use modules of the same density. Another feature of memory is its integrity checking capability, which can either be EEC (Error-Checking and Correcting) or non-EEC. The type of a module’s integrity checking must be supported by the motherboard, and it also must be the same as any existing modules in the system. There are various types of memory, notably DDR and the newer DDR2 versions; the older SDRAM is still used in some systems. Essentially the type just refers to how the data is transferred from the memory to the other components; most modern systems use either DDR or DDR2, whichever is supported by the motherboard. Another influencing factor in memory choice is the memory speed: typical speeds are PC2700, PC3200 for DDR memory; PC24200, PC2-5300 and PC2-6400 for DDR2 memory; and PC133 for SDRAM memory. Typically the memory type and speed are combined to make the name of a particular memory module (e.g. DDR PC2700 is DDR memory with speed PC2700).
K.4.5
Power Supply Unit (PSU)
The power supply unit (PSU) is responsible for providing the system with power. The important decision to remember with PSUs is that they must be capable of providing at least the minimum amount of power required by all of the components of the system. It is recommended that a power supply is capable of providing 30% more power than the maximum required by all of the
K.5. System Variables
334
components. To calculate this, add up the required power of all the components, and then multiple by 1.3. It is often desirable to have larger power supplies if upgrading in the future is likely.
K.4.6
Hard Disk Drive
The hard disk drive is the computer’s permanent store for data. The main influencing factor in choosing a hard disk drive is its capacity (typically given in GB). Secondary factors such as the amount of cache it has (this is a different cache from that of the CPU) and the interface it uses both affect the speed of reading/writing data from/to the hard disk. There are two types of interfaces currently used by hard disk drives: the older, slower, IDE interface, and the newer SATA interface. Each hard drive in a system requires a free IDE or SATA slot on the motherboard, and a free 3.5 inch drive bay in the case.
K.4.7
Optical Drives
Optical drives, such as CD-ROM, CD-RW, DVD-ROM, and DVD-RW drives allow the computer to read and write to CDs and DVDs; the choice of which to use depends on what the system will be used for. All optical drives place the same requirements on the rest of the system: the interface which it uses (like hard disk drives this is either IDE or SATA) must be supported by the motherboard, there must be a free slot of the correct type on the motherboard so the drive can be connected, and each drive will require a 5.25 inch bay in the case. There are various read/write speeds associated with optical drives, depending on their type.
K.5
System Variables
There are various variables which must be taken into account when designing a new computer, some of which are listed below.
K.5.1
Desired minimum memory
Often when building a new system, the designer will require that it has a certain amount of memory, rated either in MB (typically 512MB), or, more usually, GB (typically either 1 GB (1024 MB), 1.5 GB (1536MB), or 2GB (2048MB)).
K.5.2
Number of memory modules
Often more than one memory module is used, either to provide a higher total memory than is capable by only using one memory module, or to save money, as often for example two 1 GB memory modules are cheaper than one 2GB module. At the start of the design process, this variable’s value is 0, as no memory modules have been added to the system.
K.5.3
The total memory
This is the total amount of memory that the system has. It can be calculated by taking the density of the currently selected memory module * the number of memory modules present (assuming all modules have the same density).
K.5.4
Desired hard drive capacity
As with the total memory, it is often desirable that a computer have at least a set amount of storage capacity provided by the hard drive. The desired capacity is typically quoted in GB.
K.6. Constraints and Fixes
K.5.5
335
Total required power
This is the power required by all of the components, specifically the motherboard, hard drive, processor, memory, optical drive, plus an extra 30% (so, total required power = (required power of the motherboard + required power of the hard drive + required power of the processor + the required power of the optical drive + (the required power of the memory * number of memory modules) ) * 1.3).
K.6
Constraints and Fixes
As described above, there are various constraints that must be satisfied when building a computer from scratch. Each component has its own set of requirements, which limits the range of other components it can be used with in a working system. Due to the wide range of available components, if the selection of one component causes a constraint to be violated elsewhere in the overall system, then it is often possible to replace the “offending” component with another, more suitable version. Below is a list of the constraints which must be kept in mind when selecting the components to build a new basic computer. They are also illustrated in figure 1. Each component is discussed both in terms of the constraints on it and the constraints it places on other components.
K.6.1
Motherboard
The motherboard is the most important component in terms of constraints it places on the choice of other components. Every other component must be compatible with the motherboard. Generally, the compatibility of a component and motherboard is based on matching the available sockets (or slots) of the motherboard with the socket type required by the component.
Constraint MB-1 The primary constraint on the motherboard is that it supports the socket type of the processor. If it does not, then either another processor must be selected, which has the same socket type as the motherboard; or another motherboard must be selected, which supports the socket type of the processor. The former is more desirable than the latter, as it has less impact on other components.
Constraint MB-2 The motherboard also imposes some constraints on the memory (RAM). Firstly, the desired memory must be less than or equal to the maximum supported memory of the motherboard; there are two fixes for this: the first is to replace the motherboard with one that supports the desired amount of memory; the second is to reduce the desired memory by 128MB until it gets to the maximum supported by the motherboard.
Constraint MB-3 Another constraint imposed on the choice of memory, is that the number of memory modules must be less than or equal to the number of memory slots on the motherboard. If this is not the case, then another type of memory module, with a higher density, should be selected, and the number of memory modules set to zero.
Constraint MB-4 A final motherboard/memory constraint is that the speed of the memory must be supported by the motherboard, if this is not the case, then alternative, supported memory should be chosen.
K.6. Constraints and Fixes
K.6.2
336
Processor
The selection of the processor is perhaps the second most important component in terms of overall system design. There are primarily two choices, based on manufacturer: either select an AMD processor or an Intel processor. Once that has been decided, then the choice of motherboard is restricted to one that supports the processors socket type, see Constraint MB-1.
K.6.3
Memory
There are a variety of constraints on the memory placed by the motherboard: see Constraint MB-2, Constraint MB-3, and Constraint MB-4; this leaves one other memory related constraint.
Constraint M-1 The total memory must be more than or equal to the desired memory. If this is not the case, then add another memory module.
K.6.4
Case
The main constraint on the case is that the form factor (size) of the motherboard must be supported by the case. This constraint should be expressed against the type of case that is to be used: so, assuming that the system being designed will use a Desktop case, the Desktop case must support the motherboard’s form factor.
Constraint C-1 The motherboard form factor (size) must be supported by the case: if it is not then the motherboard will not fit the case; the most desirable fix for this violation is to select another case which supports the motherboard’s form factor; an alternative, less desirable, fix is to replace the motherboard for one whose form factor is supported by the case.
K.6.5
Hard Drive
Assuming the sytem being designed will use a hard drive with a SATA iterface, there are three constraints relevant to the hard drive regarding capacity, the interface, and available bays in the case.
Constraint HD-1 The hard drive must have a capacity equal to or larger than the desired hard drive capacity. If this constraint is violated, then select a hard drive whose capacity is at least the same as the desired capacity (most desirable fix), or reduce the desired hard drive capacity (least desirable fix).
Constraint HD-2 The interface used by the hard drive, which in this system is SATA, must be supported by the motherboard. If the motherboard does not have at least one SATA slot, then either select a hard drive which uses the IDE interface (most desirable), or select another motherboard which has at least 1 SATA slot (least desirable).
Constraint HD-3 The case must have at least one 3.5 inch bay for the hard disk drive to be stored in, if it does not, then select a case with at least one 3.5 inch drive bay.
K.7. Initial Selections and Values
K.6.6
337
Optical Drive
Optical drive refers to a class of drives, including CD-ROMs, CD-RWs, DVD-ROMS, DVD-RWs, and so on. Typically the designer of a system will know which type of drive they wish to have, and so constraints will only need to be defined for that particular type of optical drive. Assuming that the system being designed with use an DVD-RW which uses the IDE interface, there are two constraints related to optical drives regarding the interface and available case bays.
Constraint OD-1 The interface of the hard drive, which in this system is IDE, must be supported by the motherboard. If the motherboard does not have an IDE slot however, then either select an optical drive which uses the SATA interface (more desirable), or select another motherboard which has at least 1 IDE slot (less desirable).
Constraint OD-2 The case must have at least one 5.25 inch bay for the optical drive to be stored in, if it does not, then select a case with at least one 5.25 inch drive bay.
K.6.7 Power Supply Unit (PSU) Constraint PSU-1 The PSU must be able to provide enough power for all of the components that will be added to the system. If the total required power is more than the PSU is able to provide then replace the PSU with one that is capable of providing more power.
K.7
Initial Selections and Values
When starting the design process, it is necessary to select a series of components which will be used to start the design, and to specify the values of the various system variables. For this task, start with the following components 1. Motherboard - 6627MA-RS2H. 2. Power Supply Unit (PSU) - EarthWatts EA380. 3. Central Processing Unit (CPU) - Athlon 64 3500+. 4. DVD-RW - DH-20A3P. 5. Hard drive - ST3160211AS. 6. Desktop case - HQ95A. 7. Memory - Crucial 1GB DDR PC3200. Further, for this task, set the initial value of the various system variables to the following 1. Desired minimum memory to 2048 MB. 2. Number of memory modules to 0. 3. The total memory to 0.
K.7. Initial Selections and Values
Figure K.1: Illustration of the restrictions involved in building a desktop computer.
338
K.8. Output Selections
339
4. The desired hard drive capacity to 300 GB. 5. The total required power to 0.
K.8
Output Selections
After the design process is complete, it is desirable to know which components have been selected, as well as the values of any variables. Select the following types of components to be displayed after the design is complete 1. Motherboard. 2. Power Supply Unit (PSU). 3. Central Processing Unit (CPU). 4. DVD-RW. 5. Hard drive. 6. Desktop case. 7. Memory. Further, specify that the values of the following variables should also be displayed 1. Desired minimum memory. 2. Number of memory modules. 3. The total memory. 4. The desired hard drive capacity. 5. The total required power.
340
Appendix L
An Introduction to Computer Hardware Fault Diagnosis L.1
Introduction
Correctly diagnosing malfunctions with computer hardware can be a tricky, but important task. To someone unfamiliar with the area, the few obvious, often unhelpful, symptoms that are displayed, such as the computer not switching on, the monitor not displaying anything, or the system producing a seemingly random series of beeps, provides very little help in diagnosing the underlying problem. Being able to accurately diagnose the faulty component means that the relevant steps can be taken to remedy the problem and get the system working correctly again. Different components have different symptoms when they fail, and often it is not the expected component that is causing the fault; this makes the diagnosis process tricky. Often being able to identify the faulty component requires testing various other components in order to identify the culprit. The remainder of this document describes faults, failures, and possible repairs associated with the different components of a computer system, specifically power supply problems; video problems; motherboard, CPU and memory problems; hard drive problems; optical drive problems; modem problems; and sound problems.
L.2
Power Problems
The power supply unit (PSU) is responsible for providing the computer system with a constant power supply from the building’s mains electric outlets. However, as described below, diagnosis of power related problems is not usually as simple as “if the computer does not switch on then the PSU has failed” as there are a variety of reasons for power related problems.
L.2.1
Problem PSU-1
One reason for the computer not powering up is that the power switch on the case has failed. If this is the case, then the case’s power switch should be replaced.
L.2.2
Problem PSU-2
Alternatively, when the computer is not powering up, it is likely the PSU has failed. Attempting to repair a PSU is extremely, and so the standard action is to replace it.
L.3
Video Problems
There are various problems associated with the video/graphics system of the computer. Most of these problems exhibit themselves by causing some effect to the image displayed on the monitor;
L.4. Motherboard, CPU, and Memory Problems
341
of course it can also be responsible when no image is displayed at all. Below are some of the performance and failure problems associated with the video/graphics card component.
L.3.1
Problem VP-1
A wavy image on the monitor can be caused by either a failing video/graphics card. The advice in this case is to replace the video card
L.3.2
Problem VP-2
Another problem which shows itself by affecting the image displayed by the monitor is when the displayed image is missing a primary colour. If there are also bent or damaged pins in the video cable (the cable connecting the computer and monitor) then the video cable is faulty and needs to be replaced. This problem can also be caused if the video cable is not securely fitted at both ends; attaching the cable more securely can solve the problem. Other causes for the monitor displaying an image with a missing primary colour, are either a failing monitor, which needs replacing, or a failed video card which will also need replacing
L.3.3
Problem VP-3
Often the video action being displayed on the monitor can become slow, which can mean that the application requires a better video card, which can only be fixed by upgrading the video card.
L.4
Motherboard, CPU, and Memory Problems
The motherboard, CPU and memory are the source of many hardware related problems. Problems here are often interrelated and need to be dealt with as a group, and so problems related to all three components are described in this section.
L.4.1
Problem MCM-1
During normal use, sometimes the computer randomly freezes: becoming unresponsive to all input. This can be caused by an overheating computer, which is typically caused by a fault in the cooling system (e.g. a broken case fan). To remedy this problem, fix the cooling system (e.g. by replacing the broken case fans). Random freezing can also be caused by an overheating CPU. This is usually caused by a broken heatsink (which needs to be replaced).
L.4.2
Problem MCM-2
The computer freezing on the boot screen is typically caused by a either a broken CPU (which can be tested by swapping for another CPU) or a broken motherboard; in both cases, the component needs to be replaced.
L.4.3
Problem MCM-3
The main cause of a computer randomly rebooting is faulty memory, which should be removed and/or replaced.
L.5
Hard Drive Problems
The hard drive is responsible for storing and accessing all the files that are used by the operating system, the programs, as well as all the user’s personal files. Therefore being able to detect a problem with it before it fails completely can be vital, as it gives the user a chance to back up their important personal files.
L.6. Optical Drive Problems
L.5.1
342
Problem HD-1
A typical hard drive problem is that accessing it is slow and noisy. This can happen when the drive is almost full (usually over 80% full), in which case the only way to solve it is to delete some files. This can however, also be caused by the ball bearings on the hard drive becoming worn, in which case the drive is failing and needs replacing.
L.5.2
Problem HD-2
Another hard drive related problem is that it starts getting a lot of bad segments causing data errors. Often the first symptom of this is when the user’s documents are frequently corrupted. There are various causes of this: the hard drive cable could be faulty and needs replacing; or it can happen when the drive is almost full.
L.6
Optical Drive Problems
The term “optical drive” refers to a group of drives including CD-ROM, CD-RW, DVD-ROM, and DVD-RW drives. CDs and DVDs provide one of the main methods for installing new programs and getting data onto a hard disk drive; recordable versions provide a cheap method for backing up data; optical drives are therefore frequently used, and problems with them are common.
L.6.1
Problem OD-1
One common problem for optical drives is that the tray will not eject. If the computer system is working then this can be caused by the optical drive not receiving power, which is typically caused by a loose connection in the power supply cable. Reseating the drive power supply cable can often solve this problem. Alternatively, this can also be caused by a broken drive motor, which cannot be fixed and so the drive will need to be replaced.
L.6.2
Problem OD-2
Another problem is that the drive will not read data discs. The problem often resides with a dirty disc and often cleaning the disk can result in it being readable. Alternatively, the source of the problem can be a faulty connecting cable, which must be replaced.
L.7
Modem Problems
The modem is responsible for providing a computer with a way to connect to the internet via a dial-up connection (one which uses the standard telephone line to send and receive data from the Internet). As such, it is relatively important that it works correctly, but there are a variety of problems that can occur.
L.7.1
Problem M-1
The modem failing to dial the Internet Service Provider (ISP) can be caused by having the incorrect modem driver installed on the computer. If the correct driver is installed, and it is not producing a dial tone, then the modem is probably faulty and needs replacing.
L.8. Sound Problems
L.8
343
Sound Problems
Although not essential, a correctly functioning sound system can be important to some users: videos are often not as good without the sound; games without sound are also a reduced experience; and listening to music when the computer it not producing sound is impossible. This section describes some sound related problems, along with their causes and possible repairs.
L.8.1
Problem S-1
One of the most commonly occurring sound related problems is that no sound is being produced, which can have a variety of causes. Some simple causes of this are: the volume has been set too low; the speaker jack may not being connected properly; the wrong driver being installed; or a faulty sound card. These problems can be fixed by setting the volume higher; plugging in the speaker jack property; downloading the correct driver from the Internet and installing it; and replacing the card.
L.8.2
Problem S-2
A recording related problem is that no sound is being recorded. This can be due to the microphone not being enabled in the sound settings option of the operating system.
L.8.3
Problem S-3
Another common complaint is that musical CDs are not being played correctly. This is usually caused by a faulty audio patch cable between the drive and the sound card, which should be refitted or replaced.
344
Appendix M
Questionnaires M.1
Pre-Experiment Questions
Before you start this experiment, to help us gain an understanding of your level of knowledge relating to the concepts used in this experiment, we would greatly appreciate it if you could answer the following questions. Please circle the appropriate answer. How you previously created one or more significant ontologies (i.e. an ontology that has formed the basis of an application, and was not created by following a tutorial or other guide)? • Yes. • No. How familiar are you with the functionality provided by Prot´eg´e: • Not at all. • Familiar with the basics. • Can do just about everything in Prot´eg´e. How much experience do you have with problem solvers: • None. • Used them, but not created them. • Have both used and created them. How much experience do you have with Jess: • None. • Relatively little experience. • Fairly experienced using it. • Very experienced with it.
M.1. Pre-Experiment Questions
345
How much experience do you have with KBSs: • None. • Have used them, but never created them. • Have both used and created them. How familiar are you with computer configuration: • Not familiar. • Have upgraded computer(s). • Have designed and built computer(s). How familiar are you with computer diagnosis: • Not familiar. • Familiar with basic problems (e.g. ones that can be resolved without having to remove the case). • Familiar with more detailed problems (e.g. ones that typically require replacing internal components). How familiar are you with the propose-and-revise algorithm: • Not familiar. • Have understanding of the basic algorithm. • Have implemented a system using it. How familiar are you with diagnostic algorithms: • Not familiar. • Have understanding of a basic algorithm. • Have implemented a system using one.
M.2. Questions Regarding the Experiment you have just performed
M.2
346
Questions Regarding the Experiment you have just performed
We would greatly appreciate it if you could take the time to answer the following questions:
M.2. Questions Regarding the Experiment you have just performed
347
M.2. Questions Regarding the Experiment you have just performed
M.2.1
General Questions
What feature of MAKTab did you find most difficult to use and why?
What feature of MAKTab did you find most useful and why?
If necessary, do you feel you could currently use MAKTab to build another new KBS?
Please provide any further comments on the mapping tool.
Please provide any further comments on the KA tool.
Thinking in terms of building a new KBS, please provide any comments on the tool.
What feature(s) could be added to the tool to improve it?
348