download - Software Technology - TU Graz

0 downloads 0 Views 1MB Size Report
maintenance operations r a situation that triggers an enormous demand to extend and ..... tion projects, and we will discuss the experiences and lessons ...... TEKES for funding WeCoTin, ConSerWe, and Cosmos ... Workshop, 15th European Conference on Artificial ...... 2. for each constraint Ri, the formula (3i) is satisfiable:.
Proceedings of the

ECAI 2010 Workshop on Intelligent Engineering Techniques for Knowledge Bases (IKBET)

affiliated with the

19th European Conference on Artificial Intelligence August 16!20, 2010 Lisbon, Portugal Alexander Felfernig Franz Wotawa (eds.) i

Preface Knowledge bases are encountered in AI areas such as configuration, planning, diagnosis, semantic web, game playing, and others. Very often the amount and arrangement of the elements in the knowledge base outstrips the capability of a knowledge engineer to survey them and to effectively perform extension and maintenance operations ! a situation that triggers an enormous demand to extend and improve existing development and maintenance practices. This demand is further boosted by emerging Social Semantic Web applications that require the support of distributed engineering scenarios. Intelligent development and maintenance approaches are of high relevance and have attracted lasting research and interest from industrial partners. This is demonstrated by numerous publications, for example, in the IJCAI, AAAI, ECAI, KCAP, CP, IUI, and ISWC conference and workshop series. These working notes include research papers that are dealing with various aspects of knowledge engineering including overviews of commercial and open source systems, automated testing and debugging, knowledge representation and reasoning, knowledge base development processes, and open research issues. Eleven papers demonstrate the wide range of knowledge engineering research and the diversity and complexity of problems to be solved. Besides the contributions of research institutes we have received three industry!related contributions which show the relevance of intelligent knowledge engineering techniques in the commercial context. Best papers from the workshop have been invited to submit an extended version of the paper to the AICOM special issue on “Intelligent Engineering Techniques for Knowledge Bases”. This journal special issue will be available in Spring 2011.

Alexander Felfernig Franz Wotawa August 2010

ii

Organization Chairs Alexander Felfernig, Graz University of Technology Franz Wotawa, Graz University of Technology

Program Committee Alexander Felfernig, Graz University of Technology Dieter Fensel, University of Innsbruck Gerhard Friedrich, University of Klagenfurt John Gero, George Mason University Albert Haag, SAP Germany Dietmar Jannach, University of Dortmund Tomi Männistö, Helsinki University of Technology Monika Mandl, Graz University of Technology Lars Hvam, Technical University of Denmark Gregory Provan, Cork Constraint Computation Centre Monika Schubert, Graz University of Technology Steffen Staab, University of Koblenz Gerald Steinbauer, Graz University of Technology Markus Stumptner, University of South Australia Juha Tiihonen, Helsinki University of Technology Franz Wotawa, Graz University of Technology Markus Zanker, University of Klagenfurt

iii

Table of Contents Modelling Product Families for Product Configuration Systems with Product Variant Master N. H. Mortensen, L. Hvam, and A. Haug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Sigma: An Integrated Development Environment for Logical Theories A. Pease and C. Benzmüller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Characterization of Configuration Knowledge Bases J. Tiihonen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Efficient Explanations for Inconsistent Constraint Sets A. Felfernig, M. Schubert, M. Mandl, G. Friedrich, and E. Teppan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 On Classification and Modeling Issues in Distributed Model!based Diagnosis F. Wotawa and I. Pill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Diagnosis Discrimination for Ontology Debugging K. Shchekotykhin and G. Friedrich . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 On the Way to High!Level Control for Resource!Limited Embedded Systems with Golog A. Ferrein and G. Steinbauer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 A Domain Language for Expressing Engineering Rules and a Remarkable Sub!Language G. Narboni . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Protocols for Governance!free Loss!less Well!organized Knowledge Sharing P. Martin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Knowledge!based Implementation of Metalayers – The Reasoning!Driven Architecture L. Hotz and S. von Riegen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Challenges of Knowledge Evolution in Practice A. Falkner and A. Haselböck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

iv

Modelling Product Families for Product Configuration Systems with Product Variant Master Niels Henrik Mortensen, Lars Hvam, Anders Haug1 Abstract. This article presents an evaluation of applying a suggested method for modelling product families for product configuration based on theory for modelling mechanical products, systems theory and object-oriented modelling. The modelling technique includes a so-called product variant master and CRC-cards for modelling and visualising the parts and properties of a complete product family. The modelling techniques include: Customer, engineering and part views on the product assortment to model the properties, functions and structure of the product family. This also makes it possible to map the links between the three views. Modelling of characteristics of the product variants in a product family Modelling of constraints between parts in the product family Visualisation of the entire product family on a poster e.g. 1x2 meters The product variant master and CRC-cards are means to bridge the gap between domain experts and IT-developers, thus making it possible for the domain experts (e.g. engineers from product development) to express their knowledge in a form that is understandable both for the domain experts and the IT-developers. The product variant master and CRC-cards have currently been tested and further developed in cooperation with several industrial companies. This article refers to experiences from applying the modelling technique in three different companies. Based upon these experiences, the utility of the product variant master is evaluated. Significance. Product configuration systems are increasingly used in industrial companies as a means for efficient design of customer tailored products. The design and implementation of product configuration systems is a new and challenging task for the industrial companies and calls for a scientifically based framework to support the modelling of the product families to be implemented in the configuration systems. Keywords. Mass Customization, modularization, product modelling, product configuration, object-oriented system development.

1.

INTRODUCTION

A product configuration system is based on a model of the product portfolio. A product model can be defined as a model that describes a product’s structure, function and other properties as well as a product’s life cycle properties e.g. manufacturing, assembly, transportation and service (Krause, 1993, Hvam, 1999). A product model used as a basis for a product configuration system also includes a definition of the rules for generating variants in the product assortment (Hvam et al, 2008, Schwarze, 1996). A product configuration system is a knowledge-integrated or intelligent product model, which means that the models contain knowledge and information about the products, and based on this is able to derive new specifications for product instances and their life cycle properties. Experiences from a considerable number of industrial companies have shown that often these product configuration systems are constructed without the use of a strict modelling technique. As a result of this, many of the systems are unstructured and undocumented, and they are, therefore, difficult or impossible to maintain or develop further. Thus, there is a need to develop a modelling technique which can ensure that the product configuration systems are properly structured and documented, so that the configuration systems can be continually maintained and developed further. In order to cope with these challenges, a technique for modelling product families for product configuration has been developed - which makes it possible to document the product configuration systems in a structured way. This article evaluates the experiences from applying the suggested method (product variant master and CRC-cards) for modelling product families for product configuration systems. The suggested method is based on three theoretical domains: Object-oriented modelling (Bennett et al, 1999; Booch et al 1999; Felfernig; 2000, Hvam 1999) System theory (Skyttner 2005; Bertalanffy 1969) Modelling mechanical products (Hubka, 1988; Schwarze, 1996) The theory for modeling mechanical products and systems theory are used for defining the structure in the PVM and the CRC-cards, and thereby the structure of the configuration system, reflecting the product families to be modeled and the user requirements of the configuration system.

Customers worldwide require personalised products. One way of obtaining this is to customise the products by use of product configuration systems (Tseng and Piller, 2003, Hvam et al 2008).

1

Centre for Product Modelling, Technical University of Denmark. www.productmodels.org, www.man.dtu.dk, www.sam.sdu.dk

1

2. MODELLING PRODUCT FAMILIES FOR PRODUCT CONFIGURATION SYSTEMS 2.1. The Product Variant Master A company’s product range often appears to be large and have a vast number of variants. To obtain an overall view of the products, the product range is drawn up in a so-called product variant master (Hvam et al 2008).

Figure 1. Principles of the Product Variant Master

A product variant master consists of two parts. The first of these, the “part-of” model (the left-hand side of the product variant master), contains those modules or parts which appear in the entire product family. For example, a car consists of a chassis, motor, brake system etc. Each module/part of the product range is marked with a circle. It is also possible to specify the number of such units used in the product - for example, 1 motor and 4 wheels in each car. The individual modules/parts are also modeled with a series of attributes which describe their properties and characteristics.

esel motor. This is shown on the product variant master as a generic structure, where the generic part is called the motor, and the specific parts are called petrol motor and diesel motor, respectively. The two types of structure,”part of” and ”kind of”, are analogous to the structures of aggregation and specialization within object-oriented modelling. The individual parts are also described with attributes, as in the part-of model. In the product variant master, a description is also given of the most important connections between modules/parts, i.e. rules for which modules/parts are permitted to be combined. This is done by drawing a line between the two modules/parts and writing the rules which apply for combining the modules/parts concerned. In a similar manner, the life-cycle systems to be modelled are described in terms of masters that for example describe the production system or the assembly system. The individual modules/parts in the product variant master are further described on so-called CRC cards, which are discussed in more detail in the next section. It is normally possible to create the product variant master in such a way that the products are described from the customer’s point of view at the top, followed by a description of the product range seen from an engineering point of view, while the products are described from a production point of view at the bottom as shown in figure 2.

Figure 3. CRC-card

Figure 2. Structure of the Product Variant Master

The other part of the product variant master (the right-hand side) describes how a product part can appear in several variants. A motor, for example, can be a petrol or di-

In connection with the specification of the product variant master and the final object-oriented class model, CRC cards are used to describe the individual object classes (Bellin, 1997), (Hvam et al., 2003), (Hvam et al, 2008). CRC stands for “Class, Responsibility, and Collaboration”. In other words, this is where a description is made of what defines the class,

2

including the class’ name, its possible place in a hierarchy, together with a date and the name of the person responsible for the class. In addition, the class’ task (responsibility), the class’ attributes and methods, and with which classes it collaborates (collaboration) are given. Figure 3 shows an example of a CRC card. In connection with the modelling of products, a sketch of the product part is added, with a specification of the attributes of the product part. The CRC cards are filled in gradually during the objectoriented analysis. The CRC cards can be associated with both the product variant master and the OOA model. The purpose of the CRC cards is to document detailed knowledge about attributes and methods for the individual object classes, and to describe the classes’ mutual relationships. The CRC cards serve as documentation for both domain experts and system developers, and thus, together with the product variant master and the class diagram, become an important means of communicating and documenting knowledge within the project group. A more detailed description of the individual fields on the CRC cards is as follows: Class name: The CRC card is given a name that is unique, so that the class can be identified in the overall structure. Date: Each card is dated with the date on which the card was created, and a date for each time the card is revised. Author/ version: The person who created the card and/or has the responsibility for revising the card and the card’s version number. Responsibilities: This should be a short text describing the mission of the class. This makes it easier to get a rapid general view of what the class does. Aggregation and generalization: The class’ position in relation to other classes is described by specifying the class’ place in generalization-specialization structures, respectively, or aggregation structures. This is done by describing which superclasses or subclasses are related to the class within either a generalization-specialization hierarchy or an aggregation hierarchy. Generalization categorises classes with common properties from general to specific classes in the so-called generalization-specialization hierarchies, also known as inheritance hierarchies, because the specialized classes further down in the hierarchy “inherit” general properties from the general (higher level) classes. The other type of aggregation is a structure in which a higher-level class (the whole) consists of a number of subordinate classes (the parts). Using ordinary language, decomposition of an aggregation structure can be expressed by the phrase “has a” and aggregation by “is part of”. Sketch: When working with product descriptions, it is convenient to include a sketch, which in a concise and precise manner describes the attributes included in the class. Geometric relationships are usually easier to explain by using a sketch/drawing than by using words. Class attributes: The various parameters such as height-interval, widthinterval etc., which the class knows about itself, are described in the field “Attributes”. Attributes are, as previously men-

tioned, described by their names, datatype (if any), range of values and units (for example, Length, [Integer], [1..10] mm). It is often convenient to express attributes in table form. Class methods: What the class does (for example, calculation of an area) is placed in the field of ”Methods”, which, as stated, can be divided into system methods and product methods. Methods can be described in a natural language with the help of tables, with a pseudo code, by using a formal notation such as Object Constraint Language (OCL) (Warmer et al., 1999), which is a standard under UML, or by using the notation from individual items of configuration software.

3.

CASE STUDY

3.1. The Industry Applications The Product Variant Master has been applied in several manufacturing companies. In the following we shall give a brief introduction to three of those companies and their configuration projects, and we will discuss the experiences and lessons learned by using the product variant master and CRC-cards for modelling product families for product configuration systems.

Company A Company A is an engineering and industrial company with an international market leading position within the area of development and manufacturing of cement plants. The company has a turnover around 1 billion USD. A modern cement plant typically produces 2-10,000 tonnes of clinkers per day (TPD), and the sales price for a 4,000 TPD plant is approx. 100 million USD. Every complete cement plant is customized to suit the local raw material and climatic conditions, and the lead-time from signing the contract to start-up is around 2½ years. The company has initiated a project on the development of a product configuration system for the selection of the main machines of the factory and the overall determination of capacity, emissions etc. based on 2-300 input parameters on e.g. raw materials, types and qualities of finished goods, geographical conditions, energy supply etc. The configuration system is meant to be used in the early sales phase for making budget quotations, including an overall dimensioning of the cement factory, a process diagram and a price estimate. In the product analysis, the cement factory was divided into 9 process areas. Based on the process areas the model was structured hierarchically, starting with scope list and mass flow leading to the selection of solution principles described as arrangements and lists of machines. The product analysis was carried out by using the product variant master and the CRC-cards. The product variant master proved to be a useful tool for modelling the overall cement milling processes and machine parts of the cement factory. The product variant master was built up through a series of meetings between the modelling team and the product experts at the company. The detailed attributes and constraints in the model were described on the CRC-cards by the individual product experts. The product variant master and the CRC-cards formed the basis for making a formal object-oriented model of the configuration system. Based on the product variant master an

3

object-oriented class structure was derived and the CRC-cards were checked for inconsistency and logical errors. The product variant master was drawn by using MS-Visio, the CRC-cards were typed in MS-Word documents. The product configuration system was implemented in an object-oriented standard configuration software “iBaan eConfiguration Enterprise.” The OOA model was derived by the programmer of the configuration system based on the PVM and the CRC-cards. The CRC-cards both refer to the product variant master and the object-oriented class model, which represents the exact structure of the product configuration system in the “iBaan eConfiguration Enterprise.” The experiences are summarised in table 1 below. Table 1. PVM and CRC-cards in company A Modelling tool for domain experts – how the PVM was received by the domain experts The possibilities for structuring product knowledge Visualisation and decision making on the product families in focus

PVM as a basis for programming the configuration system PVM and CRC cards as documentation when maintaining and updating the configuration system

The domain experts were engineers from product design and project engineering. The domain experts could easily learn the modelling techniques for the purpose of evaluation of the models and input of information. The modelling was carried out by the configuration team and external consultants. The customer, engineering and part views have proved to be useful when modelling these products of high complexity. Structuring the PVM requires knowledge on domain theory for modelling mechanical products. The Product Variant Master proved to be able to capture and visualise the product assortment in this case on a high level of abstraction. Need to control the discussions to avoid non relevant details in the model. Important to involve the right domain experts in order to make decisions needed on the product variants to include in the configuration system. Based on the PVM an OOA-model was set up to be used as a basis for implementing and maintaining the configuration software. Easy to transform the PVM to a formal OOA model. The PVM is only used in the analysis phase and is not maintained in the phase of running the configuration system. The OOA-model is currently being updated and used in the running phase of the system. The company has over the last 10 years increased the level of documentation and searched for ITsystems to manage the documentation task.

The product configuration system is currently being updated and further developed by a task force who is responsible for updating and further developing the model and the configuration system in cooperation with the product specialists. Maintaining the documentation in Visio and Word has proved to be a tedious work, as the same attributes and rules will have to be updated in several systems. In order to improve this, the company has during the last year implemented an IT-system for documentation, which means that the rules and attributes now only will have to be entered once into the documentation system. The project at Company A has proven that the use of the three views in PVM can be useful when modelling complex

products. Finally, the configuration system has been implemented in two different product configuration systems. The first configuration system was not object-oriented, while the second configuration system (“iBaan eConfiguration Enterprise”) was fully object-oriented. Applying an object-oriented system made it considerably easier to implement and maintain the product configuration system.

3.1.1

Company B

Company B is an international engineering company that has a market leading position within the area of design and supply of spray drying plants. The company is creating approx. 340 mio. USD in turnover a year. The products are characterised as highly individualised for each project. The configuration system was implemented in 2004 at company B and is in many ways similar to the configuration system at company A. The project focuses on the quotation process. During the development of the product model the need for an effective documentation system has emerged. Early in the project it was decided to separate the documentation system from the configuration software due to the lack of documentation facilities in the standard configuration systems. Lotus Notes Release 5 is implemented throughout the company as a standard application, and all the involved people in the configuration project have the necessary skills to operate this application. The documentation tool is, therefore, based on the Lotus Notes application. The documentation system is built as a hierarchical file sharing system. The UI is divided into two main parts, the product variant master and the CRC cards. However, the product variant master is used in a different way. Only the whole-part structure of the product variant master is applied in the documentation tool. Main documents and responses are attached to the structure of the product variant master. The configuration system is implemented in Oracle Configurator. Since the standard configuration software does not provide full object orientation, the CRC cards described in section xx has been changed to fit the partly object-oriented system. The fields for generalization and aggregation have been erased. The aggregation relations can be seen from the product variant master and generalization is not supported. To ease the domain expert’s overview, an extra field to describe rules has been added. In this way, methods (does) have been divided into two fields, does and rules. “Does” could be e.g. print BOM while “Rules” could be a table stating valid variants. The CRC Card is divided into three sections. The first section contains a unique class name, date of creation and a plain language text explanation of the responsibility of the card. The second section is a field for sketches. This is very useful when different engineers need quick information about details. The sketch describes in a precise manner the attributes that apply to the class. The last section contains three fields for knowledge insertion and collaborations. Various parameters such as height, width etc., which the class knows about itself, are specified in the “knows” field. The “knows” field contains the attributes and the “rules” field describes how the constraints are working. The “does” field describes what the class does, e.g. print or generate BOM. Collaborations specify

4

which classes collaborate with the information on the CRC card in order to perform a given action. The documentation system has been in use throughout the project period. As mentioned, the documentation system was created by using standard Notes templates. This gives limitations according to functions of the application. Class diagrams are not included in the documentation tools. They must be drawn manually. Implementing class diagrams would demand a graphical interface, which is not present in the standard Lotus Notes application. Table 2 below lists the main findings from company A.

3.1.2

Company C produces data centre infrastructure such as uninterruptible power supplies, battery racks, power distribution units, racks, cooling equipment, accessories etc. The total turnover is approx. 4 billion USD (2008). The company has applied the PVM and CRC-cards since 2000. Today, the company has 8-9 product configuration systems. The company has formed a configuration team with approx. 25 employees situated in Kolding, Denmark. The configuration team is responsible for development and maintenance of the product configuration systems, which are used worldwide.

Table 2. PVM and CRC-cards in company B Modelling tool for domain experts – how the PVM was received by the domain experts The possibilities for structuring product knowledge

Visualisation and decision making on the product families in focus

PVM as a basis for programming the configuration system PVM and CRC cards as documentation when maintaining and updating the configuration system

The modelling is carried out by the configuration team and reviewed by the domain experts/ engineers. However, the domain experts contributed with parts of the model reflecting their special field of competence. Easy for the domain experts to learn to read and review the PVM and CRCcards. The model includes complex and highly engineered products. The three views in the PVM were used to clarify the interdependencies between customer requirements, main functions in the products and the Bill of material (BOM) structure. These interdependences were expressed as rules and constraints in the configuration model. The PVM was used to determine the preferred solutions to enter into the configuration system. The links between the three views provided insight into the consequences of delimiting which customer requirements to meet main functions and the number of main BOM’s to include in the configuration system. The sketches on the CRC-cards have proved to be very useful in the communication with the domain experts. The PVM and the CRC-cards were used as documentation for programming the configuration system. No formal OOA model was made.

The PVM was transferred into a file structure in the company’s file share system (Lotus Notes). The CRC-cards are stored and currently updated in Lotus notes. Domain experts are responsible for updating the individual CRC-cards.

Lessons learned from the project at Company B were that an IT-based documentation tool is necessary in order to secure an efficient handling of the documents in the product model (product variant master and CRC-cards). The configuration system is implemented in Oracle Configurator, which is not a fully object-oriented configuration system. This means that e.g. inheritance between object classes in the product configuration system is not possible. The company uses a variant of the product variant master, where only the whole part structure on the right side of the product variant master is applied. However, the experiences from the project are that the revised product variant master and the CRC-cards still secure a structure and documentation of the system.

Company C

Table 3. PVM and CRC-cards in company C Modelling tool for domain experts – how the PVM was received by the domain experts The possibilities for structuring product knowledge

Visualisation and decision making on the product families in focus PVM as a basis for programming the configuration system PVM and CRC cards as documentation when maintaining and updating the configuration system

The PVM is used for modelling the product families in a cooperation between the configuration team and engineers from product development

Relative to company A and B this company has a more “flat” structure in the PVM and configuration system, meaning that the level of structuring is lower than company A and B. The PVM and CRC-cards is set up by the configuration team and afterwards discussed with product development. The product configuration systems are set up after the product development project is completed. The PVM and CRC-cards lead to clarification of decisions on product variants that haven’t been made in the development project. The configuration system is programmed based on an object-oriented model made from the PVM and on the CRC-cards.

Only the CRC-cards are used for maintenance and update of the product configuration systems. The CRC-cards are stored and currently updated in Lotus notes. The configuration team members are responsible for updating the individual CRCcards.

The product assortment is modeled in a co-operation between the configuration team and the product development teams. The product variant master and the CRC cards are used in the modeling process and document the configuration systems throughout programming and maintenance of the product configuration systems. Similar to Company B, company C has developed a Lotus Notes based documentation tool – called the CRC-card database - to handle the documentation of the models. The configuration systems are implemented in Cincom Configurator, which is a rule based configuration system. Lessons learned from company C are that the need for an IT-based documentation tool is even bigger at this company, than at Company B. Running a configuration team with 25 employees, which have to communicate with product development teams around the world, requires a structured procedure for building the product configuration systems, as well as a Web-based documentation tool, which can be accessed by employees worldwide. At company C, a Notes based docu-

5

mentation tool has been developed similar to the Notes application at company B. The documentation system has now been running for 6 years. The experiences from running the documentation system is that the structure in the product variant master and the CRC-card database form a solid basis for communicating with e.g. product designers and for maintaining and further developing the product configuration systems. However, the fact that the documentation system is separated from the configuration software means that the attributes and rules have to be represented in both the documentation tool and in the configuration with only limited possibilities of relating the two systems to one another. This means that the configuration team at company C needs to be disciplined about updating the documentation system every time a change is made in the configuration system.

Felfernig, A., Friedrich, G. E. & Jannach, D. (2000). UML as Domain Specific Language for the Construction of Knowledge-based Configuration Systems. In International Journal of Software Engineering and Knowledge Engineering, volume 10/4.

4. CONCLUSION

Hvam, L. & Riis, J. (2003). CRC-card for product modeling. Computers in Industry, volume 50/1.

The proposed modelling techniques are based on well-known and proven theoretical elements; theory for modelling mechanical products, systems theory and object-oriented modelling. The aim of the product variant master and CRC-cards are to serve as a tool for engineers and programmers working with design, implementation and maintenance of product configuration systems. The experiences from applying the procedure in the above mentioned three industrial companies show that the product variant master and CRC-cards contribute to define, structure and document the product configuration systems. The product variant master and the CRC-cards make it possible to communicate and document the product assortment, including rules and constraints on how to design a customer tailored product. The product variant master is developed based on the basic principles in object-oriented modelling and programming. However, only a very small part of the standard configuration systems based on rules and constraints and including an inference engine are fully object-oriented. In order to meet this actual situation the product variant master and the CRC cards have been changed to fit into configuration systems, which are not fully object-oriented. The experiences from applying the procedure in building configuration systems in non objectoriented systems are positive. However, structuring and maintaining configuration systems are considerably easier in an object-oriented standard configuration system. Furthermore, an integration of the documentation tool and the configuration systems would ease the work of design, implementation and maintenance of product configuration systems considerably.

Hvam L., Mortensen N.H., Riis J.; Product Customization, Springer 2008.

Hubka, V.; Theory of technical Systems, Springer 1988. Hvam, L. (1994). Application of product modelling – seen from a work preparation viewpoint. Ph.D. Thesis. Dept. of Industrial Management and Engineering, Technical University of Denmark. Hvam, L. (1999). A procedure for building product models. Robotics and Computer-Integrated Manufacturing, 15: 77-87. Hvam, L. & Malis, M. (2002). A Knowledge Based Documentation Tool for Configuration Projects. In Mass Customization for Competitive Advantage (Ed.: M.M. Tseng and F.T. Piller).

Jackson, P. (1999). Introduction to Expert Systems. 3rd edition. Addison-Wesley. Krause, F.L., Kimura, F. & Kjellberg, T. (1993). Product Modelling. In Annals of the CIRP, volume 42/2. Schwarze, S. (1996). Configuration of Multiple-Variant Products. BWI, Zürich. Skyttner L.; General Systems Theory, 2005. Tiihonen, J., Soininen, T., Männistö, T. & Sulonen, R. (1996). State-of-the-practice in Product Configuration – a Survey of 10 Cases in the Finnish Industry. In Knowledge Intensive CAD, volume 1 (Ed.: Tomiyama, T., Mäntylä, M. & Finger S.). Chapman & Hall, pp. 95-114. Tseng Mitchell M. and Piller, Frank. T. eds(2003); The Customer Centric Enterprise – Advances in Mass Customization and Personalization; Springer Verlag. ISBN 3-540-02492-1. Warmer J., Kleppe A; The Object Constraint Language; Addison Wesley, 1999. Whitten, J.L, Bentley, L.D. & Dittman, K.C. (2001). Systems Analysis and Design Methods. Irwin/McGraw-Hill, New York, USA.

5. REFERENCES Bellin, D. & Simone, S. S. (1999). The CRC-card Book. Addison-Wesley. Bennett, S., McRobb, S. & Farmer, R. (1999). ObjectOriented Systems Analysis and Design using UML. McGrawHill. Booch, G., Rumbaugh, J. & Jacobson, I. (1999). The Unified Modeling Language User Guide. Addison-Wesley. Bertalanffy L.; General System Theory; 1996.

6

                      !  $  %                                                             !"      #       $                       $      

                    .  $ %          - /$                       $                      !)0*1     %2%23     $               *              .                              %      $      .       4    !      $                                %    $     .                     $  !   '  5.5 .             '      $   6        $  .       *          $  

 &'() (& %           &  %'           $    !#(  %     )        )      *                  %   $       )     %    '  +,         -.                 '  $             )    !            

*+ ,- *+    

6 6  76,89    :      %     ;   /, &?@4)

7

    ,           .-    !                                      )      %  6 *  ,   7A  % ;  H    6                    *  %.$!)0*   (authors Dickens OliverTwistBook)   #                               

6     .              A7  B   =7B>            +       - .   .       %            ! = !>

#   !                         .    .                      %            !#(   !) 0*%   .           .  6      '            !            C     %  =*>         =>            = >           =">         =3>             #           !                      

(format EnglishLanguage authors "%1 is %n the &%author of %2") (format it authors "%1 è l' &%autore di %2")

      .     ,           5+.  !% 5 $!)0*       5+.  I  C    !  % 5   6           K K    $    $$   %  ;  #F    4 1   .  $           $     ,  #F#  

  3?   %    )    .       !                      .                    6        .             .        $                !  !   !)0* 1    0   * ?  #   .        5 5        !         $                                                        

.),( %  ! " ?   '       %  $     !  

                              8=> !55       ???  3??? $       D&?  = >6 )(! = (!>            $           .        !# 

          '                        6         =  >                 $  =  >                 %         55          .   !    ???                                         (!        =">  %           A                                     %        ? ???  D? ???$ #           )            !   %      E6;!3                            !  !   #F$ &     !         #F            -    / 7 D G  -   !C     -  C$      ! 

/'(0 &1  2   '               .   $   =*> 7.      .  $    F$$     = >$  6            $#                                    ! .  $      .    .                                           

8

                       $                            #                                           #  

                       

      .%           % !)#F       $      #                    6                         .    A 6        $A       %                         

  A  / 

    A    

  ??G                                                                                                  !         A   %            6              .   

3&2  )11 &1     A           -   %        - .      H       ) )

*+ .$ 4 

9

 !                )    B      )      !      B =  >  %    =>                    !  55       %                $             %                 .     $*$   $ $ M   3         $           $     

                                         -  6           !C               .       %             )                    %                $ 7        $  %2%2     !   !)        767 3 &     .                 .       $   -6. 

    !  ,  = ,>4      $ .  - 6  

      !     G                 D % .       !        6                                              /         $       .     $ .   ,$       .       *    $         .   .    $     $ C              .           $  $      6                       *   ! 6 !#(# !   (   0 do repeat nxt.OutputSetSpeed(MOTOR_A,NO_RAMP,50) nxt.OutputSetSpeed(MOTOR_B,NO_RAMP,-50 until (nxt.InputGetStatus(SENSOR_1) < BLACK_LIMIT) nxt.OutputSetSpeed(MOTOR_A,NO_RAMP,0) nxt.OutputSetSpeed(MOTOR_B,NO_RAMP,0) end end

43

[4] G. De Giacomo, Y. Lsperance, and H. Levesque, ‘ConGolog, A concurrent programming language based on situation calculus’, Artificial Intelligence, 121(1–2), 109–169, (2000). [5] Alexander Ferrein, ‘lua.golog: Towards a non-prolog implementation of golog for embedded systems’, in Cognitive Robotics, eds., Gerhard Lakemeyer, Hector Levesque, and Fiora Pirri, number 100081 in Dagstuhl Seminar Proceedings. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany, (2010). to appear. [6] Alexander Ferrein and Gerhard Lakemeyer, ‘Logic-based robot control in highly dynamic domains’, Robotics and Autonomous Systems, Special Issue on Semantic Knowledge in Robotics, 56(11), 980–991, (2008). [7] Alexander Ferrein, Gerald Steinbauer, Graeme McPhillips, and Anet Potgieter, ‘RoboCup Standard Platform League - Team Zadeat - An Intercontinental Research Effort’, in International RoboCup Symposium, Suzhou, China, (2008). [8] Giuseppe De Giacomo, Yves Lesprance, Hector J. Levesque, and Sebastian Sardina, Multi-Agent Programming: Languages, Tools and Applications, chapter IndiGolog: A High-Level Programming Language for Embedded Reasoning Agents, 31–72, Springer, 2009. [9] Henrik Grosskreutz and Gerhard Lakemeyer, ‘ccgolog – A logical language dealing with continuous change.’, Logic Journal of the IGPL, 11(2), 179–221, (2003). [10] Ralph Hempel. pblua – scripting fot the LEGO NXT. http://www.hempeldesigngroup.com/lego/pblua/, 2010. (last visited on May 21, 2010). [11] Ashwin Hirschi, ‘Traveling Light, the Lua Way’, IEEE Software, 24(5), 31–38, (2007). [12] Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar Celes Filho, ‘Lua - An Extensible Extension Language’, Software: Practice and Experience, 26(6), 635 – 652, (Jan 1999). [13] Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar Celes Filho, ‘The Evolution of Lua’, in Proceedings of History of Programming Languages III, pp. 2–1 – 2–26. ACM, (2007). [14] Kurt Konolige, Karen Myers, Enrique Ruspini, and Alessandro Saffiotti, ‘The saphira architecture: A design for autonomy’, Journal of Experimental and Theoretical Artificial Intelligence, 9, 215–235, (1997). [15] Hector J. Levesque and Maurice Pagnucco, ‘Legolog: Inexpensive experiments in cognitive robotics’, in Proceedings of the Second International Cognitive Robotics Workshop, Berlin, Germany, (2000). [16] Hector J. Levesque, Raymond Reiter, Yves Lesprance, Fangzhen Lin, and Richard B. Scherl, ‘GOLOG: A logic programming language for dynamic domains’, The Journal of Logic Programming, 31(1-3), 59 – 83, (1997). Reasoning about Action and Change. [17] J. McCarthy, ‘Situations, Actions and Causal Laws’, Technical report, Stanford University, (1963). [18] H. Pham, Applying DTGolog to Large-scale Domains, Master’s thesis, Department of Electrical and Computer Engineering, Ryerson University, Toronot, Canada, 2006. [19] Raymond Reiter, Knowledge in Action. Logical Foundations for Specifying and Implementing Dynamical Systems, MIT Press, 2001. [20] Stuart Russell and Pater Norvig, Artificial Intelligence: A Modern Approach (Second Edition), Prentice Hall, 2003. [21] Paul Talaga and Jae C. Oh, ‘Combining AIMA and LEGO mindstorms in an artificial intelligence course to build realworldrobots’, Journal of Computing Science in Colleges, 24(3), 56–64, (2009). [22] The Debian Project. The Computer Language Benchmarks Game. http://shootout.alioth.debian.org/. retrieved Jan 30th 2009.

44

A domain language for expressing engineering rules and a remarkable sub-language Guy A. Narboni 1 Abstract.1We describe and analyse a constraint language for expressing logic rules on variables having a finite integer range. Our rules can involve linear arithmetic operations and comparisons to capture common engineering knowledge. They are capable of bidirectional inferences, using standard constraint reasoning techniques. Further, we show that for a restricted sub-language the basic propagation mechanism is complete. For systems in that class, local consistency implies global consistency. This noteworthy property can be instrumental in assessing the scalability of an Expert System, which in practice is a cause for concern with the growth of data. When the knowledge base passes a syntactic check, we can guarantee that its solution is search-free and therefore backtrack-free. By way of an illustration, we examine the worst case complexity of a dynamic product configuration example.

It appears that, although the clause h c1 is both Krom and Horn (it has exactly 2 literals, and no more than one is positive), the resulting system is neither Krom nor Horn. Now, those are the two best known classes of rules for which the Boolean satisfiability problem can be solved in reasonable time. So, should we fear the worst case behaviour (i.e., a combinatorial explosion) when having to reason with larger rule sets translating conditions of such a simple kind? Until proven otherwise, presumably. In this paper, we show that for some specific systems (including the above example), such guarantees can be given. To start with, we define a small constraint language for expressing logic rules on variables having a finite integer range. We observe that forward and backward inferences can be performed rule-wise, by narrowing the variables’ domains. Then, assuming that consistency is checked locally, at the rule level, using the general propagation mechanism for constraint solving, we prove that, in a special case, we get global consistency for free. Rules in that remarkable case are easily recognizable from their syntactic structure. Coming from tractability theory, this result generalizes the Horn propositional case. Despite the relative narrowness of the class of rules identified, we give a full-fledged example to which it is applicable. We thus conclude that it should be of practical interest for the handling of vast knowledge bases, in areas such as dynamic product configuration where scalability is a major concern.

1 INTRODUCTORY EXAMPLE Rules are a natural means for stating general principles, legal or technical. For instance in Europe, any installation distributing electrical power must comply with the IEC standards. A mandatory safety requirement stipulates that: in a humid environment, the degree of protection provided by enclosures must be rated IPX1. Formally, we have a logical implication: env = humid code = IPX1 In that statement, the environment characteristic can only take two values (dry or humid), whereas the IP code has a wider but still finite set of choices (including the IP3X, IP4X and IPX1 values). The above rule therefore pertains to propositional calculus. With respect to Boolean satisfiability, an equivalent formulation, using binary variables only, is given by the clausal system:

2 LOGIC RULES We shall resort to first order logic for expressing the rules. Rules are clearly easier to read and write using attribute - value pairs, since variables are directly associated to real world domain values. Modeling tools for engineers such as the Component Description Language for Xerox machines [1] or the EXPRESS product data specification language [2] often follow that convention. We’ll further assume that: • the problem’s dimension n is known (we have a finite number of attributes) • all attribute domains are finite and —for convenience— integer valued (the latter is not a true restriction since one can always use a one-to-one mapping for encoding values).

h c1 c3 ! c4 ! c1

where h is true when the environment is humid and where c1 (resp., c3, c4) is true when the degree of protection is IPX1 (resp., IP3X, IP4X). The additional disjunction ensures that the equipment has indeed an IP code (i.e., c3, c4 and c1 cannot be simultaneously

false)2.

1. 2.

Implexe, France, email: [email protected] If need be, 3 extra clauses will forbid multiple selections:

c3 " c4 c3 " c1 c4 " c1

false false false

45

2.1 Language definition

2.3 Finite domains solvers

In our language, constants are integers. Variables correspond to attribute names. The functional expressions allowed are linear expressions with integer coefficients. The domain of interpretation is the set of integers with the usual arithmetic operations and the comparison relations =, # and $. A fact or atomic formula is a diophantine equation or inequality, i.e., a constraint of the form : linexp comparator constant where linexp denotes a linear expression in (at most) n variables. An atomic formula can be used to state a fact or a basic restraint. Often, there will be just one variable with a coefficient of 1, the other variables having zero coefficients, as in code $ 1. A conjunctive formula is defined recursively: an atomic formula is a conjunctive formula, and so is the conjunction of an atomic formula with a conjunctive formula. For instance, the constraint model Ax # b of an integer program (IP) is a conjunctive formula. A rule or conditional formula is an implication constraint: conjunctive formula atomic formula It is a restricted form of rule with possibly several antecedents but only one consequent. An example is a car configuration statement excluding the air conditioning option when the battery and the engine are of small size (e.g., 1 on a scale of 3), which writes:

General-purpose solvers routinely used in constraint programming languages are by design incomplete. Their deductive power is a trade-off between inference speed and strength. They check local consistency conditions that are necessary but not sufficient for ensuring global consistency. Most of the time, the initial domain filtering step (propagation) will have to be followed up by —and interlaced with— a programmable enumeration phase (labeling), in order to answer the question.

battery = 1 " engine = 1 (small)

(small)

3 INFERENCES AS DOMAIN REDUCTIONS 3.1 Database semantics A logic rule defines a relation. Interestingly, without resorting to constraints, our introductory example rule could not be expressed intensionally as a pure Prolog program, because of the Horn limitation. To do so, we would need to extend logic programs with disjunction [4] [5]. However, the semantics of the virtual relation R a logic rule defines can be expressed by an extensional database: {x | R(x) " x & D1 ' ...' Dn} The set of points (or tuples) that satisfies the formula corresponds to the disjunctive normal form. Assume we use the following encodings in our example: • 0 for dry, 1 for humid • -1, 0 and 1 for IP3X, IP4X and IPX1, respectively. Then, the intensional model is the logical implication: env = 1 code = 1 together with the finite domain constraints at the source of the disjunctions: env & {0, 1} and code & {-1, 0, 1} The extensional model is the relational table: R(env, code) = {(0, -1), (0, 0), (0, 1), (1, 1)} The compatibility constraint thus rules out 2 out of 6 possibilities. Figure 1 depicts the relation R on a 2-dimensional lattice. We’ve highlighted the convex hull of its solution points which shows a least element, (0, -1) in the plane. We’ll go back later over this special geometrical property.

aircond = 0 (excluded)

Lastly, a knowledge base is a finite conjunction of, say m, formulas which can be either rules or facts1. We can view it formally as a quantifier-free Presburger arithmetic formula.

2.2 Global consistency checking Proving the consistency of a knowledge base is a central issue in automated deduction. This question applies to constraint programs made of logic rules. During an interactive session, new facts or restraints are typically added to the rule base which grows incrementally. Therefore, a truth maintenance system repeatedly needs to check the satisfiability of a closed formula: % x R1(x) " ..." Rm(x) " x & D1 ' ...' Dn (1) where each n-ary constraint Ri in the conjunct states a fact or a rule, and each unary range predicate xj & Dj in the cross-product sets a domain constraint (i.e., an integer constraint plus a pair of linear inequalities, basically). An inference system will be complete if it is able to answer no to the existential query as soon as the constraint system becomes inconsistent. The formula will be satisfiable if there is some instantiation of the variables by constants from the domains which makes it true (an instantiation makes each atomic constraint either true or false, and consequently, the formula can be evaluated to be true or false). Because there are finitely many possible assignments, the satisfiability problem is decidable. However, the decision procedure is equivalent in complexity to SAT solving. E.g., determining whether there exists a valid configuration for a set of rules can be a formidable challenge (the general problem being NP-complete) [3]. In practice, solution methods involve a search process —and don’t scale well. 1.

code 1 0 env -1

Figure 1. 2D representation

3.2 Compiled vs. interpreted approaches Once a relation is explicitly stored in an extensional table, the search for solutions is linear in the size of that table. However, the number of rows exponentially grows with the number of variables. With decision diagrams, compiled versions may enjoy much more compact forms (depending, of course, on the variable ordering).

We’ll include the facts into the rule base by likening a fact to a rule with no antecedent.

46

3.4

But again, there is no guarantee that the process of generating the table will terminate is reasonable time. The main advantage of the compilation approach is that the bulk of the work can be carried out off-line. The drawback is that the rule base then becomes static. When one has to deal with dynamicity, the alternative is a logic rule interpreter. This is the choice implicitly made in the sequel, through the use of a contraint solver for implementing the inference engine.

What propagation guarantees

Local consistency is a prerequisite for global consistency. The preceding examples required only bounds-consistency checking (stronger degrees of consistency can be defined). Before moving to the system level (i.e., to the consistency analysis of a set of rules), we need to recall the main properties of the general filtering mechanism which is at work with a finite domain solver [6]. While preserving equivalence, the propagation mechanism basically transforms formula (1) into formula (2): % x R1(x) " ..." Rm(x) " x & D1 ' ...' Dn (2)

3.3 Constrained queries The standard conditions that can be expressed in the where clause of a database selection query closely resemble our conjunctive formulas —at least, they share the same syntax.

The structural constraints Ri of (1) are unchanged. Only the variable domains may change and be contracted from Dj to Dj. The reduced domain Dj * Dj is the result of a fixpoint computation in which all inter-related rules take part. Note that domain constraints are the sole ones handled globally.

3.3.1 Modus ponens The SQL query: select code from R where env = 1 gives a unique solution: Answer(code) = {1} code = 1 is also the result obtained when applying the rule in forward chaining mode, in a cause and effect relationship. This corresponds to the classical inference: from env = 1 and env = 1 code = 1 we draw code = 1 Through the angle of constraint solving, the effect of applying a rule amounts to domain narrowing. One local consistency check performed on the domains is the following: when the domain De of env is reduced to {1}, the domain Dc of code is narrowed down to Dc ( {1}. Thus, if Dc ( {1} is empty, inconsistency is immediately detected. On the other hand, a query like: select code from R where env = 0 yields three solutions for the IP code since there is no restriction then on the degree of protection. Consequently, there will be no domain reduction.

No matter how local consistency is defined and implemented, the following property hold: 1. if there are any solutions to the system, then the solutions lie within the bounds specified by the cross-product of the reduced domains. In other words, there are no solutions outside. Should one domain become empty, then we have the proof that the system is inconsistent. When no domain is empty, the following properties equally hold: 2. for each constraint Ri, the formula (3i) is satisfiable: % x Ri(x) " x & D1 ' ...' Dn (3i) 3. in particular, for each variable xj, the formulas (3ijmin) and (3ijmax) are satisfiable: % x Ri(x) " xj = min(Dj) " x & D1 ' ...' Dn (3ijmin) % x Ri(x) " xj = max(Dj) " x & D1 ' ...' Dn (3ijmax) In other words, each rule does its best to narrow down the domains (e.g., by checking the consistency of partial assignments involving maximum and minimum domain values). When all domains reduce to a point, i.e., when min(Dj) = max(Dj), those properties guarantee that the unique corresponding assignment is indeed a solution. However in general, nothing certifies that a satisfying assignment for one constraint is also a satisfying assignment for another constraint.

3.3.2 Modus tollens The symmetric query: select env from R where code = 0 gives again a unique solution: Answer(env) = {0} This time, env = 0 is a result that cannot be obtained directly, in forward or backward chaining mode, from: env = 1 code = 1 since it doesn’t match the antecedent nor the consequent of the rule. It is indirectly the result of the contrapositive of the rule, i.e., the implication: code ) 1 env ) 1 The contrapositive can be expressed in our language as: code # 0 env = 0 But there is no need to add it to the rule base, since the implication and its contrapositive are logically equivalent. The engine takes care of it through a local consistency check: when the domain Dc of code is reduced to {-1, 0}, the domain De of env is narrowed down to De ( {0}. Thus, if De ( {0} is empty, inconsistency is immediately detected. It should be clear from the above that local consistency enforcement at the rule level can endow us with versatile deductive inference capabilities, through domain reductions.

We now come to a very specific case for which we can prove, thanks to a distinguished witness point, that local consistency automatically enforces global consistency.

4 A REMARKABLE CLASS OF CONDITIONAL CONSTRAINTS Using vector notation, we’ll write x # y whenever x is greater than y component-wise. The binary relation # defines a partial order in the space. If x and y are two integer points, their least upper bound is the point z = max {x, y} with integer coordinates zj = max {xj, yj}.

4.1 Semilattices A partially ordered set S is said to be a join- (resp., meet-) semilattice if for all elements x and y of S, the least upper bound (resp., the greatest lower bound) of the pair {x, y} exists. If S is a finite join-semilattice, it follows that every non-empty subset has a least upper bound. So, if S is non empty, it has a least

47

4.3 Sub-language definition

upper bound, max(S), which is the maximum (or greatest element) of S. Now, with S referring to the extensional database, this is the remarkable property that every constraint of our class will exhibit: the presence of a "max" (resp., a "min") element. Following [7], we can express the same property conveniently by referring to the intensional relation R.

We now restrict our attention to rule based systems only containing contraints of a special syntactic kind: • inequalities of type m • conditionals of type h. By showing that those constraints have the max-closed property, we’ll derive a scalability certificate for the knowledge base. This means that sizeable problem instances can be checked for consistency without resorting to enumeration.

4.2 Max-closed constraints 4.2.1 Motivation for study

4.3.1 max-closed inequalities

Cooper and Jeavons claim that max-closed constraints [7] give rise to a class of tractable problems which "generalizes the notion of a Horn formula in propositional logic to larger domain sizes". That is precisely what we are aiming at.

Assuming that all the aj’s are non-negative coefficients, we define a constraint of type m as: • either of the form: a1x1 + ... + anxn $+b • or of the form: a1x1 + ... + ak-1xk-1 + ak+1xk+1 + ... + anxn $+akxk + b In other words, a constraint of type m is a linear inequality a’x $ b with at most one negative coefficient (a’k = –|ak|). Property Every inequality of type m is max-closed. To simplify the proof, we’ll write: a1x1 + ... + akxk + ... + anxn $+a+xk + b where xk can appear on both sides with non-negative coefficients (the trick is: if a+ = 0, we are in the first case; if ak = 0, we are in the second). Now assume with have: a1x1 + ... + anxn $+a+xk + b and: a1y1 + ... + anyn $+a+yk + b Let zj = max {xj, yj}. Since zi $ xi, a1z1 + ... + anzn $ a1x1 + ... + anxn $+a+xk + b Since zi $ yi, a1z1 + ... + anzn $ a1y1 + ... + anyn $+a+yk + b Hence: a1z1 + ... + anzn $ max {a+xk + b, a+yk + b} that is: a1z1 + ... + anzn $ a+zk + b Let us briefly mention a direct and important consequence of that property: linear integer systems involving constraints of type m form a polynomial class [8]. When using finite domain solvers, propagation implements a global consistency check as well.

4.2.2 Definition A relation R is said to be max-closed if: whenever R(x) and R(y) hold, then R(max {x, y}) holds.

4.2.3 Example Constraints setting upper and lower bounds on the domains of variables, like 2 # xj or xj # 5, are max-closed. More generally, all unary constraints are max-closed.

4.2.4 Property of max-closed systems Consider a system with m max-closed constraints Ri in n unknowns ranging over finite domains. Then, if none of the reduced domains is empty, the greatest element of the cross-product D1 ' ...' Dn satisfies the formula: % x R1(x) " ..." Rm(x) " x & D1 ' ...' Dn (2)

4.2.5 Proof If D1 ' ...' Dn is not empty, the maximum element is (max(D1), ..., max(Dn)). Let us first show that it satisfies the formula: % x Ri(x) " x & D1 ' ...' Dn (3i) Consider the partial assignment xj = max(Dj). By definition of the reduced domains, there is an instantiation xj of x which satisfies the formula: % x Ri(x) " xj = max(Dj) " x & D1 ' ...' Dn (3ijmax) We can thus exhibit n points x1, ..., xn satisfying Ri. Since Ri is max-closed, max {x1, ..., xn} belongs to Ri and it corresponds the point (max(D1), ..., max(Dn)). Since (max(D1), ..., max(Dn)) satisfies x & D1 ' ...' Dn, as well as each constraint Ri(x), it also satisfies their conjunction. Hence the property.

4.3.2 max-closed conditionals We define a rule of type h as a conditional formula: Ax # b ax $ a+xk + b where • A is a non-negative matrix (i.e., all A entries are $ 0) • the consequent is an inequality of type m. Property Every conditional of type h is max-closed. Assume again we have two satisfying instanciations x and y: a1x # b1 " ... " amx # bm ax $ a+xk + b a1y # b1 " ... " amy # bm ay $ a+yk + b Let zj = max {xj, yj}. By definition ai $ 0, so we always have aix+# aiz and aiy+# aiz. Hence, if a1z # b1 " ... " amz # bm is true, the conditions: a1x # b1 " ... " amx # bm a1y # b1 " ... " amy # bm are both true. We therefore have: ax $ a+xk + b and ay $ a+yk + b.

4.2.6 Alternative view Geometrically speaking, the extensional database of a max-closed constraint is a finite semilattice with a greatest element. The nonempty intersection of such semilattices is a semilattice with the same property.

4.2.7 Main consequence A max-closed system that is locally consistent is globally consistent. Finite domain propagation is therefore sufficient for global consistency checking.

4.2.8 Corollary Any system of max-closed constraints is polynomial time solvable. This follows from the fact that the propagation algorithm is polynomial in the size of the constraint system and of the integer grid used for finitely discretizing the space (we here assume that grid is large enough to carry out all the computations accurately1).

1.

48

The price to pay can be a huge "constant" time.

The latter being a max-closed system1, we infer: az $ a+zk + b

4.4 Application to configuration problems We’ll conclude this paper with an example from the literature on dynamic product configuration. For the sake of brevity, we’ll use the simplified version of [10] which describes the possible variants of a hierarchical car model with 2 first-level characteristics (frame and package) and 3 components (engine, aircond, sunroof), the latter 2 being optional. The product structure of figure 2 can easily be flattened, leading to a problem in 8 unknowns, subject to the configuration rules listed in table 1. In our modeling, domains are ordered from left to right in their definition statement. For optional components, attributes have a supplementary "not available" value. When a component is not "active" (in the sense it isn’t part of the solution), its attributes are assigned the na value (any other assignment would be irrelevant).

4.3.3 Level of generality Clearly, our sub-language us not universal. So, what does it encompass? First, it allows to write rules with: • true antecedents: 0x # 1 (i.e., no conditions for facts) • false consequent: 0x $ 1 (for expressing integrity constraints). So, right- and left-hand sides can be considered optional. Second, by a duality argument, we have the min-closed property for the reverse order: Ax $ b ax # a+xk + b Min-closed systems that are locally consistent are ipso facto globally consistent2. Third, we can express Horn propositional clauses x1 " ... " xn-1 xn in two different ways: • either as (dual) inequalities of type m x1 + ... + xn-1 # xn + (n–2) • or as (dual) conditionals of type h x1 $ 1+" ... " xn-1 $ 1 xn $ 1 with x & {0,1}n. The fact that propagation implements unit resolution explains the good behaviour of finite domain constraint solvers in that case. Finally, the theory applies to our introductory example. We can rewrite env = 1 code = 1 as a (dual) conditional of type h: env $ 1 0 # code -1, i.e., as: env $ 1 code $ 1 Therefore, our single rule enjoys the min-closed property.

Engine

engine_size & {small, med, large}

Size ,

Aircond Type ,

aircond_opt & {discarder, selected} aircond_type & {na, ac1, ac2}

Type , Glass ,

sunroof_opt & {discarded, selected} sunroof_type & {na, sr2, sr1} sunroof_glass & {na, plain, tinted}

Figure 2. The 8 car characteristics Table 1. Car configuration rules

A1 A2 A3 A4 C5 C6 C7 C8

Sunroof option included if Package = luxury Sunroof option included if Package = deluxe Aircond option included if Package = luxury Sunroof option excluded if Frame = convertible Package = standard excludes Frame = convertible Package = standard excludes Aircond.Type = ac2 Package = luxury excludes Aircond.Type = ac1 Sunroof.Type = sr1 and Aircond.Type = ac2 excludes Sunroof.Glass = tinted A9 Aircond option included if Sunroof.Type = sr1

Provided that no addition is involved, a similar conclusion could be reached with the unary representation of integer intervals suggested by Bailleux et al. [9]. When translating a constraint into a propositional logic formula for SAT solving, this alternative mapping cleverly encodes a domain of size n using a vector of n–1 booleans bi. If i–1 refers to the ith element of the domain of x, we have the correspondance: bi = 1 if and only if x is greater than i (2 #+i+#+n) The bi’s are then connected by a set of Horn clauses: bn bn–1 ... b2 b1 rendering the logical entailment x $ i x $ i–1. For domains of size 2, we regain the standard encoding. For a domain of size 3 such as {–1, 0, 1}, we need 2 booleans: x = –1 is encoded as [0,0], x = 0 as [1,0] and x = 1 as [1,1]. This way, our single (function-free) introductory rule directly translates into a Horn system, thus confirming that solving can go without developing a full decision tree. The class property is more general and removes the need for a reformulation. As a side effect, the theory is able to explain why, without going into many experiments, such SAT encodings can be more efficient.

2.

frame & {hatchback, sedan, convertible} package & {standard, deluxe, luxury}

Sunroof

4.3.4 Constraint solving with SAT solvers

1.

Frame , Package ,

Part of the configuration rules are "activity" rules (A) for selecting options. The others are "compatibility" rules (C). Table 2. Translation into logic rules

1-2 3 4 5 6 7 8 9

package $ deluxe sunroof_opt $ selected package $ luxury aircond_opt $ selected frame $ convertible sunroof_opt # discarded package = standard frame # sedan package = standard aircond_type # ac1 package $ luxury aircond_type $ ac2 sunroof_type $ sr1 " aircond_type $ ac2 sunroof_glass # plain sunroof_type $ sr1 aircond_opt $ selected

The translation of the configuration rules gives 8 logic rules. We end up with adding 6 extra ones in order to keep the "not available" value for discarded options3:

Should the consequent be any type of max-closed constraint, the validity of the conclusion would not be affected. Of course, the property is lost if min-closed and max-closed constraints are mixed together.

49

10 11

aircond_opt $ selected aircond_type $ ac1

aircond_type $ ac1 aircond_opt $ selected

3.

E.g., aircond_type = na iff aircond_opt = discarded.

12 13 14 15

sunroof_opt $ selected sunroof_type $ sr2 sunroof_opt $ selected sunroof_glass $ plain

sunroof_type $ sr2 sunroof_opt $ selected sunroof_glass $ plain sunroof_opt $ selected

Nearly all of the rules are of type h. The only exceptions are the conditions 5 and 6 laid down on the standard package assignment. Hence, a complete consistency diagnosis can be achieved with at most one variable split, i.e., with two checks. In comparison, the product of the domain sizes would give a gross overestimate of 2916 tries. For this configuration problem, the worst case complexity is consequently quite low.

REFERENCES

[2]

ISO 10303-11 standard. Industrial automation systems and integration — Product data representation and exchange — Part 11: Description methods: The EXPRESS language reference manual, ISO, 1994.

[3]

Soininen, T. and Niemelä, I., Developing a declarative rule language for applications in product configuration. In Proceedings of the First International Workshop on Practical Aspects of Declarative Languages - PADL'99 (Gupta ed.), LNCS 1551, 305-319, Springer, 1998.

[4]

Lobo J., Minker J., and Rajasekar A., Theory of Disjunctive Logic Programs. In Computational Logic: Essays in honor of Alan Robinson (Lassez, ed.), MIT Press, 1991.

[5]

Balduccini M., Gelfond M. and Nogueira M., Answer set based design of knowledge systems. Annals of Mathematics and Artificial Intelligence, 47:1-2, 183 - 219, 2006.

[6]

Cohen J. (guest editor), Special issue on Concepts and Topics in Constraint Programming Languages. Constraints, 4:4, Kluwer, 1999.

[8]

Chandrasekaran R., Integer programming problems for which a simple rounding type of algorithm works. In Progress in Combinatorial Optimization (Pulleblank ed.), 10-106, Academic Press, 1984.

[9]

Bailleux O., Boufkhad Y. and Roussel O., New Encodings of PseudoBoolean Constraints into CNF. In Theory and Applications of Satisfiability Testing - SAT'09, 181-194, Springer, 2009.

[11] Apt K. and Monfroy E., Constraint Programming viewed as Rulebased Programming. Theory and Practice of Logic Programming, 1:6, 713-750, 2001.

We have defined a domain language with a logical semantics whose statements are sets of facts (generalized here to conjunctions of integer linear equations and inequalities, i.e., to IP models), and sets of rules (i.e., conjunctions of conditionals). A rule is just a high-level constraint to satisfy, and should be understandable at the knowledge level. By viewing a constraint as a set of low-level rewrite rules, the authors of [11] follow a diametrically opposed goal (which amounts to describe how to procedurally enforce local consistency at the implementation level). Rather than programming experts, the target users for our declarative modeling language are domain experts. The reductions entailed by the rules on the variables’ domains are akin to inferences and are not limited to the usual forward chaining mode. We have identified a sub-language with a remarkable property and linked it to the theory of max-closed constraints. Knowledge bases expressed in that language can be certified conflict-free in polynomial time, by a propagation engine. Global consistency of the constraint system automatically derives from local consistency. There is no search involved, therefore no concern for scalability. Although severely restricted, this sub-language generalizes Horn propositional logic to finite domains that are non-binary. On the practical side, we’ve given an insight into the applicability of this theory to technical engineering problems where maintaining the consistency of a large and possibly dynamic model is central. A similar analysis could also provide some guidance on how to derive efficient encodings for handing the formula to check over to a SAT solver.

Fromherz M., and Saraswat V., Model-Based Computing: Using Concurrent Constraint Programming for Modeling and Model Compilation. In Principles and Practice of Constraint Programming CP'95 Industrial Session, (Montanari & Rossi eds.), LNCS 976, 629635, Springer, 1995.

Jeavons P. and Cooper M., Tractable Constraints on Ordered Domains, Artificial Intelligence, 79:2, 327-339, 1995.

[10] Narboni G., A Tentative Approach for Integrating Sales with Technical Configuration. In Papers from the 2007 AAAI Workshop on Configuration (O’Sullivan & Orsvarn eds.), TR WS-07-03, 41-44, AAAI Press, 2007.

5 SUMMARY

[1]

[7]

50

Protocols for Governance-free Loss-less Well-organized Knowledge Sharing Philippe MARTIN 1 modify what other ones have entered (this discourages information entering or leads to edit wars), or ii) require all/some users to approve or not changes made in the KB, possibly via a workflow system (this is bothersome for the evaluators, may force them to make arbitrary selections, and this is a bottleneck in information sharing that often discourages information providers). By avoiding these two governance problems and leading to a well organized KBs, such kinds of cbwoKB protocol form a basis for a scalable knowledge sharing, even when multiple communities are involved. Actually, unlike with other approaches, a same cbwoKB can be used by many communities with partially overlapping focus since the KB is organized and can be filtered/queried/browsed by each person according to her needs or according to a community viewpoint. Even if built by many communities a (virtual) cbwoKB is unlikely to be huge since i) redundancies are reduced, ii) “well organized knowledge” (as opposed to data) is difficult to build. However, a cbwoKB can permit to index or relate the content of data-bases. In any case, the bigger and the more organized the cbwoKB, the more information are easier to access and compare. Since building a cbwoKB can partly re-use resources of more classic (i.e., less organized) Semantic Web solutions or database solutions, it can be incrementally built to overcome the limitations of these solutions when they become clear and annoying to the users. Section 2 presents the knowledge representation model used by the rules of the collaborative “KB editing” protocol of WebKB-2. Section 3 presents them and introduces many yet unpublished ideas. Due to space restrictions and for readability reasons, the model and rules are presented via sentences rather than in a fully formal way. Furthermore, as with most methodological rules, the “completeness” criterion does not apply well to these rules. Collaborative evaluation of knowledge representations is an extension of collaborative KB editing since, for precision and re-use purpose, evaluations should themselves be knowledge representations. In this article, the collaboration scheme of WebKB-2 is not developed but quickly introduced in the last point of the collaborative “KB editing” protocol. This protocol is not restricted to a physical KB: Section 4 shows how a global virtual cbwoKB can be composed of correlated cbwoKBs of users or communities of users. Section 5 concludes and reminds that the presented knowledge sharing approaches are complementary. WebKB-2 has been applied to the collaborative representation of many domains by students (for learning purposes), researchers (for knowledge sharing and evaluation purposes) and, currently, experts in the classification of coral species. Due to space restrictions, no evaluation by these users is reported in this article.

Abstract. This article first lists typical problems of knowledge sharing approaches based on static files or semi-independently created ontologies or knowledge bases (KBs). Second, it presents a protocol permitting people to collaboratively build and evaluate a wellorganized KB without having to discuss or agree. Third, it introduces extensions of this support to allow a precise collaborative evaluation of information providers and pieces of information. Fourth, it shows how a global virtual KB can be based on individual KBs partially mirroring each other.

1

INTRODUCTION

Ontology repositories – and, more generally, the Semantic Web – are often envisaged as composed of many small static (semi-)formal files (e.g., RDF or RDFa documents) more or less independently developed, hence loosely interconnected and with many implicit redundancies or inconsistencies between them [16] [4]. These relations are difficult to recover manually and automatically. This “static file based approach” – as opposed to a “collaboratively-built well-organized large knowledge base (cbwoKB) server approach” – makes knowledge re-use tasks complex to support and do correctly or efficiently, especially in a collaborative way. Most Semantic Web related research works are intended to support such tasks (ontology creation, retrieval, comparison and merging). However, most often, they lead people to create new files – thus contributing to the problems of knowledge re-use – instead of inserting their knowledge into a cbwoKB. Such a KB may be on a single machine or may be a global virtual KB distributed into various correlated KBs on several Web servers and/or the machines of a peer-to-peer network. Except for WebKB-2 [13] (webkb.org) - the tool implementing the new techniques described in this article, no other ontology/KB server has an ontology-based protocol permitting and enforcing or encouraging people to interconnect their knowledge into a cbwoKB, while keeping it well-organized (this means that detected partial redundancies or inconsistencies are prevented or made explicit via relations of specialization, identity and/or correction) and without forcing them to agree on terminology or beliefs. Indeed, i) this is often but wrongly assumed to be impossible or to involve centralization or domain restrictions, ii) this requires the users to see and write (semi-)formal knowledge representations, and iii) this does not directly re-use already existing ontologies. Furthermore, supporting a cbwoKB also requires proposing and managing a large general ontology (WebKB-2 does so). Other KB servers/editors (e.g., Ontolingua, OntoWeb, Ontosaurus, Freebase, CYC and semantic wiki servers) have no such protocols and i) let every authorized user 1

University of La Reunion and adjunct researcher of Griffith University, email: [email protected]

51

2

use (the various kinds of quotes are important) en#"bird" and "bird" refer to the English informal word “bird”. wn#bird is a formal term referring to one of the WordNet categories for “bird”. Here are examples of statements in FE. u1#u2#"birds fly" is an informal statement from u2 and represented by u1. u1#`any u1#bird is pm#agent of a pm#flight' is a formal statement and definition by u1 of u1#bird as something that necessarily fly (the first quote is a backquote, not a straight quote, to allow the embedding of statements). u1#`every u1#bird is agent of a flight' is a semiformal statement and belief of u1 that “every u1#bird flies”. In KIF [9], these last two statements would respectively be

LANGUAGE MODEL FOR THE PROTOCOL

The cbwoKB editing protocol used in WebKB-2 are not tied to any particular knowledge representation (KR) language or inference mechanism (hence, this is not the point of this article and no comparison is made on such mechanisms). They only require that conflicts between knowledge representations – i.e., partial redundancies or inconsistencies between terms or statements - are detected by some inference mechanism or by people (hence, the protocol also works with informal pieces of knowledge as long as they can be interrelated by semantic relations). The more conflicts between statements are detected, the more relations between the statements are set between them to solve the conflicts, and hence the more the KB is kept organized and thus exploitable. The model for the protocol – i.e., their view on a KB (whichever KR language it actually uses) – is a set of objects which are either terms or statements. Every object has at least one associated source (creator, believer, interpreter, source file or language) represented by a formal term. A formal term is a unique identifier for anything that can be though of, i.e., either a source, a statement or a category. It has a unique meaning which may be made partially/totally explicit by its creator via definitions with necessary and/or sufficient conditions. An identifier may be an URI or, if it is not a creator identifier, may include the identifier of its creator (this is the classic solution to avoid lexical conflicts between terms from various sources). An informal term is one name of one or several objects. Two objects may share one or several names but cannot share identifiers. A statement is a sentence that is either formal, semi-formal or informal. It is informal if it cannot be translated into a logic formula, for example because it does not have a formal grammar with an interpretation in some logics. Otherwise, it is formal if it only uses formal terms, and semi-formal if it uses some informal terms. A statement is either a category definition or a belief. A belief must have a source that is its creator and that believes in it and/or that has represented (and hence interpreted) a statement from some other source. Finally, a category is either a type of objects or an individual (object). A type (a “class” in OWL) is either a relation type or a concept type. An individual is an instance of a first-order type. The KR model of WebKB-2, its associated notations and its inference mechanism must also be mentioned in this section for illustration purposes. Although graph-based, this model is equivalent to the model of KIF (Knowledge Interchange Format; http://logic.stanford.edu/kif/dpans.html), i.e., it permits to use first order logic plus collections (sets, lists, ...) and contexts (metastatements that restrict the interpretation of statements). WebKB-2 allows the use of several notations: RDF/XML (an XML format for knowledge using the RDF model), the KIF standard notation and other ones which are here collectively called KRLX. These KRLX languages were specially designed to ease knowledge sharing: they are expressive, intuitive and normalizing, i.e., they guide users to represent things in ways that are automatically comparable. One of them is a formal controlled English named FE. It will be used for the examples along with KIF. These languages can be used for creating assertion/query commands and these commands can be sent to the WebKB-2 server via the HTTP/CGI protocol, from an application or from a WebKB-2 Web form. Other communication interfaces are being implemented: one based on SOAP and one based on OKBC (Open Knowledge Base Connectivity; http://www.ai.sri.com/˜okbc) to query (or be queried by) frame-based tools or servers, e.g., Loom, SRI and the GKB-Editor. Here are some examples of terms in KRLX that highlights its

(creator u1 ˆ(defrelation u1#bird (?b) :=> (exists ((?f pm#flight)) (pm#agent ?b ?f)))) and (believer u1 ˆ(forall ((?b u1#bird)) (exists ((?f flight)) (agent ?b ?f)))).

When the creator of an object is not explicitly specified, WebKB2 exploits its “default creator” related rules and variables to find this creator during the parsing. Similarly, unless already explicitly specified by the creator, WebKB-2 uses the “parsing date” for the creation date of a new object. The creator of a belief is also encouraged to add contextualizing relations on it (at least temporal and spatial relations must be specified). RDF/XML – the W3C recommended linearisation of RDF – and OWL – the W3C recommended language ontology – are currently not particularly well suited for the cbwoKB editing protocol or, more generally, for the representation or interconnection of expressive statements from different users in a same KB. • They offer no standard way to associate a believer, creator or interpreter to every object in an RDF/XML file. Since 2003, RDF/XML has no bagID keyword, thus no way to represent contexts and hence believers or beliefs. XML name-space prefixes (e.g., u1:bird), Dublin Core relations and statement reification do not permit to do this. This is likely a temporary only constraint since many RDF-related languages or systems extend RDF in this direction: Notation3 (N3), Sesame, Virtuoso, ... • RDF and OWL – like almost all description logics – do not permit their users to distinguish definitions from universal quantifications. More precisely, they do not offer a universal quantifier. N3 does (Turtle, the RDF-restricted subset of N3, does not). The distinction is important since, as noted in the documentation of KIF [9], a universally quantified statement (belief) may be false while a definition cannot. A definition may be said to be “neither true nor false” or “always true by definition”. A user u1 is perfectly entitled to define u1#cat as a subtype of wn#chair; there is no inconsistency as long as the ways u1#cat is further defined or used respect the constraints associated with wn#chair. A definition may be changed by its creator but then the meaning of the defined term is changed rather than corrected. This distinction is important for a cbwoKB editing protocol since it leads to different conflict resolution strategies: “term cloning” and “lossless correction” (Point 5 and Point 6 of the next section). • Many natural language sentences are difficult to represent in RDF/XML+OWL or N3+OWL, since they do not yet have various kinds of numerical quantifiers, contexts, collections, modalities, ... (FE has concise syntactic sugar for the different kinds). However, at least N3 might soon be extended. • Like most formal languages, RDF/XML and N3 do not accept – or have a special syntax for – the use of informal objects instead of formal objects. KRLX does and this permits WebKB2 to create one specialization/generalization hierarchy catego-

52

rizing all objects. More precisely, this is an “extended specialization/generalization” hierarchy since in WebKB-2 the classic “generalization” relation between formal objects (logical implication) has been extended to apply to informal objects too.

2. Adding, modifying or removing a term is done by adding, modifying or removing at least one statement (generally, one relation) that uses this term. A new term can only be added by specializing another term (e.g., via a definition), except for process types which, for convenience purposes, can also be added via subprocess/superprocess relations. In WebKB-2, every new statement is also automatically categorized into the extended specialization hierarchy. A new informal statement must also be connected via an argumentation relation to an already stored statement. In summary, all objects are manually or automatically inserted in the extended specialization hierarchy and/or the subprocess hierarchy, and thus can be easily searched and compared. However, it is clear that if one user (say, u2) enters a term (say, u2#object) that is implicitly semantically close to another user’s term (say, u1#thing) but does not manually relates them or manages to give u2#object a definition that is not automatically comparable to the definition of u1#thing (i.e., there is no partial redundancies between the two definition) then the two terms cannot be automatically related by the system and the implicit redundancy cannot be rejected by the system. Here, the problem is that u2 has not respected the following “best practice” rule (which is part of WebKB-2 normalization rules): “always relate a term to all existing terms in the KB via the most important or common relations: i) transitive relations, especially (extended) specialization/generalization relations and mereological relations (to specify parts, containers, ...), ii) exclusion/correction relations (especially via subtype partitions), iii) instance/type relations, iii) basic relations from/to processes, iv) contextualizing relations (spatial, temporal, modal, ) and v) argumentation relations”. 3. If adding, modifying or removing a statement introduces an implicit redundancy (detected by the system) in the shared KB, or if this introduces a detected inconsistency between statements believed by the user having done this action, this action is rejected. Thus, in case of an addition, the user must refine his statement before trying to add it again or he must first modify at least one of his already entered statements. An “implicit” redundancy is a redundancy between two statements without a relation between them making the redundancy explicit. Such a relation is typically an equivalence relation in case of total redundancy and an extended specialization relation (e.g., an “example” relation) in case of partial redundancy. As illustrated in the previous section, the detection of implicit extended specializations between two objects reveals an inconsistency or a total/partial redundancy. It is often not necessary to distinguish between these two cases to reject the newly entered object. Extended “instantiations” (one example was given in the previous section) are exceptions: they do not reveal an inconsistency or a total/partial redundancy that needs to be made explicit, since adding an instantiation is giving an example for a more general statement. It is important to reject an action introducing a redundancy instead of silently ignoring it because this often permits the author of the action to detect a mistake, a bad interpretation or a lack of precision (on his part or not). At the very least, this reminds the users that they should check what has already been represented on a subject before adding something on this subject. 4. If the addition of a new term u1#T by a user u1 introduces an inconsistency with statements of other users, this action is rejected by the system. Indeed, such a conflict reveals that u1 has directly or indirectly used at least one term from another user in his definition of u1#T and has misunderstood the meaning of this term. The addition by a user u2 of a definition to u1#T is actually

For its cbwoKB editing protocol, WebKB-2 detects partial redundancies or inconsistencies between objects by exploiting exclusion and extended specialization relations between objects. A statement Y is an extended specialization of a statement X (i.e., Y includes the information of X and hence either contradicts it or makes it redundant) if X structurally matches a part of Y and if each of the terms in this part of Y is identical or an extended specialization of its counterpart term in X. For example, WebKB-2 can detect that u2#`Tweety can be agent of a flight with duration at least 2.5 hour' (which means “u2 believes that

Tweety can fly for at least 2.5 hours”) is an extended specialization (and an “extended instantiation”) of both u1#`every bird can be agent of a flight' and u1#`2 bird can be agent of a flight'. In KIF, the first of these two statements can be written: (believer u1 '(modality possible '(forall ((?b bird)) (exists ((?f flight)) (agent ?b ?f))))

Furthermore, these last two statements can be found to be extended specializations (and redundant with) respectively u2#`75% of bird can be agent of a flight' and u2#`at least 1 bird can be agent of a flight'. Similarly, this last graph can be found to be inconsistent with u3#`no bird can be agent of a flight'. WebKB-2 uses graph matching for detecting extended specializations. Other inference mechanisms could be used. This matching takes into account numerical quantifiers and measures, not just existential and universal quantifiers. Apart for this, it is similar to the classic graph matching for a specialization (or conversely, a generalization which is a logical deduction) between positive conjunctive existential formulas (with or without an associated positive context, i.e., a meta-statement that does not restrict its truth domain). This classic graph matching is sound and complete with respect to first-order logic and can be computed with polynomial complexity if the query graph (X in the above description) has no cycle [5]. Apart from this restricted case, graph matching for detecting an extended specialization is not always sound and complete. However, this operation works with language of any complexity (it is not restricted to OWL or FOL) and the results of searches for extended specializations of a query graph are always “relevant”.

3

COLLABORATIVE KB EDITING PROTOCOL

The rules of the protocol are intended for each object to be connected to at least another object via relations of specialization/generalization, identity and/or argumentation. These rules also permit a loss-less information integration since they do not force to make knowledge selections. They apply to the addition, modification or removal of an object in the KB, e.g., through a graphical interface or via the parsing of a new command in a new input file. This does not serialize objects in the KB and waiting till the whole input file is parsed would not permit to detect more partial redundancies or inconsistencies between the objects. The word “user” is here used as a synonym for “source”. 1. Any user can add and use any object but an object may only be modified or removed by its creator.

53

case, u2’s belief specializes u1’s belief (except for a quantifier which is generalized) and corrects it. In both cases, WebKB-2 detects the conflict by simple graph-matching. If instead of the belief ‘every bird can be agent of a flight’ (all birds can fly), u1 entered the definition ‘any bird can be agent of a flight’, i.e., if he gave a definition to the type named “bird”, there are two cases (as implied by the rules of the two previous points): • u1 originally created this type (u1#bird); then, u2’s attempt to correct the definition is rejected, or • u1 added a definition to another source’s type, say wn#bird since this type from WordNet has no associated constraint preventing the adding of such a definition; then i) the types u1#bird and u2#bird are automatically created as clones (and subtypes of) wn#bird, ii) the definition of u1 is automatically changed into `any u1#bird is agent of a flight', and iii) the belief of u2 is automatically changed into u2#`75% of u2#bird can be agent of a flight'.

a belief of u2 about the meaning of u1#T. This belief should be rejected if it is found (logically) inconsistent with the definition(s) of u1#T by u1. An example is given in Point 6. 5. If the addition, modification or removal of a statement defining an already existing term u1#T by a user u1 introduces an inconsistency involving statements directly or indirectly re-using u1#T and created or believed by other users (i.e., users different from u1), u1#T is automatically cloned to solve this conflict and ensure that the original interpretation of u1#T by these other users is still represented. Indeed, such a conflict reveals that these other users had a more general interpretation of u1#T than u1 had or now has. Assuming that u2 is this other user or one of these other users, the term cloning of u1#T consists in creating u2#T with the same definitions as u1#T except for one, and then replacing u1#T by u2#T in the statements of u2. The difficulty is to chose a relevant definition to remove for the overall change of the KB to be minimal. In the case of term removal by u1, term cloning simply means changing the creator’s identifier in this term to the identifier of one of the other users (if this generated term already exists, some suffix can be added). In a cbwoKB server, since statements point to the terms they use, changing an identifier does not require changing the statements. In a global virtual cbwoKB distributed on several servers, identifier changes in one server need to be replicated to other servers using this identifier. Manual term cloning is also used in knowledge integrations that are not lossless [6]. In a cbwoKB, it is not true that beliefs and term definitions “have to be updated sooner or later”. Indeed, in a cbwoKB, every belief must be contextualized in space and time, as in

In WebKB-2, users are encouraged to provide argumentation relations on corrective relations, i.e., a meta-statement using argument/objection relations on the statement using the corrective relation. However, to normalize the shared KB, they are encouraged not to use an objection relation but a “corrective relation with argument relations on them”. Thus, not only the objections are stated but a correction is given and may be agreed to by several persons, including the author of the corrected statement (who may then remove it). Even more importantly, unlike objection relations, most corrective relations are transitive relations and hence their use permits better organization of argumentation structures, thus avoiding redundancies and easing information retrieval. The use of corrective relations makes explicit the disagreement of one user with (his interpretation of) the belief of another user. Technically, there is no inconsistency: an assertion A may be inconsistent with an assertion B but a belief that “A is a correction of B” is technically consistent with a belief in B. Thus, the shared KB can remain consistent. For problem-solving purposes, application-dependent choices between contradictory beliefs often have to be made. To make them, an application designer can exploit i) the statements describing or evaluating the creators of the beliefs, ii) the corrective/argumentation and specialization relations between the beliefs, and more generally, iii) their evaluations via meta-statements (see Point 7). For example, an application designer may choose to select only the most specialized or restricted beliefs of knowledge providers having worked for more than 10 years in a certain domain. Thus, this approach is unrelated to defeasible logics: it does not solve the problems of performing problem-solvinglike inferencing with imprecise knowledge and thus contradictory statements but it leads to more precise knowledge and permits to delay the choices between contradictory beliefs until they can be made, that is, in the context of applications (meanwhile, as in WebKB-2, graph-matching based mechanisms can be used to perform semantic search and checks without being affected by contradictions between beliefs from different sources). The approach also avoids the problems associated with classic “version management” (furthermore, as above explained, in a cbwoKB, formal objects do not have to evolve in time). This approach assumes that all beliefs can be argued against and hence be “corrected”. This is true only in a certain sense. Indeed, among beliefs, one can distinguish “observations”, “interpretations” (“deductions” or “assumptions”; in this

u3#` `75% of bird can be agent of a flight' in place France and in period 2005 to 2006'

(such contexts are not shown in the other examples of this article). If needed, u3 can create the formal term u3#75%_of_birds_fly__in_France_from_2005_to_2006

to refer to this last belief. Due to the possibility of contextualizing beliefs, it is rarely necessary to create formal terms such as u2#Sydney_in_2010. Most common formal terms, e.g., u3#bird and wordnet1.7#bird never need to be modified by their creators. They are speciali-zations of more general formal terms, e.g., wn#bird (the fuzzy concept of bird shared by all versions of the WordNet ontologies). What certainly evolutes in time is the popularity of a belief or the popularity of the association between an informal term and a concept. If needed, this changing popularity can be represented by different statements contextualized in time and space. 6. If adding, modifying or removing a belief introduces an implicit potential conflict (partial/total inconsistency or redundancy) involving beliefs created by other creators, it is rejected. However, a user may still represent his belief (say, b1) – and thus “loss-less correct” another user’s belief that he does not believe in (say, b2) – by connecting b1 to b2 via a corrective relation. E.g., here are two FE statements by u2, each of which corrects a statement made earlier by u1: u2#` u1#`every bird is agent of a flight' has for corrective_restriction u2#`most healthy flying_bird can be agent of a flight' ' and u2#` u1#`every bird can be agent of a flight' has for corrective_generalization u2#`75% of bird can be agent of a flight' '.

In the second case, u2’s belief generalizes u1’s belief and corrects it since otherwise u2 would not have needed to add it. In the first

54

similar interests, e.g., the numerous different communities working on the Semantic Web. The hypothesis of this approach are that i) conflicts can always be solved by adding more precision (e.g., by making their sources explicit: different “observations”, “interpretations” or “preferences”), ii) solving conflicts in a loss-less way most often increases or maintain the precision and organization of the KB, and iii) different, internally consistent, ontologies do not have to be structurally modified to be integrated (strongly inter-related) into a unique consistent semantic network. None of the various kinds of integrations or mappings of ontologies that I made invalidated these hypothesis.

approach, axioms are considered to be definitions) and “preferences”; although all these kinds of beliefs can be false (their authors can lie, make a mistake or assume a wrong fact), most people would be reluctant to argue against selfreferencing beliefs such as u2#"u2 likes flowers" and u2#"u2 is writing this sentence". Instead of trying to formalize this into exceptions, the editing protocols of WebKB2 rely on the reluctance of people to argue against such beliefs that should not be argued against. 7. To support more knowledge filtering or decision making possibilities and lead the users to be careful and precise in their contributions, a cbwoKB server should propose “default measures” deriving a global evaluation of each statement/creator from i) users’ individual evaluations of these objects, and ii) global evaluations of these users. These measures should not be hardcoded but explicitly represented (and hence be executable) to let each user adapt them – i.e., combine their basic functions – according to his goals or preferences. Indeed, only the user knows the criteria (e.g., originality, popularity, acceptance, ..., number of arguments without objections on them) and weighting schemes that suit him. Then, since the results of these evaluations are also statements, they can be exploited by queries on the objects and/or their creators. Furthermore, before browsing or querying the cbwoKB, a user should be given the opportunity to set “filters for certain objects not to be displayed (or be displayed only in small fonts)”. These filters may set conditions on statements about these objects or on the creators of these objects. They are automatically executed queries over the results of queries. In WebKB-2, filtering is based on a search for extended specialization, as for conceptual querying. Filters are useful when the user is overwhelmed by information in an insufficiently organized part of the KB. The KB server Co4 [7] had protocols based on peer-reviewing for finding consensual knowledge; the result was a hierarchy of KBs, the uppermost ones containing the most consensual knowledge while the lowermost ones were the private KBs of contributing users. Establishing “how consensual a belief is” is more flexible in a cbwoKB: i) each user can design his own global measure for what it means to be consensual, and ii) KBs of consensual knowledge need not be generated.

4

DISTRIBUTION IN A VIRTUAL KB

One cbwoKB server cannot support knowledge sharing for all communities. For scalability purposes, the cbwoKB servers of communities or persons should be able to interact to act as one global virtual cbwoKB (gv cbwoKB), without a central brokering system, without restrictions on the content of each KB, and without necessarily asking each server to register to a particular super-community or peerto-peer (P2P) network. For several cbwoKB servers to be seen as a gv cbwoKB, it should not matter which KB a user or agent chooses to query or update first. Hence, object additions/updates made in one KB should be replicated into all the other KBs that have a scope which covers these objects; idem for queries when this is relevant. Given these specifications, current approaches for collaboration between KB servers/owners (e.g., [4] [14] which are based on integrating changes made in other KBs, and [15] which also use a workflow system) or distributed querying between a few KB servers (e.g., as described by [12]) are insufficient. Indeed, they are based on partial descriptions of the content of each KB or on predefined roles for each KB owner or user, and the redundancies or inconsistencies between the KBs are not made explicit. This often makes difficult to find the relevant KBs to search or add in and to integrate query results. As in the previous sections, a solution is to let the knowledge indexation and distribution be made at the object level instead of the document/KB/community/owner level. The requirement is that for every term T stored in a cbwoKB server, the KB must either • have a Web-accessible formal description specifying that it is committed to be a “nexus” for T, i.e., that i) it stores any statement S on T (if S is inserted in another KB of this gv cbwoKB, it is also inserted in this KB), or ii) it associates to T the URLs of cbwoKB servers permitting to find or store any statement on T, or • not be a “nexus” for T and hence associate to T either i) the URLs of all cbwoKB servers that have advertised themselves to be a nexus for T, or ii) the URL of at least one server that stores these URLs of nexus servers for T. Thus, via forwards between servers, all objects using T can be added or found in all the nexus for T. This requirement refines the 4th rule of the Linked Data approach [3]: “link things to their related ones in some other data sets”. Indeed, to obtain a gv cbwoKB, the data sets must be cbwoKB servers and there must be at least one nexus for each term. A consequence is that when the scopes of two nexus overlap, they share common knowledge and there is no implicit redundancies or inconsistencies between them. Thus, the gv cbwoKB has a unique ontology distributed on the various cbwoKB servers. The difficult task is, whenever the owners of a new cbwoKB server want to join a gv cbwoKB, to integrate their ontology into the global one (they must find some nexus of the gv cbwoKB, only one if it has a nexus for its top level type). This integration task is at the core

The approach described in the above points is incremental and works on semi-formal KBs. Indeed, the users can set corrective or specialization relations between objects even when WebKB-2 cannot detect an inconsistency or redundancy. As noted, a new informal statement must be connected via an argumentation relation (e.g., a corrective relation) or an extended specialization relation to an already stored statement. For this relation to be correct, this new statement should generally not be composed of several substatements. However, allowing the storing of (small) paragraphs within a statement eases the incremental transformation of informal knowledge into (semi-)formal knowledge and allows doing so only when needed. This is necessary for the general acceptance of the approach. The techniques described in this article work do not seem particularly difficult for information technology amateurs, since the minimum they require is for the users to set the above mentioned relations from/to each term or statement. Hence, these techniques could be used in semantic wikis to avoid their governance problems cited in the introduction and other problems caused by their lack of structure. More generally, the presented approach removes or reduces the filebased approach problems listed in the previous section, without creating new problems. Its use would allow merging of (the information discussed or provided by the members of) many communities with

55

of most knowledge sharing/re-use approaches. In this one, it is done only by the owners of the new cbwoKB; once this is done, regularly and (semi-)automatically integrating new knowledge from/to other nexus is much easier since a common ontology is shared. Thus, it can be envisaged that one initial cbwoKB server be progressively joined by other ones to form a more and more general gv cbwoKB. The key point of the approach is the formal commitment to be a nexus for a term (and hence to be a cbwoKB since direct searches/additions by people must be allowed). There is currently no standard vocabulary to specify this, e.g., from the W3C, the Dublin Core and voiD [10] (a vocabulary for discovering linked datasets). It is in the interest of a competitive company to advertise that it hosts a nexus for a certain term, e.g., apartment_for_rent_in_Sydney for a real estate agent covering the whole of Sydney. If the actual coverage of a nexus is less than the advertised one, a competitor may publish this. In a business environment, it is in the interest of a competitive company to check what its competitors or related companies offer and, if it is legal, integrate their public information in its cbwoKB. It is also in its interest to refer to the most comprehensive KBs/nexus of its related companies. To sum up, the approach could be technically and socially adopted. Since its result is a gv cbwoKB, it can be seen as a way to combine advantages commonly attributed to “distributed approaches” and “centralized approaches”.

5

more possibilities, with on the whole no more costs, than the mainstream approach [16] [3] where knowledge creation and re-use involves searching, merging and creating (semi-)independent (relatively small) ontologies or semi-formal documents. The problem – and related debate – is more social than technical: which formalisms and user-centric methodologies will people accept to learn and use to gain precision and flexibility at the expense of more initial involvement? The answers depend on the considered kinds of users and time frames (now or in the long term future). A cbwoKB is much more likely to be adopted by a small communities of researchers but could incrementally grow to a larger and larger community. In any case, research on the two approaches are complementary: i) techniques of knowledge extraction or merging ease the creation of a cbwoKB, ii) the results of applying these techniques with a cbwoKB as input would be better, and iii) these results would be easier to retrieve, compare, combine and re-use if they were stored in a cbwoKB.

REFERENCES [1] ISWC 2006. Owl specification of the semantic web topics ontology. http://lsdis.cs.uga.edu/library/resources/ontologies/swtopics.owl, 2006. [2] V.R. Benjamins, D. Fensel, A. Gomez-Perez, D. Decker, M. Erdmann, E. Motta, and M. Musen. Knowledge annotation initiative of the knowledge acquisition community: (ka)2. KAW 1998, 11th Knowledge Acquisition for Knowledge Based System Workshop, April 18-23, 1998. [3] C. Bizer, T. Heath, and T. Berners-Lee, ‘Linked data - the story so far’, International Journal on Semantic Web and Information Systems, 5 (3), 1–22, (2010). [4] P. Casanovas, N. Casellas, C. Tempich, D. Vrandecic, and R. Benjamins, ‘Opjk and diligent: ontology modeling in a distributed environment’, Artif. Intell. Law, 15 (2), 171–186, (2007). [5] M. Chein and M.-L. Mugnier, ‘Positive nested conceptual graphs’, ICCS 1997, LNAI 1257, 95–109, (1997). [6] R. Djedidi and A. Aufaure, ‘Define hybrid class resolving disjointness due to subsumption’, Web page, (2010). [7] J. Euzenat, ‘Corporate memory through cooperative creation of knowledge bases and hyper-documents’, KAW 1996, (36)1–18, (1996). [8] J. Euzenat, O. Mbanefo, and A. Sharma, ‘Sharing resources through ontology alignment in a semantic peer-to-peer system’, Cases on semantic interoperability for information systems integration: practice and applications, 107–126, (2009). [9] M.R. Genesereth, ‘Knowledge interchange format’, Draft proposed American National Standard (dpANS), NCITS.T2/98-004, (1998). [10] M. Hausenblas, ‘Discovery and usage of linked datasets on the web of data’, Nodalities Magazine, 4, 16–17, (2008). [11] W.D. Hillis, ‘Aristotle (the knowledge web)’, Edge Foundation, Inc., 138, (May 6, 2004). [12] J. Lee, J. Park, Park M., C. Chung, and J. Min, ‘An intelligent query processing for distributed ontologies’, Systems and Software, 83 (1), 85–95, (Jan. 2010). [13] Ph. Martin and M. Eboueya, ‘For the ultimate accessibility and reusability’, 589–606, (July 2008). [14] N.F. Noy and T. Tudorache, ‘Collaborative ontology development on the (semantic) web’, AAAI Spring Symposium on Semantic Web and Knowledge Engineering (SWKE), (March 2008). [15] R. Palma, P. Haase, Y. Wang, and M. d’Aquin. Propagation models and strategies. Deliverable 1.3.1 of NeOn (Lifecycle Support for Networked Ontologies; NEON EU-IST-2005-027595), Jan. 2008. [16] N. Shadbolt, T. Berners-Lee, and W. Hall, ‘The semantic web revisited’, IEEE Intelligent Systems, 21 (3), 96–101, (May/June 2006). [17] J. Sowa. Theories, models, reasoning, language, and truth. Web document http://www.jfsowa.com/logic/theories.htm, Dec. 2005.

CONCLUSION

This article first aimed to show that a (gv )cbwoKB is technically and socially possible. To that end, Section 4 presented a protocol permitting, enforcing or encouraging people to incrementally interconnect their knowledge into a well-organized (formal or semi-formal) KB without having to discuss and agree on terminology or beliefs. As noted, it seems that all other knowledge-based cooperation protocols that currently exists work on the comparison or integration of whole KBs, not on the comparison and loss-less integration of all their objects into a same KB. Other required elements for a (gv )cbwoKB – and for which WebKB-2 implements research results - were also introduced (Section 5 and Section 6) or simply mentioned: expressive and normalizing notations, methodological guidance, a large general ontology, and an initial cbwoKB core for the application domain of the intended cbwoKB. Already explored kinds of applications were cited. One currently explored is the collaborative representation and classification by Semantic Web experts of “Semantic Web related techniques”. This means that in the medium term Semantic Web researchers will be able and invited to represent and compare their techniques in WebKB-2, instead of just indexing their research via domain related terms, as was the case in the KA(2) project [2] or with the Semantic Web Topics Ontology [1]. More generally, the approach proposed in this article seems interesting for collaboratively-built corporate memories or catalogues, e-learning, e-government, e-science, eresearch, etc. [11] describes a “Knowledge Web” to which teachers and researchers could add “isolated ideas” and “single explanations” at the right place, and suggests that this Knowledge Web could and should “include the mechanisms for credit assignment, usage tracking and annotation that the Web lacks” (pp. 4-5). [11] does not give indications on what such mechanisms could be. The cbwoKB elements described by this article can be seen as a basis for such mechanisms. A second aim of this article (mainly via Section 2) was to show that – in the long term or when creating a new KB for general knowledge sharing purposes – using a cbwoKB does/can provide

56

Knowledge-based Implementation of Metalayers The Reasoning-Driven Architecture Lothar Hotz and Stephanie von Riegen1 knowledge about the next layer. The main ranks are kingdom, phylum, class, order, genus, species, and breed, which again are divided into categories. Each category unifies organisms with certain characteristics. For instance, the Mammal class includes organisms with glands only, thus a downward specialisation from Mammal to its subcategories is depicted in Figure 1. For clarity reasons, only extracts of ranks and categories are given, for example the rank of kingdom contains more than the category animal, among others plants, bacteria, and fungi. The ranks are representing an additional layer (BioClM ) above the domain model of the biological classification. The categories of the ranks form the domain model layer (BioClD ) and each of them is an instance of the correspondent rank. The system model layer (BioClS ) is covering specific individuals (also called domain objects), e.g. Tux the penguin. By the given classification, the need for multiple layers becomes directly evident: It is understandable that a King Penguin is an instance of Breed. But it would be improper to denote Tux as a Breed, which would hold if King Penguin would be a specialization of Breed.

Abstract. The Meta Principle, as it is considered in this paper, relays on the observation that some knowledge engineering problems can be solved by introducing several layers of descriptions. In this paper, a knowledge-based implementation of such layers is presented, where on each layer a knowledge-based system consisting, as usual, of a knowledge model and separated inference methods is used for reasoning about the layer below it. Each layer represents and infers about knowledge located on a layer below it.

1

Introduction

Typically, knowledge engineering has the goal to create a model of knowledge of a certain domain like car periphery supervision [33], drive systems [29], or scene interpretation [16]. For knowledge-based tasks, like constructing or diagnosing a specific car periphery system, a strict separation is made into domain model which covers the knowledge of a certain domain and a system model which covers the knowledge of a concrete system or product of the domain. The domain model and the system models are represented with a knowledge-modeling language which again is interpreted, because of a defined semantic, through a knowledge-based system. Examples are a terminology box (TBox) as a domain model representing e.g. knowledge about an animal domain; and an assertional box (ABox) as a system model representing e.g. a specific animal. The TBox and ABox are represented with a certain Description Logic [5] which is again interpreted through a Description Logic reasoner like PELLET or RACER. In this paper, we use configuration systems (configurators) as knowledge-based systems. In such systems, a domain model (also called configuration model) is used for constructing a system model (also called configuration). Configurators typically combine a broad range of inference mechanisms like constraint solving, rulebased inference, taxonomical reasoning, and search mechanisms. The domain model and the system model constitute two layers: In the domain model all knowledge about possible systems is expressed, in the system model all knowledge about one real system is expressed. Often, these two layers are sufficient for expressing the knowledge needed for a certain knowledge-based task. However, in some domains more layers may be needed for adequately representing the domain knowledge. We take the biological classification as an example for multiple layers. In [32], a detailed discussion of alternative modeling for biological classification is supplied. Figure 1 presents an extract of the traditional biological classification of organisms established by Carolus Linnaeus. Each biological rank joins organisms according to shared physical and genetic characteristics. We conceive of a rank as 1

BioCl

MM

BioCl

M

BioCl

D

BioCl

Kingdom

Phylum

Animal

Echinoderms

Mammal

Aves Asteroidea

Carnivora

Order Felidae Felis

Genus Domestic Cat

Species

Breed

has Instance-of Is-a

Chordate

Class

Biological Rank

S

Penguin

Great Penguin

Aptenodytes patagonicus

Europ. Shorthair King Penguin

Forcipulatida

Asterias Asteriidae

Common Sea Star

Patrick

Black Pete Tux

Figure 1. Biological Classification represented with several layers. Figure inspired by [4].

Because each layer specifies knowledge about knowledge in the layer below it, we also speak of a metalayer approach or of metaknowledge because as [28] pointed out: “Metaknowledge is knowledge about knowledge, rather than knowledge from a specific domain such as mathematics, medicine or geology.” From a knowledge engineering viewpoint, metaknowledge is being created at the same time

HITeC e.V. c/o Fachbereich Informatik, Universit¨at Hamburg, Germany, email: {hotz,svriegen}@informatik.uni-hamburg.de

57

M 1 layer. The system model represents a system which is located in the reality (M 0 layer; not shown in the figure for brevity) [9]. Please note, that each layer contains elements which are instances of classes of the layer above. Typically a specific implementation in a tool ensures that a system model on M 1 conforms to a model on M 2 and a model on M 2 conforms to a metamodel on M 3. For clarifying our approach, we will use R1, R2, R3 for RDA which roughly correspond to M 1, M 2, M 3 in MDA, respectively. Ri stands for “Reasoning Layer i”. We separate R1 in several reasoning layers, because for knowledge-based systems one single model on this layer is not sufficient. This is due to the above mentioned separation of domain model and system model. R1 consists of a domain model specified with concepts and constraints of CDL (denoted by R1C ) and knowledge instances (denoted R1I ) representing the system model. Furthermore, corresponding reasoning facilities of CDL allow to reason about entities on layer R1 (see Figure 2, CDL view). The RDA itself consists of multiple copies of the CDL view for representing and reason about distinct types of metaknowledge on different layers. These layers are denoted with R1M i (i ≥ 0), R1D , R1S for an arbitrary number of metamodel layers, one domain model layer, and one system model layer, respectively. Because each of these layers are realized with same knowledge-based facilities, i.e. CDL concepts and instances, we do not extend CDL with the notion of a metaclass, which has instances that act as classes and can again have instances (e.g. like OWL Full [2] or like MDA/UML implementations with stereotypes [4]). Thus, the layers of R1 are not in one system but are clearly separated into several systems, here called Knowledge Reflection Servers. Each server is realized with the typical concept/instance scheme. Hence, each server can be realized with a typical knowledge-based system like Description Logic systems, rule-based systems, or as in our case a configuration system based on CDL. Through a mapping between those servers, concepts on one layer are identified with instances of the next higher layer. This mapping is a one-to-one mapping and based on a metaknowledge model (see Figure 2, RDA view and Section 4). Following [4], we distinguish between a linguistic and an ontological instance-of relation. However, we explicitly name the internal implementation instance-of relation as such, which is a simple UML type-instance relation in [4]. The implementation instance-of relation is provided through the instantiation protocol of the underlying implementation language Common Lisp and its Common Lisp Object System (CLOS) [20, 21] (see Figure 2, classes are instances of the predefined metaclass standard-class). The linguistic instance-of relation is originated in the notion of classes and objects known from programming languages. In case of CDL, this relation is realized with the macroexpansion facilities of Common Lisp. Beside others, Figure 2 depicts concept definitions (define-concept) of CDL and their linguistic instance-of expansion to defclass of CLOS. The ontological instance-of relation represents relationships between knowledge elements, in CDL between concepts and instances (see above). As we will see in Section 3, the main feature of CDL is given by the use of its inference techniques like constraint propagation. By representing the knowledge of a domain with modeling facilities of CDL these inference techniques can be applied for model construction. This representation is basically a generic description of domain objects of a domain at hand, i.e. CDL is used for specifying a domain model. For the representation of concrete domain objects, this description is instantiated and a system model is constructed. The created instances are related to each other through relations. Furthermore, instances can be checked for concept membership.

as knowledge [27]. For supporting metaknowledge, representation facilities are needed that allow the adequate representation of these types of knowledge, in here called metaknowledge models. A strict separation of these knowledge types from each other and the encapsulation of metalayer facilities are further requirements for a metalayer approach [6]. Furthermore, for facilitating the use and maintenance of the layers, if possible, each layer should be realized with the same modeling facilities. For being able to reason about the models on a metalayer and not only to specify them, a declarative language with a logic-based semantic should be used for implementing each layer. For allowing domain specific models on the layers, each layer should be extensible. In this paper, we propose a Reasoning-Driven Architecture, RDA (rooted at the Model-Driven Architecture, MDA, see [23] and Section 2). The RDA consists of an arbitrary number of layers. Each layer consists of a model and a knowledge-based system. Both represent the facilities used for reason about the next lower layer. For this task, we consider the Component Description Language (CDL) as a knowledge-modeling language which was developed for representing configuration knowledge [15]. CDL combines ontology reasoning similar to the Web Ontology Language (OWL) [2] with constraint reasoning [30], and rules [13] (see Section 3). Because knowledge-based configuration can be seen as model construction [8, 16, 15], these technology provides a natural starting point for implementing the modeling layers of RDA. Typically, a configuration model (located at the domain model layer) generically represents concrete configurations which themselves are located on the system model layer. The crucial point of RDA is to provide each layer with a model that represents knowledge about the next lower layer (see Section 4) and uses a knowledge-based configuration system to infer about this layer (see Section 5). By introducing a configuration system on each layer of the RDA, we enable reasoning tasks like consistency check, model construction and enhancement on each layer, i.e. also on metalayers. An application of such an RDA is naturally to support the knowledge acquisition process needed for knowledge-based systems. In a first phase of a knowledge acquisition process, the typically tacit knowledge about a domain is extracted by applying knowledge elicitation methods and high interaction between a knowledge engineer and the domain expert (knowledge elicitation phase). A model sketch is the result, which in turn is formalized during the domain representation phase. During this phase a domain model is created. The domain model has to be expressed with the facilities of a knowledgemodeling language. The RDA can e.g. be used to check such knowledge models for being consistent with the knowledge-modeling language.

2

Reasoning-Driven Architecture

For defining the Reasoning-Driven Architecture (RDA), we borrow the notion of layers from the Model-Driven Architecture [24, 22, 12, 25]. In MDA, the main task is to specify modeling facilities that can be used for defining models (metamodeling), see for example [25]: “A metamodel is a model that defines the language for expressing a model”. Or compiled to terms used here: “A metaknowledge model (called Meta-CDL-KB, see below and Section 3) is a knowledge model that defines CDL, which in turn is used for expressing a domain model”. MDA provides four layers for modeling (see Figure 2, MDA view): M 2 is the language layer, which is realized by (or is a (linguistic, s.b.) instance-of) a metamodel located on the M 3 layer. The language is used for creating a model of a specific system on the

58

Figure 2. Comparing MDA, CDL, and RDA. The MDA view is refined when applying CDL with its domain and system model on R1 (R1C , R1I ). The CDL view is three times copied for using RDA for the biological classification domain.

A configuration system supports mainly three tasks (see Figure 2, CDL view):

1. It enables the expression of a configuration model that is consistent with the configuration language, which the system implements. For this task, the configuration system performs consistency checks of given configuration models (or parts of it) with the language specification. As a result, the configuration model is

conform to the configuration language. However, the creation of the configuration model is done manually during knowledge acquisition. 2. On the basis of the configuration model, the configuration system supports the creation of configurations (system models) that are consistent with the configuration model. For this task, the system interprets the logical expressions of the configuration model and creates configurations according to these definitions. As a result,

59 CLOS internals

CLOS internals

CLOS internals

Knowledge implementation layer R3

RDA view:

(make-instance ‘standard-class :name ‘Aves...)

CLOS internals

Knowledge implementation layer R3

CDL view:

Metamodel

Metalayer M3

MDA view:

linguistic instance-of

linguistic instance-of

linguistic instance-of

macroexpansion i of f CLOS

linguistic instance-of

linguistic instance-of

CDL instance level

implementation instance of instance-of

CDL concept level

CDL instance level

implementation instance-of

CDL concept level

CDL instance level

implementation instance-of

CDL concept level

Knowledge modeling language layer R2

(make-instance ‘Aves ...)

CDL instance level

implementation instance-of

(defclass Aves...)

CDL conceptt llevell

Knowledge modeling language layer R2

Language

Language layer M2

linguistic instance-of

linguistic instance-of

linguistic instance-of

linguistic g instance-of

linguistic instance-of

linguistic instance-of

p of CDL macroexpansion

linguistic instance-of

i of f CDL macroexpansion

linguistic instance-of

linguistic instance-of

(make-individual Aves ...)

BioClS

ontological instance of instance-of

(define-concept Aves ...

BioClD

(make-individual Class-m :name Aves ...

BioClD

ontological instance-of

(d fi (define-concept t concept-m t (define-concept Class-m ...

Meta-CDL-KB, BioClM

(make-individual biological-rank-mm :name Class-m ...

BioClMM

ontological instance-of

(define-concept concept-m ... (define-concept biological-rank-mm

Meta-CDL-KB, BioClMM

Knowledge model layer R1

(make-individual Aves ...)

System model

ontological instance-of realized by a configurator

(define-concept Aves ...)

Model of a domain realized by knowledge engineer

Knowledge model layer R1

Model in the language

System model layer M1

one-to-one mapping

I instance-of C

System model layer R1IS

Domain model layer as concepts R1CD

Domain model layer as instances R1IM

Metamodel layer y as concepts M R1C

Metamodel layer as instances R1IMM

Metametamodel layer R1CMM

y model layer y System R1IS

Domain model layer R1CD

C

I

or value ranges specified in the concept. Figure 3 gives examples for concept definitions. The structural relation has-differences is defined, which relates one biological class with several characteristics and one characteristic with several classes. The concept for a biological class (Mammal) is defined with such a relation including number restricted structural relations. The right side of the operator :> consists of the super-concept of all relative concepts and the minimal and maximal number of those concepts. The left side restricts the total number of instances in the relation. The characteristics Hair and Fur are optional and only one of them can be used for describing a mammal, because exactly 6 characteristics are needed for specifying a mammal. Except of the metaconcept specification typical ontological definitions are given. Constraints summarize conceptual constraints, constraint relations, and constraint instances. Conceptual constraints consists of a condition and an action part. The condition part specifies a structural situation of instantiated concepts. If this structural situation is fulfilled by some instances (i.e. the instances match the structural situation), the constraint relations that are formulated in the action part are instantiated to constraint instances. Constraint relations can represent restrictions between properties like all-different-p or ensure-relation. The constraint relation ensure-relation establishes a relation of a given name between two instances. It is used for constructing structural relations and thus provides main facilities for creating resulting constructions. Before establishing a relation between given instances, ensure-relation checks whether the relation already exists. The constraint relation all-different-p ensures that all objects in a given set are of a different type. Please note, that such kind of constraints extend typical constraint technology, which is based on primitive datatypes like numbers or strings [19]. Constraints are multi-directional, i.e. they are propagated regardless of the order in which constraint variables are instantiated or changed. At any given time, the remaining possible values of a constraint variable are given as structural relations, intervals, value sets or constant values. Constraint relations are used in the action part of conceptual constraints. Figure 6(c) gives also an example of such a conceptual constraint in CDL, however, already on a metalayer. It shows, how instances, which are selected through the structural situation, can be checked for being of a different type. When this check is fulfilled this constraint would be consistent otherwise inconsistent. A configuration system performs knowledge processing on the basis of logical mappings like they are given in [31] for a predecessor of CDL. Thus, the configuration system applies inference techniques such as taxonomical reasoning, value-related computations like interval arithmetic [18], establishing structural relations, and constraint propagation. The structural relation is the main mechanism that causes instantiations and thus leads to an extended configuration: If such a relation is given between a concept c and several relative concepts r, depending on what exists first as instances in the configuration (c or one or more of the relative concepts r), instances for the other part of the relation may be created and the configuration increases. This capability together with the fact that descriptions, i.e. models, of systems are constructed lead to the use of configuration systems for constructing models. For a detailed description of CDL and its use in a configuration system, we refer to [15].

the configurations are conform to the configuration model. 3. The configuration model can be defined in textual form or supported by a graphical user interface that enables the creation of concepts and constraints. Thus, the configuration system supplies a user interface for expressing the configuration model and for guiding the configuration process. Thus, a configuration system supplies means for supporting the S step from a domain model (R1D C ) to a system model (R1I ) (see Figure 2, CDL view) and it can check configuration models represented with the configuration language of the system for being compliant with the language. However, the development of the configuration model is not supported with general inference techniques but is system dependent. In RDA, this instantiation facility is used for supporting the step from the configuration language to the domain model. By applying the configuration system to a domain model that contains every model of a language, i.e. by applying it to a metaknowledge model (the Meta-CDL-KB), the configuration of a domain model for any specific domain is supported (Figure 2, RDA view, R1M C ). This is achieved because of the general applicability of the language constructs of CDL, which are based on logic (see Section 3). Furthermore, other advantages of configuration systems, like a declarative representation of the configuration model, or the use of inference techniques can be applied to the Meta-CDL-KB. Thus, the construction of metamodel-compliant domain models is supported with this approach. However, the question arises: How can CDL be represented with CDL? (see Section 4.1).

3

Comprehensive Knowledge Representation

This section is organized as follows. First a brief overview of the knowledge representation language CDL (Component Description Language) [15] will be given in Section 3.1. Section 3.2 presents parts of the metamodel of CDL which have to be modeled on R1M . CDL will be used in the following sections to realize the envisioned metalevels through knowledge-based implementations.

3.1

A Sketch of CDL

The CDL mainly consists of two modeling facilities: a concept hierarchy and constraints. Models consisting of concepts and constraints belong to R1D , for the biological classification domain this layer is named BioClD (see Figure 1). The Concept Hierarchy contains concepts, which represent domain objects, a specialization hierarchy (based on the is-a relation), and structural relations. Concepts gather all properties, a certain set of domain objects has, under a unique name. A specialization relation relates a super-concept to a sub-concept, where the later inherits the properties of the first. The structural relation is given between a concept c and several other concepts r, which are called relative concepts. With structural relations a compositional hierarchy based on the has-parts relation can be modeled as well as other structural relationships. For example, the structural relation has-differences connects a species with its differentiating characteristics which is not a decomposition, to be precise. Parameters specify the attributes of a domain object with value intervals, values sets (enumerations), or constant values. Parameters and structural relations of a concept are also referred to as properties of the concept. Instances are instantiations of concepts and represent concrete domain objects. When instantiated, the properties of an instance are initialized by the values

60

(define-relation :name has-differences :inverse differentiate :domain Class-m :range Characteristics :mapping m-n)

a)

(define-concept :name Characteristic :specialization-of domain-root :metaconcept Characteristic-m)

b)

(define-concept :name Glands :specialization-of Characteristic :metaconcept Characteristic-m)

(define-concept :name Animal :specialization-of domain-root :metaconcept Kingdom-m)

(define-concept :name Hair :specialization-of Characteristic :metaconcept Characteristic-m)

(define-concept :name Chordate :specialization-of Animal :metaconcept Phylum-m)

(define-concept :name Fur :specialization-of Characteristic :metaconcept Characteristic-m)

(define-concept :name Mammal :specialization-of Chordate :has-differences ((:type Characteristic :min 6 :max 6) :> (:type Glands :min 1 :max 1) (:type Hair :min 0 :max 1) (:type Fur :min 0 :max 1) (:type MiddleEarBones :min 3 :max 3) (:type WarmBlooded :min 1 :max 1)) :metaconcept Class-m)

(define-concept :name MiddleEarBones :specialization-of Characteristic :metaconcept Characteristic-m) (define-concept :name WarmBlooded :specialization-of Characteristic :metaconcept Characteristic-m)

Figure 3. Example of CDL concept definitions from the domain of biological classification.

3.2

4

Parts of the Metamodel of CDL

Metamodels

In this section, the metamodels needed for R1M and R1M M (Section 4.1 and [14]) and their extensions for modeling the biological classification domain (Section 4.2) are presented.

Languages are typically defined by describing their abstract syntax, their concrete syntax, and consistency rules. For describing CDL’s abstract syntax, we introduce three metalevel facilities: a knowledge element, a taxonomical relation between knowledge elements, and a compositional relation between knowledge elements. These facilities are not to be mixed up with the above mentioned CDL facilities: concepts, specialization relations, and structural relations. The abstract syntax for concepts and conceptual constraints of CDL is given in Figure 5. A concrete syntax for CDL is given in Figure 3 for example. A CDL concept is represented with a knowledge element of name concept (see Figure 4 (a)), and a CDL structural relation is represented with the knowledge element relation-descriptor. The fact that CDL concepts can have several structural relations is represented with a compositional relation with name has-relations. Parameters are represented similarly. Structural relations are defined in a further part of the metamodel (see Figure 4(b)). The fact that a concept is related by a structural relation of other concepts (the relative concepts) is represented with three knowledge elements and three compositional relations in a cyclic manner. Figure 4(c) provides the metamodel for a conceptual constraint with its structural situation and action part. A structural situation consists of a concept expression which in turn consists of a variable and a conditioned concept. The action part consists of a number of constraint relations that should hold if the structural situation is fulfilled. Several consistency rules define the meaning of the syntactic constructs. For example, one rule for the structural relation defines that the types of the relative concepts of a structural relation have to be sub-concepts of the concept on the left side of the operator :> (rule-5). Additionally, consistency rules are given that check CDL instances, e.g. one rule defines when instances match a conceptual constraint (rule-6). All rules are given in [15].

4.1

CDL in CDL

As discussed in Section 2, R1M and R1M M will be realized with CDL. Thus, the goal is to define the metamodel of CDL (as sketched in Section 3.2) using CDL itself. In fact, CDL provides all knowledge representation facilities needed for this purpose. The result of this is a metaconfiguration model called Meta-CDL knowledge base (Meta-CDL-KB). In Section 3.2, parts of the metamodel of CDL are defined using three modeling facilities, namely knowledge elements, taxonomical relation, and compositional relation. These modeling facilities are mapped to the CDL constructs concept, specialization relation, and structural relation respectively. Figure 5 shows how the knowledge elements shown in Figure 4 (a) can be represented with the meta-concepts concept-m, relations-descriptor-m, and parameter-m. The consistency rules of CDL have to be represented also. This is achieved by defining appropriate constraints. In Figure 5, a conceptual constraint is represented, which checks the types of a structural relation.2 Furthermore, instances can be represented on the metalevel by including the metaconcept instance-m. Having instances available, conceptual constraints and their matching instances can be represented (see Figure 5). The fact that instances fulfill a certain conceptual constraint is represented through establishing appropriate relations using the constraint relation ensure-relation. Please note that self references can be described also, e.g. a concept-m is related to itself via the has-superconcept-m relation (compare the loop in Figure 4 with Figure 5). Other approaches also use metalevels for defining their language, e.g. UML [34, 24, 26, 25]. In contrast to these approaches, we use 2

61

For a complete mapping of the CDL consistency rules to conceptual constraints see [15].

language construct

a)

b)

has-relations

has-concept

concept pproperty p y relation descriptor name operator

0..n

has-superconcept

relation descriptor

1..n

has-spec

structural specificator minimum maximum

1..n

1..n

concept

parameter descriptor

has-relations 0..n

has-parameters has-instances conceptual constraint name

c)

concept instance

has calls has-calls

has structural situation has-structural-situation

Legend: name

knowledge element

1..1

taxonomical relation

compositional relation with name a e and a d defaults de au ts

structural situation

1..n

concept expression

action part

1..n

constraint call

1..1

structural variable

conditioned concept

Figure 4. Metamodel for a) a concept, b) a structural relation, and c) a conceptual constraint.

(define-concept :name concept-m :specialization-of named-domain-object-m :concept-of-dom-m (:type domain-m) :superconcept-of-m :superconcept of m (:type concept-m :min 0 :max inf) :in-some-m (:type some-m :min 0 :max inf) :has-superconcept-m (:type concept-m :min 0 :max 1) :has-relations-m (:type relation-descriptor-m :min 0 :max inf) p :has-parameters-m (:type parameter-m :min 0 :max inf) :has-instances-m (:type instance-m :min 0 :max inf)) (define-concept :name relation-descriptor-m :specialization-of named-domain-object-m (:type yp concept-m) p ) :relation-of-m ( :has-left-side-m (:type some-m :min 1:max 1) :has-right-side-m (:type some-m :min 0:max inf) :has-relation-definition-m (:type relation-definition-m :min 1:max 1)) (define-concept :name some-m :specialization-of p domain-object-descriptor-m j p :parameters ( (lower-bound [0 inf]) (upper-bound [0 inf])) :in-relation-left-m (:type relation-descriptor-m) :in-relation-right-m (:type relation-descriptor-m) yp concept-m)) p :some-of (:type

(define-conceptual-constraint :name consistency-rule-5 :structural-situation ((?c :name concept-m) (?rd :name relation relation-descriptor-m descriptor m :relation-of-m ?c) (?svt :name some-m :in-relation-left-m ?rd) (?stdi :all :name some-m :in-relation-right-m ?rd)) :constraint-calls ((all-isp (( p ?stdi ?svt))) ))) (define-concept :name conceptual-constraint-m :specialization-of named-domain-object-m :structural-situation (:type concept-expression-m :min 1 :max inf) :constraint-calls (:type ( yp constraint-call-m :min 1 :max inf) ) :matching-instances (:type instance-m :min 0 :max inf)) (define-conceptual-constraint :name instance-consistency-rule-6 :structural-situation ((?cc :name conceptual-constraint-m) (( p ) (?i :name instance-m :self (:condition (instance-matches-cc-p *it* ?cc)))) :constraint-calls ((ensure-relation ?i matching-instance-of ?cc) (ensure-relation ?cc matching-instancs ?i)))

(define-concept :name instance-m :specialization-of named-domain-object-m :instance-of-dom-m (:type domain-m) :instance-of-m (:type concept-m) :matching-instance-of-m yp conceptual-constraint-m) p (:type :has-relations-m (:type relation-descriptor-m :min 0 :max inf) :has-parameters-m (:type parameter-m :min 0 :max inf))

Figure 5. Formalizing the knowledge elements shown in Figure 4(a) and some consistency rules with CDL concepts.

62

• can supply new concepts for the layer below, which may be computed by machine learning methods, • monitors the reasoning process, e.g. for evaluating reasoning performance, and thus, makes reasoning explicit, • can create and use explanations, • may solve conflicts that occur during the domain representation phase, • may apply domain-specific metaknowledge, e.g. “ensuring specific differentiating characteristics of biological classes” with metacontraints as shown in Figure 6.

a knowledge representation language with a logic-based semantic on the metalevel, i.e. CDL instead of UML derivates like EMOF [24]. Doing so, inference techniques provided by the knowledge representation language can be used, e.g. constraint propagation. This enables the realisation of the Knowledge Reflection Server as introduced in the next section.

4.2

Extension of Meta-CDL-KB for Biological Classification

Because the metalayers are also realized with a knowledge-modeling language (here CDL) they can be extended by simply adding concept and constraint definitions to the metaknowledge base. Thus, the modeling facilities provided by such languages can not only be used for specifying the languages itself (see Section 4.1) but also for domainspecific extensions on the metalayers. In Figure 6 the extensions of R1M M (a) and R1M (b and c) for the biological classification domain are sketched, yielding to BioClM M and BioClM respectively. BioClM M consists simply of one concept which specifies a biological rank (Figure 6 (a)). On BioClM , beside the concept definitions that define the metaconcepts used in BioClD (see Figure 3 and 1), the conceptual constraint Specific-Characteristics is defined on the metalayer. This conceptual constraint checks every combination of biological classes for having specific characteristics by comparing their differences models on BioClD . Thus, with this conceptual constraint on BioClM it is specified that a biological class should have a unique combination of characteristics. With the constraint, also classes of different phylums are tested (e.g. chordate and echinoderms). On BioClD , these kinds of constraints are hard to define because they are typlically not related to one specific concept but to several. Furthermore, such constraints are usually part of some modeling guidelines, e.g. for biological classification such documents state that the definitions of biological classes should be unique. Thus, by the approach presented here a modeling of modeling guidelines on the metalayer is achieved.

5

We implemented parts of the KRS services based on the configuration system KONWERK [10], but have not yet finished the extensive evaluation. The Meta-CDL-KB and its extensions were used for checking versions of knowledge bases for the biological classification domain. Thereby, a mapping of concept definitions of BioClD to instance descriptions of BioClM was realized, i.e. concept definitions on one layer are instance definitions on the next upper layer. Furthermore, the concrete syntax for defining concepts in BioClD was extended, thus, metaproperties can be specified in BioClD . Both implementation issues as well as interfaces for the server functionality could be realized straight forward because of the flexibility of the underlying implementation language Common Lisp. However, the reasoning facilities provided by KONWERK could directly be used for scrutinizing the layers. The validation of this approach was shown with the help of the following three scenarios, which also illustrate the use of the KRS: • First, metaknowledge modeling can be adequately enabled, to this no workarounds with specialisations, like in [32] are needed. • Checking domain dependent constraints: In the event of the introduction of a new biological class in BioClD , the KRS advises according to the specific characteristics of the constraint (e.g. see Figure 6 (c)) on BioClM whether it is a biological class or not. • Checking domain independent consistency rules: Nonrelevant to the kind of domain the domain independent rules, like the number restrictions (see Figure 5) will be checked at all times. For example, when a new kind of mammal will be introduced the defined restrictions, like the number of middle ear bones has to be conform.

Knowledge Reflection Servers

Each layer described in Section 2 is realized through a Knowledge Reflection Server (KRS). Every server monitors the layer below it and consists of the appropriate model and a configuration system which interprets the model. This has the advantage of using declarative models at each metalayer as well as the possibility to apply inference techniques like e.g. constraint programming at the metalayer. Each server supplies knowledge-based services that can be called by a server below it for obtaining a judgement of its own used models. For example, a KRS monitors the activities during the construction of the domain model BioClD , i.e. during the domain representation phase. If e.g. a concept c of the domain is defined with define-concept on R1D the KRS on R1M is informed (see Figure 7) for checking its consistency. Furthermore, a KRS

6

Discussion and Related Work

Main properties of the Reasoning-Driven Architecture realized with the Knowledge Reflection Servers (KRS) as described in the previous sections are: • the introduction of a model on one layer that represents the knowledge facilities used on the layer below it (i.e. metaknowledge models). • the use of existing knowledge-based systems with their reasoning facilities on each layer, especially on metalayers. This enables reflection about knowledge. • the mapping of concepts of one layer to instances of the next higher layer. This approach has the potential of using more tractable instance related inference methods instead of concept reasoning. • the support of declarative knowledge modeling on several metalayers. This enables the modeling of knowledge and metaknowledge at the same time. Metaknolwledge is typically specified in modeling guidelines. Thus, the described approach enables the modeling of modeling guidelines.

• supplies services like check-knowledge-base, add-conceptualconstraint, • creates appropriate instances of metaconcepts of the Meta-CDLKB, e.g. concept-m or conceptual-constraint-m, • uses constraint propagation for checking the consistency rules, • applies the typical model configuration process for completing the configuration, e.g. adds mandatory parts, • checks consistency of created domain specific concepts, e.g. of BioClD ,

63

b) (define-concept :name Difference-descriptor-m :specialization-of relation-descriptor-m :relation-of-m (:type Class-m) :has-left-side-m (:type Characteristic-some-m :min 1:max 1) :has-right-side-m (:type Characteristic-some-m :min 0:max inf))

(define-concept :name Kingdom-m :specialization-of concept-m :metaconcept Biological-Rank-mm) (define-concept :name Phylum-m :specialization-of i li ti f Kingdom-m Ki d :metaconcept Biological-Rank-mm) (define-concept :name Class-m :specialization-of Phylum-m :has-relations-m (:type Difference-descriptor-m :min 1 :max 1) :metaconcept Biological-Rank-mm) Biological Rank mm)

(define-concept :name Characteristic-some-m :specialization-of some-m :in-relation-left-m (:type Difference-descriptor-m) :in-relation-right-m (:type Difference-descriptor-m) :some-of (:type Characteristic-m)) (define-concept :name Characteristic-m :specialization-of concept-m :has-relations-m (:type Difference Difference-descriptor-m descriptor m :min 1 :max 1) :metaconcept Biological-Rank-mm)

c) (define-conceptual-constraint :name Specific-Characteristics :structural-situation ((?c1 :name Class-m) (( ) (?c2 :name Class-m :self #'(not-eq *it* ?c1)) (?r1 :name Difference-descriptor-m :relation-of-m ?c1) (?r2 :name Difference-descriptor-m :relation-of-m ?c2) (?d1 :all :name Characteristic-some-m ( :in-relation-right-m ?r1) (?d2 :all :name Characteristic-some-m :in-relation-right-m ?r2) :action-part ((all-different ?d1 ?d2)))

(define-concept :name Biological-Rank-mm :specialization-of concept-m)

a)

Figure 6. Extending Meta-CDL-KB with concepts and conceptual constraints for the domain of biological classification, i.e. parts of BioClM .

Singular activity e.g. in [Hotz 09] Editor for BioClMM e.g. Biological!Rank!mm

creates

creates

Metaknowledge model Meta!CDL!KB

Configuration system on

Domain!specific metaknowledge extensions BioClMM

Metaknowledge k l d model d l Meta!CDL!KB

Model

R1MM

Editor for knowledge model on R1M, e.g. Class!m of BioClM

scrutinizes/ constructs

Instantiation of e.g. Class!m

uses Configuration system on R1M

Domain!specific metaknowledge extensions BioClM, e.g. Class!m

Metaknowledge model Meta!CDL!KB

Instantiation of e.g. Biological!Rank!mm

uses

Editor for knowledge model on R1DS, e.g. Mammal of BioClD

scrutinizes/ constructs

Instantiation of e.g. King Penguin

uses

Domain!specific model BioClD, e.g. Mammal

C fi Configuration ti system t on R1DS constructs

Reasoning system

User!interface of configurator g gguidingg the configuration process, e.g. Tux of BioClS

System model BioClS

User interaction

Figure 7. Monitoring the construction of metamodels and domain models through Knowledge Reflection Servers realized with configuration systems. A server using BioClM scrutinizes BioClD and a server using BioClM M scrutinizes BioClM .

more important for us was the possibility to add server technologies to the knowledge-representation language CDL. However, by replacing the Meta-CDL-KB with a metamodel for OWL (e.g. the Ontology Definition Metamodel (ODM) [26]) one could use RDA for scrutinizing the construction of domain models written in OWL. However, with the Meta-CDL-KB a knowledge-based implementation of

The use of reasoning methods in RDA is achieved by replacing the Meta Object Facility (MOF) of MDA [23], which is based on UML, with CDL. Other knowledge-representation languages like the Web Ontology Language (OWL) [2] could also be considered for being used on the layers. However, CDL is quite expressive, e.g. also constraints can be expressed on each layer. For realizing the KRS, even

64

REFERENCES

a metamodel is provided and was used from a knowledge-based system (here a configurator). [3] and [11] present also approaches that include semantics on the metalayer, similar as the Meta-CDL-KB metamodel does. However, these approaches do not emphasise the use of reasoning methods on each layer as well as the capability to define domain-specific extensions on the metalayer. Furthermore, the RDA presented in this paper allows for an arbitrary number of metalayers. By introducing a configurator which allows the definition of procedural knowledge for controling the used reasoning techniques, in our approach the realization of metastrategies on the metalayers can be considered (see also [17]). (See Section 2 for further comparison to MDA and OWL.) The creation of a metamodel for CDL with the aid of CDL has its tradition in self-referencing approaches like Lisp-in-Lisp [7] or the metaobject protocol, which implements CLOS (the Common Lisp Object System) with CLOS [21]. Such approaches demonstrated the use of the respective language and provide reflection mechanisms. With our approach such reflection mechanisms are extended from object-oriented reflection (e.g. about introspection of methods) to knowledge-based reflection (e.g. about used concepts and constraints for modeling a domain). Thus, our approach provides reflection about knowledge and a way to self-awareness of agents. A Knowledge Reflection Server is basically an implementation of a configuration tool on the basis of the Meta-CDL-KB, i.e. of a configuration model. A typical configuration tool is implemented with a programming language and an object model implemented with it. During this implementation one has to ensure correct behavior of model construction and the inference techniques. By using CDL, this behavior (e.g. the consistency rules) is declaratively modeled, not procedurally implemented. The bases for this declarative realization are of course the procedural implementation of the inference techniques, so to speak, as a bootstrapping process. However, our approach gives indications how to open up the implementation of configuration systems or other knowledge-based systems for allowing domain-specific extensions and extensions to the inference methods and the used knowledge-modeling language. The approach of the Metacognitive Loop presented in [1] considers the use of metalevels for improving learning capabilities and human-computer dialogs. Similar to our approach, it points out the need for enhancing agents with capabilities to reason about their cognitive capabilities for gaining self-awareness and a basis for decisions about what, when, and how to learn. However, our approach stems more from the ontology and technical use point of view and supports the idea of using metalevels from that side.

7

[1] Michael L. Anderson and Donald R. Perlis, ‘Logic, Self-awareness and Self-improvement: the Metacognitive Loop and the Problem of Brittleness’, J. Log. and Comput., 15(1), 21–40, (2005). [2] Grigoris Antoniou and Frank Van Harmelen, ‘Web Ontology Language: OWL’, in Handbook on Ontologies in Information Systems, pp. 67–92. Springer, (2003). [3] T. Asikainen and T. M¨annist¨o, ‘A Metamodelling Approach to Configuration Knowledge Representation’, in Proc. of the Configuration Workshop on 22th European Conference on Artificial Intelligence (IJCAI2009), Pasadena, California, (2009). [4] Colin Atkinson and Thomas K¨uhne, ‘Model-Driven Development: A Metamodeling Foundation’, IEEE Softw., 20(5), 36–41, (2003). [5] F Baader, D Calvanese, D McGuinness, D Nardi, and P PatelSchneider, The Description Logic Handbook, Cambridge University Press, 2003. [6] Gilad Bracha and David Ungar, ‘Mirrors: design principles for metalevel facilities of object-oriented programming languages’, in OOPSLA ’04: Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pp. 331–344, New York, NY, USA, (2004). ACM. [7] R.A. Brooks, R.P. Gabriel, and L. Steele Jr., ‘Lisp-in-Lisp: High Performance and Portability’, in Proc. of Fifth Int. Joint Conf. on AI IJCAI-83, (1983). [8] M. Buchheit, R. Klein, and W. Nutt, ‘Constructive Problem Solving: A Model Construction Approach towards Configuration’, Technical Report TM-95-01, Deutsches Forschungszentrum f¨ur K¨unstliche Intelligenz, Saarbr¨ucken, (January 1995). [9] Dragan Gaˆsevi´c, Dragan Djuric, Vladan Devedzic, and Bran Selic, Model Driven Architecture and Ontology Development, SpringerVerlag New York, Inc., Secaucus, NJ, USA, 2006. [10] A. G¨unter and L. Hotz, ‘KONWERK - A Domain Independent Configuration Tool’, Configuration Papers from the AAAI Workshop, 10–19, (July 19 1999). [11] P. Haase, R. Palma, and d’Aquin M., ‘Updated Version of the Networked Ontology Model’, Project Deliverable D1.1.5, Neon Project, (2009). www.neon-project.org. [12] W. Hesse, ‘More Matters on (Meta-)Modelling: Remarks on Thomas K¨uhne’s ”Matters”’, Journal on Software and Systems Modeling, 5(4), 369–385, (2006). [13] Ian Horrocks, Peter F. Patel-Schneider, Harold Boley, Said Tabet, Benjamin Grosof, and Mike Dean, ‘SWRL: A Semantic Web Rule Language Combining OWL and RuleML’, W3c member submission, World Wide Web Consortium, (2004). [14] L. Hotz, ‘Construction of Configuration Models’, in Configuration Workshop, 2009, eds., M. Stumptner and P. Albert, Workshop Proceedings IJCAI, Pasadena, (2009). [15] L. Hotz, Frame-based Knowledge Representation for Configuration, Analysis, and Diagnoses of technical Systems (in German), volume 325 of DISKI, Infix, 2009. [16] L. Hotz and B. Neumann, ‘Scene Interpretation as a Configuration Task’, K¨unstliche Intelligenz, 3, 59–65, (2005). [17] L Hotz, K Wolter, T Krebs, S Deelstra, M Sinnema, J Nijhuis, and J MacGregor, Configuration in Industrial Product Families - The ConIPF Methodology, IOS Press, Berlin, 2006. [18] E. Hyv¨onen, ‘Constraint Reasoning based on Interval Arithmetic: the Tolerance Propagation Approach’, Artificial Intelligence, 58, 71–112, (1992). [19] U. John, Konfiguration and Rekonfiguration mittels Constraintbasierter Modellierung, Infix, St. Augustin, 2002. In German. [20] Sonya E. Keene, Object-Oriented Programming in Common Lisp: A Programmer’s Guide to CLOS, Addison-Wesley, 1989. [21] J. Kiczales, D. G. Bobrow, and J. des Rivieres, The Art of the Metaobject Protocol, MIT Press, Cambridge, MA, 1991. [22] T. K¨uhne, ‘Matters of (Meta-)Modeling’, Journal on Software and Systems Modeling, 5(4), 369–385, (2006). [23] OMG, MDA Guide Version 1.0.1, omg/03-06-01, Object Management Group, 2003. [24] OMG, Meta Object Facility Core Specification, version 2.0, formal/2006-01-01, Object Management Group, 2006. [25] OMG, Unified Modeling Language: Infrastructure, version 2.1.1, formal/07-02-06, Object Management Group, 2007.

Conclusion

This paper presents a technology for using knowledge-based systems on diverse metalayers. Main ingredients for this task are models about knowledge (metamodels). Through the use of knowledgebased systems as they are, a Reasoning-Driven Architecture is provided. It enables reasoning facilities on each metalayer, opposed to the Model-Driven Architecture which focusses on transformations. The Reasoning-Driven Architecture is realized through a hierarchy of Knowledge Reflection Servers based on configuration systems. Future work will include meta strategies for conducting reasoning methods on the metalayers, a complete implementation of the servers, and industrial experiments in the field of knowledge engineering.

65

[26] OMG, Ontology Definition Metamodel, Version 1.0, Object Management Group, 2009. [27] Gilbert Paquette, ‘An Ontology and a Software Framework for Competency Modeling and Management’, Educational Technology & Society, 10(3), 1–21, (2007). [28] J. Pitrat, ‘Metaconnaissance, avenir de l’Intelligence Artificielle.’, Technical report, Hermes, Paris, (1991). [29] K.C. Ranze, T. Scholz, T. Wagner, A. G¨unter, O. Herzog, O. Hollmann, C. Schlieder, and V. Arlt, ‘A Structure-Based Configuration Tool: Drive Solution Designer DSD’, 14. Conf. Innovative Applications of AI, (2002). [30] D. Sabin and E.C. Freuder, ‘Configuration as Composite Constraint Satisfaction’, in Proceedings of the Artificial Intelligence and Manufacturing Research Planning Workshop, pp. 153–161. AAAI Press, (1996). [31] C. Schr¨oder, R. M¨oller, and C. Lutz, ‘A Partial Logical Reconstruction of PLAKON / KONWERK’, in DFKI-Memo D-96-04, Proceedings of the Workshop on Knowledge Representation and Configuration WRKP’96, ed., F. Baader, (1996). [32] Stefan Schulz, Holger Stenzhorn, and Martin Boeker, ‘The ontology of biological taxa’, in ISMB, volume 24, pp. 313–321, (2008). [33] K. Wolter, T. Krebs, and L. Hotz, Mass Customization - Challenges and Solutions, chapter Model-based Configuration Support for Software Product Families, 43–62, Springer, 2006. [34] Dong Yang, Ming Dong, and Rui Miao, ‘Development of a Product Configuration System with an Ontology-Based Approach’, Comput. Aided Des., 40(8), 863–878, (2008).

66

Challenges of Knowledge Evolution in Practice Andreas Falkner and Alois Haselböck Siemens AG Österreich, Austria, email: {andreas.a.falkner,alois.haselboeck}@siemens.com Abstract. This paper enumerates some of the most important challenges which arise in practice when changing a configurator knowledge base: redesign of the knowledge base, schema evaluation of the data bases, upgrade of configuration instances which are already in the field, adaptation of solver, UI, I/O, and test suites. Partially, there are research theories for some of these challenges, but only few of them are already available in tools and frameworks. So we do not provide solutions here, but we want to stimulate a deeper discussion on knowledge evolution. Moreover, we give a self-contained realworld example as a proposal for future research.

A door sensor detects everybody who moves through its door (directed movement detection). There can be doors without a sensor. Any number of rooms may be grouped to a counting zone. Each zone knows how many persons are in it (counting the information from the sensors at doors leading outside of the zone). Zones may overlap or include other zones, i.e. a room may be part of several zones. A communication unit can control at most 2 door sensors and at most 2 zones. If a unit controls a sensor which contributes to a zone on another unit, then the two units need a direct connection: one is a partner unit of the other and vice versa. Each unit can have at most 2 partner units. This relation is symmetric but neither transitive nor reflexive. PartnerUnits problem: Given a consistent configuration of door sensors and zones, find a valid assignment of communication units (with max. 2 partners). We assume that a solution requires only a minimal number of units: the smallest integer greater or equal to the half of the maximum of the number of zones and the number of door sensors. Therefore we treat it as a constraint satisfaction problem, not as a constraint optimization problem.

1 INTRODUCTION Product configurators have a long history in artificial intelligence (see [1], [2]) and nowadays various systems based on AI methods are available: scientific frameworks and tools (like Choco, CHR, DLV, etc.) as well as commercial programs (like from Camos, Configit, ILOG, Oracle, SAP, Tacton, etc.). However, many of those systems do not cope well with a scenario which arises in practice whenever long-living products are configured: Knowledge evolution. Knowledge about the product grows and changes over time, e.g. new types of product components get available, old ones run out of production, experiences in production or operation cause tightening or loosening of assembly restrictions between components' properties, etc. The corresponding configurator's knowledge base (types, relations, attributes, constraints) must be adjusted accordingly. This may also affect existing configuration instances based upon that knowledge base. The necessary reconfiguration - i.e., in general, modification of an existing product individual to meet new requirements - has been subject to only little research work, e.g. by [3] who suggest explicit rules (conditions and operations) and invariants for necessary changes of instances. A more recent paper [4] concentrates on replacing (faulty) components with new or better ones and poses some research questions still to be answered. This paper identifies some aspects of knowledge evolution worth to be researched in detail. As an example we use a fictitious people counting system for museums. The structure of the problem is similar to problems we encountered in different real-world domains of our technical product configurators. At first we describe the original problem and model it with an object-oriented (UML) and constraint-based (GCSP) approach. Then we specify some exemplary changes. This leads to several challenges how to cope with the implied knowledge evolution: schema evolution, data upgrade, solver adaptations, adaptations to UI, I/O, and tests.

8

7

1

6

2

5

3

4

Figure 1. Room layout of example 1

Example 1: Rooms 1 to 8 with eleven doors, nine of them having a door sensor (named D01, D12, D26, D34, etc. by the rooms which they connect). Eight zones named by the rooms which they contain: Z1 (white), Z2345678, Z2367, Z2378 (light gray), Z4, Z45 (dark grey), Z456, Z6 (medium grey). The relation between zones and door sensors can be represented as a bi-partite graph (see Figure 2). An instance of a solution using the minimum of five units is shown in Table 1. Unit

U1 U2 U3 U4 U5

2 ORIGINAL PROBLEM DESCRIPTION

Zones

Z1 Z2367 Z2378 Z456 -

Z2345678 Z45 Z6 Z4 -

DoorSensors

D01 D12 D34 D26 D45

D78 D56 D67 D36 -

PartnerUnits

U2 U1 U2 U3 U4

Table 1: A minimal solution of example 1

A people counting system (e.g. for use in museums) consists of door sensors, counting zones (rooms), and communication units (see [5] for more details).

67

U3 U4 U5 -

context ComUnit inv: myPartnerUnitsSensor = sensors.zones.unit->excluding(self)->asSet() and myPartnerUnitsZone = zones.sensors.unit->excluding(self)->asSet() and myPartnerUnitsSensor->union(myPartnerUnitsZone) ->size() KB’)

1

unit2sensor

id: string outside: int 0..2

partnerunit s

constraint DoorSensor.equalOutside: outside = unit.outside Figure 5. UML diagram of changed PartnerUnits problem Figure 6. GCSP formulation of the modified PartnerUnits problem

Figure 5 shows a UML diagram which models the changed knowledge base: DoorSensor and ComUnit have a new integer attribute which distinguishes connections to outside systems (0 is used for sensors without outside connection). Zone gets two subclasses - one for the original zone type (therefore the zone2sensor

In general, the following changes are possible and should be handled in a way so that the other challenges can be solved as well: • Change property names, attribute types, association domains and cardinalities, inheritance hierarchy, etc. • Add new component types, attributes, or associations.

69

• Delete component types, attributes, or associations. • Change, add, or delete constraints.

zone(X) -> zone4sensors(X) unit(X) -> unit(X), unit_outside(X, 0) sensor(X) and sensor_info(X, V) -> sensor(X), sensor_outside(X, V) zone2sensor(X, Y) -> zone2sensor(X, Y) unit2sensor(X, Y) -> unit2sensor(X, Y) unit2zone(X, Y) -> unit2zone(X, Y) unit2unit(X, Y) -> unit2unit(X, Y)

In our example, the knowledge engineer had to decide which class for zones (if any) to reuse in the model - and he decided for the abstract general class and against the old semantics (for that we would have needed a new super class of the existing Zone and the new ZoneWithZones as well as a redirection of unit2zone instead of zone2sensor). Furthermore, he decided just for a new attribute for ComUnit instead of a new class. By that, the existing constraints and cardinalities can stay unchanged. Only the computation of the partnerunits association must be extended with the path over zone2zone. A new constraint for outside systems is necessary (outside number of sensor and its unit must be the same). Furthermore, implicit knowledge strongly suggests to disallow cycles in the zone2zone association (omitted for brevity). Challenges: Are there any patterns or rationale for putting such design decisions for change on a sound base?

The Chase procedure then generates a target instance by applying the transformation rules to the source instance: {sensor(d01), sensor(d12), …, sensor_outside(d01, 1), sensor_outside(d12, 0), …, zone4sensors(z1), zone4sensors(z2345678), …, unit(u1), unit(u2), …, unit_outside(u1, 0), unit_outside(u2, 0), …, zone2sensor(z1, d01), zone2sensor(z1, d12), …, unit2sensor(u1, d01), unit2sensor(u1, d78), …, unit2zone(u1, z1), unit2zone(u1, z2345678), …, unit2unit(u1, u2), unit2unit(u2, u1), …}

4.2 Schema Evolution (DB -> DB’)

The performance of this Chase depends on which logical expressions are used in the transformation rules. In general, there might be various changes of different difficulty, e.g. renaming, type changes, value shifts, conditional changes, etc. Challenges: What are complete, clear, fast, and reliable methods to specify necessary changes for schema evolution?

When existing configuration instances (I) shall be accessed with the new knowledge base (KB'), it is necessary to also change the schema of the database (DB) where the instances I are stored. They must be upgraded to instances which are based on the new database schema DB' (as part of KB'). In our example, the schema evolution must change the type of all Zone instances to ZoneWithSensors. In all ComUnit instances, it must add the outside attribute and set it to 0 (initial value). In all DoorSensor instances, it must add the outside attribute and set it to the correct value (which must be supplied externally). The second action may be derived automatically from KB', the others not. A way to formalize this transformation of a source schema to a target schema is the theory of data exchange [8]. For our example, the source schema is defined by the following predicates:

4.3 Upgrade Instances (I -> I’) The new configuration instance is now in the schema of DB’, but not necessarily consistent to all the constraints of KB’. In order to get a consistent I', the violated constraints must be repaired. KB' might contain a new constraint which contradicts some part of the configuration which is valid with respect to the old KB. Examples include simple cases like new versions of components, but also complicated relations between objects. In our example, D01 should be placed on a separate unit with outside number 1. Just changing the value of the outside attribute of its unit is not enough as there are other zones and sensors on that unit. Moving D01 to a new unit U6 is possible because at present U1 has just one partner unit and the second partner unit can be set to U6. The resulting configuration instance is shown in Table 2.

{sensor/1, zone/1, unit/1, zone2sensor/2, unit2sensor/2, unit2zone/2, unit2unit/2}

The solution instance shown in Table 1 is therefore represented by: {sensor(d01), sensor(d12), …, zone(z1), zone(z2345678), zone(z2367), …, unit(u1), unit(u2), …, zone2sensor(z1, d01), zone2sensor(z1, d12), …, unit2sensor(u1, d01), unit2sensor(u1, d78), …, unit2zone(u1, z1), unit2zone(u1, z2345678), …, unit2unit(u1, u2), unit2unit(u2, u1), …}

Unit

U1 U2 U3 U4 U5 U6

The target schema is:

Zones

Z1 Z2367 Z2378 Z456 -

Z2345678 Z45 Z6 Z4 -

DoorSensors

D12 D34 D26 D45 D01

D78 D56 D67 D36 -

PartnerUnits

U2 U1 U2 U3 U4 U1

U6U3 U4 U5 -

Table 2: A solution for the upgraded configuration {sensor/1, zone4zones/1, zone4sensors/1, unit/1, sensor_outside/2, unit_outside/2, zone2sensor/2, unit2sensor/2, unit2zone/2, unit2unit/2}

This solution is not minimal, however well acceptable because minimality of the solution (i.e. using as few units as possible) is less important than minimality of changes during upgrade (i.e. the upgraded configuration shall be as near to the original configuration as possible). Furthermore there is no trivial way to a better solution: Trying to reuse U5 instead of a new U6 just by exchanging D01 and D45 causes that U2 needs 3 partner units (U1, U3, U4) thus violating another cardinality constraint.

Data exchange transformation rules map a source configuration to a target configuration (we assume that information about the outside system of sensors is available in an externally defined table named sensor_info):

70

Challenges: How does the system know what to do (e.g. not setting DoorSensor.outside to 0)? Is it useful to specify a white list (e.g. direction of repair) or black list (e.g. do not change user-set values) for changes? What are repair actions which ensure that I’ is as close to I as possible? Can this closeness be formally defined in terms of an optimization function, shifting the upgrade task from a search problem to an optimization problem? In general, it is not possible to detect contradicting constraints by just checking the set of all constraints in the KB. The question is whether at least one configuration fulfils all constraints. If the constraint language is powerful enough (e.g. has quantors and multiplication), this question is undecidable. For simpler languages the best algorithm might still be of exponential order. An alternative approach based on diagnosis is described in [9].

format). Possible outputs comprise bill of material, production plans, etc. They must be adjusted to the changed model, e.g. output new attributes' values, input prices of the new types, etc. If many different output formats are involved - as is typical for configuration tasks in engineering-oriented companies - then synchronization may get complicated. A typical source of problems for changes concerning inputs is using the correct version of input data, i.e. the one matching the current knowledge schema. Challenges: How to easily and reliably synchronize external data and output format with the changed model?

4.7 Upgrade Test Suites Each knowledge base needs tests to ensure quality. This is especially true for safety-critical domains like railway systems. Common developer knowledge is that testing effort is as high as implementation effort. For knowledge base development it tends to be much higher as the specification language is very concise due to declarative, multi-functional constraints. Tests should include checking and solving with various combinations of set and free variables contributing to each constraint so that a high coverage of all scenarios is reached. Regression test cases are usually based on problem statements and reference solution instances. These instances need to be upgraded, too (see sections 4.2 and 4.3). Challenges: How to reduce test efforts, e.g. by automatically creating part of the tests? How can the regression test cases be upgraded when they fail because of changes in the knowledge base? How to distinguish errors in the changed knowledge base from necessary changes in the test cases?

4.4 Solver Adaptation If domain-independent solvers are used, then no solver adaptations should be necessary (in the best case). However, achieving sufficient performance often requires domain-specific algorithms. In a paper submitted to the forthcoming AI EDAM special issue on configuration, we present several domain-independent approaches which all fail to find solutions for large scale configurations (i.e. containing a lot of instances) even when they are simple. Those problem instances can only be solved with domain-specific heuristics exploiting the problem structure or after mapping the problem to special algorithms from appropriate fields, e.g. graph theory [10]. Unfortunately, the changed problem does no longer map to a bipartite graph which changes the problem structure considerably. Domain-specific solving must be adapted, usually with relative high costs. The same is true not only for our problem but also for other domain-specific implementations. Challenges: Are there any approaches to reduce domain-specific algorithms or to support their adaptation to changed requirements?

5 CONCLUSION We presented a real-world example of knowledge evolution and its effects on the design of the knowledge base and the upgrade of existing configuration instances (data bases). Although over the last years various modeling languages for product configuration have been suggested (e.g. [11]), we do not see one which can cope well with all presented challenges. Substantial research activities are required to improve efficiency and quality of redesign of knowledge bases for product configurators. A first step is the definition of the challenges in more detail and in a formal way.

4.5 UI Extension The user interface must support configuration work in the best way. In our example, the configuration process is severely affected by the new functionality: ZoneWithZones and ZoneWithSensors have different associations. Outside sensors and units should be visualized distinctively so that they can be easily distinguished. It must become possible to manually place sensors on units (accordingly to requirements of the outside system) before automatically placing the other sensors and zones. Some GUI features can be derived directly from the knowledge base (e.g. classes, attributes, associations) but in general need to be filtered and/or parameterized (formatted) for better usability. Some tools allow therefore defining certain UI properties as part of the knowledge base. Challenges: To what extent can the UI be defined as a natural extension of the knowledge base without introducing redundancy or overhead?

REFERENCES [1] D. Sabin and R. Weigel, Product Configuration Frameworks - A Survey, IEEE Intelligent Systems 13(4), pp. 42-49, 1998. [2] A. Felfernig, Standardized Configuration Knowledge Representations as Technological Foundation for Mass Customization, IEEE Transactions on Engineering Management, 54(1), pp. 41-56, 2007. [3] T. Männistö, T. Soininen, J. Tiihonen, and R. Sulonen, Framework and Conceptual Mode for Reconfiguration, AAAI-99 Workshop on Configuration, pp. 59-64, 1999. [4] P. Manhart, Reconfiguration - A Problem in Search for Solutions, IJCAI-05 Workshop on Configuration, pp. 64-67, 2005. [5] A. Falkner, A. Haselböck, and G. Schenner, Modeling Technical Product Configuration Problems, ECAI-10 Workshop on Configuration (to appear), 2010. [6] A. Felfernig, G. Friedrich, D. Jannach, and M. Zanker, Configuration knowledge representation using UML/OCL, Proceedings of the 5th

4.6 Adjust I/O Our example does not contain explicit input or output interfaces. Typically, input could be some product information in external data bases (code numbers of components, variations, prices, etc.) or customer requirements for a product individual (e.g. in XML

71

[7]

[8]

[9]

[10] [11]

International Conference on the Unified Modeling Language, pp. 49-62, Springer-Verlag, 2002. G. Fleischanderl, G. Friedrich, A. Haselböck, H. Schreiner, and M. Stumptner, Configuring large-scale systems with generative constraint satisfaction, IEEE Intelligent Systems 13(4), pp. 59-68, 1998. R. Fagin, P.G. Kolaitis, R.J. Miller, and L. Popa, Data exchange : semantics and query answering, Theoretical Computer Science 336, pp. 89-124, 2005. A. Felfernig, G. Friedrich, D. Jannach, and M. Stumptner, Consistency based diagnosis of configuration knowledge-bases, Technical Report KLU-IFI-99-2, University of Klagenfurt, 1999. B.W. Kernighan and S. Lin, An efficient heuristic procedure for partitioning graphs, The Bell system technical journal, 1970. T. Soininen, J. Tiihonen, T. Männistö, R. Sulonen, Towards a general ontology of configuration, AI EDAM 12, pp. 357-372, 1998.

72