Workshop Proceedings - TU Graz

Model-Based Systems The Model-Based Systems (MBS) paradigm refers to a methodology that allows for description of various kinds of systems for various tasks in a uniform way. For example, MBS has been used to specify monitoring tasks in medical systems, for planning in cognitive systems, and control and diagnosis in hardware and software systems. Consequently, research in MBS is spread across various application domains and different tasks. As lots of scientific workshops are application specific or system and task oriented, it is difficult to exchange experiences and novel concepts across the various application domains and tasks. In recent years MBS technology has increasingly contributed to mastering the inherent and ever increasing complexity of software and software-enabled systems. Thus it is the aim of this workshop to cross-fertilize the established concepts in model-based software engineering and MBS technology to further leverage model-oriented techniques in the software engineering domain. MBS 2008 workshop attracted researchers and practitioners dealing with modeling for specific reasoning tasks, knowledge representation, qualitative reasoning, and related areas such as model-based testing and fault detection and localization. The MBS 2008 - Workshop on Model-Based Systems is the fourth workshop of a series of workshops on the this topic. Previous workshops were collocated with the ECAI 2004 in Valencia, Spain, the IJCAI 2005 in Edinburgh, United Kingdom and the ECAI 2006 in Riva del Garda, Italy. The submissions to the MBS 2008 cover a wide range of topics with in the area of model-based systems. They range from more application-oriented solutions to modeling problems including automated generation and debugging of models to more theoretical contributions in the areas of diagnosis, qualitative reasoning and testing. The good mixture of theoretical and application-oriented articles from various domains promises a very interesting and fruitful workshop. Finally we like to thank all the authors who have submitted to this workshop. Moreover we like to thank all members of the program committee for their careful reviews. Bernhard Peischl, Neal Snooke, Gerald Steinbauer and Cees Witteveen July 2008

i

Organizing Committee Bernhard Peischl Neal Snooke Gerald Steinbauer Cees Witteveen

Technische Universität Graz, Austria University of Wales, Aberystwyth, UK Technische Universität Graz, Austria Delft University of Technology, The Netherlands

Program Committee Gautam Biswas Bert Bredeweg Marie-Odile Carlos J. Alonso Gonzlez Bernhard Peischl Caudia Picardi Belarmino Pulido Junquera Martin Sachenbacher Paulo Salles Neal Snooke Gerald Steinbauer Cees Witteveen

Vanderbilt University Universiteit van Amsterdam, The Netherlands Cordier IRISA Campus de Beaulieu, France Universidad de Valladolid, Spain Technische Universität Graz, Austria Universit di Torino, Itanly Universidad de Valladolid, Spain Technische Universität M¨ unchen, Germany Universidade de Brasilia, Brazil University of Wales, Aberystwyth, UK Technische Universität Graz, Austria Delft University of Technology, The Netherlands

ii

Table of Contents Comparing GDE and Conflict-based Diagnosis Ildik´ o Flesch, Peter J.F. Lucas . . . . . . . . . . . . . . . . . . .

1

On computing minimal conflicts for ontology debugging Kostyantyn Shchekotykhin, Gerhard Friedrich, Dietmar Jannach

7

Supporting Conceptual Knowledge Capture Through Automatic Modelling Jochem Liem, Hylke Buisman, Bert Bredeweg . . . . . . . . . . .

13

Automated Learning of Communication Models for Robot Control Software Alexander Kleiner, Gerald Steinbauer, Franz Wotawa . . . . . . .

19

Relaxation of Temporal Observations in Model-Based Diagnosis of Discrete-Event Systems Gianfranco Lamperti, Federica Vivenzi, Marina Zanella . . . . .

25

The Concept of Entropy by means of Generalized Orders of Magnitude Qualitative Spaces Lloren¸s Rosell´ o, Francesc Prats, M´ onica S´ anchez, N´ uria Agell . .

31

Model-based Testing using Quantified CSPs: A Map Martin Sachenbacher, Stefan Schwoon . . . . . . . . . . . . . . .

37

iii

Comparing GDE and Conflict-based Diagnosis Ildikó Flesch1 and Peter J.F. Lucas2 Abstract. Conflict-based diagnosis is a recently proposed method for model-based diagnosis, inspired by consistency-based diagnosis, that incorporates a measure of data conflict, called the diagnostic conflict measure, to rank diagnoses. The probabilistic information that is required to compute the diagnostic conflict measure is represented by means of a Bayesian network. The general diagnostic engine is a classical implementation of consistency-based diagnosis and incorporates a way to rank diagnoses using probabilistic information. Although conflict-based and consistency-based diagnosis are related, the way the general diagnostic engine handles probabilistic information to rank diagnoses is different from the method used in conflict-based diagnosis. In this paper, both methods are compared to each other.

1 INTRODUCTION In the last two decades, research into model-based diagnostic software has become increasingly important, mainly because the complexity of devices, for which such software can be used, has risen considerably and trouble shooting of faults in such devices has therefore become increasingly difficult. Basically, two types of model-based diagnosis are being distinguished in literature: (i) consistency-based diagnosis [2, 8], and (ii) abductive diagnosis [7]. In consistency-based diagnosis a diagnosis has to be consistent with the modelled system behaviour and observations made on the actual system, whereas in abductive diagnosis the observations have to be implied by the modelled system given the diagnosis [1]. In this paper, we focus on consistency-based diagnosis as implemented in the general diagnostic engine, GDE for short, [2]. In addition, particular probabilistic extensions to consistency-based diagnosis as implemented in GDE are considered [2]. There is also a third kind of model-based diagnosis that can be best seen as a translation of consistency-based diagnosis from a mixed logical-probabilistic setting to a purely probabilistic setting, using a statistical measure of information conflict. The method has been called conflict-based diagnosis; it exploits Bayesian-network representations for the purpose of model-based diagnosis [4]. Although both GDE and conflict-based diagnosis take consistency-based diagnosis as a foundation, the way uncertainty is handled, as well as the way in which diagnoses are ranked, are different. The aim of this paper is to shed light on the differences and similarities between these two approaches to model-based diagnosis. It is shown that conflict-based diagnosis yields a ranking that, under particular circumstances, is more informative than that obtained by GDE. 1 2

Department of Computer Science, Maastricht University, email: [email protected] Institute for Computing and Information Sciences, Radboud University Nijmegen, email: [email protected]

The paper is organised as follows. In Section 2, the necessary basic concepts from model-based diagnosis, including GDE, and the use of Bayesian networks for model-based are reviewed. Next, in Section 3, the basic concepts from conflict-based diagnosis are explained. What can be achieved by the method of probabilistic reasoning in GDE is subsequently compared to the method of conflict-based diagnosis in Section 4. Finally, in Section 5, the paper is rounded off with some conclusions.

2 PRELIMINARIES 2.1 Model-based Diagnosis In the theory of consistency-based diagnosis [8, 2, 3], the structure and behaviour of a system is represented by a logical diagnostic system SL = (SD, COMPS), where • SD denotes the system description, which is a finite set of logical formulae, specifying structure and behaviour; • COMPS is a finite set of constants, corresponding to the components of the system that can be faulty. The system description consists of behaviour descriptions and connections. A behavioural description is a formula specifying normal and abnormal (faulty) functionality of the components. An abnormality literal of the form Ac is used to indicate that component c is behaving abnormally. whereas literals of the form ¬Ac are used to indicate that component c is behaving normally. A connection is a formula of the form ic ≡ oc′ , where ic and oc′ denote the input and output of components c and c′ , respectively. A logical diagnostic problem is defined as a pair PL = (SL , OBS), where SL is a logical diagnostic system and OBS is a finite set of logical formulae, representing observations. Adopting the definition from [3], a diagnosis in the theory of consistency-based diagnosis is defined as follows. Let ∆C consist of the assignment of abnormal behaviour, i.e. Ac , to the set of components C ⊆ COMPS and normal behaviour, i.e. ¬Ac , to the remaining components COMPS − C, then ∆C is a consistency-based diagnosis of the logical diagnostic problem PL iff the observations are consistent with both the system description and the diagnosis; formally: SD ∪ ∆C ∪ OBS 2 ⊥. Here, 2 stands for the negation of the logical entailment relation , and ⊥ represents a contradiction. Usually, one is in particular interested in subset-minimal diagnoses, i.e. diagnoses ∆C , where the set C is subset minimal. Thus, a subset-minimal diagnosis assumes that a subset-minimal number of components are faulty; this often corresponds to the most-likely diagnosis.

1

1 0

X1 1

1 A1 0

1 0 1 X2 1 1 1 A2

0 predicted [1] observed

1 1 0 R1

1 predicted [0] observed

Figure 1. Full adder with all outputs computed under the assumption of normality and observed and predicted outputs; i1 (1), ¯ı2 (0) and i3 (1) indicate the inputs of the circuit and o1 (1) and ¯ o2 (0) its observed outputs.

EXAMPLE 1 Figure 1 presents the full-adder example, which consists of two AND gates (A1 and A2), one OR gate (R1) and two exclusive-OR (XOR) gates (X1 and X2). Note that the predicted output ¯ o1 contradicts with the observation o1 , which is also the case for gate X2. As a consequence, the assumption that all components are behaving normally is invalid; thus, this is not a consistency-based diagnosis. However, a consistency-based diagnosis would be to assume the malfunctioning of component X1, as this would restore consistency. 2

2.2 GDE Next, GDE is briefly described, where [2] is used as a point of reference; however, the terminology defined above in this paper is adopted throughout this section. For example, where [2] speaks of a ‘candidate’ in this paper the term ‘diagnosis’ is used. The logical reasoning implemented by GDE can best be seen as an efficient implementation of consistency-based diagnosis. GDE can also deal with uncertainty by attaching a prior probability of malfunctioning to components. After an observation is made, the prior probability becomes a posterior probability, conditioned on this observation. Based on new observations, there may be previous diagnoses which become inconsistent with the observations and the system description. The set of diagnoses that are still possible is denoted by R and called the set of remaining diagnoses; it can be partitioned into two disjoint subsets: (i) the set of diagnoses that imply the observations, called the set of selected diagnoses and denoted by S, and (ii) the set of diagnoses that neither predict nor contradict the observations, called the set of uncommitted diagnoses, denoted by U . By definition, R = S ∪ U and S ∩ U = ∅. The posterior probability of a set of behaviour assumptions that is either inconsistent (not in R), a selected diagnosis (in S), or an uncommitted diagnosis (in U ) is computed as follows: 8 if ∆C 6∈ R > < 0 P (∆C ) if ∆C ∈ S (1) P (∆C | OBS) = P (OBS) > : P (∆C )/m if ∆C ∈ U P (OBS)

Computation of P (∆C ) is made easy in GDE by assuming independence between components behaving normally or abnormally. One of the consequences of this assumption is the following proposition. Proposition 1 Let PL = (SD, OBS) be a logical diagnostic system with associated joint probability distribution P as defined above for GDE, such that P (Ac ) ≪ P (¬Ac ) for each c ∈ COMPS, and let ∆C and ∆C ′ be two consistency-based diagnoses that are both in either S or U , then it holds that: P (∆C | OBS) ≥ P (∆C ′ | OBS)

Proof. The result follows from the assumption of independence together with P (Ac ) ≪ P (¬Ac ): Y Y P (∆C ) = P (Ac ) P (¬Ac ) c∈C

≥

P (OBS, ∆C ) +

X

P (∆C ) +

∆C ∈S

=

∆C ∈S

X

C

Y

P (¬Ac ) = P (∆C ′ )

c∈COMPS−C ′

For further detail of GDE the reader is referred to the paper by De Kleer and Williams [2]. The following example illustrates how GDE works. Table 1. Comparison of the values of the diagnostic conflict measure and GDE for the full-adder circuit with observations OBS = ω = {i1 ,¯ı2 , i3 , o1 , ¯ o2 } and the probability distribution P , assuming that P (ac ) = P (oc | ac ) = 0.001. k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

P (OBS, ∆C )

∆C ∈U

X P (∆C ) . m ∆ ∈U

c∈COMPS−C

P (Ac )

Filling this result into Equation (1) gives the requested outcome. 2

∆C ∈R

X

Y

c∈C ′

where m = 1/P (OBS | ∆C ). Finally, the probability P (OBS) is computed as follows: X P (OBS) = P (OBS, ∆C ) =

if C ⊆ C ′ .

X2 R1 X1 A1 A2 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 0 1 1 1 1 0 1 0 1 1 0 0 1 1 1 0 0 0 1 0 1 1 1 1 0 1 1 0 1 0 1 0 1 1 0 1 0 0 1 0 0 1 1 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 0 0 1 1 0 1 0 1 1 0 0 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0

conf[P δk ](ω) – – – – −0.4255 −0.4255 −0.3006 −0.3006 – – – – −0.3006 −0.3006 −0.3006 −0.3006 – −0.1249 – 0 −0.1247 −0.1249 0.0002 0 0 0 0 0 0 0 0 0

GDE’s P (∆k | OBS) – – – – 0.99402 9.9502 · 10−4 9.9502 · 10−4 9.9601 · 10−7 – – – – 9.9502 · 10−4 9.9601 · 10−7 9.9601 · 10−7 9.9701 · 10−10 – 9.9502 · 10−4 – 9.9502 · 10−7 9.9502 · 10−4 9.9601 · 10−7 9.9601 · 10−7 9.9701 · 10−10 9.9502 · 10−4 9.9601 · 10−7 9.9601 · 10−7 9.9701 · 10−10 9.9601 · 10−7 9.9701 · 10−10 9.9701 · 10−10 9.9801 · 10−13

(2) EXAMPLE 2 Reconsider the full-adder shown in Figure 1, where

2

each component can only be normal or abnormal. Assume that the probability of faulty behaviour of a component is equal to P (Ac ) = 0.001. Without any observations, the diagnosis space consists of 25 = 32 members, where the diagnosis ∆∅ = {¬Ac | c ∈ COMPS} is the most probable diagnosis with probability P (∆∅ ) = (1 − P (Ac ))5 = (0.999)5 ≈ 0.995. When more components are assumed to be faulty, the probabilities decrease quickly to very small values. Now, suppose that OBS = {i1 ,¯ı2 , i3 , o1 , ¯ o2 }. The new probabilities obtained from GDE are shown in the right-most column of Table 1, where ‘1’ for a component means normal behaviour and ‘0’ means abnormal behaviour. The diagnoses ∆k , for k = 1, 3, 4, 9, . . . , 12, 17, 19, respectively, are eliminated by these observations. Furthermore, since there are no diagnoses in the set R that imply the two output observations, the set of S is empty and, thus, the set of uncommitted diagnoses U is equal to R. Then, the posterior probability of a diagnosis ∆k can be computed as follows:

The interpretation of the conflict measure is as follows. A zero or negative conflict measure means that the denominator is equally likely or more likely than the numerator. This is interpreted as that the joint occurrence of the observations is in accordance with the probabilistic patterns in P . A positive conflict measure, however, implies negative correlation between the observations and P indicating that the observations do not match P very well.

P (∆k )/m P (∆k ) P (∆k | OBS) = P = P , ( ∆C ∈U P (∆C ))/m ∆C ∈U P (∆C ) P where here ∆C ∈U P (∆C ) ≈ 1.002 · 10−3 . 2

EXAMPLE 3 Consider the Bayesian network shown in Figure 2, which describes that stomach ulcer (u) may give rise to both vomiting (v) and nausea (n). Now, suppose that a patient comes in with the symptoms of vomiting and nausea. The conflict measure then has the following value:

In the example, the probability of the ∆k ’s that still can be diagnoses become about 1000 times more likely when conditioning on the observations than without observations. However, either with or without observations, the diagnosis with the fewest number of abnormality assumptions is the most likely one. Thus the resulting diagnostic reasoning behaviour is very similar to that obtained by exploiting the concept of subset-minimal diagnosis.

2.3 Bayesian Networks and the Conflict Measure Let P (X) be a joint probability distribution of the set of discrete binary random variables X. A single random variable taking the values ‘true’ or ‘false’ is written as (upright) y and y ¯, respectively. If we refer to arbitrary values of a set of variables X, sometimes a single variable, this will be denoted by (italic) x. Let U, W, Z ⊆ X be disjoint sets of random variables, then U is said to be conditionally independent of W given Z, if for each value u, w and z: P (u | w, z) = P (u | z), with P (w, z) > 0.

(3)

A Bayesian network B is defined as a pair B = (G, P ), where G = (V, E) is an acyclic directed graph, with set of vertices V and set of arcs E, P is the associated joint probability distribution of the set of random variables X which is associated 1–1 with V . We will normally use the same names for variables and their associated vertices. The factorisation Q of P respects the independence structure of G as follows: P (x) = y∈x P (y | π(y)), where π(y) denotes the values of the parent set of vertex Y . Finally, we will frequently make Puse of marginalising out particular variables W written as P (u) = w P (u, w). Bayesian networks specify probabilistic patterns that must be fulfilled by observations. Observations are random variables that obtain a value through an intervention, such as a diagnostic test. The set of observations is denoted by ω. The conflict measure has been proposed as a tool for the detection of potential conflicts between observations and a given Bayesian network and is defined as [5]: conf(ω) = log

P (ω1 )P (ω2 ) · · · P (ωm ) , P (ω)

with ω = ω1 ∪ ω2 ∪ · · · ∪ ωm .

(4)

u

P (v | u) = 0.8 P (v | u ¯ ) = 0.01

v

P (u) = 0.2

n

P (n | u) = 0.9 P (n | u ¯ ) = 0.1

Figure 2. Example of a Bayesian network.

conf({v, n}) = log

P (v)P (n) 0.168 · 0.26 = log ≈ −0.5. P (v, n) 0.1448

As the conflict measure assumes a negative value, there is no conflict between the two observations. This is consistent with medical knowledge, as we do expect that a patient with stomach ulcer displays symptoms of both vomiting and nausea. As a second example, suppose that a patient has only symptoms of vomiting. The conflict measure now obtains the following value: conf({v, n ¯ }) = log

0.168 · 0.74 ≈ log 5.36 ≈ 0.7. 0.0232

As the conflict measure is positive, there is a conflict between the two observations, which is in accordance to medical expectations. 2

2.4 Bayesian Diagnostic Problems A Bayesian diagnostic system is denoted as a pair SB = (G, P ), where P is a joint probability distribution of the vertices of G, interpreted as random variables, and G is obtained by mapping a logical diagnostic system SL = (SD, COMPS) to a Bayesian diagnostic system SB as follows [6]: 1. component c is represented by its input Ic and output Oc vertices, where inputs are connected by an arc to the output; 2. to each component c there belongs an abnormality vertex Ac which has an arc pointing to the output Oc . Figure 3 shows the Bayesian diagnostic system corresponding to the logical diagnostic system shown in Figure 1. Let O denote the set of all output variables and I the set of all input variables, let o and i denote (arbitrary) values of the set of output and input variables, respectively, and let δC = {ac | c ∈ C} ∪ {¯ ac | c ∈ COMPS − C} be the set of values of the abnormality variables Ac , with c ∈ COMPS. The latter definition establishes a link between ∆C in logical diagnostic systems and the abnormality variables in Bayesian diagnostic systems.

3

I1

diagnosis is that the conflict measure can be used to rank these consistency-based diagnoses (cf. [4]). We start with the definition of the diagnostic conflict measure.

I2

AX1

AA1

I3

OX1

OA1

Definition 1 (diagnostic conflict measure) Let PB = (SB , ω) be a Bayesian diagnostic problem. The diagnostic conflict measure, denoted by conf[P δC ](·, ·), is defined for P (ω | δC ) 6= 0, as:

AA2 AX2

conf[P δC ](iω , oω ) = log

AR1 OA2 OX2

Figure 3.

OR1

The graphical representation of a Bayesian diagnostic system corresponding to the full-adder in Figure 1.

Due to the independences that hold for a Bayesian diagnostic system, it is possible to simplify the computation of the joint probability distribution P by exploiting the following properties: Property 1: the joint probability distribution of a set of output variables O can be factorised as follows: X Y P (o) = P (i, δc ) P (oc | π(oc )) ; (5) i,δc

c∈COMPS

Property 2: the input variables and abnormality variables are mutually independent of each other, formally: P (i, δc ) = P (i)P (δc ). Recall that logical diagnostic problems are logical diagnostic systems augmented with observations; Bayesian diagnostic problems are defined similarly. The input and output variables that have been observed are now referred to as Iω and Oω , respectively. The unobserved input and output variables will be referred to as Iu and Ou respectively. The set of actual observations is then denoted by ω = iω ∪ oω . Thus, a Bayesian diagnostic problem PB = (SB , ω) consists of (i) a Bayesian diagnostic system representing the components, their behaviour and interaction, and (ii) a set of observations ω [4]. In Bayesian diagnostic problems, the normal behaviour of component c is expressed in a probabilistic setting by the assumption that a normally functioning component yields an output value with probability of either 0 or 1. Thus, P (oc | π(oc )) ∈ {0, 1}, when the abnormality variable Ac ∈ π(Oc ) takes the value ‘false’, i.e. is ¯ ac . For the abnormal behaviour of a component c it is assumed that the random variable Oc is conditionally independent of its parent set π(Oc ) if component c is assumed to function abnormally, i.e. Ac takes the value ‘true’, written as: P (oc | π(oc )) = P (oc | ac ). Thus, the fault behaviour of an abnormal component cannot be influenced by its environment. We use the abbreviation P (oc | ac ) = pc . Note that this assumption is not made when a component is behaving normally, i.e. when ¯ ac holds.

3 CONFLICT-BASED DIAGNOSIS There exists a 1–1 correspondence between a consistency-based diagnosis ∆C of a logical diagnostic problem PL and a δC for which it holds that P (ω | δC ) 6= 0 if PB is the result of the mapping described above, applied to PL . The basic idea behind conflict-based

P (iω | δC )P (oω | δC ) , P (iω , oω | δC )

(6)

with observations ω = iω ∪ oω . Using the independence properties of Bayesian diagnostic problems we obtain [4]: P P Q P (oc | π(oc ) i P (i) Pou Qc conf[P δC ](iω , oω ) = log P . P (i ) u iu ou c P (oc | π(oc )) where π(Oc ) may include input variables from I. The diagnostic conflict measure can take positive, zero and negative values having different diagnostic meaning. Note that the numerator of the diagnostic conflict measure is defined as the probability of the individual occurrence of the inputs and outputs, whereas the denominator is defined as the probability of the joint occurrence of the observations. Intuitively, if the probability of the individual occurrence of the observations is higher than that of the joint occurrence, then the observations do not support each other. Thus, more conflict between diagnosis and observations yields higher (more positive) values of the diagnostic conflict measure. This means that the sign of the diagnostic conflict measure, negative, zero or positive, can already be used to rank diagnoses in a qualitative fashion. This interpretation gives rise to the following definition. Definition 2 ((minimal) conflict-based diagnosis) Let PB = (SB , ω) be a Bayesian diagnostic problem and let δC be a consistency-based diagnosis of PB (i.e. P (ω | δC ) 6= 0). Then, δC is called a conflict-based diagnosis if conf[P δC ](ω) ≤ 0. A conflictbased diagnosis δC is called minimal, if for each conflict-based diagnosis δC ′ it holds that conf[P δC ](ω) ≤ conf[P δC ′ ](ω). In general, the diagnostic conflict measure has the important property that its value can be seen as the overall result of a local analysis of component behaviours under particular logical and probabilistic normality and abnormality assumptions. A smaller value of the diagnostic conflict measure is due to a higher likelihood of dependence between observations, and this indicates a better fit between observations and component behaviours. Consider the following example. EXAMPLE 4 Reconsider the full-adder circuit example from Figure 1. Let as before ω = {i1 ,¯ı2 , i3 , o1 , ¯ o2 }. The diagnostic conflict measures for all the possible diagnoses are listed in Table 1. As an example, the diagnostic conflict measures for the diagnoses δ5 , δ6 , δ7 and δ8 are compared to one another for the case that the probability pX1 = P (oX1 | aX1 ) = 0.001 and it is explained what it means that, according to Table 1, conf[P δ5 ](ω) = conf[P δ6 ](ω) < conf[P δ7 ](ω) = conf[P δ8 ](ω). First, the diagnoses δk , for k = 6 and k = 7, will be considered in more detail in order to explain the meaning of the diagnostic conflict measure. The difference in value of the diagnostic conflict measure for these two diagnoses can be explained by noting that for δ6 it is assumed that the adder A1 functions normally and A2 abnormally, whereas for δ7 it is the other way around. The diagnostic conflict measure of the diagnosis δ6 is higher than that for δ7 , because if A1

4

functions normally, then its output has to be equal to 0, whereas if A2 functions normally, then its output has to be equal to 1. Note that it has been observed for R1 that the output is equal to 0. Because 0 is the output of the OR gate R1, its inputs must be 0; therefore, the assumption that A1 functions normally with output 0 offers a better explanation for the output 0 of the R1 gate than the assumption in δ7 that A2 functions normally (which yields output value 1). Furthermore, since in both diagnoses δ6 and δ7 component X1 is assumed to be faulty, and the output of the X1 acts as the input of A2, the assumption about the output of A2 is already relaxed. This also explains the preference of diagnosis δ6 above δ7 and why δ6 is ranked higher than δ7 . Next, the diagnoses δ7 , δ8 , δ13 , δ14 , δ15 , and δ16 are compared to one another and we explain why it is reasonable that these diagnoses have the same diagnostic conflict measure value (−0.3006). Note that both diagnoses δ7 and δ8 include faulty assumption aX1 and aA1 , and δ13 , δ14 , δ15 and δ16 include the faulty behaviours aR1 and aX1 . Note that for both {aX1 , aA1 } and {aR1 , aX1 }, one input of the X2 and the two inputs of R1 are relaxed. Therefore, they yield the same qualitative information about fault behaviour of the system. Below, these results are compared with those by GDE. 2 The example above illustrates that comparing the value of the diagnostic conflict measure for different diagnoses gives considerable insight into the behavioural abnormality of a system.

4 COMPARISON In this section, the diagnostic conflict measure and GDE’s probabilistic method are compared to each other in terms of the difference in ranking they give rise to. To start, the main differences between the diagnostic conflict measure and GDE are summarised, which is followed by an example. The example is used to illustrate that the diagnostic conflict measure yields a ranking that, for the probability distribution defined earlier, conveys more useful diagnostic information than the ranking by GDE. The following facts summarise the differences and similarities between the diagnostic conflict measure and GDE: 1. an abnormality assumption ∆C is a diagnosis according to GDE iff its associated diagnostic conflict measure is defined, i.e. [4] P (ω | δC ) 6= 0 ⇔ SD ∪ ∆C ∪ OBS 2 ⊥. 2. computation of the diagnostic conflict measure requires the conditional probability pc = P (oc | ac ), i.e. the probability that the component’s output is oc when the component is faulty, this probability is assumed to be always 0 or 1 by GDE. 3. in GDE the probability P (ac ), i.e. the probability that component c functions abnormally, acts as the basis for ranking diagnoses; this probability is not needed to rank diagnoses using conflictbased diagnosis, because it is summed out in the computation of the diagnostic conflict measure. 4. the ranking of a conflict-based diagnosis is based on a local analysis of interactions between inputs and outputs of components, taking into account the probability of particular faulty behaviours of components, and thus can be interpreted as a measure of how well the diagnosis, observations and system behaviour match; GDE offers nothing that is to some extent similar. 5. in GDE assuming more components to be functioning abnormally renders a diagnosis less likely, as proved in Proposition 1; a similar property does not hold for conflict-based diagnosis using the diagnostic conflict measure.

All properties above have already been discussed extensively. Therefore, only the last issue is illustrated. EXAMPLE 5 Consider the Bayesian diagnostic problem discussed above. Table 1 summarises the results of GDE and conflict-based diagnosis, which makes it easier to compare the results. Note that δk ≡ ∆k and ω ≡ OBS. Consider again the Bayesian diagnostic problem PB with set of observations ω = {i1 ,¯ı2 , i3 , o1 , ¯ o2 } and the two diagnoses δ5 = δ{X1} = {¯ aX2 , ¯ aR1 , aX1 , ¯ aA1 , ¯ aA2 } and δ6 = δ{X1,A2} = {¯ aX2 , ¯ aR1 , aX1 , ¯ aA1 , aA2 }. According to Table 1 the posterior probabilities computed by GDE are equal to P (∆5 | OBS) = 0.99402 and P (∆6 | OBS) = 9.9502 · 10−4 . Thus, ∆5 is much more likely than ∆6 , which is due to the inclusion of an extra abnormality assumption in ∆6 in comparison to ∆5 . Consequently, the ranking obtained is compatible with subset-minimality. However, using the diagnostic conflict measure gives, according to Table 1, for both diagnoses the value of −0.4255. This means that relaxing one extra logical and probabilistic constraint, i.e. A2 in addition to X1, has no effect on the likelihood of the diagnosis in this case. Next consider the diagnoses ∆7 and ∆6 , which both have the same number of components assumed to be abnormal, and thus obtain the same ranking according to GDE. However, δ6 and δ7 have a different diagnostic conflict measure, as explained in Example 4. 2 This example again illustrates that GDE and conflict-based diagnosis rank diagnoses differently. Conflict-based diagnosis really looks into the system behaviour and, based on a local analysis of strength of the various constraints, comes up with a ranking.

5 CONCLUSION AND FUTURE WORK Conflict-based diagnosis is a new concept in the area of model-based diagnosis that has been introduced recently [4]. In this paper, we have compared this new method with the well-known probabilistic method employed in GDE. It was shown that the probabilistic method underlying conflict-based diagnosis yields detailed insight into the behaviour of a system. As the obtained information differs from information obtained from GDE, it may be useful as an alternative or complementary method. In the near future, we intend to implement the method as part of a diagnostic reasoning engine in order to build up experience with regard to the practical usefulness of the method.

REFERENCES [1] L. Console and P. Torasso. A Spectrum of Logical Definitions of Model-based Diagnosis, Computational Intelligence, 7:133–141, 1991. [2] J. de Kleer and B.C. Williams. Diagnosing multiple faults, Artificial Intelligence, 32:97–130, 1987. [3] J. de Kleer, A.K. Mackworth, and R. Reiter. Characterizing diagnoses and systems. Artificial Intelligence, 56:197–222, 1992. [4] I. Flesch, P.J.F. Lucas and Th.P. van der Weide. Conflict-based diagnosis: Adding uncertainty to model-based diagnosis, Proc. IJCAI-2007, pp. 380–388, 2007. [5] F.V. Jensen. Bayesian Networks and Decision Graphs. Springer-Verlag, New York, 2001. [6] J. Pearl. Probabilistic Reasoning in Intelligent Systems:Networks of Plausible Inference. Morgan Kauffman, San Francisco, CA, 1988. [7] D. Poole, R. Goebel and R. Aleliunas. A logical reasoning system for defaults and diagnosis, In: The knowledge Frontier, Ed. N. Cerone and G. Calla, Springer-Verlag, pp. 331–352, 1987. [8] R. Reiter. A theory of diagnosis from first principles. Artificial Intelligence, 32:57–95, 1987.

5

6

On computing minimal conflicts for ontology debugging Kostyantyn Shchekotykhin and Gerhard Friedrich1 and Dietmar Jannach 2 Abstract. Ontology debugging is an important stage of the ontology life-cycle and supports a knowledge engineer during the ontology development and maintenance processes. Model based diagnosis is the basis of many recently suggested ontology debugging methods. The main difference between the proposed approaches is the method of computing required conflict sets, i.e. a sets of axioms such that at least one axiom of each set should be changed (removed) to make ontology coherent. Conflict set computation is, however, the most time consuming part of the debugging process. Consequently, the choice of an efficient conflict set computation method is crucial for ensuring the practical applicability of an ontology debugging approach. In this paper we evaluate and compare two popular minimal conflict computation methods: Q UICK X PLAIN and S INGLE J UST. First, we analyze best and worst cases of the required number of coherency checks of both methods on a theoretical basis assuming a black-box reasoner. Then, we empirically evaluate the run-time efficiency of the algorithms both in black-box and in glass-box settings. Although both algorithms were designed to view the reasoner as a black box, the exploitation of specific knowledge about the reasoning process (glass-box) can significantly speed up the run-time performance in practical applications. Therefore, we present modifications of the original algorithms that can also exploit specific data from the reasoning process. Both a theoretical analysis of best- and worst-case complexity as well as an empirical evaluation of run-time performance show that Q UICK X PLAIN is preferable over S INGLE J UST.

1

that restore the coherence of its terminology (diagnosis). In order to accomplish this task efficiently, current diagnosis approaches are based on the computation of axiom subsets that define an incoherent terminology (conflict sets). Diagnosis techniques: Currently, two approaches are used for the computation of diagnoses in ontology debugging: Pinpointing [12] and Reiter’s model-based diagnosis (MBD) [10]. Pinpoints are used to avoid the computation of minimal hitting sets of conflict sets by approximating minimal diagnoses by their supersets. However, the pinpoints themselves are computed on the basis of all minimal conflicts. In contrast, in MBD approaches (minimal) conflicts are computed on demand and diagnoses are computed with increasing cardinality by constructing a hitting-set tree (HST REE). Consequently, this method will find those diagnoses first that suggest minimal changes and avoids both the computation of very implausible multi-fault diagnoses and the costly computation of the set of all minimal conflicts. Note that Reiter’s original proposal does not work correctly for non-minimal conflicts [5] and shows limited performance for non-minimal conflicts. A modified diagnosis method was however introduced in [4] which avoids these deficits. The general question whether pinpoints or leading diagnoses are more appropriate as an output of the debugging process is still open. Conflict computation: Current approaches like S INGLE J UST[8] and Q UICK X PLAIN[4] either treat the underlying reasoner as a black-box or a glass-box. In (invasive) glass-box approaches the developer of the debugging method can exploit specifics of the theorem prover. In [9], for instance, a conflict computation approach was proposed which requires modifications of existing reasoning algorithms as its aim is to compute sets of conflicts during the reasoning process as efficiently as possible. The main drawback of such glass-box approaches however is that they can be used only for a particular description logic [1], like SHOIN (D) [9]. Hence, only a particular reasoner (or even a version of a reasoner) and a particular type of logic can be used. Moreover, glass-box modifications to reasoning systems often remove existing optimizations and thus are typically slower then their non-modified analogues. In addition, glass-box approaches to conflict set minimization do not guarantee the minimality of returned conflict sets and further (black-box) minimization is required [8]. On the other hand, black-box algorithms are completely independent from the reasoning process and just use the boolean outputs of the theorem prover. These algorithms are therefore logic-independent and can exploit the full power of highly optimized reasoning methods. Still, in case of an unsatisfiable set of axioms, all the axioms are considered as a conflict since no further information is available. Conflicts are typically minimized by additional calls to a theorem prover. In order to make black-box approaches applicable in cases where theorem proving is expensive, the number of such calls must

MOTIVATION

With an increasing number of applications that rely on ontologies, these knowledge bases are getting larger and more complex. Thus, corresponding knowledge bases can include definitions of thousands of concepts and roles from different domains. RDF search engines like Watson [2] for instance facilitate the creation of composite ontologies that reuse the definition of concepts and roles published on the Web. Moreover, the community of ontology users is getting more heterogeneous and nowadays includes many members from various industrial and scientific fields. Hence, different faults can be easily introduced during creation and maintenance of ontologies. Recent debugging methods as described in [4, 8, 9, 11] help the user to localize, understand, and correct faults in ontologies and are already implemented in popular ontology development tools like Protégé3 or Swoop4 . All currently suggested approaches for ontology debugging aim at the automated computation of a set of changes to the ontology 1

University Klagenfurt, Austria, email: [email protected] Dortmund University of Technology, Germany, email: dietmar.jannach@u do.edu 3 http://www.co-ode.org 4 http://code.google.com/p/swoop/ 2

7

Q UICK X PLAIN. This algorithm (listed in Figure 1) takes two parameters as an input, B a set of axioms that are considered as correct by a knowledge engineer and C a set of axioms, which should be analyzed by the algorithm. Q UICK X PLAIN follows a divide-andconquer strategy and splits the input set of axioms C into two subsets C1 and C2 on each recursive call. If the conflict set is a subset of either C1 or C2 , the algorithm significantly reduces the search space. If for instance the splitting function is defined as split(n) = n/2 then the search space will be reduced in half just with one call to a reasoner. Otherwise, the algorithm re-adds some axioms ax ∈ C2 to C1 . With the splitting function defined above, the algorithm will add a half of all axioms of the set C2 . The choice of the splitting function is crucial since it affects the number of required coherency checks. The knowledge engineer can define a very effective splitting function for a concrete problem, e.g., if there exists some a priori knowledge about faulty axioms of an ontology. However, in the general case it is recommended to use the function that splits the set C of all axioms into two subsets of the same size since the path length from the root of the recursion tree to a leaf will contain at most log2 n nodes. Thus, if the cardinality of the searched minimal conflict set |CS| = k in the best case, i.e. when all k elements belong to a single subset C1 , the number of required coherency checks is log2 nk +2k. The worst case for Q UICK X PLAIN is observed when the axioms of a minimal conflict set always belong to different sets C1 and C2 , i.e., if for instance a minimal conflict set has two axioms and one is positioned at the first place of set C and the other one at the last. In this case the number of coherency checks is 2k(log2 nk + 1) [6]. Note that we modified the original Q UICK X PLAIN algorithm (Figure 1) such that it can be used with both black- and glassbox approaches. The algorithm issues two types of calls to a reasoner, isCoherent(T ) and getConf lictSet glassBox(). The first function returns true if the given terminology T is coherent. getConf lictSet glassBox() returns a set of axioms AX that are responsible for incoherence (CS ⊆ AX). This function can only be used if the reasoner supports glass-box debugging. If this is the case the reasoner will able to return the set AX which was generated during the reasoning process. If only black-box usage is possible then a practical implementation of an ontology debugger should override this function with one that returns an empty set. In this case the modified algorithm is equal to the original one given in [6]. Moreover, the first part of the algorithm (lines 1-4) is required to check if an ontology is actually incoherent. This check is required for two reasons. First, the result of conflict set computation for an already coherent ontology using a reasoner as a black-box will include all axioms of this ontology; second, a feedback of a glass-box reasoner executed at this stage can significantly reduce the search space of Q UICK X PLAIN. The same can also be noted for S INGLE J UST.

be minimized. In current systems, two main approaches for conflict set computation are used, S INGLE J UST [8] and Q UICK X PLAIN [4]. In general, both of them can be used in glass-box and black-box settings. In this paper we show that Q UICK X PLAIN is preferable over S IN GLE J UST in both settings based on a theoretical analysis of best and worst cases and an empirical performance evaluation for a simulated average case. In addition, we propose modifications to the original algorithms to further improve the run-time performance in glass-box settings. The reminder of the paper is organized as follows, Section 2 provides theoretical study of conflict set computation methods and includes a brief description of the main algorithms as well as the analysis of extreme cases. In Section 3 we present the results of empirical evaluation of Q UICK X PLAIN and S INGLE J UST in both black- and glass-box settings. The paper closes with a discussion of the future work.

2

COMPUTING MINIMAL CONFLICTS

We will focus on the comparison of two popular algorithms Q UICK X PLAIN [6] and S INGLE J UST [8]. The presented comparison is possible because application scenarios and strategies of these algorithms are similar. Both methods are designed to compute only one minimal conflict set per execution. The combinations of Q UICK XPLAIN + HST REE [4] and S INGLE J UST + HST REE (also referred as A LL J UST) [8] are used to obtain a set of minimal diagnoses diagnoses or to enumerate minimal conflict sets (justifications). Therefore, Q UICK X PLAIN and S INGLE J UST can be compared both theoretically and empirically. Algorithm: Q UICK X PLAIN(B, C) Input: trusted knowledge B, set of axioms C Output: minimal conflict set CS

(1) (2) (3) (4)

if isCoherent(B ∪ C) or C = ∅ return ∅; AX ← getF aultyAxioms(C); if AX 6= ∅ then C ← C ∩ AX; return computeConf lict(B, B, C)

function computeConf lict(B, ∆, C) (5) if ∆ 6= ∅ and not isCoherent(B) then return ∅; (6) if |C| = 1 then return C; (7) int n ← |C|; int k := split(n) (8) C1 ← {ax1 , . . . , axk } and C2 := {axk+1 , . . . , axn }; (9) CS1 ← computeConf lict(B ∪ C1 , C1 , C2 ); (10) if CS1 = ∅ then C1 ← getF aultyAxioms(C1 ); (11) CS2 ← computeConf lict(B ∪ CS1 , CS1 , C1 ); (12) return CS ← CS1 ∪ CS2 ;

S INGLE J UST This algorithm (see Figure 2) follows an expand-andshrink strategy and has two main loops. The first one creates a set of CS that includes all axioms of the minimal conflict set and the second one minimizes CS by removing axioms that do not belong to the minimal conflict set. The algorithm includes two functions select(T ) and f astP runing(T ) that can be tuned to improve its performance. The first function starts by selecting a predefined number of axioms num from the given set. The number of axioms that are selected can grow with a certain factor f (see [7]). The f astP runing function implements a pruning strategy for CS with a sliding window technique. The pruning algorithm takes the size of the window

function getF aultyAxioms(C) (13) AX ← getConf lictSet glassBox(); (14) if AX = ∅ then return C; (15) else return AX; Figure 1.

Generalized Q UICK X PLAIN algorithm

8

Algorithm: S INGLE J UST(B, C) Input: trusted knowledge B, set of axioms C Output: minimal conflict set CS (1) (2) (3) (4)

106 axioms 104 axioms

if isCoherent(B ∪ C) or C = ∅ return ∅; AX ← getF aultyAxioms(C); if AX 6= ∅ then C ← C ∩ AX; return computeConf lict(B, C)

102 axioms 1

function computeConf lict(B, C) (5) CS ← B; (6) do (7) CS ← CS ∪ select(C \ CS); (8) while (isCoherent(CS)); (9) CS ← f astP runing(getF aultyAxioms(CS)); (10) for each ax ∈ CS do (11) CS ← CS \ {ax} (12) if isCoherent(CS) then CS ← CS ∪ {ax}; (13) else CS ← getF aultyAxioms(CS); (14) return CS;

103 104 105 106 Number of coherency checks 10 100

QuickXPlain

Single_Just

Figure 3. Intervals for numbers of possible coherency checks required to identify a minimal conflict set of cardinality k = 8 in an ontology of n axioms. Q UICK X PLAIN parameters: split(n) = n/2. S INGLE J UST: number of axioms on the first iteration num = 50, increment factor f = 1.25, window size window = 10.

ter Concept and rewrite the coherency checking function such that isCoherent(C, Concept) returns f alse if Concept is unsatisfiable with respect to the terminology defined by a set of axioms C. Otherwise this function should return true. The algorithms presented on Figures 1 and 2 can also exploit the structural relations between axioms by means of specifically implemented functions split, select and f astP runing. One can thus for instance select and/or partition axioms so that axioms with intersecting sets of concepts will be considered first.

Figure 2. Generalized S INGLE J UST algorithm

window and the set of axioms CS as an input and outputs a set of axioms CS 0 ⊆ CS. In the form it was implemented in OWL-API5 , the pruning algorithm partitions the input set CS with n axioms into p = n/window parts Pi , i = 1, . . . , p and then sequentially tests coherency of each set CSi = CS \ Pi , i = 1, . . . , p. Note also that OWL-API includes two variants of the pruning method, one with constant and one with shrinking window size. In further analysis and evaluation we will consider only the variant with the constant window size. Let us consider the best and worst cases for S INGLE J UST. In the best case, all axioms of a minimal conflict set CS belong to some partition set Pi . Thus, given an axioms set C of cardinality n that contains a minimal conflict set CS of cardinality k, the algorithm will make at most 1 + p + min(window, num) coherency checks. the worst case, the first iteration will require at least ` In 1−f ´ logf 1 − num n coherency checks and both the sliding window and final minimization p + min(k/p, 1)n checks, if all k axioms of the minimal conflict set belong to different partitions Pi . The theoretical analysis of the two algorithms thus shows that Q UICK X PLAIN has a smaller interval of possible number of coherency checks in comparison to S INGLE J UST (see Figure 3)6 . Note also that the interaction with the reasoner used in S IN GLE J UST in Figure 2 is organized in the same way as in Q UICK X PLAIN, i.e., by means of the functions isConsistent and getF aultyAxioms. However, if a black-box approach to ontology debugging is used, the modified algorithm presented on the 2 is equal in terms of number of consistency checks to the original one suggested in [8]. Moreover, both generalized algorithms can also be used to detect conflict sets that cause unsatisfiability of a certain concept. This is possible if we introduce one more input parame-

3

EMPIRICAL EVALUATION

The theoretical analysis of the algorithms showed that Q UICK XPLAIN is preferable over S INGLE J UST since it has much lower variation of the number of required reasoner calls. Nevertheless, the extremum conditions of the discussed best and worth cases are rather specific. Therefore, an analysis of the average case has to be done in order to make the comparison complete. However, evaluating this case is problematic, since there are no publicly available collections of incoherent ontologies that are published on the Web and are suitable for such tests. Moreover, there is no a priori knowledge on the distribution of conflicts. In other words, we do not know how the faulty axioms are most often positioned in an ontology. Therefore, we simulated the occurence of faults in the ontology in order to obtain a measure of the numbers of coherency checks required by Q UICK X PLAIN and S INGLE J UST. These statistics can be then used to calculate the average number of required coherency checks. Moreover, for our purposes it is enough to generate and then compute only one conflict set, since none of the analyzed algorithms can improve its performance on subsequent executions by using data from the previous runs. The test case generation method was designed under the following assumptions: (1) All axioms have the same probability to be part of a conflict set (uniform distribution). Thus, for an ontology with n axioms, the probability for each axiom to be a source of a conflict is 1/n. (2) The cardinalities of minimal conflict sets follow the binomial distribution with the number of trials t equal to the maximal length of the dependency path from an axiom selected according to the first assumption to all axioms that directly or indirectly depend on concepts declared in the selected axiom. The value t corresponds to the maximum possible cardinality of a conflict that can be generated for a selected axiom. The success probability, which is the second

5

Unfortunately, the authors of the S INGLE J UST algorithm did not provide a specification of the fast pruning method neither in [7] nor in [8]. Therefore we analyzed the OWL-API (http://owlapi.sourceforge.net/ checked on June 7, 2008) implementation that is referred by the authors in [8]. 6 The values of S INGLE J UST parameters were taken from the OWL-API implementation.

9

MyGrid MyGrid Sweet-JPL Sweet-JPL BCS3 BCS3 Galen Galen Bike9 Bike9 MGED Ontology MGED Ontology Gene Ontology Gene Ontology Sequence Ontology Sequence Ontology

200

MyGrid MyGrid MyGridSweet-JPL Sweet-JPLSweet-JPL BCS3 BCS3 BCS3 Galen Galen Galen Bike9 Bike9 Bike9 Ontology MGED

MGED Ontology MGED Ontology Gene Ontology Gene Ontology Gene Sequence Ontology Ontology

0 0

QuickXplain QuickXplain

Figure 4.

10 20 30 40 50 coherency 100 checks 150

Sequence Ontology Sequence Ontology

50 200

0

8

0 0 20

50 100 150 50 100 50150 30 milliseconds 40

QuickXplain Find_Just QuickXplain Single_Just QuickXplain Find_Just

Find_Just Single_Just

Figure 5.

Average number of consistency checks for Q UICK X PLAIN and S INGLE J UST using reasoner as a black-box

200 200

Average running times for black-box Q UICK X PLAIN and S INGLE J UST

we cannot predict the number of axioms that will be returned by a glass-box method on each iteration of the conflict computation algorithm. The computed conflict set can include extra axioms that do not belong to the searched minimal conflict set because of nondeterministisms of a reasoning algorithm such as max-cardinality restrictions or special features of the tracing algorithm itself. Therefore, our empirical evaluation of different combinations of Q UICK X PLAIN and S INGLE J UST is based on two different glass-box implementations: invasive [9] and na¨ıve non-invasive. In general, all glass-box methods implement tracing of axioms that were used by the reasoner to prove the unsatisfiability of a concept. The invasive method first stores all correspondences between source axioms and internal data structures of the reasoner and tracks the changes in internal data structures during the normalization and absorption phases (see [9] for details). In [9], it is also suggested to add tracing to the SHOIN (D) tableaux expansion rules to enable a very precise tracking of axioms so that the resulting axioms set will be as small as possible. The main drawback of this approach however is that such a modification disables many key optimizations, which are critical for the excellent performance of modern OWL reasoners [8]. In the non-invasive approach that we developed for our evaluation, we only track which concepts were unfolded by the reasoner and then search for all axioms in which these concepts are defined using the OWL-API. This method does not analyze the details of the reasoning process and thus, the resulting set of axioms is only in the best case equals to the set returned by the invasive method. However, such an approach can have a shorter execution time, since it does not require changes in the optimized reasoning process except for the insertion of a logging method for unfolded concepts. Pellet 1.5.1 already includes an implementation of the invasive method (explanations of clashes) and can also be configured to turn on logging which is required for the non-invasive method. The only modification to the reasoner was to add a fast fail behavior in the satisfiability check. By default, Pellet searches for all unsatisfiable concepts. However, for the minimal conflict set computation algorithm it is enough to find just one such concept, since in this case the terminology is already incoherent. We performed the tests of the glass-box methods using the same test bundle that was used for the black-box tests. The evaluation shows that Q UICK X PLAIN is faster in both approaches (see Figures 6 and 7). When using the feedback from the glass-box satisfiability check, Q UICK X PLAIN performed better then S INGLE J UST in all

parameter of the distribution, is set to 1/t. Hence minimal conflict sets of smaller cardinality are more likely to appear. The process starts with the generation of a uniformly distributed number i for the interval [1, n]. This number corresponds to an axiom axi that is the initial axiom of a conflict set. Then the algorithm queries a reasoner for a taxonomy of a given ontology to find out all axioms that contain concept definitions that are either directly or indirectly dependent on a concept defined in axi . The length of a longest dependency path t is then used to generate the value c, which corresponds to the minimal cardinality of a conflict set according to the second assumption. Next, we retrieve all axioms which define concepts that subsume one of the concepts declared in axi such that subsumption is made over the c − 1 other concepts (C1 v ... v Cc−1 v Cc ). If more then one axiom is retrieved then we select randomly one of them (denoted as axj ). Both axioms are modified to create a conflict (e.g. by inserting a new concept definition in axi and its negation in axj ). Thus, the generation method generates faults that correspond to local or propagated causes of unsatisfiability that were observed by Wang et al [14] in many realworld ontologies. Note that the real cardinality of a minimal conflict is unknown prior to the execution of a conflict compution algorithm, since we do not investigate all possible dependencies of the modified axioms. In the tests we used Pellet 1.5.17 as a reasoner and the SSJ statistical library8 to generate the required random numbers. As can be seen in Figure 4 and Figure 5, in the average case (after 100 simulations) Q UICK X PLAIN outperformed S INGLE J UST in all eight test ontologies MyGrid (8179 axioms), Sweet-JPL (3833 axioms), BCS3 (432 axioms), Galen (3963 axioms), MGED Ontology (236), Bike9 (215), Gene Ontology (1759) and Sequence Ontology (1745). In this test we measured both the number of checks and the elapsed time. Note also that the results that we obtained when using Pellet can in generally also be transferred to other reasoners that – in these settings have shown to have comparable performance [13]. All experiments have been performed on a MacBookPro (Intel Core Duo) 2 GHz with 2 GB RAM and 1.5 GB maximum Java memory heap size. Beside using the reasoner as a black-box, both variants of Q UICK X PLAIN and S INGLE J UST can also be used in glass-box settings. However, the theoretical analysis of these cases is not trivial, since 7

10

http://pellet.owldl.com/ http://www.iro.umontreal.ca/∼simardr/ssj/indexe.html

10

MGE MGE Gen Gen Sequenc Sequenc

MyGrid MyGrid Sweet-JPL Sweet-JPL BCS3 BCS3 Galen Galen Bike9 Bike9 MGED Ontology MGED Ontology Gene Ontology Gene Ontology Sequence Ontology Sequence Ontology

0 0

3


6 9 12 50 milliseconds 100 150

Find_Just Single_Just

15 200

Figure 6. Average running times for non-invasive glass-box

MyGrid MyGrid Sweet-JPL Sweet-JPL BCS3 BCS3 Galen Galen Bike9 Bike9 MGED Ontology MGED Ontology Gene Ontology Gene Ontology Sequence Ontology Sequence Ontology 12

15

_Just

REFERENCES MyGrid

0 0


Figure 7.

work, it would be useful, if ontology editors like Protégé or Swoop wouldMyGrid support anonymous user feedback for debugging purposes, as Sweet-JPL statistics onMyGrid the number of conflict sets and their average cardinality. Sweet-JPL BCS3 This data would help to make even more precise evaluations of the averageGalen case.BCS3 Finally, Galenthat in the context of this work we generally unBike9 note derstand axioms as valid description logics statements of any kind. MGED Ontology Bike9 Concept definitions where the left-hand sides are atomic, are the MGED Ontology Gene Ontology most frequent form of axioms, since available ontology editors supGene Ontology Sequence Ontology port mainly this presentation. However, if the terminology includes Sequence Ontology 3 6 9 12 15 axioms with0a different structure, approaches like those presented 10 20 the 30 axioms. 40 50 in [3, 7] can be used0 to transform These approaches supQuickXplain debugging Find_Just port fine-grained of ontologies, will allow us to locate Find_Just faults withinQuickXplain parts of the original axioms. Although the evaluation in this paper was limited to the more coarse-grained case, the conflict computation techniques can be applied also for the fine-grained debugging approaches.

3

6 9 12 50 milliseconds 100 150

[1] The Description Logic Handbook: Theory, Implementation, and ApSweet-JPL plications, eds., Franz Baader, Diego Calvanese, Deborah L. McGuinness, Daniele BCS3 Nardi, and Peter F. Patel-Schneider, Cambridge University Press, 2003. [2] MathieuGalen d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Sofia Angeletou, Bike9 Marta Sabou, and Enrico Motta, ‘Watson: A gateway for next generation semantic web applications’, in Poster session of the InternaMGED Ontology tional Semantic Web Conference, ISWC, (2007). Ontology [3] Gene Gerhard Friedrich, Stefan Rass, and Kostyantyn Shchekotykhin, ‘A general method for diagnosing axioms’, in DX’06 - 17th International Sequence Ontology Workshop on Principles of Diagnosis, eds., C.A. Gonzalez, T. Escobet, 10 101–108, 20 30 40 50 Duero, Burgos, Spain, and B. Pulido, 0pp. pp. Penaranda de (2006). [4] GerhardQuickXplain Friedrich and Kostyantyn Shchekotykhin, ‘A General DiagnoFind_Just sis Method for Ontologies’, Proceedings of the 4 thInternational Semantic Web Conference (ISWC-05), 232–246, (2005). [5] Russell Greiner, Barbara A. Smith, and Ralph W. Wilkerson, ‘A correction to the algorithm in Reiter’s theory of diagnosis’, Artificial Intelligence, 41(1), 79–88, (1989). [6] Ulrich Junker, ‘QUICKXPLAIN: Preferred explanations and relaxations for over-constrained problems.’, in Association for the Advancement of Artificial Intelligence, pp. 167–172, San Jose, CA, USA, (2004). [7] Aditya Kalyanpur, Debugging and repair of OWL ontologies, Ph.D. dissertation, University of Maryland, College Park, MD, USA, 2006. Adviser-James Hendler. [8] Aditya Kalyanpur, Bijan Parsia, Matthew Horridge, and Evren Sirin, ‘Finding all justifications of OWL DL entailments’, in Proc. of ISWC/ASWC2007, Busan, South Korea, volume 4825 of LNCS, pp. 267–280, Berlin, Heidelberg, (November 2007). Springer Verlag. [9] Aditya Kalyanpur, Bijan Parsia, Evren Sirin, and James Hendler, ‘Debugging unsatisfiable classes in OWL ontologies’, Web Semantics: Science, Services and Agents on the World Wide Web, 3(4), 268–293, (2005). [10] Raymond Reiter, ‘A theory of diagnosis from first principles’, Artificial Intelligence, 23(1), 57–95, (1987). [11] Stefan Schlobach, ‘Diagnosing terminologies.’, in Proc of AAAI, eds., Manuela M. Veloso and Subbarao Kambhampati, pp. 670–675. AAAI Press / The MIT Press, (2005). [12] Stefan Schlobach, Zhisheng Huang, Ronald Cornet, and Frank Harmelen, ‘Debugging incoherent terminologies’, J. Autom. Reason., 39(3), 317–349, (2007). [13] Evren Sirin, Bijan Parsia, Bernardo Cuenca Grau, Aditya Kalyanpur, and Yarden Katz, ‘Pellet: A practical OWL-DL reasoner’, Technical report, UMIACS Technical Report, (2005). [14] H. Wang, M. Horridge, A. Rector, N. Drummond, and J. Seidenberg, ‘Debugging OWL-DL Ontologies: A Heuristic Approach’, Proceedings of the 4 thInternational Semantic Web Conference (ISWC-05), 745–757, (2005).

15 200

Single_Just Find_Just

Average running times for invasive glass-box

the cases. Note also that the difference in the average running times for Q UICK X PLAIN and S INGLE J UST in invasive and non-invasive glass-box settings is not significant as both glass-box methods can in general reduce the search space of minimal conflict computation algorithms very rapidly.

4

CONCLUSIONS & FUTURE WORK

Adequate debugging support is an important prerequisite for the broad application of ontologies in real-world scenarios and in recent years, different techniques for the automated detection of problematic chunks in the knowledge bases have been developed. One of the most critical and time-intensive tasks in most debugging approaches is the detection of small sets of axioms that contain the faults (conflict sets). In general, efficient conflict computation and minimization is central not only in debugging scenarios, as conflict sets are also helpful to compute justifications for axioms and assertions which in turn can serve as a basis of an explanation facility [8]. In this paper we have analyzed two recent proposals for the identification of conflicts, both in black-box and glass-box application scenarios. Both the theoretical analysis as well as an empirical evaluation showed that Q UICK X PLAIN is currently the more efficient method for that purposes. Due to the lack of publicly available mass data about typical ontology faults, artificial tests had to be used in the experiments. For future

11

12

Supporting Conceptual Knowledge Capture Through Automatic Modelling Jochem Liem and Hylke Buisman and Bert Bredeweg Human Computer Studies Laboratory, Informatics Institute, Faculty of Science, University of Amsterdam, The Netherlands. Email: {jliem,bredeweg}@science.uva.nl, [email protected] Abstract. Building qualitative models is still a difficult and lengthy endeavour for domain experts. This paper discusses progress towards an automated modelling algorithm that learns Garp3 models based on a full qualitative description of the system’s behaviour. In contrast with other approaches, our algorithm attempts to learn the causality that explains the system’s behaviour. The algorithm achieves good results when recreating four well-established models.

cision trees). The choices in the qualitative tree can be seen as conditional inequalities for specific model fragments in our approach. As with the inductive process modelling approach equations are used to represent the causality, in this case Qualitative Differential Equations (QDEs). Similar work to QUIN learns models for the JMorven language, which uses fuzzy quantity spaces to specify variables [7]. However, this work also uses QDEs, which leaves the representation of causality implicit.

1 Introduction In this paper we focus on the ground work required to advance towards an automated modelling program. The input is considered to have a qualitative representation, i.e. a state graph that represents the possible situations that can emerge from a system, and the values of the quantities in each situation. Furthermore, the input is assumed to have no noise nor any inconsistencies. The completed algorithm is envisioned to support researchers in articulating their conceptual understanding. As such it will help to establish theories that explain the phenomena provided as input data.

2 Related Work Recently, researchers in the machine learning community proposed inductive process modelling as a new research agenda [4]. They argue that models should not just be accurate, but should also provide explanations (often requiring variables, objects and mechanisms that are unobserved). In their work, quantitative process models are learned from numerical data. Based on changing continuous variables (observations), a set of variables, a set of processes (described by generalized functional forms), and constraints such as variable type information, a specific process model is generated that explains the observed data and predicts unseen data accurately. As in our approach, there is an explicit notion of individual processes, variables (quantities) and subtype hierarchies to represent different types. Our approach differs from this work in two ways. Firstly, we learn qualitative models based on qualitative data, making our approach a viable alternative when no numerical data is available. Secondly, our approach represents causality more explicitly through causal dependencies. We argue that this representation provides a better explanation than equations. However, our generated models cannot perform numerical predictions. An earlier approach to learning qualitative models is Qualitative Induction (QUIN) [1]. QUIN searches for qualitative patterns in numeric data and outputs the result as ”qualitative trees” (similar to de-

3 QR Model and Simulation Workbench: Garp3 The automatic model building algorithm is implemented in Garp31 [2]. Garp3 allows modellers to represent their knowledge about the structure and the important processes in their system as model fragments, which can be considered formalisations of the knowledge that applies in certain general situations. Next to model fragments, different scenarios can be modelled. These represent specific start states of a system. Garp3 can run simulations of models based on a particular scenario. The result of such a simulation is a state graph, in which each state represents a particular possible situation of the system, and the transitions represent the possible ways a situation can change into another. The simulation engine takes a scenario as input, and finds all the model fragments that apply to that scenario. The consequences of the matching model fragments are added to the scenario to create a state description from which new knowledge can be inferred such as the derivatives of quantities. Given the completed state description, the possible successor states are inferred. The complete state graph is generated by applying the reasoning to the new states. In Garp3 the structure of a system is represented using entities (objects) and configurations (relations). For example, a lion hunting on a zebra would be represented as two entities (lion and zebra) and a configuration (hunts). Quantities represent the features of entities and agents that change during simulation. A quantity has a magnitude and a derivative, representing its current value and trend. The magnitude and derivative are each defined by a quantity space that represents the possible values the magnitude and the derivative can have. Such a quantity space is defined by a set of alternating point and interval values. We use Mv (Q1 ) to refer to the current value of the magnitude of a quantity. Ms (Q1 ), the sign of the magnitude, indicates whether the magnitude is positive, zero or negative (Ms (Q1 ) ∈ {+, 0, −}). 1

13

http://www.garp3.org

Dv (Q1 ) refers to the current value of the derivative of a quantity, which has a value from the predefined derivative quantity space (Dv (Q1 ) ∈ {−, 0, +}). Ds (Q1 ) refers to the current sign of a derivative. Note that the predefined values of derivatives completely correspond to the possible signs of the derivative.

pear, but that their effects are only nullified when quantities become zero or stable. Thirdly, the algorithm is focussed on causal explanation and less on structure. Therefore, the entity hierarchy is assumed known.

4.2 Input and Output 3.1 Causality

Q1 → Q2. Similar to influences, proportionalities can be either positive or negative. The positive proportionality will increase Dv (Q2 ) if Ds (Q1 ) = +, have no effect if it is stable, and decrease if it is below zero. For a negative proportionality, it is vice versa.

The algorithm takes a complete state graph as input, which includes (1) the quantity names, (2) the quantity spaces, (3) the magnitudes and derivatives of the quantities in different states, (4) the observable inequalities, and (5) the state transitions. Furthermore, the algorithm is provided with the scenario that should produce the state graph, which consists of: (1) the entities, agents and assumptions involved, (2) structural information about the configurations between them, (3) the quantities and their initial values, and (4) the inequalities that hold in the initial state. The output of the algorithm is one or more Garp3 qualitative models that explain (are consistent with) the input that can be immediately simulated.

3.2 Other Behavioural Ingredients

4.3 Algorithm Design Approach

Other behavioural ingredients in Garp3 are operators, inequalities, value assignments and correspondences. Operators (+ and -) are used to calculate the magnitude value of quantities (e.g. Q1 − Q2 = Q3 , to indicate Mv (Q1 ) − Mv (Q2 ) = Mv (Q3 )). Inequalities can be placed between different model ingredient types: (1) magnitudes (Mv (Q1 ) = Mv (Q2 )), (2) derivatives (Dv (Q1 ) < Dv (Q2 ), (3) values Q1 (point(M ax)) = Q2 (point(M ax)), (4) operator relations (Mv (Q1 ) − Mv (Q2 ) < Mv (Q3 ) − Mv (Q4 ), (5) combinations of the 1, 2, 3 and 4 (although only between either magnitude or derivative items). Value assignments simply indicate that a quantity has a certain qualitative value (Mv (Q1 ) = Q1 (P lus)). Finally, correspondences indicate that from certain values of one quantity, values of another quantity can be inferred. There are quantity corre-

Since the semantics of model ingredients are formally defined, one would assume that it is clear how each ingredient manifests itself in the simulation results of a model. Otherwise, how would the implementation of a simulation engine have been possible? However, in practice, it is hard even for expert modellers to pinpoint the model ingredients that are responsible for certain (lack of) behaviour. This has several reasons. Firstly, a large set of inequalities are derived during qualitative simulation, of which the implications (other inequalities) are difficult to foresee. Secondly, the engine has a lot of intricacies (such as second order derivatives) which makes simulation results hard to predict. Thirdly, the branching in the state graph that results from ambiguity is difficult for people to completely envision. For these reasons, an iterative algorithm design approach is chosen. Well-established models are ordered by complexity, and attempts are made to generate them using their own output. Each of the models requires a different (and increasingly large) set of considerations that must be dealt with. The models chosen are Tree and Shade, Communicating Vessels, Deforestation, Population Dynamics and a set of other even more complex models2 Tree and Shade is the least complex model, containing only a few quantities, and causal dependencies, and no conditions, causal interactions, inequalities or operator relations. Communicating vessels is more complex, as it contains causal interactions, an operator, and inequalities. The deforestation model is different from the previous models as it contains many clusters linked to each other by proportionalities. Population dynamics is again more complex, due to the large amount of quantities, interactions and conditions.

Garp3 explicitly represents causality using indirect and direct influI+

ences. Direct influences are represented as Q1 → Q2 . Influences can be either positive or negative. The positive influence will increase Dv (Q2 ) if Ms (Q1) = +, decrease it if Ms (Q1) = −, and have no effect when Ms (Q1) = 0. For a negative influence, it is vice versa. The indirect influences, called proportionalities, are represented as P+

Qqs

Qv

spondences (Q1 ↔ Q2 ) and value correspondences (Q1 (P lus) → Q2 (P lus)), which can both be either directed or undirected. The value correspondence indicates that if Mv (Q1 ) = Q1 (P lus), Mv (Q2 ) = Q2 (P lus). If the value correspondence is bidirectional, the reverse inference is also possible. Quantity correspondences can be considered a set of value correspondences between each consecutive pair of the values of both quantities. There are also inverse quanQ−1 qs

tity space correspondences (Q1 ↔ Q2 ) that indicate that the first value in Q1 corresponds to the last value in Q2 , the second to the one before last, etc.

4 Algorithm Requirements and Approach 4.1 Assumptions and Scoping The goal of the automatic model building algorithm is to take a state graph and a scenario as input, and generate the model that provides an explanation for the behaviour. Our approach focusses on the generation of causal explanation. Several assumptions are made to scope the work. In further research these assumptions can be alleviated. Firstly, input is assumed to have no noise or inconsistencies. Secondly, the state graph is assumes to be a full envisionment of the system’s behaviour. The second assumption is that a model can be build using a single model fragment. From a causal explanation point of view, it is reasonable to assume that influences and proportionalities never disap-

4.4 Causality and Clusters 4.4.1 Causal Paths Important for the algorithm is the concept of causal paths. These are series of quantities connected by influences and proportionalities. A causal path is defined as a set of quantities that starts with an influence, and is followed by a arbitrary number of proportionalities. For I+

P+

P−

P+

example: Q1 → Q2 → . . . → Qn−1 → Qn . A quantity that has no 2

14

The models and references to articles are available at http://www.garp3.org

proportionalities leading out of it ends the causal path. If a quantity has more than one proportionality leading out of it, multiple causal paths can be defined. Since each influence represents the causal effect of a process, a causal path can be seen as the cascade of effects of a process. Given this perspective, certain successions of causal relations become unI+

I+

P−

I+

likely. For example the causal path Q1 → Q2 → Q3 → Q4 → Q5 would imply there are many active processes with short or no cascading effects.

4.4.2 Direction of Causality An important issue in scientific enquiry is the problem of correlation and causality. This issue appears when trying to derive causal relations from the state graph. For example, Ds (Q1 ) = Ds (Q2 ) P+

P+

P+

can be an caused by Q1 → Q2 , Q2 → Q1, or even Q3 → Q1 P+

and Q3 → Q2 . Another example of this is in the communicating vessels model. Ideally, a model capturing the idea of a contained liquid would distinguish between Volume, Height and BotP+

tom pressure, and have a particular causal account (V olume → P+

Height → Bottom pressure). However, from the model’s behaviour this causality may not be derivable, e.g. when the width of the containers doesn’t change. As a result, the unique role of the quantities involved can only be inferred when the required variation for that is apparent in the input state-graph. Therefore, it is considered the modeller’s responsibility to provide simulation examples which will allow the algorithm the make these critical distinctions. However, it can be considered the responsibility of the tool to indicate to the modeller that the causality between certain sets of quantities cannot be derived, and that examples showing these differences should be provided.

4.5 Minimal Covering The key requirement of the model building algorithm is that it explains the input behaviour. However, a second requirement is that the algorithm does not contain redundant dependencies. That is, the algorithm should return the minimal set of dependencies that explains the behaviour. Two dependencies are considered substitutionary if they have the same effect on the simulation result (i.e. removing one of them would have no effect, however removing both would). Complementary dependencies are responsible for different aspects of the behaviour, and both have to be present to explain the data. The aim is to create an algorithm that is minimally covering, i.e. it should only contain complementary dependencies.

5 Algorithm 5.1 Finding Naive Dependencies The goal of this step is to find (non-interacting) dependencies that are valid throughout the entire model (i.e. are not conditional). These causal relations are called naive dependencies, and provide the basis for the rest of the algorithm.

5.1.1 Consistency Rules Naive dependencies are identified using consistency rules. Each pair of quantities is checked using these rules to determine which of them potentially holds throughout the state graph. These rules make use of Mv (Qx ), Ms (Qx ), Dv (Qx ), Ds (Qx ) of each quantity in a pair, and inequalities that hold between them. These statements are referred to as the state information of a quantity. The consistency rules are derived from the semantics of the causal dependencies (see Section on Garp3). Examples of rules (that should hold throughout the state graph) are:

4.4.3 Clusters

I+

Q1 → Q2 if Ms (Q1 ) = Ds (Q2 )

The algorithm makes use of a specific subset of causal paths called clusters. We define clusters as groups of quantities that exhibit “equivalent” behaviour. More specifically, a set of quantities con-

I−

Q1 → Q2 if Ms (Q1 ) = −Ds (Q2 ) P+

Q1 → Q2 if Ds (Q1 ) = Ds (Q2 )

Qqs

stitute a cluster if their values either correspond (Q1 ↔ Q2 ) or

P−

Q1 → Q2 if Ds (Q1 ) = −Ds (Q2 )

Q−1 qs

inversely correspond (Q1 ↔ Q2 ) to each other. Additionally, the corresponding derivatives should be equal (Dv (Q1 ) = Dv (Q2 )), while inversely corresponding derivatives should be each other’s inverse (Dv (Q1 ) = −Dv (Q2 )). A further constraint is that the corresponding quantities (not inverse) in a cluster must be completely equivalent. Therefore, Mv (Q1 ) = Mv (Q2 ) must always hold. If an inequality holds between two quantities, they are considered not to belong to the same cluster. During implementation it became obvious that clusters are not meaningful when quantities within a cluster belong to different entities. The reason for this originates from the idea of ‘no function in structure’ [5]. Clusters involving multiple entities would integrate causality across individual structural units, which is undesired. Therefore, clusters can only contain quantities that belong to the same entity. Quantities cannot be a member of more than one cluster. If Q1 and Q2 are in a cluster, and Q1 and Q3 are in a cluster, then Q1 , Q2 and Q3 must be in the same cluster. After all, if Q1 and Q2 have equivalent behaviour, and Q1 and Q3 have equivalent behaviour, by transitivity Q2 and Q3 have to exhibit equivalent behaviour.

(1) (2) (3) (4)

Qv

Q1 (Vx ) ↔ Q2 (Vy ) if Mv (Q1 ) = Q1 (Vx ) =⇒ Mv (Q2 ) = Q2 (Vy ) Qqs

Qv

Q1 ↔ Q2 if ∀Vn (Q1 (Vn ) ↔ Q2 (Vn ))

(5) (6)

5.1.2 Redundancy The set of dependencies that are found contain a lot of redundancy, i.e. many dependencies are substitutionary. For example, in the comP+

municating vessels model height → pressure, can be substituted P+

by pressure → height. The remainder of the algorithm selects the correct substitutionary groups, and uses the selected naive dependencies to derive more complex dependencies.

5.2 Determining Clusters This step tries to determine clusters within the set of naive dependencies. The algorithm searches for quantities belonging to the same entity that exhibit equivalent behaviour, and tries to expand these candidate clusters by adding other quantities. Quantities are only added if 15

they exhibit behaviour equivalent to the quantities already contained in the candidate cluster. If no more quantities can be added to a candidate cluster, the algorithm searches for other candidate clusters. By only considering models composed of clusters, the space of possible models is significantly reduced. The validity of the candidate clusters is checked by determining if there is overlap between the clusters. All clusters that overlap are removed. An alternative would be to only remove clusters until no more overlap is present. However, in practice no situations were encountered where this was desirable. An example of a found cluster is volume, height and pressure in the communicating vessels model. Note that these clusters are still missing influences (their actuators), these are determined later in the algorithm.

I−

I+

Q4 → Q5 and Q4 → Q6 , where Q4 ∈ C1 , Q5 ∈ C2 , Q6 ∈ C3 . Note that in many cases Q1 = Q4 , such as in the communicating vessels model.

5.4.2 Finding Calculus Relations The algorithm reduces the search space of finding ESMs using four constraints. Firstly, all quantities involved in the operator should be in different clusters (C1 , C2 and C3 are unequal). Secondly, the set of naive dependencies should at least contain one influence from Q1 (to serve as an actuation). Thirdly, both Q2 and Q3 would be at the end of the causal paths within their cluster, as in most cases this is the most meaningful interpretation. Finally, Q2 and Q3 are required to be of the same type, as only things of the same type can be subtracted.

5.3 Generating Causal Paths This step returns the possible causal orderings within clusters based on the cluster and naive dependencies sets. For each cluster a valid causal ordering is returned. Through backtracking other possible orderings are generated. The quantities in a cluster can be either connected in a linear fashP+

P+

P+

P+

ion (Q1 → Q2 → Q3) or using branching (Q1 → Q2 and Q1 → Q3). The algorithm prefers linear branching, as branching does not often occur in practice. Additionally, the reduction of possible models is a significant advantage. Another constraint that reduces the number of possible models is requiring clusters that belong to entities of the same type to have the P+

same causal ordering. For example, if for one container V olume → P+

Height → P ressure, than for other containers the same causal ordering must hold.

5.4.3 External Actuators External actuators are causes of change more at the edges of the system compared to ESMs. To identify external actuators, the algorithm considers the influences in the naive dependencies that are not part of an ESM. Again, the minimal covering principle is applied to keep the number of dependencies to a minimum. As a result a cluster will never have more than one incoming actuation. An actuation is only considered between C1 to C2 if the set of naive dependencies contains influences between each possible pair I+

of quantities, such that ∀Qx ∈ C1 , ∀Qy ∈ C2 (Qx → Qy ). This removes the influences in the set of naive dependencies that are consistent with the behaviour by chance. Alternative actuations are returned through backtracking. In the future, actuations may be chosen based on the structure of the system, as causal relations are more likely to occur parallel to structurally related entities.

5.4 Actuating Clusters The goal of the actuating clusters step is to connect clusters by identifying cluster actuations. This step takes the set of clusters with established causal orderings and the naive dependencies as input. Clusters can either be actuated by another cluster, or act as an actuator itself. Furthermore, clusters can be connected by propagating an actuation. In a model, each cluster should take part in at least one of these kind of relations such that all clusters are related in a way. Otherwise, the model would include two separate non-interacting subsystems. When one cluster actuates another, there is an influence relation between the two. Actuations are the most important form of connecting clusters, since these connections are the cause of change in the system. They are also the easiest to detect, due to the specific way influences manifest themselves in the state information. For this reason, actuations by influences are identified first. Two types of actuations though influences are distinguished: (1) equilibrium seeking mechanisms (ESM) and (2) external actuators.

5.4.4 Feedback A common pattern in qualitative models is feedback, which is a proportionality originating from the end of a causal path to the quantity actuating the causal path. Feedbacks are simply added if the naive dependencies contain one. The algorithm always adds feedback at the end of causal paths, since this is what happens in the investigated models. However, it could be the case that feedbacks from halfway a causal chain are also possible.

5.5 Linking Clusters by Propagation This step connects the clusters that have not yet been connected through proportionalities, based on the naive dependencies. As with clusters, the causal ordering of the clusters cannot be distinguished. Therefore all possibilities are generated. Furthermore, the same design choices as with finding causal paths within clusters have been made. Only linear orderings of clusters are allowed (i.e. no branching).

5.4.1 Equilibrium Seeking Mechanisms ESMs are better known as flows, and are common in qualitative models. Flows cause two unequal quantities to equalize. The flow in the communicating vessels model has a non-zero value when the pressures in the two containers are unequal. The flow changes the volume of the containers, and thus the pressures to equalize. An ESM holds under the following two conditions: (1) Q1 = Q2 −Q3 , where Q1 ∈ C1 , Q2 ∈ C2 , Q3 ∈ C3 , where the C’s are clusters, and (2)

5.6 Setting Initial Magnitudes An influence has no effect if the magnitude of the quantity from which it originates is unknown. Therefore this step assigns initial values to quantities. Note that this step first generates a set of candidate assignments. When a value can be derived in another way than through assignment, it is removed from the set of value assignment candidates. 16

There are six ways to assign initial magnitudes. Firstly, if a value assignment for the quantity is present in the scenario, it requires no initialisation. Secondly, if the magnitude can be derived through a correspondence, the value is known. Thirdly, the result of a minus operator can be derived if an inequality between its arguments is known. Based on the possible magnitudes of the result this inequality can be derived. Either this inequality is present in the scenario, or multiple inequalities should be made assumable by adding them as conditions in multiple model fragments. Garp3 automatically assumes unprovable values and inequalities if they are conditions in model fragments. Note that generating the conditional inequalities is currently beyond the scope of the algorithm, as it involves adding model ingredients to multiple model fragments. Fourthly, it is possible that a certain magnitude holds everywhere throughout the state graph. In this case, a value assignment is added as a (conditionless) consequence. Fifthly, a value could hold under certain conditions. However, this would require a value assignments with a conditional inequalities in separate model fragments. Therefore, it is currently beyond the scope of the algorithm. Finally, multiple model fragments could be created in which the magnitudes are present as conditions. Garp3 will generate the different states that would result by assuming each of the values. As with the conditional value assignments, having value assignments as conditions in multiple model fragments is currently beyond the scope of the algorithm.

5.7 Dependency Interactions This step identifies dependency interactions (influences or proportionalities) based on the input behaviour. Dependency interactions are detected in the same way as naive dependencies, i.e. using a set of consistency rules. Interactions are not found as naive dependencies, as the individual dependencies are not consistent with the entire state graph (as an interaction results in more behaviour than a single dependency). The algorithm assumes that the interaction consists opposing dependencies, such as birth vs. death and immigration vs. emigration.

6 Results The tree and shade model [3] is successfully modelled by the algorithm. It returns two models, representing both possible directions of causality between Size and Shade. The initial magnitude assignment correctly finds the conditionless value assignment on Growth rate. The models’s simulation results are equivalent to the original model. The dependencies of the communicating vessels model are correctly found. The algorithm returns 6 models; one for each possible causal ordering of amount, height and pressure. The algorithm also correctly identifies the ESM-based actuations of the clusters, by properly finding the min operator. Furthermore, all necessary causal dependencies and correspondences are identified. Model fragments that allow the assumption of initial values are missing (due to the fact that the algorithm generates a single model fragments). Adding an inequality between the pressures of the containers in the scenario allows the model to simulate without problems. The deforestation model (containing entities ’Woodcutters’, ’Vegetation’, ’Water’, ’Land’ and ’Humans’) is successfully modelled, including setting initial magnitudes using conditions. The simulation is equivalent to that of the original model. The causal ordering does differ, as it does not capture the branching of the causal paths in the original model. The resulting model however, is not considered wrong by experts, and is arguably better than the original. Over 2000

models are returned when generating all possible results, due to the many possible causal orderings. The population dynamics model [3] generates the correct models for the open and closed population scenarios. However, the initial values are not set. The algorithm does not yet give correct results for the heating/boiling, R-Star [6] and Ants’ Garden [8] models. For the heating model this is due to inequalities that hold under specific conditions, which are not taken care of in the algorithm. The R-Star and Ants’ Garden are large models that resulted from specific research projects. As such, these models are an order of magnitude more complex than the other models. It is therefore not surprising that the algorithm in its current form cannot cope with them.

7 Conclusions & Future Work This paper presents preliminary work towards an algorithm that automatically determines a Garp3 qualitative model, using an enumeration of all possible system behaviour as input. The algorithm uses consistency rules to determine the causal dependencies that hold within the system. Using the concept of clusters the search space is significantly reduced. Accurate results are generated for a set of well-established models. The results seem to suggests that it is possible to derive causal explanations from the behaviour of a system, and that model building support through an automatic model building algorithm is viable. There are several algorithm improvements planned. The first improvement is to have a generalised representation for the ambiguity within and between clusters. That is, have a single representation for the complete model space. For simulation purposes an arbitrary instantiation can be chosen, as each one has an equivalent result. Secondly, the algorithm has to be improved to be able to create multiple model fragments in order to deal with conditional model ingredients. Thirdly, means have to be developed to be able to compare generated state graphs with the desired state graph.

ACKNOWLEDGEMENTS We would like to thank the referees for their insightful comments.

REFERENCES ˇ [1] Ivan Bratko and Dorian Suc, ‘Learning qualitative models’, AI Mag., 24(4), 107–119, (2004). [2] B. Bredeweg, A. Bouwer, J. Jellema, D. Bertels, F. Linnebank, and J. Liem, ‘Garp3 - a new workbench for qualitative reasoning and modelling’, in 20th International Workshop on Qualitative Reasoning (QR06), eds., C. Bailey-Kellogg and B. Kuipers, pp. 21–28, Hanover, New Hampshire, USA, (July 2006). [3] Bert Bredeweg and Paulo Salles, Handbook of Ecological Modelling and Informatics, chapter Mediating conceptual knowledge using qualitative reasoning, WIT Press, 2008. (in press). [4] W. Bridewell, P. Langley, L. Todorovski, and S. D˘zeroski, ‘Inductive process modeling’, Machine Learning, 71, 132, (2008). [5] J. de Kleer and J. S. Brown, ‘A qualitative physics based on confluences’, Artificial Intelligence, 24(1-3), 7–83, (December 1984). [6] T. Nuttle, B. Bredeweg, and P. Salles, ‘R-star - a qualitative model of plant growth based on exploitation of resources’, in 19th International Workshop on Qualitative Reasoning (QR’05), eds., M. Hofbaur, B. Rinner, and F. Wotawa, pp. 47–53, Graz, Austria, (May 2005). [7] Wei Pang and George M. and Coghill, ‘Advanced experiments for learning qualitative compartment models’, in 21th International Workshop on Qualitative Reasoning (QR-07), ed., C. Price, (2007). [8] P. Salles, B. Bredeweg, and N. Bensusan, ‘The ants garden: Qualitative models of complex interactions between populations.’, Ecological Modelling, 194(1-3), 90–101, (2006).

17

18

Automated Learning of Communication Models for Robot Control Software Alexander Kleiner

1

and Gerald Steinbauer

Abstract. Control software of autonomous mobile robots comprises a number of software modules which show very rich behaviors and interact in a very complex manner. These facts among others have a strong influence on the robustness of robot control software in the field. In this paper we present an approach which is able to automatically derive a model of the structure and the behavior of the communication within a componentorientated control software. Such a model can be used for on-line model-based diagnosis in order to increase the robustness of the software by allowing the robot to autonomously cope with faults occurred during runtime. Due to the fact that the model is learned form recorded data and the use of the popular publisher-subscriber paradigm the approach can be applied to a wide range of complex and even partially unknown systems.

1

2

and Franz Wotawa

2

the type of a particular communication path, e.g, whether the communication occurs on a regular basis or sporadically. Finally, the model includes information about which inputs and outputs of the software modules have a functional relation, e.g, which output is triggered by which input. The model is specified by a set of logic clauses and uses a component-based modeling schema [1]. Please refer to [8, 7] for more details. The diagnosis process itself uses the well known consistency-based diagnosis techniques of Reiter [5]. The models of the control software and the communication were created by hand by analyzing the structure of the software and its communication behavior during runtime. Because of the complexity of such control software or the possible lack of information about the system it is not feasible to do this by hand for large or partially unknown systems. Therefore, it is desirable that such models can be created automatically either from a formal specification of the system or from observation of the system. In this paper we present an approach which allows the automatic extraction of all necessary information from the recorded communication between the software modules. The algorithm provides all information needed for model-based diagnosis. It provides a communication graph showing which modules communicate, the desired behavior of the particular communication paths and the relation between the inputs and outputs of the software modules. These model learning approach was originally developed for and tested with the control software of the Lurker robots [2] used in the RoboCup rescue league. This control software uses the IPC communication framework [6], which is a very popular event-based communication library used by a number of robotic research labs worldwide. However, the algorithm simply can be adapt to other event-based communication frameworks, such as for instance Miro. The next section describes in more detail how the model is extracted from the observed communication.

Introduction

Control software of autonomous mobile robots comprises a number of software modules which show very rich behaviors and interact in a very complex manner. Because of this complexity and other reasons like bad design and implementation there is always the possibility that a fault occurs at runtime in the field. Such faults can have different characteristics like crashes of modules, deadlocks or wrong data leading to a hazardous decision of the robot. This situation can occur even if the software is carefully designed, implemented and tested. In order to have truly autonomous robots operating for a long time without or with limited possibility for human intervention, e.g., planetary rovers exploring Mars, such robots have to have the capability to detect, localize and to cope with such faults. In [8, 7] the authors presented a model-based diagnosis framework for control software for autonomous mobile robots. The control software is based on the robot control framework Miro [10, 9] and has a client-server architecture where the software modules communicate by exchanging events. The idea is to use the different communication behaviors between the modules of the control software in order to monitor the status of the system and to detect and localize faults. The model comprises a graph specifying which modules communicate with each other. Moreover, the model has information about 1

2

2

Model Learning

Control systems based on IPC use an event-based communication paradigm. Software modules which wants to provide data are publishing an event containing the data. Other software modules which like to use this data, subscribe for the appropriate event and get automatically informed when such an event is available. A central software module of IPC is in charge for all aspects of this communication. Moreover, this software module is able to record all the communication details. It is able to record the type of the event, the time the event was published or consumed, the content of the event, and the names of the publishing and the receiving module.

Institut f¨ ur Informatik, Albert-Ludwigs-Universit¨ at Freiburg, Georges-K¨ ohler-Allee, D-79110 Freiburg, Germany, [email protected] Institute for Software Technology, Graz University of Technology, Inffeldgasse 16b/II, A-8010, Austria, {steinbauer,wotawa}@ist.tugraz.at

19

The collected data is the basis for our model learning algorithm. Figure 1 depicts such collected data for a small example control software comprising only 5 modules with a simple communication structure. This example will be used in the following description of the model learning algorithm. The control software comprises two data path. One is the path for the self-localization of the robot. The two modules in the path Odometry and SelfLoc provide data on a regular basis. The other is the path for object tracking. The module Vision provides new data on a regular basis. The module Tracker provides data only if new data is available from the module Vision. The figure shows when the different events were published. Based on this recorded communication we extract the communication model step by step.

2.1

2.2

The communication behavior

In a next step the behavior or type of each event connection is determined. For this determination we use the information of the node the event connection comes from, and the recorded information of the event related to the connection, and all events related to the sending node. We can distinguish the following types: triggered event connection (1), periodic event connection (2), bursted event connection (3) and random event connection (4). In order to describe the behavior of a connection formally we define a set of connection types CT = {periodic, triggered, bursted, random} and a function ctype : C 7→ CT which returns the type of a particular connection c ∈ C. The type of a event connection is determined by tests like measurements of the mean and the standard deviation of the time between the occurrence of the events on the connection, and comparison or correlation of the occurrence of two events. The criteria used to assign an event connection to one of the four categories are summarized below:

The communication graph

At a first step the algorithm extract a communication graph from the data. The nodes of the graph are the different software modules. The edges represent the different events which are exchanged between the modules. Each event is represented by at least one edge. If the same event is received by multiple modules, there is an edge to every receiving module originating from the publishing module. Figure 2 depicts the communication graph for the above example. This graph shows the communication structure of the control software. Moreover, it shows the relation of inputs and outputs of the different software modules because each node knows its connections. Such a communication graph is not only useful for diagnosis purposes, but it is also able to expressively visualize the relation of modules from a larger or partially unknown control software. Formally the communication graph can be defined as following:

triggered A triggered event only occurs if its publishing module recently received a trigger event. In oder to determine if an event connection is a triggered event connection, the events on connection c ∈ out(m) are correlated to the events on the set of input connection to the software module I = in(m). If the number of events on connection c, which are correlated with an event on a particular connection t ∈ in(m), exceed a certain threshold, connection t is named as trigger of connection c. The correlation test looks for the occurrence of the trigger event prior the observed event. Note each trigger event can only trigger one event. If connection c is correlated with at least one connection t ∈ in(m) connection c is categorized as a triggered connection. Usually, such connections are found in modules doing calculations only if new data are available.

Definition 1 (CG) A communication graph (CG) is a directed graph with the set of nodes M and the set of labeled edges C where:

periodic On a periodic event connection the same event regularly occurs with a fixed frequency. We calculate from the time stamps of the occurrence of all events a discrete distribution of the time difference between two successive events. If there is a high evidence in the distribution for one particular time difference, the connection is periodic with a periodic time of the estimated time difference. For a pure periodic event connection one gets a distribution close to a Dirac impulse. Usually, such connections are found with modules providing data at a fixed frame rate, such as a module sending data from a video camera.

• M is a set of software modules sending or receiving at least one event. • C is a set of connections between modules, the direction of the edge points from the sending to the receiving module, the edge is labeled with the name of the related event. Please note that the communication graph may contain cycles. Usually such cycles emerge from acknowledgement mechanisms between two modules. The algorithm for the creation of the communication graph is straightforward. The algorithm starts with an empty set of nodes M and edges C. The algorithm iterates trough all recorded communication events. If either the sender or the receiver are not in the set of the nodes the sender or the receiver is added to the set. If there is no edge pointing from the sending to the receiving node with the proper label, a new edge with the appropriate label is added between the two modules. Moreover, we define the two functions in : CO 7→ 2C which returns the edges pointing to a node and the function out : CO 7→ 2C which returns the edges pointing from a node.

bursted A bursted event is similar to the periodic event but its regularly occurrence can be switched on and off for a period of time. A event connection is classified as bursted if there exist time periods where the criteria of the periodic event connection hold. Usually, such connections are found with modules which do specific measurements only if the central controller explicitly enable them, e.g., a complete 3d laser scan. random For random event connections none of the above categories match and therefore no useful information about the behavior of that connection can be derived. Usually, such

20

350

300

250

Vision

Odometry

msg-objects @ 2 Hz

200

Tracker

msg-odometry @ 12 Hz

Selfloc

150 msg-velocities @ 2 Hz msg-pose @ 6 Hz

100

User 50

0 0

1

2

msg-objects

Figure 1.

3 msg-odometry

4 Time [s]

5

6

msg-velocities

7 msg-pose

Figure 2. Communication graph learned from the recorded data of the example control software.

Recorded communication of the example robot control software. The peaks indicate the occurrence of the particular event.

connections are found in modules which provide data only if some specific circumstance occur in the system or its environment. In the case of the above example, the algorithm correctly classified the event connections odometry, objects and pose as periodic and the connection velocity as triggered with the trigger objects.

2.3

8

periodic This observer raises an alarm if there is a significant change in the frequency of the events on the observed connection. The observer checks if the frequency of successive events does vary too much from the specified frequency. For this purpose, the observer estimates the frequency of the events within a sliding time window. bursted This observer is similar to the observer above. It differs in the fact that this observer starts the frequency check only if events occur and does not raise an alarm if no events occur.

The observers

In order to be able to monitor the actual behavior of the control software, the algorithm instantiates an observer for each event connection. The type of the observer is determined by the type of the connection and its parameters, estimated by the methods described before. An observer rises an alarm if there is a significant discrepancy between the currently observed behavior of an event connection and the behavior learned beforehand during normal operation. The observer provides as an observation O the atom ok(l) if the behavior is within the tolerance and the atom ¬ok(l) otherwise. Where l is the label of the corresponding edge in the communication graph. The observations of the complete control OBS software is the union of all individual observations OBS =

n [

random This is a dummy observer which alway provides the observation ok(l). This observer is implemented for completeness.

2.4

The system description

The communication graph together with the type of the connections is a sufficient specification of the communication behavior of the robot control software. This specification can be used in order to derive a system description for the diagnosis process. It is a description of the desired or nominal behavior of the system. In order to be able to be used in the diagnosis process, the system description is automatically written down as a set of logical clauses. This set can easily be derived from the communication graph and the behavior of the connections. The algorithm to derive the system description starts with an empty set SD. For every event connection in two steps, clauses are added to the system description. In the first step, a clause for forward reasoning is added. The clause specifies if a module works correct and all related inputs and outputs behave as expected. Depending on the type of the connection, we add the following clause to the SD. If connection c is

Oi

i=1

where n is the number of observers. The following observers are used: triggered This observer raises an alarm if within a certain timeout after the occurrence of a trigger event no corresponding event occurs or if the trigger event is missing prior the occurrence of the corresponding event. In order to be robust against noise, the observer uses a majority vote for a number of succeeding events, e.g, 3 votes.

21

triggered, we add the clause ^ ¬AB(m)

module User. The module Vision provides position measurements of objects. The module Tracker uses this measurements to estimate the velocity of the objects. New velocity estimations are only generated if new data is available. The velocity estimates are also visualized by the GUI. Figure 1 shows the recorded communication of this example. Figure 2 depicts the communication graph extracted from the recorded data. It correctly represents the actual communication structure of the example, and shows the correct relation of event producers and event consumers. Moreover, the algorithm correctly identified the type of the event connections. This can be seen by the system description the algorithm has derived which is depicted in Figure 3. It also instantiated the correct observer for the four event connections. A periodic event observer was instantiated for odometry, objects and pose, and a triggered event observer was instantiated for velocities.

ok(t) → ok(c)

t∈trigger(c)∧t∈in(m)

and the clause ¬AB(m) → ok(c) otherwise. ¬AB(m) means that the module m is not abnormal and the module works as expected. The atom ok(c) specifies that the connection c behaves as expected. In a second step, a clause for backward reasoning is added. The clause specifies if all output connections c′ of module m behave as expected the module itself has to behave as expected. We add the clause ^ ok(c′ ) → ¬AB(m) c′ ∈out(m)

1. 2. 3. 4. 5. 6. 7. 8.

Figure 3 depicts the system description obtained for the above example control software.

3

Model-based diagnosis

For the detection and localization of faults we use the consistency-based diagnosis technique of [5]. A fault detectable by the derived model causes a change in the behavior of the system. If such an inconsistency between the modeled and observed behavior emerges, a failure has been detected. Formally, we define this by:

¬AB(Vision) → ok(objects) ¬AB(Odometry) → ok(odometry) ¬AB(Tracker ) ∧ ok(objects) → ok(velocities) ¬AB(Selfloc) → ok(pose) ok(objects) → ¬AB(Vision) ok(odometry) → ¬AB(Odometry) ok(velocities) → ¬AB(Tracker ) ok(pose) → ¬AB(Selfloc)

Figure 3.

The system description automatically derived for the example control software.

SD ∪ OBS ∪ {¬AB(m)|m ∈ M } |=⊥ where the latter set says that we assume that all modules work as expected. In order to localize the module responsible for the detected fault, we have to calculate a diagnosis ∆. Where ∆ is a set of modules m ∈ M we have to declare as faulty (change ¬AB(m) to AB(m)) in order to resolve the above contradiction. We use our implementation 3 of this diagnosis process for the experimental evaluation of the models. Please refer to [8, 7] for the detail of the diagnosis process.

4

Figure 3 depicts the extracted system description. Clauses 1 to 4 describe the forward reasoning. Clauses 5 to 8 describe the backward reasoning. Clause 3 states that the module Tracker works correctly only if a velocity event occurs only after trigger event. For instance, Clause 6 states that if all output connections of module Odometry work as expected, consequently the module itself works correct. This automatically generated system description was used in some diagnosis tests. We randomly shutdown modules and evaluate if the fault was correctly detected and localized. For this simple example the faults were always proper identified.

Experimental Results

In order to show the potential of our model learning approach, the approach has been tested on three different types of robot control software. We evaluated whether the approach is able to derive an appropriate model reflecting all aspects of the behavior of the system. The derived model was evaluated by the system engineer who has developed the system. Moreover, we injected artificial faults like module crashes in the system, and evaluated if the fault can be detected and localized by the derived model.

4.1

4.2

In a second experiment we recorded the communication of the control software of the rescue robot Lurker [2] while the robot was autonomously exploring an unknown area. The robot is shown in Figure 4. The control software of this robot is far more complex as in the example above, since it comprises all software modules enabling a rescue robot to autonomously explore an area after a disaster. Figure 5 shows the communication graph derived from the recorded data, clearly showing the complex structure of the control software. From the communication graph and the categorized event connections a system description with 70 clauses with 51 atoms and 35 observers was derived. After a double check with the system engineer of the control software it was confirmed that the automatically derived model maps the behavior of the system.

A small example control software

The example software from the introduction comprises five modules. The module Odometry provides odometry data at a regular basis. This data is consumed by the module SelfLoc, which does pose tracking by integrating odometry data, and providing continuously a pose estimate to a visualization 3

The implementation can freely http://www.ist.tugraz.at/mordams/.

be

downloaded

Autonomous exploration robot Lurker

at

22

Xsense inertia @ 58Hz inertia @ 25 Hz

mcClient

lurker_arm_pos @ 11 Hz

lurker_touch_pos @ 11 Hz

bumper @ 11 Hz

redone @ 5 Hz redone @ 3 Hz

LurkerController task_finish task_assign_climbing @ 0 Hz @ 0 Hz inertia @ 20 Hz

rangescan_ranges @ 7 Hz

HierarchyController action_execution_debug @ 18 Hz robot_context @ 0 Hz

positioner_actuator @ 0 Hz heightmap @ 2 Hz

autonomy_control @ 0 Hz

motor positioner_actuator @ 0 Hz @ 3Hz

hierarchy_debug @ 3 Hz

3dscan_received @ 0 Hz

3dscan_received @ 0 Hz

positioner_actuator @ 3Hz 3dscan_trigger @ 0 Hz

kalman_pose @ 6 Hz

RemoteAutonomy mrf_area_request @ 1 Hz

robot_context @ 3 Hz

pose3d @ 13 Hz

inertia @ 38Hz

tilt_ack @ 3 Hz

positioner_actuator @ 0 Hz

urgLMS

rangescan_ranges @ 9 Hz

localization

rangescan @ 0 Hz

tilted_ranges @ 5 Hz

kalman_pose @ 6 Hz

elevation partial_heightmap partial_heightmap @ 0 Hz @ 0 Hz

mrfHeightmapClassifier

Figure 5.

Communication graph Lurker robot.

Figure 7 depicts the communication graph derived from the recorded data. It clearly shows that the control software for teleoperation shows a far less complex communication structure than in the autonomous service. From the communication graph and the categorized event connections a system description with 44 clauses with 31 atoms and 22 observer was derived.

5

Figure 4.

4.3

There are many proposed and implemented systems for fault detection and repair in autonomous systems. Due to lack of space we refer only a few. The Livingstone architecture by Williams and colleagues [4] was used on the space probe Deep Space One to detect failures in the probe’s hardware and to recover from them. Model-based diagnosis also has been successfully applied for fault detection and localization in digital circuits and car electronics and for software debugging of VHDL programs [1]. In [3] the authors show how modelbased reasoning can be used for diagnosis for a group of robots in the health care domain. The system model comprises interconnected finite state automata. All these methods have in common that the used models of the system behavior are generated by hand.

The autonomous rescue robot Lurker of the University of Freiburg.

Teleoperation Telemax robot.

In a final experiment we record data during a teleoperated run with the bomb-disposal robot Telemax. The robot Telemax is shown in Figurer 6.

6

Figure 6.

Related Research

Conclusion and Future Work

In this paper we presented an approach which allows the automated learning of communication models for robot control software. The approach uses recorded event communication. The approach is able to automatically extract a model of the behavior of the communication within a componentorientated control software. Moreover, the approach is able to derive a system description which can be used for model-based diagnosis. The approach was successfully tested on IPC-based

The teleoperated robot Telemax.

23

Joystick velocity @ 7 Hz

usb4VideoSender

flipper_movements @ 7 Hz

manipulator_movements @ 8 Hz

Telemax

image_message @ 3 Hz

flipper_axes @ 9 Hz

battery_status @ 1 Hz

inertia robot_configuration @ 28 Hz @ 10 Hz

safety @ 1 Hz

attitude @ 2 Hz

drive_axes @ 10 Hz

RoboGUI_RobotConnect

Xsense odometry flipper_axes @ 1 Hz @ 10 Hz

flipper_current @ 10 Hz

inertia @ 69 Hz

SensorVisualization

Figure 7.

Communication graph Telemax robot.

robot control software like the rescue robot Lurker. IPC is a widely used basis for robot control software. Therefore, our approach is instantly usable on many different robot systems. Currently, we are working on a port for Miro-based systems. This even will increase the number of potential target systems of our approach. Moreover, we work on the recognition of additional event types in order to enrich the generated models. We believe that the consideration of the content of the events will lead to significantly better models and diagnosis. For the modeling the techniques of Qualitative Reasoning seem to be promising. But it is an open question how such qualitative models can be automatically learned from recorded data.

REFERENCES [1] Gerhard Friedrich, Markus Stumptner, and Franz Wotawa, ‘Model-based diagnosis of hardware designs’, Artificial Intelligence, 111(2), 3–39, (1999). [2] Alexander Kleiner and Christian Dornhege, ‘Real-time Localization and Elevation Mapping within Urban Search and Rescue Scenarios’, Journal of Field Robotics, (2007). [3] Roberto Micalizio, Pietro Torasso, and Gianluca Torta, ‘Online monitoring and diagnosis of a team of service robots: A model-based approach’, AI Communications, 19(4), 313 – 340, (2006). [4] Nicola Muscettola, P. Pandurang Nayak, Barney Pell, and Brian C. Williams, ‘Remote agent: To boldly go where no AI system has gone before’, Artificial Intelligence, 103(1-2), 5–48, (August 1998). [5] Raymond Reiter, ‘A theory of diagnosis from first principles’, Artificial Intelligence, 32(1), 57–95, (1987). [6] Reid Simmons, ‘Structured Control for Autonomous Robots’, IEEE Transactions on Robotics and Automation, 10(1), (1994). [7] Gerald Steinbauer, Martin M¨ orth, and Franz Wotawa, ‘RealTime Diagnosis and Repair of Faults of Robot Control Software.’, in RoboCup 2005: Robot Soccer World Cup IX, volume 4020 of Lecture Notes in Computer Science, pp. 13–23. Springer, (2006). [8] Gerald Steinbauer and Franz Wotawa, ‘Detecting and locating faults in the control software of autonomous mobile robots.’, in 16th International Workshop on Principles of Diagnosis (DX-05), pp. 13–18, Monetrey, USA, (2005). [9] Hans Utz, Advanced Software Concepts and Technologies for Autonomous Mobile Robotics, Ph.D. dissertation, University of Ulm, Neuroinformatics, 2005. [10] Hans Utz, Stefan Sablatng, Stefan Enderle, and Gerhard K. Kraetzschmar, ‘Miro – middleware for mobile robot applications’, IEEE Transactions on Robotics and Automation, Special Issue on Object-Oriented Distributed Control Architectures, 18(4), 493–497, (August 2002).

24

flipper_movements @ 7 Hz

velocity @ 7 Hz odometry @ 10 Hz

GPS_module inertia @ 70 Hz

gpsFilter

gps_status gps_satellites @ 0 Hz @ 1 Hz

Relaxation of Temporal Observations in Model-Based Diagnosis of Discrete-Event Systems Gianfranco Lamperti and Federica Vivenzi and Marina Zanella1 Proposing algorithms for comparing uncertain observations in order to reuse model-based reasoning [11, 12].

Abstract. Temporal observations play a major role in model-based diagnosis of discrete-event systems. Although the reaction of a system generates a sequence of visible labels, in real contexts, where the system is large and distributed, what is perceived by the observer is uncertain in nature, namely an uncertain temporal observation. On the one hand, the degradation of the sequence of labels to the temporal observation has never been subject of formal investigation. On the other, considerable effort has been spent on similarity-based diagnosis, where temporal observations are checked for subsumption, a property that enables reuse of model-based reasoning. The notion of coverage between temporal observations was proposed to check subsumption efficiently. This paper unifies the concepts of degradation and coverage by means of the new notion of observation relaxation. This consists of three algebraic operators applied to the domain of temporal observations, namely (logical relaxation), (temporal relaxation), and ˛ (augmentation). A formal result is that any temporal observation relevant to the reaction of a system can always be represented by a relaxation expression.

1

All these research lines assume an existing uncertain observation. This paper instead assumes an existing completely certain sequence of observable events (the sequence of events generated by the considered DES) and addresses how such a sequence is transformed into an uncertain observation, by introducing the notion of relaxation. The aim of the paper is providing a formal framework for mimicking what happens in the real world, so as to endow the notion of an uncertain observation with a physical motivation. However, after relaxation had been defined, the authors found out that this notion was strictly related to the notions of subsumption and coverage already introduced for quite different purposes [11, 12]. Owing to space reasons, the proofs of all propositions and theorems in the paper are omitted.

2

Discrete-event systems are dynamic systems, typically modeled as networks of components. Each component is a communicating automaton [1] that reacts to input events by state-transitions which possibly generate new events towards other components. When a DES reacts, it performs a sequence of transitions, some of which are visible. For each visible transition, an observable label is generated. The whole sequence of these observable labels (ordered according to the generation order) is the signature of the reaction. However, what is actually perceived by the external observer about the reaction is a degradation of the signature, namely the temporal observation. Formally, let L be a finite domain of labels, possibly including the null label . A temporal observation is a DAG

INTRODUCTION

Model-based diagnosis of discrete-event systems (DESs) [2] has been an active research area in this first decade of the 2000s [4, 13, 3, 14, 6, 15]. A diagnosis task takes as input an observation of the system to be diagnosed. In case such a system is discrete, its observable events range over a finite domain of discrete values. An observation is temporally uncertain if the generation order of observed events is not precisely known, what is known is instead a partial order that conforms to the actual generation order: an event can be observed before another that was generated by the DES before it, and, given the reception order of events, it is impossible to devise the relative emission order of all the pairs of events belonging to the observation. Therefore, several sequences of observable events comply with a temporally uncertain observation. Features and models of DES observations have been investigated for diagnosis purposes in several directions:

O D .N ; L; A/

(1)

where N is the set of nodes, with each N 2 N being marked with a non-empty subset of L, and A W N 7! 2N is the set of arcs. A ‘’ temporal precedence relationship among nodes of the graph is defined as follows:

Defining the different kinds of uncertainty affecting a given observation [9]; Splitting an uncertain observation into sub-observations to be incrementally considered by diagnostic tasks [10, 14]; Studying the effects of some properties (such as correct slicing [7] or stratification [12]) of a fragmented uncertain observation on the diagnostic results, and recognizing whether a given fragmented uncertain observation exhibits such properties; 1

TEMPORAL OBSERVATION

If N 7! N 0 2 A then N N 0 ; If N N 0 and N 0 N 00 then N N 00 ; If N ! 7 N 0 2 A then ÀN 00 2 N .N N 00 N 0 /. Based on the last property, we say that O is in canonical form (that is, without any redundant temporal precedence). When no precedence relationship is defined between N and N 0, such nodes are temporally unrelated, written N ªN 0 . The set of candidate labels marking a node N is the logical content of the node, written kN k. We assume kN k ¤ fg.

Università di Brescia, Italy, e-mail: [email protected], federica.vivenzi.gmail.com, [email protected]

25

Figure 1. Certain observation O` (left) and uncertain observations O2 (center) and O1 (right).

An observation which includes a node whose temporal content is not a singleton is affected by logical uncertainty. An observation which includes a pair of temporally unrelated nodes is affected by temporal uncertainty A temporal observation where none of the above uncertainties holds is a linear observation. The signature of the reaction is in fact a linear observation. An uncertain observation O implicitly incorporates several candidate signatures, where each candidate is determined by selecting one label from each node in N without violating the temporal constraints imposed by the precedence relationships.

Figure 2.

Isp.O1 / (left) and two-step construction of Isp.O2 / (right).

by selecting one label for each node without violating the temporal constraints, where the null label is removed. Notice that this observation is actually a degradation of the given signature abc since kO2 k includes the signature itself.

Assumption 1. Let O be the temporal observation relevant to a signature S. Among the candidate signatures of O is S.

3

Based on Assumption 1, all candidate signatures but one are spurious. However, the mode in which the signature S degrades to an observation O is, generally speaking, nondeterministic and, therefore, unpredictable, thereby making it impossible to ascribe O to S. The signature of a reaction may incidentally have not degraded. In such a case, the temporal observation is completely certain (linear observation).

SUBSUMPTION

In similarity-based diagnosis of DESs [11], it is essential to understand whether the solution of the diagnosis problem } 0 at hand can be supported by the knowledge yielded for solving a previous (different) diagnosis problem }, with the latter being stored in a knowledge base. Among other constraints, reuse of } can be exploited only if the observations O 0 and O relevant to } 0 and }, respectively, are linked by a subsumption relationship,

Example 1. Assume that a DES reaction generates the signature abc. Shown on the left in Fig. 1 is the corresponding completely certain (and therefore necessarily linear) observation whose only candidate signature is abc. This is the observation considered by the diagnosis process in case no degradation has occurred. If, instead, a degradation has occurred, the resulting temporal uncertain observation may be, for instance, the one displayed in the center of the same figure, O2 D .N2 ; L2 ; A2 /, where N2 = fN10 ; : : : ; N40 g, and L2 D fa; b, c; d; g. Node N10 incorporates the first observable label, namely a. Then, either N20 or N30 follows, each of which involves two candidate labels, where is null. The last generated node is N40 , with a and being the final candidate labels. This means that the observer cannot devise the reciprocal emission order of the observable events relevant to nodes N20 or N30 since, for instance, the difference between their time tags was less than the synchronization error between the clocks of the two distinct channels transmitting such events from the system to the observer. Moreover, possibly owing to the observer’s limited discrimination ability, the observer does not know whether there is actually an event relevant to N20 or only noise. Likewise, the observer cannot discriminate which is the actual label relevant to N30 . Finally, node N40 is due to noise on the transmission channel (in fact no a label was generated after the first one), however the observer does not know whether what was received is pure noise or label a instead. Based on the content of each node and on the partial temporal relationships among nodes, it is easy to show that kO2 k includes the candidate signatures ac, ad , abc, abd , aca, ada, acb, adb, abca, abda, acba, adba, each of which is obtained

O c O0

(2)

namely, only if O subsumes O 0 . O subsumes O 0 if and only if the set of candidate signatures of O includes all the candidate signatures of O 0 . The subsumption relationship is defined in terms of regularlanguage containment, relevant to the corresponding index spaces. The index space of an observation O, namely Isp.O/, is a deterministic automaton with the property that its regular language is the set of candidate signatures of O, namely Lang.Isp.O// D kOk [11]. Therefore, O subsumes O 0 if and only if Lang.Isp.O// Lang.Isp.O 0 //:

(3)

The reason why observation subsumption supports reuse can be roughly explained as follows. The solution of } yields an automaton , a sort of diagnoser based on O, where each state is marked by a set of diagnoses and each transition is marked by a label in L fg. The regular language of is the subset of the signatures relevant to O that comply with the model of the system, namely, Lang./ kOk. The same applies to a new problem } 0 relevant to O 0 . However, if O c O 0 , that is, kOk kO 0 k, then Lang./ Lang.0 /. In other words, contains all the signatures of 0 . This allows the diagnosis engine to reuse in order to generate 0 based on O 0 . The advantage stems from the fact that such an operation is far more efficient than generating 0 from scratch, which would require heavy, low-level model-based reasoning.

26

Example 2. Suppose that a diagnosis problem inherent to observation O2 displayed in the center of Fig. 1 has to be solved. Let assume that the portion of the knowledge base inherent to the considered DES does not include an observation equal to it, while it includes observation O1 D .N1 ; L1 ; A1 /, displayed on the right of the same figure, where N1 D fN1 ; : : : ; N5g, and L1 D fa; b; c; d; f; g. Then, a subsumption check has to be performed to ascertain whether O1 subsumes O2 . This could be done by building the deterministic automaton generating the language of either observation, that is, the index space. This is a two-step process, as illustrated for observation O2 on the right of Fig. 2: first a nondeterministic automaton is drawn from the observation graph, then the equivalent deterministic automaton is built [8]. The deterministic automaton generating the language of observation O1 is displayed on the left of Fig. 2. It is easy to check that Lang.Isp.O1 // Lang.Isp.O2 //, therefore O1 subsumes O2 .

4

Figure 3. Temporal relaxation.

Logical relaxation (): the logical content of a node is extended with a set of labels. Temporal relaxation ( ): a temporal constraint is removed by the following actions: (1) an arc N 7! N 0 is deleted, (2) for each parent node Np of N , an arc Np 7! N 0 is inserted, and (3) for each child node Nc of N 0 , an arc N 7! Nc is inserted. Augmentation (˛): a new node N is inserted, where kN k © fg, and possibly connected with other nodes in such a way that no new temporal constraint is generated between the previous nodes.2

COVERAGE

Checking observation subsumption by regular-language containment may be prohibitive in real applications. In order to cope with this complexity, an alternative checking-technique, based on the notion of coverage was proposed in [12], where it was proven that coverage entails subsumption, that is, it is a sufficient condition for subsumption. Definition 1. (C OVERAGE) Let O D .N ; L; A/ and O 0 .N 0 ; L0 ; A0 / be two temporal observations, where N fN1 ; : : : ; Nn g and N 0 D fN10 ; : : : ; Nn0 0 g. We say that O covers written O D O0

Operators and ˛ do not alter the existing temporal constraints of O 0 , the former for it affects the logical content only, the latter by definition. A doubt may arise about operator : does it change the existing temporal constraints among the nodes of O 0 ? The answer, provided by Proposition 1 below, is that it only removes one temporal constraint between a pair of nodes while leaving all the other ones unchanged. In this sense, is the finest-grained temporal relaxation operator.

D D O 0, (4)

if and only if there exists a subset NN of N , with NN D fNN 1 ; : : : ; NN n0 g 0 having the same cardinality as N , such that, denoting N D N N N , we have:

Example 4. Shown in Fig. 3 is the effect of the application of the temporal-relaxation operator applied to the observation on the left, where the temporal precedence between N and N 0 is removed. According to the definition, and as outlined on the right of the figure, the removal of the arc N 7! N 0 is accompanied with the insertion of four arcs, two arcs from the parents of N to N 0 , namely N1 7! N 0 and N2 7! N 0 , and two arcs from N to the children of N 0 , namely N 7! N3 and N 7! N4 . These allows the relaxed observation to keep all the (implicit) temporal constraints other than N 7! N 0 .

(-coverage): 8N 2 N . 2 kN k/; (Logical coverage): 8i 2 Œ1 :: n0 .kNN i k kNi0k/; (Temporal coverage): For each path NN i Ý NNj in O such that both NN i and NNj are in NN , and all (if any) intermediate nodes of the path are in N , we have Ni0 Nj0 in O 0 . Example 3. With reference to the observations in Fig. 1, it is easy to show that O1 D O2 . Assume the subset of N1 being NN 1 D fN2 ; N1; N4 ; N5 g. Hence, N1 D fN3 g. Clearly, -coverage holds, as 2 kN3k. Logical coverage holds too, as kN2k kN10 k, kN1 k kN20 k, kN4 k kN30 k, and kN5k kN40 k. It is easy to check that temporal coverage occurs. For instance, for hN1 ; N3 ; N5 i, where N3 2 N1 , we have N20 N40 in O2 .

5

Proposition 1. Let O D .O 0 /, where N 7! N 0 is the arc removed from O 0 by . Then, 8.Ni ; Nj / in O 0 , .Ni ; Nj / ¤ .N; N 0 /, Ni Nj in O 0 , we have Ni Nj in O. Example 5. With reference to Fig. 3 and Example 4, it is easy to check that each temporal constraint in the observation outlined on the left (other than N 7! N 0 ) is preserved within the observation outlined on the right (the result of temporal relaxation), such as, for instance, N2 N4 , or N1 N 0 , as claimed by Proposition 1. By contrast, such constraints would have been implicitly removed if, after the removal of N 7! N 0 , we had not inserted the additional arcs required by .

RELAXATION

Relaxation transforms an observation by relaxing its logical and temporal constraints. Definition 2. (R ELAXATION ) Let O D .N ; L; A/ and O 0 D .N 0 ; L0 ; A0 / be two temporal observations. We say that O is a relaxation of O 0 , written O # O0 (5)

Example 6. Consider observations O1 and O2 displayed in Fig. 1. We show that O1 can be generated by a relaxation expression applied 2

iff O can be obtained from O 0 by the application of a (possibly empty) sequence of the relaxation operators , , and ˛, defined as follows:

27

Formally, for each pair of nodes N1 and N2 where N1 ¤ N and N2 ¤ N , if N1 N2 in the augmented graph, then N1 N2 in the original graph too.

Figure 5. Observations O (left) and O 0 (right). Figure 4. Relaxed observations O (left), O (center), and O˛ (right).

Theorem 2. Relaxation is equivalent to coverage: O # O0 ” O D O0:

to O2 as follows: O1 D ˛. .1 .2 .3 .4 .O2 //////

(6)

(9)

Theorem 2 allows us to test relaxation by coverage. An algorithm for testing coverage is provided in [5].

where 1 is the logical relaxation by extending kN10 k with fbg, and, similarly, 2 , 3 , and 4 extend kN20 k, kN30 k, and kN40 k with fag, fbg, and fbg, respectively. After the application of 1 , the intermediate resulting observation, namely O , is shown on the left of Fig. 4. Note how the topology of O2 is preserved, while the logical content of each node has been extended. Then, is the temporal relaxation of O by removing the temporal constraint between N10 and N20 , resulting in the new observation O displayed on the center of Fig. 4. Based on the definition of temporal relaxation, we should insert an arc from each parent of N10 to N20 , which is not applicable, as N10 has no parents. Besides, we should also insert an arc from N10 to N40 , but this would violate the third condition on the temporal precedence relationship assumed for observations, which preserve the observation graph from redundant arcs (in fact, N10 N40 by means of the intermediate node N30 ). Finally, augmentation ˛ is applied to O by inserting a new node N˛0 between N20 and N40 , where kN˛0 k D ff; g. This yields the precedence relationships N20 N˛ N40 , thereby preserving all the temporal constraints in O , as required. The resulting observation, namely O˛ , is displayed on the right of Fig. 4. Note how O˛ coincides in fact with O1 (right of Fig. 1). In other words, O1 is a relaxation of O2 .

Corollary 2.1. Coverage entails subsumption: O D O 0 H) O c O 0 :

(10)

Note 2. Coverage is stronger than subsumption: O c O 0 6) O D O 0 :

(11)

Note 2 is a consequence of Theorem 2 and Note 1. Although, generally speaking, relaxation is not equivalent to subsumption, such an equivalence holds when what is subsumed is a linear observation. Proposition 2. Let O 0 be a linear observation. Then, O c O 0 H) O # O 0 :

(12)

With O 0 linear, a corollary of Proposition 2 and Theorem 2 is the equivalence of subsumption, coverage, and relaxation. Corollary 2.1. Let O 0 be a linear observation. Then, O c O 0 ” O # O 0 ” O D O 0:

(13)

Theorem 1. Relaxation entails subsumption: (7)

A final corollary concerns the nature of the temporal observation with respect to the relevant signature.

Example 7. On the one hand, in Example 6 we have shown that O1 is a relaxation of O2 , where O1 and O2 are displayed in Fig. 1. On the other, in Example 2 we have also shown that O1 subsumes O2 . This is consistent with Theorem 1.

Corollary 2.2. Let O be the degradation of a signature S, that is, let O be a temporal observation of S that complies with Assumption 1. Then, O is a relaxation of S, and, vice versa, a relaxation of S is a degradation of S.

O # O 0 H) O c O 0 :

Note 1. Relaxation is stronger than subsumption: O c O 0 6) O # O 0 :

Example 9. To support the claims of Proposition 2.1 and its two Corollaries, we will consider observations O` and O2 in Fig. 1. Example 1 has shown that kO2k includes the only signature of the linear observation O` , therefore we can conclude that O2 c O` . Now we will show that O2 # O` , as O2 can be generated by a relaxation expression applied to O` as follows:

(8)

Example 8. To be convinced of the claim of Note 1, it suffices to show an example in which subsumption holds while relaxation does not. Consider observations O and O 0 displayed in Fig. 5. Notice how, unlike O, O 0 does not force any temporal constraint between its two nodes. Incidentally, both observations involve just one candidate signature, namely S D aa. Thus, since kOk D kO 0 k D faag, both observations subsume each other, in particular O c O 0 . However, it is clear that O is not a relaxation of O 0 .

O2 D .c .b .˛.O` ////

(14)

where augmentation ˛ is applied to O` by inserting a new node, say Nd , whose logical content is fa; g, as a child of Nc ; b and c extend kNb k and kNc k with fg and fd g, respectively; removes the temporal constraint between Nb and Nc , which requires deleting arc Nb 7! Nc and inserting two new arcs, namely Na 7! Nc and Nb 7! Nd . The resulting observation equals O2 , thus O2 # O` .

Theorem 1 and Note 1 offer evidence that relaxation is only a sufficient condition for subsumption, not a necessary one. However, based on experimental results, if relaxation does not hold, it is unlikely for subsumption to hold.

28

It can be checked that O2 D O` , where O` D .N` ; L` ; A` /, with N` D fNa ; Nb ; Nc g, and O2 D .N2 ; L2 ; A2 /. Assume NN 2 D fN10 ; N20 ; N30 g and N2 D fN40 g. Clearly, -coverage holds, as 2 kN40 k. Logical coverage holds too, as kN10 k kNa k, kN20 k kNb k, and kN30 k kNc k. Temporal coverage trivially occurs as no path in O2 has N40 as an intermediate node. Thus, we conclude that O2 D O` . All these conclusions are consistent with Proposition 2 and Corollary 2.1. Given that, as shown in Example 1, O2 is a degradation of O` , they are also consistent with Corollary 2.2.

6

principle, subsumption checking could be performed as a relaxation checking. Checking whether an observation is a relaxation of another one means checking whether the former can be obtained starting from the latter by applying the relaxation operators: this is actually a planning problem. The convenience of such a check from the computational viewpoint is an interesting topic for future research. Research is ongoing to state the conditions under which an observation is subsumed by another that does not cover/relaxes it. Roughly, this may occur if the subsumed observation includes a set (at least) of temporally unrelated nodes all having the same logical content. If a preprocessing is performed that detects and removes such a temporal uncertainty, a new observation is obtained that has the same extension as the former one and whose subsuming observations necessarily cover/relax it. Performing subsumption checking on this equivalent observation is promising to enhance the effectiveness of similarity-based diagnosis of DESs.

CONCLUSION

This paper, which is theoretical in nature, deals with the notion of an uncertain DES observation. If in previous works by the authors, an uncertain observation was the output of an undefined uncertainty function [12] as applied to the certain sequence of events generated by a DES, now this non-deterministic function has been formally substantiated. Three relaxation operators have been introduced which are the formal counterpart of the physical effects that makes what is observed different from what has been generated. Roughly, the logical operator corresponds to the superposition of noise to an event transmitted from the DES to the observer, so that the observer cannot univocally detect which event was actually generated out of a set of possible events. The same operator corresponds also the observer inability to exactly recognize a specific event, owing to limited discrimination capabilities. The temporal operator can account for several effects, for instance, it corresponds to synchronization errors among the clocks of distinct channels that convey events to the observer, so as the observer cannot uncover the relative emission order of every pair of events; it corresponds also to the transmission of the events to the observer by means of distinct channels in a scenario wherein no time-tag is available, etc. Finally, the augmentation operator corresponds to pure noise transmitted on a communication channel from the DES to the observer, where the observer is unable to distinguish whether what it has received is just noise or an event. These three operators are the causes of two kinds of uncertainty affecting observations: both the logical and the augmentation operator cause logical uncertainty, while the temporal operator causes temporal uncertainty. An outcome of the paper (Corollary 2.2) is that the application of such operators to a sequence of certain events generated by a DES (signature) produces an uncertain observation that fulfills the assumption always made in previous works by the authors as to the considered uncertain observation within a diagnostic problem However, the relaxation operators are not only the generators of an uncertain observation: they can be applied also to an existing uncertain observation, thus obtaining a still more uncertain observation. While delving into this possibility, the authors found out that:

REFERENCES [1] D. Brand and P. Zafiropulo, ‘On communicating finite-state machines’, Journal of ACM, 30(2), 323–342, (1983). [2] C.G. Cassandras and S. Lafortune, Introduction to Discrete Event Systems, volume 11 of The Kluwer International Series in Discrete Event Dynamic Systems, Kluwer Academic Publisher, Boston, MA, 1999. [3] L. Console, C. Picardi, and M. Ribaudo, ‘Process algebras for systems diagnosis’, Artificial Intelligence, 142(1), 19–51, (2002). [4] R. Debouk, S. Lafortune, and D. Teneketzis, ‘Coordinated decentralized protocols for failure diagnosis of discrete-event systems’, Journal of Discrete Event Dynamic Systems: Theory and Application, 10, 33– 86, (2000). [5] A. Ducoli, G. Lamperti, E. Piantoni, and M. Zanella, ‘Coverage techniques for checking temporal-observation subsumption’, in Eighteenth International Workshop on Principles of Diagnosis – DX’07, pp. 59– 66, Nashville, TN, (2007). [6] E. Fabre, A. Benveniste, S. Haar, and C. Jard, ‘Distributed monitoring of concurrent and asynchronous systems’, Journal of Discrete Event Dynamic Systems, 15(1), 33–84, (2005). [7] A. Grastien, M.O. Cordier, and C. Largouët, ‘Incremental diagnosis of discrete-event systems’, in Sixteenth International Workshop on Principles of Diagnosis – DX’05, pp. 119–124, Monterey, CA, (2005). [8] J.E. Hopcroft, R. Motwani, and J.D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, Reading, MA, third edn., 2006. [9] G. Lamperti and M. Zanella, ‘Diagnosis of discrete-event systems from uncertain temporal observations’, Artificial Intelligence, 137(1–2), 91– 163, (2002). [10] G. Lamperti and M. Zanella, ‘Dynamic diagnosis of active systems with fragmented observations’, in Sixth International Conference on Enterprise Information Systems – ICEIS’2004, pp. 249–261, Porto, P, (2004). [11] G. Lamperti and M. Zanella, ‘Flexible diagnosis of discrete-event systems by similarity-based reasoning techniques’, Artificial Intelligence, 170(3), 232–297, (2006). [12] G. Lamperti and M. Zanella, ‘On monotonic monitoring of discreteevent systems’, in Eighteenth International Workshop on Principles of Diagnosis – DX’07, pp. 130–137, Nashville, TN, (2007). [13] J. Lunze, ‘Diagnosis of quantized systems based on a timed discreteevent model’, IEEE Transactions on Systems, Man, and Cybernetics – Part A: Systems and Humans, 30(3), 322–335, (2000). [14] Y. Pencolé and M.O. Cordier, ‘A formal framework for the decentralized diagnosis of large scale discrete event systems and its application to telecommunication networks’, Artificial Intelligence, 164, 121–170, (2005). [15] R. Su and W.M. Wonham, ‘Global and local consistencies in distributed fault diagnosis for discrete-event systems’, IEEE Transactions on Automatic Control, 50(12), 1923–1935, (2005).

The relaxation of an uncertain observation subsumes such an observation; Relaxation and coverage are equivalent notions. The first point provides a rationale for interpreting conclusions drawn (and proven) by previous research as to similarity-based diagnosis [11]. Based on the second point, since coverage is a sufficient condition for observation subsumption, so is relaxation. Therefore, in

29

30

The Concept of Entropy by means of Generalized Orders of Magnitude Qualitative Spaces1 ´ Llorenç Roselló and Francesc Prats and Mónica Sánchez 2 and Nuria Agell 3 represent it. Taking into account that the entropy can be used to measure the information, this work is intended to be a first step towards this measure by means of orders of magnitude qualitative spaces.

Abstract. A new concept of generalized absolute orders of magnitude qualitative spaces is introduced in this paper. The new structure makes it possible to define sets of qualitative labels of any cardinality, and is consistent with the classical structure of qualitative spaces of absolute orders of magnitude and with the classical interval algebra. In addition, the algebraic structure of these spaces ensures initial conditions for adapting measure theory to a qualitative environment. This theory provides the appropriate framework in which to introduce the concept of entropy and, consequently, the opportunity to measure the gain or loss of information when working within qualitative spaces. The results obtained are significant in terms of situations which arise naturally in many real applications when dealing with different levels of precision.

The concept of entropy has its origins in the nineteenth century, particularly in thermodynamics and statistics. This theory has been developed from two aspects: the macroscopic, as introduced by Carnot, Clausius, Gibbs, Planck and Caratheodory and the microscopic, developed by Maxwell and Boltzmann [15]. The statistical concept of Shannon’s entropy, related to the microscopic aspect, is a measure of the amount of information [18],[2]. In order to define the concept of information within the QR framework, this paper adapts the basic principles of Measure Theory [8], [5] to give OM a structure in which to define the concept of entropy, and, consequently, the concept of information.

1 INTRODUCTION

Section 2 defines the concept of generalized absolute orders of magnitude qualitative spaces. In Section 3, the algebraic structure of these spaces is analyzed in order to ensure initial conditions in which to adapt the Measure Theory. A measure and the concept of entropy in the generalized absolute orders of magnitude spaces are given in section 4 and 5 respectively. The paper ends with several conclusions and outlines some proposals for future research.

Qualitative Reasoning (QR) is a subarea of Artificial Intelligence that seeks to understand and explain human beings’ ability for qualitative reasoning [6], [11]. The main objective is to develop systems that permit operating in conditions of insufficient numerical data or in the absence of such data. As indicated in [22], this could be due to both a lack of information as well as to an information overload. A main goal of Qualitative Reasoning is to tackle problems in such a way that the principle of relevance is preserved; that is to say each variable has to be valued with the level of precision required [7]. It is not unusual for a situation to arise in which it is necessary to work simultaneously with different levels of precision, depending on the available information, in order to ensure interpretability of the obtained results. To this end, the mathematical structures of Orders of Magnitude Qualitative Spaces (OM) were introduced.

2 GENERALIZED ABSOLUTE ORDERS OF MAGNITUDE QUALITATIVE SPACES S∗g Order of magnitude models are an essential piece among the theoretical tools available for qualitative reasoning about physical systems ([10], [19]. They aim at capturing order of magnitude commonsense ([21]) inferences, such as used in the engineering world. Order of magnitude knowledge may be of two types: absolute or relative. The absolute order of magnitudes are represented by a partition of R, each element of the partition standing for a basic qualitative class. A general algebraic structure, called Qualitative Algebra or Q-algebra, was defined on this framework ([9]), providing a mathematical structure which unifies sign algebra and interval algebra through a continuum of qualitative structures built from the rougher to the finest partition of the real line. The most referenced order of magnitude Q-algebra partitions the real line into 7 classes, corresponding to the labels: Negative Large(NL), Negative Medium(NM), Negative Small(NS), Zero(0), Positive Small(PS), Positive Medium(PM) and Positive Large(PL). Q-algebras and their algebraic properties have been extensively studied ([13], [22])

The word information appears constantly in QR. However, its meaning is as yet undefined within a qualitative context. The implicit and explicit use of the term and concept addresses the need to define and, perhaps paradoxically, to quantify them. In this work it is presented a way of measuring the amount of information of a system when using orders of magnitude descriptions to 1

2 3

This work has been partly funded by MEC (Spanish Ministry of Education and Science) AURA project (TIN2005-08873-C02). Authors would like to thank their colleagues of GREC research group of knowledge engineering for helpful discussions and suggestions. Polytechnical University of Catalonia, Barcelona, Spain email: [email protected],[email protected],[email protected] Esade, Ramon Llull University, Barcelona, Spain email: [email protected]

31

It is important to remark that the function B : I → P(X) determines the elements of S and S∗g , and the cardinal of the set I ⊂ R determines the cardinal of S and therefore the cardinal of S∗g .

Order of magnitude knowledge may also be of relative type, in the sense that a quantity is qualified with respect to another quantity by means of a set of binary order-of-magnitude relations. The seminal relative orders of magnitude model was the formal system FOG ([14]), based on three basic relations, used to represent the intuitive concepts of ”negligible with respect to” (Ne), ”close to” (Vo) and ”comparable to” (Co), and described by 32 intuition-based inference rules. The relative orders of magnitude models that were proposed later improved FOG not only in the necessary aspect of a rigorous formalisation, but also permitting the incorporation of quantitative information when available and the control of the inference process, in order to obtain valid results in the real world ([12], [3], [4]).

The classical orders of magnitude qualitative spaces [22] verifies the conditions of the generalized model that has just been introduced. This model are build from a set of ordered basic qualitative labels determined by a partition of the real line. Let X be the real interval [a1 , an ), and a partition of this set given by {a2 , . . . , an−1 }, with a1 < a2 < . . . < an−1 < an . The set of basic labels is S = {B1 , . . . , Bn−1 }, where, for 1 ≤ i ≤ n − 1, Bi is the real interval [ai , ai+1 ). The set of indexes is I = {1, 2, . . . , n − 1}.

In ([20], [22]) the conditions under which an absolute orders of magnitude and a relative orders of magnitude model are consistent is analysed and the constraints that consistency implies are determined and interpreted.

B1

This paper proposes a further step towards the generalization of qualitative orders of magnitude. This generalization makes it possible to define orders of magnitude as either a discrete or continuous set of labels, providing the theoretical basis on which to develop a Measure Theory in this context.

a1

Bn−1

a2

...

an−1

an

Figure 1. Classical aualitative labels Sn

For 1 ≤ i < j ≤ n − 1 the non-basic label [Bi , Bj ) is: [Bi , Bj ) = {Bi , Bi+1 , . . . , Bj−1 },

Definition 1 Let X be a non-empty set, I a subset of R, and B : I → P(X) an injective function. Then each B(t) = Bt ⊂ X is a generalized basic label on X and the set S of generalized basic labels on X is S = {Bt | t ∈ I}.

and it is interpreted as the real interval [ai , aj ). For 1 ≤ i ≤ n − 1 the non-basic label [Bi , B∞ ) is: [Bi , B∞ ) = {Bi , Bi+1 , . . . , Bn−1 }, and it is interpreted as the real interval [ai , an ).

Note that if t 6= t′ , then Bt 6= Bt′ .

The complete universe of description for the Orders of Magnitude Space is the set

Definition 2 If i, j ∈ I, with i < j, the generalized non-basic label [Bi , Bj ) is defined by

Sn = { [Bi , Bj ) | Bi , Bj ∈ S, i ≤ j} ∪ { [Bi , B∞ ) | Bi ∈ S},

[Bi , Bj ) = {Bt | t ∈ I, i ≤ t < j}.

which is called the absolute orders of magnitude qualitative space with granularity n, also denoted OM (n). In this case, S∗g = {∅} ∪ Sn .

In the case i = j ∈ I, the convention [Bi , Bi ) = {Bi } will be used. If necessary, [Bi , Bi ) = {Bi } can be identified with the basic label Bi .

There is a partial order relation ≤P in Sn “to be more precise than”, given by: L1 ≤P L2 ⇐⇒ L1 ⊂ L2 . The least precise label is denoted by ? and it is the label [B1 , B∞ ), which corresponds to the interval [a1 , an ).

Definition 3 If i ∈ I, the generalized non-basic label [Bi , B∞ ) is defined by [Bi , B∞ ) = {Bt | t ∈ I, i ≤ t}.

? p r e c i s i o n

Note that B∞ is a symbol, not a basic label. Definition 4 The set of Generalized Orders of Magnitude S∗g is: S∗g = {∅} ∪ {[Bi , Bj ) | i, j ∈ I, i ≤ j} ∪ {[Bi , B∞ ) | i ∈ I}.

.

[Bi,Bj)

B1 ... Bi

In this definition of S∗g the basic label Bi has been identified with the singleton {Bi }.

...

Bj ... Bn

Figure 2. The space Sn

32

a b s t r a c t i o n

3. If [Bi , Bj ), [Bk , Bl ) ∈ S∗g such that [Bi , Bj ) ⊂ [Bk , Bl ), then two cases are considered:

This structure permits working with all different levels of precision from the label ? to the basic labels.

(a) If Bk = Bi or Bl = Bj , it suffices to take D0 = [Bi , Bj ) and D1 = [Bk , Bl ).

In some theoretical works, orders of magnitude qualitative spaces are constructed by partitioning the whole real line (−∞, +∞) instead of a finite real interval [a1 , an ). However, in most real world applications involved variables do have a lower bound a1 and an upper bound an , and then values less than a1 or greater than an are considered as outliers and they are not treated like any other.

(b) Otherwise, take D0 = [Bi , Bj ), D1 = [Bi , Bl ) and D2 = [Bk , Bl ). The cases [Bi , Bj ) ⊂ [Bk , B∞ ) and [Bi , B∞ ) ⊂ [Bk , B∞ ) are proved in a similar way.

The classical sign algebra S = {−, 0, +} was the first absolute orders of magnitude space considered by the QR community. It corresponds to the case S = {B−1 = (−∞, 0), B0 = {0}, B1 = (0, +∞)}. The sign algebra is obtained via a partition of the real line given by an unique landmark 0. The classical orders of magnitude qualitative spaces are built from partitions via a set of landmarks {a2 , . . . , an−1 }, and the classical interval algebra is built from the finest partition of the real line whose landmarks are all real numbers.

Definition 6 A class A of subsets of a non-empty set X is called an algebra when it contains the finite unions and the complements of its elements. If finite unions are replaced by countable unions, it is called a σ-algebra. The smallest σ-algebra that contains S∗g ⊂ P(X) is called the σalgebra generated by S∗g , denoted by Σ( S∗g ).

It is important to remark the significance of the presented mathematical formalism in the sense that it permits to lump together a family of S∗g forming a continuum from the sign algebra S = {−, 0, +} to the interval algebra corresponding to S = R.

3

Definition 7 Let X be a non-empty set and C ⊂ P(X), with ∅ ∈ C. A measure on C is an application µ : C → [0, +∞] satisfying the following properties:

THE MEASURE SPACE (P(X), Σ( S∗g ), µ∗ )

1. µ(∅) = 0. 2. For any sequence (En )∞ n=1 of disjoint sets of C such that ∪+∞ n=1 En ∈ C, then

To introduce the classical concept of entropy by means of qualitative orders of magnitude spaces, Measure Theory is required. This theory seeks to generalize the concept of “length”, “area”and “volume”, understanding that these quantities need not necessarily correspond to their physical counterparts, but may in fact represent others. The main use of the measure is to define the concept of integration for orders of magnitude spaces. First, it is necessary to define the algebraic structure on which to define a measure.

µ(

+∞ [

n=1

En ) =

+∞ X

µ(En ).

n=1

Any measure µ on the whole P(X), when it is restricted to S∗g , gives a measure on S∗g . Definition 8 Let µ be a measure on S∗g . The outer measure on an arbitrary subset A of X is defined by: X [ µ∗ (A) = inf{ µ([Bsk , Btk )), A ⊂ [Bsk , Btk )}.

Definition 5 A class of sets ℑ is called a semi-ring if the following properties are satisfied:

k∈N

1. ∅ ∈ ℑ. 2. If A, B ∈ ℑ, then A ∩ B ∈ ℑ. 3. If A, B ∈ ℑ, A ⊂ B, then ∃n ∈ N, n ≥ 1 and ∃ D1 , D2 , . . . , Dn such that A = D0 ⊂ D1 ⊂ . . . ⊂ Dn = B, with Dk − Dk−1 ∈ ℑ, ∀k ∈ {1, . . . , n}.

k∈N

Carathéodory theorem [8] assures µ∗ of definition 7 is a measure on Σ( S∗g ), and (P(X), Σ( S∗g ), µ∗ ) is called a measure space. It is proved that, since S∗g is a semi-ring, µ∗|S∗g = µ. In this measure space an integration with respect µ∗ can be defined. Because of the fact that µ∗|S∗g = µ, in any integration on S∗g the measure µ∗ can be replaced by µ.

Proposition 1 S∗g is a semi-ring.

Proof:

4 ENTROPY BY MEANS OF S∗g 1. ∅ ∈ S∗g by definition. 2. If [Bi , Bj ), [Bk , Bl ) ∈ S∗g , it is trivial to check that [Bi , Bj ) ∩ [Bk , Bl ) ∈ S∗g , taking into account the relative position between the real intervals [i, j) and [k, l). Analogously, in the case of intersections [Bi , Bj ) ∩ [Bk , B∞ ) or [Bi , B∞ ) ∩ [Bk , B∞ ).

Once the integration in S∗g has been defined, entropy can then be considered. To introduce the concept of entropy by means of qualitative orders of magnitude, it is necessary to consider the qualitativization function between the set to be qualitatively described and the space of qualitative labels, S∗g .

33

Example 1 Suppose that Q maps each element of Λ to the same label E ∈ S∗g ; then the induced partition Λ/ ∼Q contains only one class equal to Λ and the entropy defined in equation (1) is H(Λ/ ∼Q ) = −µ(E ) log µ(E ). In the classical interpretation of the entropy, the knowledge about Λ induced by this particular Q will lead to an entropy equal to zero, because in the given situation it is understood that this trivial partition of Λ provides no information at all. On the contrary, in the approach that has been presented in this paper, although Q map the whole set to the same label it could give a certain information about Λ: the intrinsic information provided by the measure of the label itself.

To simplify the notation, let us express with a calligraphic letter the elements in S∗g ; thus, for example, elements [Bi , Bj ) or [Bi , B∞ ) shall be denoted as E . Let Λ be the set that represents a magnitude or a feature that is qualitatively described by means of the labels of S∗g . Since Λ can represent both a continuous magnitude such as position and temperature, etc., and a discrete feature such as salary and colour, etc., Λ could be considered as the range of a function a : I ⊂ R → Y, where Y is a convenient set. For instance, if a is a room temperature during a period of time I = [t0 , t1 ], Λ is the range of temperatures during this period of time. Another example can be considered when I = {1, . . . , n} and Λ = {a(1), . . . , a(n)} are n number of people whose eye colour we aim to describe. In general, Λ = {a(t) = at | t ∈ I}.

Two different measures that show this fact are considered in the following examples. On the one hand, the first differs from Shannon’s classical interpretation of entropy as noted in Example 1: although Q map each element of Λ to the same label E ∈ S∗g entropy is not equal to zero . On the other, the entropy corresponding to Example 3 behaves like the classical interpretation of Shannon and Rokhlin, in the sense just discussed. Example 2 takes into account the lengths of the intervals corresponding to the labels, and Example 3 is related to the cardinality of the set of representatives of each label.

The process of qualitativization is given by a function Q : Λ → S∗g , where at 7→ Q(at ) = Et = minimum label (with respect to the inclusion ⊂) which describes at , i.e. the most precise qualitative label describing at . All the elements of the set Q−1 (Et ) are ”representatives” of the label Et or “are qualitatively described” by Et . They can be considered qualitatively equal.

Example 2 Let us define a particular measure µ on {∅} ∪ Sn as follows: For the basic labels Bi = [ai , ai+1 ), whith i = 1, . . . , n − 1, let

The function Q induces a partition in Λ by means of the equivalence relation: a ∼Q b ⇐⇒ Q(a) = Q(b).

µ(Bi ) =

ai+1 − ai . an − a1

This measure is proportional to the knowledge of imprecision about the magnitude and it is normalized with respect to the “basic” known range given by the length an − a1 . For non-basic labels the measure is, for i, j = 1, . . . , n − 1, i < j:

This partition will be denoted by Λ/ ∼Q , and its equivalence classes are the sets Q−1 (Q(aj )) = Q−1 (Ej ), ∀j ∈ J ⊂ I. Each of these classes contains all the elements of Λ which are described by the same qualitative label.

µ([Bi , Bj )) =

S∗g

Definition 9 Let µ be a measure on such that Z [ dµ = 1. {Bi }

j−1 X

µ(Bk ) =

k=i

aj − ai , an − a1

and for i = 1, . . . , n − 1:

i∈I

µ([Bi , B∞ )) =

n−1 X k=i

The entropy H with respect the partition Λ/ ∼Q is the integral: Z log µ dµ, (1) H(Λ/ ∼Q ) = −

µ(Bk ) =

an − ai . an − a1

Elements of Λ represented by quite precise labels will provide a bigger contribution to entropy H than those who are represented by less precise labels. Considering the particular case in which Q maps all the elements of Λ to the same label: Q(Λ) = {E}, then Λ/ ∼Q = Λ and H(Λ/ ∼Q ) = −µ(E ) log(µ(E )) 6= 0.

Q(Λ)

where Q(Λ) is the set of labels mapped by Q (logarithms are to the base 2). The expression (1) can be written as: X H(Λ/ ∼Q ) = − log(µ(Ej ))µ(Ej ). (2)

Example 3 Another interpretation of the entropy defined in equation (1) is obtained by defining another measure µ over {∅ ∪ Sn as follows: For each Et ∈ {∅} ∪ Sn ,

j∈J

As in most definitions of entropy, it gives a measure of the amount of information. In Definition 9 entropy can be interpreted as the measure of the amount of information that provides the knowledge of Λ by means of Q.

µ(∅) = 0, µ(Et ) = card(Q−1 (Et ))/card(Λ). This case recovers the classical interpretation of Shannon and Rokhlin in the sense that if Q maps all the elements of Λ to the same label, then the partition does not give information of Λ because the entropy is H(Λ/ ∼Q ) = −1 · log 1 = 0. Moreover, the entropy reaches its maximum when different elements of Λ are mapped to different labels Et ∈ Sn , i.e., when Q is an injective map from Λ onto Sn . This maximum is H(Λ/ ∼Q ) = log(card Λ).

Nevertheless, the inner features of the orders of magnitude structure considered introduce some differences between the entropy defined in (1) and the entropy defined by Rokhlin [15] and Shannon [18], as can be seen in the following example:

34

5 CONCLUSION AND FUTURE WORK

Example 4 This last example is presented to show why it is necessary n some practical problems the mathematical formalism developed in this paper:in this paper. Actually, the best way to describe the evolution of a function is by means of the derivative. The measure theory provides the mathematical framework to do so under certain conditions (these conditions are not explained in this paper but can be found in [5]), a function ν defined on a measure space (X, σ, µ) can be derived respect to the measure µ, i.e. there exist a function f such that f = dν/dµ. The case below falls within the development frame of the AURA research project, which sets out to adapt softcomputing techniques to the study of the financial rating tendencies by using qualitative reasoning. The main goal of the project is to use these techniques to extract knowledge and allow prognosis. The rating is an attempt to measure the financial risk of a given company’s bond issues. The specialized rating agencies, such as Standard & Poor’s, classify firms according to their level of risk, using both quantitative and qualitative information to assign ratings to issues. Learning the tendency of the rating of a firm therefore requires the knowledge of the ratios and values that indicate the firms’ situation and, also, a deep understanding of the relationships between them and the main factors that can modify these values. The processes employed by these agencies are highly complex and are not based on purely numeric models. Experts use the information given by the financial data, as well as some qualitative variables, such as the industry and the country or countries where the firm operates, and, at the same time, they forecast the possibilities of the firm’s growth, and its competitive position. Finally, they use an abstract global evaluation based on their own expertise to determine the rating. Standard & Poor’s ratings are labelled AAA, AA, A, BBB, BB, B, CCC, CC, C and D. From left to right these rankings go from high to low credit quality, i.e., the high to low capacity of the firm to return debt.

This paper introduces the concept of entropy by means of absolute orders of magnitude qualitative spaces. This entropy measures the amount of information of a system when using orders of magnitude descriptions to represent it. In order to define the concept of entropy within Qualitative Reasoning framework, this paper adapts the basic principles of Measure Theory to give the space of absolute orders of magnitude the necessary structure. With the presented structure, we obtain a family of qualitative spaces forming a continuum from the sign algebra to the classical interval algebra. From a theoretical point of view, future research could focus on two lines. On the one hand, it could focus on the comparison of the given entropy with the macroscopic concept of Caratheodory entropy. On the other hand, the adaptation of Measure Theory provides the theoretical framework in which developing a rigorous analytical study of functions between orders of magnitude spaces. The continuity and differentiability of these functions will allow the dynamical study of qualitatively described processes. Within the framework of applications, this work and its related methodology will be orientated towards the modelization and the resolution of financial and marketing problems. Regarding financial problems, the concept of entropy will facilitate the study of the evolution and variation of the financial ratings. On the other hand, entropy as a measurement of coherence and reliability is useful in group decision-making problems arising from retail marketing applications.

The problem of classifying firms by using their descriptive variables has already been tackled by several authors [1]. In [16] and [17] it is analyzed the variables that influence variations in ratings and how this influence is expressed, but not the speed of rating tendencies (i.e., how “fast” or “slow” ratings change) by using orders of magnitude descriptions. The particular evolution of the rating of a given firm and its prediction from the previous rating and the values of its present financial ratios is currently being studied. The mathematics involved use the measure theory and the entropy concept. In order to simplify the notation and to give a glimpse at the problem, the one dimensional case is considered: let B1 = D, B2 = C, B3 = CC, . . . , B10 = AAA and let be S10 the absolute orders of magnitude space considered for describing the rating depending only on one variable x ∈ R. As mentioned before and in the references [16] and [17] the rating is a subjective valuation of several experts, so the best way to describe it is by means of a function R : R → S10 . Therefore the rating tendency is a derivative of a function defined in a non-euclidean space. Moreover, the rating evolution sometimes neither increase nor decrease because for example in a x1 ∈ R is R(x1 ) = [B2 , B4 ) and R(x1 + ǫ) = [B1 , B5 ) where ǫ > 0. What happens between x1 and x1 + ǫ is that precision on rating is lost, or in other words, the R entropy between x1 and x1 + ǫ has increased: if H(R(x)) = − Q log µdµ, then dH(R(x))/dµ = − log µ. This example is an sketch of an application of the theory on the dynamics in S∗g spaces, and its applications on financial problems. Further development is needed and we are working on it.

Moreover, the introduced entropy will allow defining a conditional entropy in this framework, which in turn will allow considering the Rokhlin distance to be used in decision-making problems of ranking and selection of alternatives.

REFERENCES [1] J.M. Ammer and N Clinton, ‘Good news is no news? the impact of credit rating changes on the pricing of asset-backed securities’, Technical report, Federal Reserve Board, (July 2004). [2] Thomas M. Cover and Joy A. Thomas, Elements of Information Theory, Wiley Series in Telecomunications, 1991. [3] P. Dague, ‘Numeric reasoning with relative orders of magnitude’. AAAI Conference, Washington, (1993). [4] P. Dague, ‘Symbolic reasoning with relative orders of magnitude’. 13th IJCAI, Chambéry, (1993). [5] G.N. Folland, Real Analysis: Modern Techniques and Their Applications, Pure and Applied Mathematics: A Wiley-Interscience Series of Texts, Monographs, and Tracks, John Wiley & Sons, Inc, 1999. [6] K. Forbus, Qualitative Reasoning, CRC Hand-book of Computer Science and Engineering, CRC Press, 1996. [7] Kenneth Forbus, ‘Qualitative process theory’, Artificial Intelligence, 24, 85–158, (1984). [8] Paul R. Halmos, Measure Theory, Springer-Verlag, 1974. [9] IJCAI. The orders of Magnitude Models as Qualitative Algebras, number 11th, 1989. [10] J. Kalagnanam, H.A. Simon, and Y. Iwasaki, ‘The mathematical bases for qualitative reasoning’, IEEE Expert., (1991). [11] B. Kuipers, ‘Making sense of common sense knowledge’, Ubiquity, 4(45), (January 2004). [12] M.L. Mavrovouniotis and G. Stephanopoulos, ‘Reasoning with orders

35

[13] [14] [15] [16]

[17]

[18] [19] [20] [21] [22]

of magnitude and approximate relations’. AAAI Conference, Seattle, (1987). A. Missier, N. Piera, and L. Travé, ‘Order of magnitude algebras: a survey’, Revue d’Intelligence Artificielle, 3(4), 95–109, (1989). O Raiman, ‘Order of magnitude reasoning’, Artificial Intelligence, (24), 11–38, (1986). V.A. Rokhlin, ‘Lectures on the entropy of eeasure ereserving eransformations’, Russian Math. Surveys, 22, 1 – 52, (1967). Llorenç Roselló, Núria Agell, Mònica Sánchez, and Francesc Prats, ‘Qualitative induction trees applied to the study of financial rating’, in Artificial Intelligence Research and Development, eds., Monique Polite, Thierry Talbert, Beatriz López, and Joquim Melèndez, pp. 47 – 54. IOS Press, (2006). Llorenç Roselló, Núria Agell, Mónica Sánchez, and Francesc Prats, ‘Learning financial rating tendencies with qualitative trees’, in 21st International Workshop on Qualitative Reasoning, ed., Chris Price, pp. 142 – 146, (2007). Claude E. Shannon, ‘A mathematical theory of communication’, The Bell System Technical Journal, 27, 379 – 423, (1948). P. Struss, ‘Mathematical aspects of qualitative reasoning’, AI in Engineering, 3(3), 156–169, (1988). L. Travé-Massuyès, F. Prats, M. Sánchez, and N Agell, ‘Consistent relative and absolute order-of-magnitude models’. 16th International Workshop on Qualitative Reasoning, (2002). L. et al. Travé-Massuyès, Le raisonnement qualitatif pour les sciences de l’ingénieur, Ed. Hermès, 1997. Modèles et raisonaments qualitatifs, eds., Louise Travé-Massuyès and Philippe Dague, Hermes Science (Par´ıs), 2003.

36

Model-based Testing using Quantified CSPs: A Map Martin Sachenbacher and Stefan Schwoon1 Abstract. Testing is the process of stimulating a system with inputs in order to reveal hidden parts of the system state. In this paper, we consider finding input patterns to discriminate between different, possibly non-deterministic models of a technical system, a problem that was put forward in the model-based diagnosis literature. We analyze this problem for different types of models and tests with different discriminating strength. We show how the variants can be uniformly formalized and solved using quantified CSPs, a gametheoretic extension of CSPs. The results of the paper are (1) a map of the complexity of different variants of the testing problem, (2) a way to compute discriminating tests using standard algorithms instead of ad-hoc methods, and (3) a starting point to extend testing to a richer class of applications, where tests consist of stimulation strategies instead of simple input patterns.

1

tem, even if there might be several possible outputs for a given input. [15] provided a characterization of this problem in terms of relational (constraint-based) models, together with an ad-hoc algorithm to compute DDTs. Later work [9] extended the idea to systems modeled as automata, which, using a fixed bound of time steps, are unfolded into constraint networks such that the former algorithm can be applied. Generating DDTs is a problem of considerable practical importance; the framework was applied to real-world scenarios from the domain of railway control and automotive systems [9]. In the field of automata theory, [1, 5] have studied the analogous problem of generating distinguishing sequences, which asks whether there exists an input sequence for a non-deterministic finite state machine, such that based on the generated outputs, one can unambiguously determine the internal state of the machine. In this paper, we give an overview and establish connections between these different notions of the testing problem, with a generalized form of constraint models serving as the glue. We show how the different variants can be conveniently formulated using quantified CSPs (QCSPs), an extension of CSPs to multi-agent (adversarial) scenarios. This leads to three contributions: first, we provide an overview of the complexity landscape of model-based testing for different combinations of discriminating strength and model types. For example, we observe that the problems of finding possibly discriminating tests and finding definitely discriminating tests for logical models [15] have the same worst-case complexity, which is however less than those of finding distinguishing sequences for automata models. Second, we map the various test generation problems to QCSP formulas, which, instead of devising ad-hoc algorithms as in [15, 9], enables to use off-the-shelf solvers in order to effectively compute tests. Third, we show that our QCSP (adversarial planning) formulation of testing can be straightforwardly extended to problems that require complex test strategies instead of simple input patterns, and thus go beyond the framework in [15, 9].

Introduction

As the complexity of technical devices is growing, methods and tools to automatically check such systems for the absence or presence of faults become increasingly important. Diagnosability asks whether a certain fault can ever go undetected in a system due to limited observability. It has been shown how this question can be framed and solved as a satisfiability problem [7, 13]. Testing instead asks whether there exist inputs (test patterns) to stimulate a system, such that a given fault will always lead to observable differences at the outputs. For the domain of digital circuits with deterministic outputs, it has also been shown how this question can be framed and solved as a satisfiability problem [10, 6]. In this paper, we consider constraint-based testing for a broader class of systems, where the models need not be deterministic. There are several sources for non-determinism in model-based testing of technical systems: in order to reduce the size of a model – for example, to fit it into an embedded controller [16, 14] – it is common to aggregate the domain of continuous system variables into discrete, qualitative values such as ’low’, ’medium’, ’high’, etc. A side-effect of this abstraction is that the resulting models can no longer be assumed to be deterministic functions, even if the underlying system behavior was deterministic. Another source is the test situation itself: even in a rigid environment such as an automotive test-bed, there are inevitably some variables or parameters that cannot be completely controlled while testing the device. Notions of testing for non-deterministic models have been introduced in various areas. In the field of model-based reasoning with logical (constraint-based) system descriptions, Struss [15] introduced the problem of finding so-called definitely discriminating tests (DDTs), which asks whether there exist inputs that can unambiguously reveal or exclude the presence of a certain fault in a sys1

2

Quantified CSPs (QCSPs)

In a constraint satisfaction problem (CSP), all variables are (implicitly) existentially quantified; we wish to find an assignment for each of the variables that satisfies all constraints simultaneously. Quantified CSPs (QCSPs) are a generalization of CSPs that allow a subset of the variables to be universally quantified: Definition 1 (Quantified CSP) A QCSP φ = hQ, X, D, Ci has the form Q1 x1 . . . Qm xm . C(x1 , ..., xn ) where m ≤ n and C is a set of constraints over the variables X = {x1 , . . . , xn } with domains D = {d1 , . . . , dn }, and Q is a sequence of quantifiers where each Qi , 1 ≤ i ≤ m, is either an existential (∃) or a universal (∀) quantifier.

Technische Universität München, Institut für Informatik, Boltzmannstraße 3, 85748 Garching, Germany, email: {sachenba,schwoon}@in.tum.de

37

Definition 2 (Satisfiability of QCSP) The satisfiability of a QCSP φ = hQ, X, D, Ci is recursively defined as follows. If Q is empty then φ is satisfiable iff the CSP hX, D, Ci is satisfiable. If φ is of the form ∃x1 Q2 x2 . . . Qn xn . C then φ is satisfiable iff there exists a value a ∈ d1 such that Q2 x2 . . . Qn xn . C ∧ (x1 = a) is satisfiable. If φ is of the form ∀x1 Q2 x2 . . . Qn xn . C then φ is satisfiable iff for every value a ∈ d1 , Q2 x2 . . . Qn xn . C ∧ (x1 = a) is satisfiable.

x y

u

z

fdiff:

Compared to the classical CSP framework, QCSPs have more expressive power to model particular aspects of real world problems, such as uncertainty or other forms of uncontrollability in the environment. For example, in game playing, they can be used to find a winning strategy for all possible moves of the opponent. There exist a number of solvers for quantified formulas, most of which use variants of search and local propagation, the dominating algorithmic approach for SAT/CSP problems. While such solvers are easy to implement because they build on existing technology, their performance often turns out to be not competitive with the alternative approach of expanding the problem into a classical instance (SAT/CSP) and using a SAT/CSP solver. However, it has recently been shown [4] that more advanced algorithmic preprocessing and inference techniques, which usually do not pay off for classical problems, often work well for quantified problems, and can make QBF/QCSP-approaches several orders of magnitudes faster than classical approaches. It is therefore expected that QBF/QCSP solvers will see significant performance improvements in the future, similar to those that SAT/CSP solvers have undergone in the past.

3

fdiff

x y u L L H H

L H L H

L H H L

Figure 1.

fadd

v

fadd:

u z v L L L H H H

L H H L L H

fadd‐stuck:

L L H L H H

u z v L L H H

L H L H

L L L L

Circuit with a possibly faulty adder.

The second hypothesis is that the adder is stuck-at-L, which is modeled by M2 = {fdiff , faddstuck }. Then for example, the assignment x ← L, y ← H, z ← L is a PDT for M (it leads to the observation v = L or v = H for M1 , and v = L for M2 ), while the assignment x ← L, y ← H, z ← H is a DDT for M (it leads to the observation v = H for M1 , and v = L for M2 ). In the following, we restrict ourselves to the case where there are only two possible hypotheses, for example corresponding to normal and faulty behavior of the system. Note that DDTs are then symmetric: if tI is a DDT to discriminate M1 from M2 , then it is also a DDT to discriminate M2 from M1 .

Discriminating Tests for Logical Models

We briefly review the theory of constraint-based testing of physical systems as introduced in [15]. Testing attempts to discriminate between hypotheses about a system – for example, about different kinds of faults – by stimulating the system in such a way that the hypotheses become observationally distinguishable. Formally, let S M = i Mi be a set of different models (hypotheses) for a system, where each Mi is a set of constraints over variables V . Let I = {i1 , . . . , in } ⊆ V be the subset of input (controllable) variables, O = {o1 , . . . , om } ⊆ V the subset of observable variables, and U = {u1 , . . . , uk } = V −(I ∪O) the remaining (uncontrollable and unobservable) variables. The goal of testing is then to find assignments to I (input patterns) that will cause different assignments to O (output patterns) for the different models Mi :

3.1

Characterizing PDTs and DDTs

We sketch how for logical (state-less models), finding PDTs and DDTs can be characterized as a game played between two opponents. The first player (∃-player) tries to reveal the fault by choosing input values for which the two hypotheses yield disjunct observations. The second player (∀-player) instead tries to hide the fault by choosing values for outputs or internal variables such that the two hypotheses yield overlapping observations. In the case of PDTs, he can choose values only for internal variables, whereas in the case of DDTs, he can choose values both for internal and observable variables. Both the ∃-player and the ∀-player must adhere to the rules that they can only choose among values that are consistent with the model of the system, as not all values are possible in all situations (there might also be additional rules for the ∃-player such that he can only choose among allowed inputs, but without loss of generality, we do not consider such restrictions here). The goal of the game is that exactly one hypothesis becomes true. Clearly, a PDT or DDT then exists if and only if the first player has a winning strategy. Thus, the first form of testing in Def. 3, finding PDTs, corresponds to solving a QCSP and is captured by the formula

Definition 3 (Discriminating Tests) An assignment tI to I is a possibly discriminating test (PDT), if for all Mi there exists an assignment tO to O such that tI ∧ Mi ∧ tO is consistent and for all Mj , j 6= i, tI ∧ Mj ∧ tO is inconsistent. The assignment tI is a definitely discriminating test (DDT), if for all Mi and all assignments tO to O, if tI ∧ Mi ∧ tO is consistent then for all Mj , j 6= i, it follows that tI ∧ Mj ∧ tO is inconsistent. For example, consider the (simplified) system in Fig. 1. It consists of five variables x, y, z, u, v, where x, y, z are input variables and v is an output variable, and two components that compare signals (x and y) and add signals (u and z). The signals have been abstracted into qualitative values ’low’ (L) and ’high’ (H); thus, for instance, values L and H can add up to the value L or H, and so on. Assume we have two hypotheses about the system that we want to distinguish from each other: the first hypothesis is that the system is functioning normally, which is modeled by the constraint set M1 = {fdiff , fadd }.

∃i1 . . . in ∃o1 . . . om ∀u1 . . . uk . M1 → ¬M2

(1)

In analogy to (1), we can capture the second (stronger) form of testing, finding DDTs, by the following QCSP formula: ∃i1 . . . in ∀o1 . . . om ∀u1 . . . uk . M1 → ¬M2

(2)

Note that for the case where M1 and M2 comprise only of deterministic functions, the quantification over the output variables ranges

38

Definition 6 (Discriminating Test Sequences) Given two plants P1 = hx0 , X, I, δ, O, λi and P2 = hy0 , Y, I, η, O, µi, a sequence of inputs σ ∈ I ∗ is a possibly discriminating test sequence (PDTS), if there exists a sequence of outputs ρ ∈ O∗ such that (σ, ρ) is a feasible trace of P1 but not of P2 . The sequence σ is a definitely discriminating test sequence (DDTS) for P1 and P2 , iff for all sequences of outputs ρ, it holds that if (σ, ρ) is a feasible trace P1 then it is not a feasible trace of P2 .

over a single possible assignment, and thus (1) and (2) will have the same solutions (PDTs and DDTs become equivalent). The two problems of finding a PDT and a DDT can be embedded into the polynomial time hierarchy: Proposition 1 (Complexity of PDTs and DDTs) The problem of finding PDTs and DDTs is ΣP 2 -complete. Because the complexity class ΣP 2 is believed to lie between NP and PSpace, this means that the problem of finding tests for logical models is more complex than solving CSPs, but less complex than the problem of finding tests for automata models (see Sec. 4). The QCSP formulation allows us to use standard QCSP/QBF solvers in order to actually compute tests (see Sec. 6), as opposed to devising special algorithms as in [15, 9].

4

Notice that, due to our assumptions about completeness, for every input sequence σ there exist sequences ρ, τ such that (σ, ρ) is a feasible trace of P1 and (σ, τ ) is a feasible trace of P2 . PDTSs and DDTSs are equivalent to the notion of weak and strong tests as defined in [5]: like PDTs and DDTs, a PDTS is a sequence that may reveal a difference between two hypotheses, whereas a DDTS is a sequence that will necessarily do so. In the case of deterministic plants, PDTSs and DDTSs coincide. Again, DDTSs are symmetric: a DDTS to discriminate P1 from P2 is also a DDTS to discriminate P2 from P1 . For example, Fig. 2 shows two plants P1 and P2 with I = {L,H} and O = {0,1}. The input sequence σ = L,L is a PDTS for P2 , P1 , because, for example, 0,1,0 is a possible output sequence of P2 but not of P1 . The sequence σ 0 = H,H is a DDTS for P2 , P1 , because the only possible output sequence 0,0,0 of P2 cannot be produced by P1 .

Discriminating Tests for Automata Models

In this section, we extend the notion of hypotheses (models) to be discriminated from the case of logical (state-less) models to the more general case of dynamic models whose state can change over time, as for instance used in NASA’s Livingstone [17] or MIT’s Titan modelbased system [16]. This means that we are no longer searching for a single assignment to input variables, but rather for a sequence of inputs over different time steps. The following two definitions are adapted from [7]: Definition 4 (Plant Model) A (partially observable) plant is a tuple P = hx0 , X, I, δ, O, λi, where X, I, O are finite sets, called the state space, input space, and output space, respectively, x0 ∈ X is the start state, δ ⊆ X × I × X is the transition relation, and λ ⊆ X × O is the observation relation.

L

L,H x0

LH L,H

x1

L

L,H x0

L

x1

H L 1

0

For technical convenience, we henceforth assume that in all our plants δ and λ are complete, that is for every x ∈ X and i ∈ I there exists at least one x0 such that (x, i, x0 ) ∈ δ and at least one o ∈ O such that (x, o) ∈ λ. The intuitive meaning of a plant is as follows: X is the set of states that the plant can assume, and the state is not revealed to the observer. When the plant is in state x, input i will cause the state to change from x to x0 provided that (x, i, x0 ) ∈ δ. Moreover, it can emit the observable output o provided that (x, o) ∈ λ. We write δ(x, i, x0 ) for (x, i, x0 ) ∈ δ, and λ(s, o) for (x, o) ∈ λ. Note that a plant need not be deterministic, that is, the state after a transition may not be uniquely determined by the state before the transition and the input. Likewise, a plant state may be associated with several possible observations.

0 L,H

Figure 2.

4.1

1

x2 0

Two plants P1 (left) and P2 (right).

Characterizing PDTSs and DDTSs

We give QCSP formulas that encode the problem of finding PDTSs and DDTSs with a length less or equal to k. Using QCSPs, feasible traces of length k of a plant can be captured as follows: a sequence of inputs and outputs is feasible, iff there exists a sequence of states such that for any two consecutive states x, x0 along the sequence, the respective input i and output o must be consistent with the transition relation δ and the observation relation λ:

Definition 5 (Feasible Trace) Let P = hx0 , X, I, δ, O, λi be a plant, and σ = i1 , i2 , . . . , ik ∈ I ∗ be a sequence of k inputs and ρ = o0 , o1 , . . . , ok ∈ O∗ be a sequence of k +1 outputs. Then (σ, ρ) is a feasible trace of P iff there exists a sequence σ = x0 , x1 , . . . , xk of states such that δ(xj−1 , ij , xj ) for all 1 ≤ j ≤ k and λ(xj , oj ) for all 0 ≤ j ≤ k.

φ(i1 , . . . , ik , o0 , . . . , ok , X, δ, λ) ≡ ∃x0 , . . . , xk ∀x, x0 , i, o .

A plant represents a hypothesis about the actual behavior of the system under test. Given two such hypotheses P1 , P2 we are interested in determining which of the hypotheses is true. To this end, our aim is to stimulate the system under test using a sequence of inputs, and observe the output sequence; if we find that the observed output can be generated by one plant but not by the other, we know which hypothesis is correct. In this sense we can extend the notion of discriminating tests (Def. 3) from static systems to dynamic systems (plants):

0 0 ([∨k−1 j=0 (x = xj ) ∧ (x = xj+1 ) ∧ (i = ij+1 )] → δ(x, i, x ))

∧ ([∨kj=0 (x = xj ) ∧ (o = oj )] → λ(x, o))

(3)

From (3), we can construct a QCSP formula that encodes the problem of finding a PDTS with a maximum path length of k: ∃i1 , . . . , ik ∃o0 , . . . , ok . φ(i1 , . . . , ik , o0 , . . . , ok , X, δ, λ) ∧ ¬φ(i1 , . . . , ik , o0 , . . . , ok , Y, η, µ)

39

(4)

get a similar picture: it has been shown [1] that finding such adaptive distinguishing sequences is in ExpTime, and therefore even harder than the problem of finding DDTSs. Surprisingly, for deterministic automata models, the problem is polynomial and therefore easier than the DDTS problem [11]. For model-based testing, this leads to two interesting insights: first, since the class of adaptive tests (observation-dependent inputs) generalizes the class of non-adaptive tests (observation-independent inputs), from the two classes being different it follows that both for logical (constraint-based) models and for automata models, adaptive tests are strictly more powerful in the sense that an adaptive test might exist even if a nonadaptive test does not exist. Second, the more general form of adaptive (observation-dependent) testing is not just more powerful, but for deterministic (or nearly deterministic) models it is even computationally preferable over non-adaptive testing. As already noted in the introduction, relaxing the second assumption (controllable variables characterize all relevant causal inputs to the system) is often a practical necessity: during testing, even in a highly controlled environment such as an automotive test-bed, there might be variables or parameters that influence the system’s behavior, but whose values cannot be completely controlled. For logical models, this scenario of testing under limited controllability can be captured using a modification of (2). Let I be partitioned into input variables Ic = {i1 . . . is } that can be controlled (set during testing), and input variables Inc = {is+1 . . . in } that can be observed but not controlled. Then a definitely discriminating test exists iff the following formula is satisfiable:

Extending on (4), the following QCSP formula captures DDTSs with a maximum path length of k: ∃i1 , . . . , ik ∀o0 , . . . , ok . φ(i1 , . . . , ik , o0 , . . . , ok , X, δ, λ) → ¬φ(i1 , . . . , ik , o0 , . . . , ok , Y, η, µ)

(5)

We compare this to the approach in [9], which is based on unrolling automata into a constraint network using k copies of the transition relation and the observation relation, and then applying test methods for logical models as discussed in Sec. 3. The advantage of the QCSP-based encoding (4,5) is that for any k, it requires only a single copy of the transition relation and the observation relation, which are the biggest components in most automata model specifications. Thus, the size of the formula will grow much more moderately with the number of time steps k than the constraint network in [9]. However, it is still open to what extent current QCSP/QBF solvers can exploit this more compact encoding of the test generation problem, and turn it into actual performance improvements (see also Sec. 6). As for the complexity, for non-deterministic finite-state machines it has been shown that the problem of uniquely identifying its initial state from its input and output behavior is PSpace-complete [1]. This problem is equivalent to the problem of designing a sequence of inputs that allows to unambiguously distinguish among two nondeterministic finite-state machines (with known initial states), and therefore equivalent to the problem of finding DDTSs: Proposition 2 The problem of finding DDTSs is PSpace-complete. To our knowledge, the complexity of finding PDTSs is still unknown, but it is likely that this problem is also PSpace-complete.

5

∀is+1 . . . in ∃i1 . . . is ∀o1 . . . om ∀u1 . . . uk . M1 → ¬M2

Again, this problem is strictly harder than the DDT problem (Sec. 3.1). Also note again that while solutions to (1) and (2) are simply assignments to the values of the input variables, solutions to (7) are in general more complex and correspond to a strategy or policy that states how the values of the controllable variables Ic must be set depending on the values of the non-controllable variables Inc . To illustrate this, consider again the example in Fig. 1, but assume that variable x can’t be controlled. According to Def. 3, no DDT exists in this case, as the possible observations for v will always overlap for the two hypotheses M1 and M2 . However, there exists a test strategy to distinguish M1 from M2 , which consists of setting y depending on the value of x: choose input y ← H, z ← H if x = L, and choose input y ← L, z ← H if x = H. Again, generating test for such systems with limited controllability goes beyond the theory in [15], but it is possible in the QCSP framework. We are currently working on merging the two sources of nondeterminism in testing (non-deterministic behavior of the system and limited controllability of the system) into one common framework for QCSP-based adaptive testing.

Adaptive Testing

As discussed above, the QCSP (game-theoretic) framework is useful to compactly express, analyze and solve different variants of (known) model-based testing problems. However, in addition, it can also serve as a starting point to tackle new classes of problems that are closer to the practice of testing. Recall that in Def. 3, tests are assumed to consist of (complete) assignments to controllable variables I. Actually, looking closely, there are two assumptions underlying this definition, namely that i) testing is performed as a two-step process where one first sets the inputs and then observes the outputs, and ii) the controllable variables characterize all relevant causal inputs to the system. In the following, we seek to relax these two assumptions. Relaxing the first assumption means to extend testing from the problem of finding input assignments to the problem of finding adaptive tests, where input variables can be set depending on the values of observed output variables. Such an adaptive sequence is in fact a strategy that describes which values the input variables must be given in response to the values of observed variables (represented, for example, as a decision tree). Generating such adaptive strategies goes beyond the theory in [15], which assumed that tests consist of assignments (patterns) for the input variables, but it is possible in the QCSP framework. For logical models, adaptive tests can be captured using the following modified QCSP formula (assuming, without loss of generality, that the number of input variables equals the number of output variables): ∃i1 ∀o1 . . . ∃in ∀on ∀u1 . . . uk . M1 → ¬M2

(7)

6

Prototypic Implementation of QCSP-based Testing

We have conducted preliminary experiments of QCSP-based test generation with the solvers Qecode [3] and sKizzo [2] (since the present version of Qecode does not allow one to extract solutions from satisfiable instances, we transform the instance into QBF and use sKizzo to extract solutions). So far, we have implemented several examples of non-adaptive and adaptive test generation for logical models (Sec. 3), and a small example of non-adaptive test generation for automata models (Sec. 4). However, at the moment these exam-

(6)

While the non-adaptive version of DDTs (Sec. 3.1) is ΣP 2 complete, the adaptive version (6) is harder to compute (PSpacecomplete). For the case of (non-deterministic) automata models, we

40

• Third, existing methods to combat search space complexity by automated abstraction of constraints can be straightforwardly extended from CSPs to QCSPs and thus be adapted to the context of model-based testing with limited effort. Based on our previous work in this direction [14, 12] and related work in [8], we plan to devise an abstraction-refinement method for constraint-based testing of hybrid systems.

ples are still too small for a meaningful performance comparison of our approach (non-adaptive case only) to the approach in [15, 9]. Figure 3 shows solutions generated from (2) and (7) for the example in Fig. 1. The solutions are represented in the form of BDDs with complemented arcs (see [2]), where ¬x stands for x ← L, x stands for x ← H, etc. The left-hand side of the figure shows the strategy (in this case, a simple set of assignments) that is generated if variables x, y, z are specified as controllable (input) variables, whereas the right-hand side of the figure shows the strategy when only y, z are controllable (in this case, y must be set depending on the value of x). No solution (definitely discriminating test strategy for the fault) is found if only z is assumed to be controllable.

¬x y

z

y

We are also currently working on larger, more realistic examples to evaluate our QCSP-based testing approach. In particular, in the future we seek to complement passive verification tools [7] for embedded autonomous controllers [16] with a capability to generate test strategies that can actively reveal faults.

ACKNOWLEDGEMENTS

z

The authors would like to thank Michael Esser, Paul Maier, and Peter Struss for useful comments. x 1

REFERENCES 1

Figure 3.

7

[1] Rajeev Alur, Costas Courcoubetis, and Mihalis Yannakakis, ‘Distinguishing tests for nondeterministic and probabilistic machines’, in Proceedings ACM Symposium on Theory of Computing, pp. 363–371, (1995). [2] Marco Benedetti, ‘skizzo: A suite to evaluate and certify qbfs’, in Proceedings CADE-05, (2005). [3] Marco Benedetti, Arnaud Lallouet, and Jrmie Vautard, ‘Qcsp made practical by virtue of restricted quantification’, in Proceedings IJCAI07, pp. 38–43, (2007). [4] Marco Benedetti and Hratch Mangassarian, ‘Experience and perspectives in qbf-based formal verification’, Journal on Satisfiability, Boolean Modeling and Computation (JSAT), (2008). to appear. [5] Sergiy Boroday, Alexandre Petrenko, and Roland Groz, ‘Can a model checker generate tests for non-deterministic systems?’, Electronic Notes in Theoretical Computer Science, 190(2), 3–19, (2007). [6] Sebastian Brand, ‘Sequential automatic test pattern generation by constraint programming’, in Proceedings CP-01 Workshop on Modelling and Problem Formulation, (2001). [7] Alessandro Cimatti, Charles Pecheur, and Roberto Cavada, ‘Formal verification of diagnosability via symbolic model checking’, in Proceedings IJCAI-05, pp. 363–369, (2003). [8] E. Clarke, A. Fehnker, Z. Han, B.H. Krogh, O. Stursberg J. Ouaknine, and M. Theobald, ‘Abstraction and counterexample-guided refinement in model checking of hybrid systems’, Journal of Foundations of Computer Science, 14(4), 583–604, (2003). [9] Michael Esser and Peter Struss, ‘Fault-model-based test generation for embedded software’, in Proceedings IJCAI-07, pp. 342–347, (2007). [10] Tracy Larrabee, ‘Test pattern generation using boolean satisfiability’, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 11(1), 4–15, (1992). [11] David Lee and Mihalis Yannakakis, ‘Testing finite-state machines: State identification and verification’, IEEE Transactions on Computers, 43(3), 306–320, (1994). [12] Paul Maier and Martin Sachenbacher, ‘Constraint optimization and abstraction for embedded intelligent systems’, in Proceedings CPAIOR08, (2008). to appear. [13] J. Rintanen and A. Grastien, ‘Diagnosability testing with satisfiability algorithms’, in Proceedings of IJCAI-07, pp. 532–537, (2007). [14] Martin Sachenbacher and Peter Struss, ‘Task-dependent qualitative domain abstraction’, Artificial Intelligence, 162(1–2), 121–143, (2005). [15] Peter Struss, ‘Testing physical systems’, in Proceedings AAAI-94, pp. 251–256, (1994). [16] B.C. Williams, M. Ingham, S. Chung, and P. Elliott, ‘Model-based programming of intelligent embedded systems and robotic space explorers’, Proceedings of the IEEE Special Issue on Modeling and Design of Embedded Software, 91(1), 212–237, (2003). [17] B.C. Williams and P. Nayak, ‘A model-based approach to reactive selfconfiguring systems’, in Proceedings of AAAI-96, pp. 971–978, (1996).

Test strategies generated for the example in Fig. 1.

Discussion and Future Work

We reviewed an existing theory [15] of testing for physical systems, which defines a weaker (PDTs) and a stronger form (DDTs) of test inputs, and showed how it can be framed as QCSP solving. For the first time, we give precise results on the complexity of this problem (in between NP and PSpace). Furthermore, we showed how assumptions in this theory about the complete controllability of system inputs can be relaxed and lead to a strictly more powerful class of tests, where inputs are intelligently set in reaction to observed values. Such test strategies go beyond the test pattern approach of the existing theory, but they can be captured in the QCSP framework. We also extended the QCSP-based formulation of testing to the case of plants modeled as non-deterministic automata. While there exist approaches that solve non-deterministic testing problems using classic constraint solvers [9] and model checkers [5], we believe that the QCSP-based representation can be advantageous for several reasons: • First, as noted in Sec. 4, the QCSP encoding is quite compact. While it is not yet clear if this theoretical advantage can indeed be capitalized by current solver technology, there are at least hints [4] that it can lead to performance improvements as more sophisticated techniques are added to these solvers. • Second, because QCSPs are kind of a natural generalization of CSPs, it is not too difficult to lift extensions of CSPs such as soft constraints and optimization to QCSPs. In fact, the next release of the QCSP solver we used for our experiments (Qecode) contains optimization extensions. Thus, using the QCSP-based formulation, it will be relatively easy to extend model-based testing in order to generate, for instance, cost-optimal test strategies or probabilistic test strategies that most likely discriminate fault hypotheses.

41

42