A tableau decision algorithm for modalized ALC with constant domains Carsten Lutz1 Holger Sturm2 Frank Wolter2 and Michael Zakharyaschev3 ;
;
;
LuFG Theoretical Computer Science, RWTH Aachen, Ahornstrae 55, 52074 Aachen, Germany 2 Institut fur Informatik, Universitat Leipzig, Augustus-Platz 10-11, 04109 Leipzig, Germany 3 Department of Computer Science, King's College Strand, London WC2R 2LS, U.K. (emails:
[email protected],
[email protected],
[email protected],
[email protected]) 1
Abstract The aim of this paper is to construct a tableau decision algorithm for the modal description logic KALC with constant domains. More pre-
cisely, we present a tableau procedure that is capable of deciding, given an ALC -formula ' with extra modal operators (which are applied only to concepts and TBox axioms, but not to roles), whether ' is satis able in a model with constant domains and arbitrary accessibility relations. Tableau-based algorithms have been shown to be `practical' even for logics of rather high complexity. This gives us grounds to believe that, although the satis ability problem for KALC is known to be NEXPTIMEcomplete, by providing a tableau decision algorithm we demonstrate that highly expressive description logics with modal operators have a chance to be implementable. The paper gives a solution to an open problem of Baader and Laux [5].
1 Introduction Description logics are logical formalisms intended for representing knowledge about static application domains in a structured way (see e.g. [11] and references therein). In the basic description logic ALC [30], for example, we can use the concept Person u 9lives in:Bavaria u 8drinks:Beer to describe the persons living in Bavaria and drinking only beer. (Here Person, Bavaria, and Beer are concept names (unary predicates), and lives in and drinks
1
are role names (binary predicates).) The knowledge that all people living in Bavaria drink only beer and, conversely, that all people drinking only beer live in Bavaria can be represented then by the terminological axiom Person u 9lives in:Bavaria = Person u 8drinks:Beer:
(1)
Modal logics are formal systems intended for reasoning with necessity-like and possibility-like operators, in particular, for reasoning about agents' knowledge of their knowledge bases and about the evolution of these knowledge bases in time (see e.g. [12] and references therein). Using the modal operators [Tom believes] and [always], we can relativize axiom (1) to represent the knowledge of an agent `Tom' and also add to it some temporal avor:
[Tom believes] (Person u 9lives in:Bavaria = Person u [always]8drinks:Beer) (2) i.e., Tom believes that those people who live in Bavaria and nobody else drink only beer at any given moment of time. As just demonstrated, modalized description logics provide means for representing distributed knowledge about dynamic application domains. Such logics were constructed e.g. in [10, 29, 5, 6, 34, 36, 37, 38]. Both description logics and modal logics are featured as `a compromise between expressiveness and eectiveness', which led to the implementation of many of them as representation and reasoning systems [15, 20, 27]. Description logics with modal operators are not so well-behaved. The interaction between description logic constructs and modalities can substantially increase the complexity of reasoning tasks and even make them undecidable [6, 36, 26]. The series of papers [34, 36, 37, 38, 35] identi ed the following syntactical and semantical `limits' within which interactions between the modal and the description parts of the combined logics usually preserve decidability: Modal operators can be applied to both concepts and terminological axioms (as in the example above) but not to roles. In the worlds of a Kripke model interpreting the modal operators, each object and concept name of the description logic can be interpreted both globally (i.e., by the same object or, respectively, set of objects in all worlds) and locally (i.e., by arbitrary objects or sets of objects).1 Role names can be interpreted only locally. The domains of the worlds (on which we interpret the description logic component) can be assumed to be constant; the expressive power of the combined language makes it possible to simulate the varying or expanding domain assumptions which presuppose less interaction. If we represent a person, say Tom, by the object name tom then we most likely assume that this name is global, while the object name president of the USA, representing the current president of the United States, should clearly be local. 1
2
By allowing modalized or/and global roles, one can easily construct undecidable description logics with epistemic and temporal operators [35]. The proofs of decidability in [34, 37, 38] are based on a semantical approach. They do not give any `practical' decision procedures having a potential to be implemented in reasoning systems which are reasonably fast on reasonably large sets of problems. The aim of this paper is to construct a tableau decision algorithm for the modal description logic KALC with constant domains. More precisely, we present a tableau procedure which is capable of deciding, given a formula ' of the form (2) (in which modal operators are applied only to concepts and TBox axioms, but not to roles), whether ' is satis able in a model with constant domains. Tableau-based algorithms have been shown to be `practical' even for logics of rather high complexity [14, 19, 21]. This gives us grounds to believe that, although the satis ability problem for KALC is known to be NEXPTIMEcomplete [26], by providing a tableau decision algorithm we demonstrate that highly expressive description logics with modal operators have a chance to be implementable. The logic KALC with expanding domains was rst considered by Baader and Laux [5] who proved its decidability by designing a tableau satis ability-checking algorithm. However, they failed to construct such an algorithm working under the constant domain assumption, having left this as an open problem and explained some principle diculties. Somewhat simpli ed, the idea of constructing tableaux for KALC with expanding domains is as follows. Given a formula ', we rst apply to it the rules of a tableau system for ALC , thus constructing an ALC -model for the non-modal part of ' in the initial world w0 . Then we apply the rules of a tableau system for K to the modalized concepts and formulas in this model and construct a number of new worlds wi `populated' by the same objects as in w0 . After that we use the ALC -rules in the worlds wi and possibly extend their domains. And so forth. This straightforward approach does not work in the case of models with constant domains. Having expanded the domain of wi with a new object a, we have to add a to the domain of w0 which, in turn, may force us to expand this domain, and so the domains of the wi as well, with some new objects, and so on (for more details consult [5] and Section 3 below). The main technical contribution of the paper is that it shows how to design a machinery for constructing tableaux with constant domains. The fundamental dierence from the approach of [5] is that our tableau algorithm constructs not a model itself but its representation in the form of a quasimodel, the worlds in which are `populated' by types of objects rather than real objects. Quasimodels were rst introduced in [34] and used in [31] for designing a tableau procedure for temporal description logics with expanding domains. In fact, the transference of the notion of quasimodels from model-theory in [34] to proof-theory in [31] and the present paper provides a nice example of how an `abstract' decidability proof can serve as an important basis for an implementable decidability proof. We have chosen the basic modal description logic KALC for treating in this paper only in order to make the ideas of the tableaux and the employed techniques as clear as possible. The developed methods can be extended to more 3
sophisticated logics, say, S4 or temporal logics based on ALC . Moreover, the approach developed in this paper can be generalized to the monodic fragments of rst-order modal and temporal logics [16, 39] by combining tableau procedures for their rst-order and modal components in a modular way; for details visit http://www.dcs.kcl.ac.uk/staff/mz. It is worth also noting that KALC is closely related to the Cartesian product K S5 (cf. [13, 33]). Thus we obtain for free a tableau-based decision procedure for K S5 as well. The paper is organized in the following way. In the next section we introduce the syntax and the semantics of KALC with constant domains. Section 3 discusses diculties in designing tableau procedures for modalized description logics under the constant domain assumption and gives an overview of the tableau algorithm developed in this paper. In Section 4 we de ne constraint systems for KALC -formulas, and Section 5 shows how to encode KALC -models in the form of quasimodels. The tableau decision algorithm is presented in Section 6 and its termination and correctness is proved in Section 7. We conclude the paper with a discussion of obtained results and related problems.
2 Preliminaries In this section, we de ne the syntax and the semantics of the modal description logic KALC . De nition 1 (syntax). Let NC , NR, and NO be countably in nite sets of concept names, role names, and object names, respectively. The set of KALC concepts is de ned inductively as follows: 1. All concept names as well as the logical constant > are concepts. 2. If C and D are concepts, R is a role name, and i < !, then the following expressions are concepts:
:C; C u D; C t D; 9R:C; 8R:C; 3iC; 2i C:
KALC -formulas are also de ned inductively:
3. If C and D are concepts and a is an object name, then C = D and a : C are (atomic) formulas. 4. If ' and are formulas and i < !, then the following expressions are formulas:
:'; ' ^ ; ' _ ; 3i'; 2i ':
Throughout the paper, we denote concept names by A, role names by R, and object names by a. Arbitrary concepts are denoted by C , D, and E , and formulas by ', , and #. The semantics of KALC is a natural hybrid of the possible world semantics for K and the standard set-theoretic semantics for ALC . 4
De nition 2 (semantics). An ALC -interpretation is a pair I = (I ; I ), where I is a non-empty set called the domain of I and I maps each concept name A to a subset AI of I , each role name R to a binary relation RI on I , and each object name a to an element aI in I . A KALC -model is a triple of the form M = (W; ; I ), where W is a non-empty set (of worlds ), is a map associating with each i < ! a binary relation i on W , and I associates with each w 2 W an ALC -interpretation I (w) = (w ; I (w) ) such that, for any w; w0 2 W , we have { w = w0 and { aI (w) = aI (w0) for all a 2 NO . We now de ne the value C I (w) of a concept C in a world w in M by taking
>I w = w ; (:C )I w = w n C I w ; (C u D)I w = C I w \ DI w ; (C t D)I w = C I w [ DI w ; (9R:C )I w = fd 2 w j 9d0 2 C I w (dRI w d0 )g; (8R:C )I w = fd 2 w j 8d0 2 w (dRI w d0 ) d0 2 C I w )g; 0 (3iC )I w = fd 2 w j 9w0 2 W (w i w0 & d 2 C I w )g; 0 (2i C )I w = fd 2 w j 8w0 2 W (w i w0 ) d 2 C I w )g: The truth-relation (M; w) j= ' for a KALC -formula ' is de ned as follows: (M; w) j= C = D i C I w = DI w : (M; w) j= a : C i aI w 2 C I w : (M; w) j= :' i (M; w) 6j= ': (M; w) j= ' ^ i (M; w) j= ' and (M; w) j= : (M; w) j= ' _ i (M; w) j= ' or (M; w) j= : (M; w) j= 3i ' i there is w0 2 W such that w i w0 and (M; w0 ) j= ': (M; w) j= 2i ' i (M; w0 ) j= ' for all w0 2 W such that w i w0 : ( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
(
( )
(
( )
( )
)
)
( )
( )
Remark 3. We do not make the unique name assumption (UNA): for two distinct object names a; b 2 NO , we may have aI (w) = bI (w) for some w 2 W . However, with minor changes, all the results in this paper can be proved for models satisfying UNA. Note that the object names are assumed to be global, i.e., for all a 2 NO and w; w0 2 W , we have aI (w) = aI (w0 ) . On the contrary, the values of concept names are de ned locally. Neither of these assumptions is essential; in particular, global concepts can be de ned via local ones.
5
By requiring w = w0 for all w; w0 2 W , we choose models with constant domains. In models with expanding domains we would only have w w0 for any w; w0 2 W such that w i w0 for some i < !. Models with varying domains impose no restrictions on the domains at all. As noted in the introduction, both varying domains and expanding domains can be simulated using constant domains [36]. We are interested in the following inference problems. De nition 4 (inference). A formula ' is satis able if there exist a model M = (W; ; I ) and a world w 2 W such that (M; w) j= '. A concept C is satis able if there exist M = (W; ; I ) and w 2 W such that C I (w) 6= ;. A concept C is subsumed by a concept D if C I (w) DI (w) for all models M = (W; ; I ) and all w 2 W . Remark 5. Note that concept satis ability and concept subsumption can be reduced to the satis ability of formulas. A concept C is satis able i the formula a : C is satis able and a concept C subsumes a concept D i the formula a : (D u:C ) is unsatis able. So in what follows we only consider the satis ability of formulas. Say that a formula ' is equivalent to a formula when (M; w) j= ' i (M; w) j= for every model M and every world w in it. Similarly, a concept C is equivalent to a concept D if C I (w) = DI (w) for all models M = (W; ; I ) and all w 2 W . The formula C = D is clearly equivalent to (:C t D) u (:D t C ) = >. So without loss of generality we can assume that in every atomic formula of the form C = D the concept D is >. Furthermore, we generally assume formulas and concepts to be in negation normal form. De nition 6 (negation normal form). A concept C is said to be in negation normal form (NNF) if negation occurs in C only in front of concept names. A formula ' is in negation normal form if negation occurs in ' only in front of concept names and atomic formulas of the form C = D. Each concept C can be transformed into an equivalent concept in NNF by pushing negation inwards with the help of de Morgan's law, the duality between 9 and 8, and between 3 and 2. The NNF of :C will be denoted by C . Similarly, each formula can be transformed into an equivalent one in NNF by employing de Morgan's law, the duality between 3 and 2, and the fact that :(a : C ) is equivalent to a : :C .
3 Tableau algorithms and constant domains In this section, we introduce tableau algorithms in general, discuss diculties in designing such algorithms for modalized description logics under the constant domain assumption, and give an overview of the tableau procedure developed in this paper. 6
To decide whether a given formula # is satis able, a tableau algorithm tries to construct a model for # by repeatedly applying completion rules to an appropriate data structure. In the case of ALC , this data structure is usually a constraint system [4, 5, 17], consisting of constraints which are formulas, expressions of the form x : C , or expressions of the form (x; x0 ) : R, where x and x0 are variables or object names from NO . For now, it is convenient to think of variables and object names as representing domain objects. In the case of KALC , a more complex data structure is needed. In this paper, it is a completion tree whose edges represent accessibility relations and whose nodes are labeled with constraint systems representing ALC interpretations. The tableau algorithm starts with a completion tree containing only a single node labeled with a constraint system containing only # (and possibly some additional constraints required for some technical reasons). The completion rules are applied until (i) a contradictory completion tree is obtained or (ii) a contradiction-free completion tree is found which is complete in the sense that no more rule is applicable to it. Here, a completion tree contains a contradiction if, for example, one of its nodes is labeled with a constraint system containing both x : A and x : :A. To illustrate how completion rules look like, we sketch some standard rules which can be found in most tableau algorithms for description logics (see e.g., [4, 5, 17, 23]). 1. If a constraint system S contains the formula C = > and a variable or an object name x, then we add x : C to S . 2. If a constraint system S contains x : C t D, then we add to S either x : C or x : D. 3. If a constraint system S contains x : 9R:C , then we add to S two constraints v : C and (x; v) : R, where v is a fresh variable that was not used in S before. 4. If the label S of a node g in a completion tree T contains the constraint x : 3iC , then we add to T a new node g0 as a successor of g and label it with the constraint system containing x : C and x : D for every x : 2i D in S . The second rule above is nondeterministic: its application yields more than one possible outcome. In the presence of nondeterministic rules, a tableau algorithm terminates successfully if the completion rules can be applied in such a way that the result is a complete and contradiction-free completion tree. To illustrate some diculties in designing a tableau algorithm for KALC with constant domains, we consider here an example from [5]. Suppose that we have a completion tree T with one node g labeled with the constraint system
L(g) = fv : >; (3i 9R:C ) = >g: An application of the rst rule above yields an additional constraint v : 3i 9R:C . By applying the fourth rule, we construct a new node g0 in T with the label 7
L(g0 ) = fv : 9R:C g, which is then extended to L(g0 ) = fv : 9R:C; (v; v0 ) : R; v0 : C g by an application of the third rule. Since we assume constant domains and since the variables represent domain objects, the presence of v0 in L(g0 ) forces us to add v0 to L(g). This can be done by extending L(g) with the constraint v0 : >, which triggers the rst rule again: now it adds v0 : 3i9R:C to L(g). This constraint and the fourth rule give a new node g00 with label L(g00 ) = fv0 : 9R:C g which is then extended by the third rule with (v0 ; v00 ) : R and v00 : C . Thus we obtain a new variable v00 which has to be added to L(g) and L(g0 ). As these steps are to be repeated in nitely many times, the algorithm does not terminate. A standard way to overcome termination problems of this kind is to use blocking (see e.g., [5, 22, 23]). For a constraint system S and a variable or an object name v, let tS (v) denote the set of all concepts C such that S contains v : C . A variable v is said to be blocked in S if there is another variable or an object name x such that tS (v) tS (x) and x was introduced earlier than v. For example, in the completion tree constructed above, we could regard the variable v0 as blocked in L(g) because after the introduction of v0 to L(g) we have tL(g) (v0 ) tL(g) (v). No completion rule is applied to blocked variables, which yields termination. By using blocking, the tableau algorithm constructs a representation of a model rather than the model itself. In the case of KALC , it turned out to be dicult to nd a proper blocking strategy. Baader and Laux [5] experimented with several such strategies, but failed to identify a suitable one that ensures termination, soundness and completeness of the resulting algorithm. The interested reader is referred to [5] for a thorough discussion and examples. Instead of using blocking in the classical sense, the algorithm presented in this paper pursues a more general approach. In our algorithm, variables represent partial types of domain objects rather than domain objects themselves. Given an ALC -interpretation (I ; I ), by the type of a domain object d 2 I we mean the set of all concepts C such that d 2 C I . Since this set is in nite, we restrict attention only to those concepts that are `relevant' to constructing a model for the input formula (for more details see the next section). Blocking then becomes much easier. The completion rules have to ensure that no (partial) type t is introduced into a constraint system already containing a supertype of t. More precisely, if an application of a rule leads to an introduction of a new variable v to a completion system L(g) such that in the resulting completion system S 0 we have tS0 (v) tL(g) (x), for some variable or object name x, then the variable v is not introduced to L(g). Dealing only with types, the algorithm constructs not a model satisfying the input formula, but its representation known as a quasimodel [34]. We illustrate the use of quasimodels by the following example. Suppose that T is a completion tree consisting of a single node g labeled with
L(g) = f2i D = >; v : 3i9R:C; v : 2i :C; v : 2i Dg; 8
An application of the fourth rule above generates a successor node g0 of g with the label fv : 9R:C; v : :C; v : Dg, which is then extended by the third rule to
L(g0 ) = fv : 9R:C; v : :C; v : D; v0 : C g: Note that in quasimodels we do not need to keep trace of how roles connect objects|in our case (v; v0 ) : R|this information can be recovered in a canonical way later. The constructed completion tree represents models with a set of worlds W = fw; w0 g such that i = fhw; w0 ig, in the interpretation I (w), there are domain objects `of type v', and in the interpretation I (w0 ), there are domain objects of type v and of type v0 . Let us see now how the algorithm copes with constant domains. Fix a model described by the completion tree and let d be an object in I (w0 ) of type v0 . As we make the constant domain assumption, d is also an element of the domain of I (w). However, in I (w) this element cannot be of type v because otherwise d would satisfy :C in I (w0 ), which is impossible, since it also satis es C . A straightforward approach to attack this problem would be to introduce a new type to L(g) (thus overruling blocking). But then again we would face the problem of termination. We take a dierent way. Our solution is to generate a set of minimal partial types in each constraint system L(g) so that every domain object in the corresponding ALC interpretation I (w) be of exactly one of the types in the set. To this end we distinguish between two kinds of variables. A variable may be marked in a constraint system, which indicates that it represents a minimal (partial) type, or it may be unmarked, which means that the variable represents an `ordinary' type. We illustrate the dierence between marked and unmarked variables as well as the role of minimal partial types by reconsidering the example above. According to the minimal type strategy, we have to introduce to L(g) a marked variable vm together with the constraint vm : > before generating the node g0 . For nearly all completion rules (a notable exception is the second rule above), marked variables are treated like unmarked ones. An application of the rst rule adds vm : 2i D to L(g). This constraint means that every domain object in the ALC -interpretation I (w) is in (2i D)I (w) . After that we construct the node g0 and the variable v0 as above. In models described by the resulting completion tree, domain objects may be of types v and v0 in I (w0 ) and of types v and vm in I (w). Again, we face the problem of nding a `predecessor type' for v0 , i.e., a type for objects in I (w) which are of type v0 in I (w0 ). According to the minimal type strategy, we must choose this predecessor among the marked variables in L(g), in our case this can only be vm . However, since the constraint vm : 2i D is in L(g) and vm was chosen as the predecessor type for v0 , we must add v0 : D to L(g0 ). Figure 1 shows the resulting completion tree. Note that 9
g
g0
2i D = > v : 3i 9R:C v : 2i :C v : 2i D vm : > vm : 2i D
minimal type
i
v : 9R:C v : :C v:D v0 : C v0 : D
predecessor type for
Figure 1: The fully expanded completion tree (see text). with the minimal type strategy, there is no need to reconsider constraint systems that have already been treated, which helps to avoid termination problems. To conclude this section, we give a brief overview of how the set of minimal types is generated. Consider a completion tree consisting of a node g labeled with L(g) = fA = >; B t C = >; v : C g Again we start by introducing a single marked variable vm together with the constraint vm : >. Applications of the rst rules above add both vm : A and vm : B t C . According to the second rule, we must now decide where to put vm : to B or to C . However, it may be the case that neither of these two choices is the correct one: that all domain objects in interpretations corresponding to L(g) satisfy B t C does not imply that all of them satisfy B or that all of them satisfy C . So for marked variables disjunction must be treated in a special way. Namely, rst we introduce a new marked variable vm0 which is a `copy' of vm , i.e., we have vm0 : A and vm0 : B t C in L(g). And then we add constraints vm : B and vm0 : C saying that each object is either of type vm , and so belongs to B , or of type vm0 , and so belongs to C . To be more precise, we need a nondeterministic rule. In one case, we explore both disjuncts as has been just described; in the two additional cases, we explore only one of the disjuncts (which is necessary to deal with disjuncts that lead to a contradiction). Similar modi cations are required for all nondeterministic rules dealing with marked variables.
4 Constraint systems Let us now de ne formally what we mean by a constraint system for a given KALC -formula #. Denote by ob(#) the set of all object names occurring in #; 10
con(#) the set of all concepts occurring in #; for(#) the set of all subformulas of #. The fragment (of KALC ) induced by # is de ned as the set Fg(#) = ob(#) [ for(#) [ con(#) [ fC j C 2 con(#)g [ f>g: Fix a countably in nite set V of (individual) variables so that V \ NO = ;. The
variables in V and the object names in ob(#) will be called terms for #. We will assume that NO [ V is well-ordered by an order ) 2 S , a term x occurs in S , but (x : C ) 2= S , then set S := S [ fx : C g. Local generating rules R6 If :(C = >) 2 S and there is no term x in S such that (x : C ) 2 S , then choose a fresh variable v for S and set S := S [ fv : C g with unmarked v. R9 If (x : 9R:C ) 2 S and there is no term y in S such that fC g [ fD j (x : 8R:D) 2 S g fD j (y : D) 2 S g, then choose a fresh variable v for S and set S := S [ fv : C g [ fv : D j (x : 8R:D) 2 S g with unmarked v. =
=
Figure 2: Local completion rules for KALC .
I described by S belongs to (C t D)I . The addition of v : C (or v : D) to S , i.e., option (i) in Rt0 , would mean then that each object of I belongs to C I (or, respectively, to DI ). This excludes from consideration models I in which
some objects are in C I and some in DI . That is why we need option (ii) which creates a marked copy v0 of v and then adds to S both v : C and v0 : D to say that every domain object in models described by S 0 = fv : C t D; v : C; v0 : Dg is either of type v or of type v0 . The majority of tableau algorithms for description logics contain two separate rules for dealing with the 9 and 8 constructors (see, e.g., [4, 5, 17, 23]). Our (local) set of rules contains only a single rule R9 which deals with both constructors. The reason for this is that we don't keep track of role relationships among objects. The following de nition introduces complete constraint systems and speci es what it means for a constraint system to have a clash. 12
De nition 10 (clash). Say that a constraint system S contains a clash if either of the two conditions holds: 1. fx : A; x : :Ag S for some term x and some concept name A, 2. x : :> is in S for some term x. Otherwise we say that S is clash-free. A constraint system S is complete if no completion rule is applicable to S .
5 Quasimodels
In this section we show how KALC -models can be represented in the form of quasimodels. Unlike [34], here we characterize quasimodels syntactically (cf. [31]). De nition 11 (quasiworld). Let # be a KALC -formula. A quasiworld for # is a complete clash-free constraint system for # all variables in which are unmarked. A frame F = hW; i whose worlds are (labeled with) quasiworlds for # will be called a #-frame. More precisely, a #-frame is a triple F = hW; ; i, where W 6= ;, gives a binary relation i on W for every i < !, and is a map from W into the set of quasiworlds for #. De nition 12 (run). Let F = hW; ; i be a #-frame. A run r in F is a function associating with every w 2 W a term r(w) occurring in the quasiworld (w) in such a way that if (r(w) : 3iC ) 2 (w) then there exists a w0 2 W such that w i w0 and (r(w0 ) : C ) 2 (w0 ); if (r(w) : 2i C ) 2 (w) and w i w0 , then (r(w0 ) : C ) 2 (w0 ). De nition 13 (quasimodel). A #-frame F = hW; ; i is called a quasimodel for # if the following conditions hold: 1. for every object name a 2 ob(#), the function ra de ned by ra (w) = a, for w 2 W , is a run in F; 2. for every w 2 W and every variable v in (w), there exists a run r in F such that r(w) = v; 3. for every w 2 W and every 3i ' 2 (w), there exists a w0 2 W such that w i w0 and ' 2 (w0 ); 4. for every w 2 W and every 2i ' 2 (w), whenever w i w0 then ' 2 (w0 ). We say that # is quasi-satis able if there is a quasimodel F = hW; ; i for # such that # 2 (w) for some w 2 W . 13
Theorem 14. A KALC -formula # in NNF is satis able i it is quasi-satis able. Proof ()) Suppose # is satis able. Then there is a model M = hW; ; I i such that (M; w# ) j= # for some w# 2 W . Let be the domain of M. For all w 2 W and d 2 we put I w (d) = fC 2 Fg(#) j d 2 C I w g: ( )
Let
( )
Tw = f I (w) (d) j d 2 g:
For each t = I (w) (d), take an individual variable vt and de ne a constraint system (w) as the union of the following sets: f' 2 for(#) j (M; w) j= 'g, fa : C j a 2 ob(#); C 2 Fg(#); aI (w) 2 C I (w) g, fvt : C j C 2 tg, for t 2 Tw . All variables are unmarked in (w). We show that F = hW; ; i is a quasimodel for #. It should be clear that the (w) are quasiworlds for # and that F satis es conditions (1), (3), and (4) in De nition 13. Let us check (2). Suppose that w 2 W and vt is a variable from (w). Take a d 2 with I (w) (d) = t and de ne a function r with domain W by putting r(u) = v I u (d), for each u 2 W . It is easy to see that r is a run in F coming through vt . That # is quasi-satis able follows from # 2 (w# ). (() Suppose that # is quasi-satis ed in a world w# 2 W of a quasimodel hW; ; i, i.e., # 2 (w# ).
De ne a KALC -model M = hW; ; I i with I (w) = ; I (w) as follows: is the set of all runs in hW; ; i; aI (w) = ra , for all a 2 ob(#); AI (w) = fr 2 j (r(w) : A) 2 (w)g, for all concept names A in Fg(#); for every pair r1 ; r2 2 and every role name R, we have r1 RI (w)r2 i ( )
fC 2 con(#) j (r (w) : 8R:C ) 2 (w)g fC 2 con(#) j (r (w) : C ) 2 (w)g: 1
2
We claim that # is satis ed in M. Let us observe rst that (I) for all w 2 W , C 2 Fg(#), and r 2 , if (r(w) : C ) 2 (w) then r 2 C I (w) .
14
The proof is by induction on the construction of C . The only non-trivial steps are C = 9R:D, C = 3iD, and C = 2i D. Suppose C = 9R:D and (r(w) : 9R:D) 2 (w). As (w) is closed under the rule R9, we can nd a term x such that (x : D) 2 (w) and
fE j (r(w) : 8R:E ) 2 (w)g fE j (x : E ) 2 (w)g: Moreover, there must be a run r0 such that r0 (w) = x. But then rRI (w) r0 and, by the induction hypothesis, r0 2 DI (w). Therefore, r 2 (9R:D)I (w) . Now suppose that C = 3i D and (r(w) : 3i D) 2 (w). By the rst clause in De nition 12, there exists w0 2 W such that w i w0 and (r(w0 ) : D) 2 (w0 ). I (w 0 ) So, by the induction hypothesis, r 2 D , from which r 2 (3iD)I (w) . Finally, let C = 2i D. By the second clause of De nition 12, we then have (r(w0 ) : D) 2 (w0 ), and so r 2 DI (w0 ) , for all w0 2 W such that w i w0 . It follows that r 2 (2i D)I (w) . Our second observation is that (II) for every w 2 W and every ' 2 for(#), if ' 2 (w) then (M; w) j= '. This is also proved by induction. Let ' be atomic and ' 2 (w). Consider two cases. First, suppose ' = (a : C ). By the rst clause of De nition 13, we have (ra (w) : C ) 2 (w). Hence, by (I), ra 2 C I (w) . Recall that aI (w) was de ned as ra . So (M; w) j= a : C . Second, assume ' = (C = >). Let r 2 . As (w) is closed under R= , we then have (r(w) : C ) 2 (w). It follows from (I) that r 2 C I (w) . Next, let ' = : with atomic . Since # is in NNF, has the form (C = >). As (w) is closed under R6= , we have (x : C ) 2 (w) for some x. By the second clause in De nition 13, there exists a run r such that r(w) = x. Moreover, it follows from (I) that r 2 (C )I (w) . So there is a d 2 such that d 2 (C )I (w) , from which (M; w) j= :(C = >). The induction step is rather straightforward (it is based on (3) and (4) in De nition 13); we leave the details to the reader. It follows from (II) that (M; w# ) j= #. 2
6 The algorithm We are in a position now to de ne completion trees, the global completion rules, and the tableau algorithm itself. Fix a countably in nite set N of nodes. De nition 15 (completion tree). A completion tree for a KALC -formula # is a tree T whose nodes g 2 N are labeled with constraint systems L(g) for # and whose edges (g; g0 ) are labeled with natural numbers i such that 3i or 2i occurs in #. If (g; g0) is labeled with i then we say that g0 is an i-successor of g in T . The global completion rules operate on completion trees. To introduce the rules we require the following de nitions. Given a constraint system S and 15
i < !, de ne an equivalence relation iS on the set of variables (not terms) occurring in S by taking v iS v0 i fC j (v : 2i C ) 2 S g = fC j (v0 : 2i C ) 2 S g: Denote by [v]iS the equivalence class (with respect to iS ) generated by a variable v and put [ Si = fmin([v]iS )g: v occurs in S
The global completion rules intended for constructing a completion tree for a formula # are shown in Fig. 3. Note that there we have two versions of the R# rule: one for unmarked variables and one for marked ones. This can be explained analogously to the double Rt rule discussed in Section 4. Indeed, the two versions of the rule are needed since R# is nondeterministic. As for Rt , it is not sucient to explore each nondeterministic choice separately, but, additionally, we must explore all possible combinations of nondeterministic choices simultaneously. The interested reader may check that, e.g., the satis able formula (> = 2i 2i C t 2i 2i D) ^ (a : 3i 3i(9R:C u 9R::C )) is judged unsatis able i the R# rule is used for marked variables instead of the R#0 rule. We say that a completion tree T contains a clash if there exists a node g in T such that L(g) contains a clash; otherwise T is called clash-free. T is said to be complete if no completion rule is applicable to T . To decide whether a given formula # in negation normal form is satis able, we form the initial completion tree T # consisting of a single node g0 labeled with the initial constraint system
S# = f#g [ fa : > j a 2 ob(#)g [ fv : >g; where v is a marked individual variable. After that we repeatedly apply both local and global completion rules in such a way that the R3f and R3c rules are applied only if no other rule is applicable. The tableau algorithm is shown in Fig. 4 in a pseudocode notation. Note that, if the rules R9 , R6=, Rt0 , and R#0 are applied with higher precedence than R3f and R3c but with lower precedence than all the remaining rules, the introduction of duplicate types is prevented. This is, however, not crucial for termination and correctness of the algorithm.
7 Termination and correctness In this section we show that the algorithm described above terminates and that it is sound and complete. By the length j'j of a formula ' we mean the number of symbols used to construct '. The modal depth md(') of ' is the length of the longest chain of 16
Global generating rules R3f If 3i ' 2 L(g) and ' 2= L(g0 ), for all i-successors g0 of g, then construct a new i-successor g0 of g and set L(g0 ) to the union of the following sets:
f'g f j 2i 2 L(g)g fa : > j a 2 ob(#)g fv :[>g fa : C j (a : 2i C ) 2 L(g)g fu : C j (u : 2i C ) 2 L(g)g u2(L(g))i
where v is the only marked variable in L(g0 ) and v 2= (L(g))i . R3c If (x : 3i C ) 2 L(g) and for all i-successors g0 of g and terms y, fC g [ fE j (x : 2i E ) 2 L(g)g 6 fE j (y : E ) 2 L(g0 )g then construct a new i-successor g0 of g and set L(g0 ) to the union of the following sets: fv0 : C g f j 2i 2 L(g)g fv0 : D j (x : 2i D) 2 L(g)g fa : > j a 2 ob(#)g fv :[>g fa : C j (a : 2i C ) 2 L(g)g fu : C j (u : 2i C ) 2 L(g)g u2(L(g))i
where v is the only marked variable in L(g0 ), v 6= v0 , and v; v0 2= (L(g))i .
Global non-generating rules R#
R#0
If g0 is an i-successor of g, v an unmarked variable in L(g0 ), and for no term x in L(g) do we have fC j (x : 2i C ) 2 L(g)g f C j (v : C ) 2 L(g0 )g; then nondeterministically choose a marked variable v0 in L(g) and set L(g0 ) := L(g0 ) [ fv : C j (v0 : 2i C ) 2 L(g)g: If g0 is an i-successor of g, v a marked variable in L(g0 ), for no term x in L(g) do we have fC j (x : 2i C ) 2 L(g)g f C j (v : C ) 2 L(g0 )g; and X is the set of marked variables occurring in L(g) then nondeterministically choose a non-empty subset Y = fv1 ; : : : ; vk g of X and set S? 1 := L(g0 ) [ fv : D j (v1 : 2i D) 2 L(g)g, Sj := Sj?1 + fD j (vj : 2i D) 2 L(g)g [ fE j (v : E ) 2 L(g0 )g , for all 1 < j k, and L(g0 ) := Sk . Figure 3: Global completion rules for KALC .
17
de ne procedure sat(T ) if T contains a clash then return unsatis able if a completion rule r dierent from R3f and R3c is applicable to T then apply r to T return sat(T ) if a completion rule r which is either R3f or R3c is applicable to T then apply r to T return sat(T ) return satis able Figure 4: The satis ability-checking algorithm for KALC . nested modal operators in ' (both in subformulas and subconcepts); the modal depth md(x : C ) of a constraint x : C is de ned analogously. The modal depth md(S ) of a constraint system S is the maximal modal depth of constraints in S . The depth of a tree is the number of edges in its longest branch; the outdegree is the maximal number of immediate successors of nodes in the tree. Lemma 16. Let T be a completion tree for a formula # constructed by the algorithm and let g be a node in T . Then the number of constraints of the form x : C in L(g) is bounded by 2cj#j , where c is a constant. 2
Proof Let us rst determine an upper bound for the number of distinct terms
per node label. By the de nition of completion trees and constraint systems, all object names occurring in node labels are from ob(#). So the number of distinct object names in a label does not exceed j#j. At the moment of its generation, the node g (its label, to be more precise) contains not more than 22j#j distinct unmarked variables and a single marked one (see the de nitions of the R3f and R3c rules and that of Fg(#)). Consider now the rules that can introduce new variables in L(g). First for the unmarked variables: R6= can add at most j#j new variables and R9 at most 22j#j (i.e., not more than the number of distinct subsets of concepts in Fg(#)). We now consider marked variables, which are introduced by the Rt0 and R#0 rules. De ne a tree T whose nodes are the marked variables in L(g) and whose edges are labeled with either Rt0 or R#0 as follows: The root node is the initial marked variable in L(g). If a completion rule r 2 fRt0 ; R#0 g is applied to a marked variable v generating new marked variables v1 ; : : : ; vk , then vi is successor of v in T and the edge between v and vi is labelled with r for 1 i k. Using the de nition of Rt0 , R#0 , and of Fg(#), it is not hard to see that the depth of T is bounded by 2 j#j. Moreover, each node has at most 2 j#j + 22j#j successors: at most 2 j#j outgoing edges labelled with Rt0 and at most 22j#j outgoing edges labelled with R#0 . Hence, the number of nodes in the tree is 18
bounded by (2 j#j +22j#j )2j#j+1 ? 1 28j#j which is hence the maximum number of marked variables in L(g). To sum up, we can have at most 2 j#j + 2 22j#j + 28j#j distinct terms in L(g), and so at most 2
2
2 j#j (2 j#j + 2 22j#j + 28j#j ) 2
distinct constraints of the form x : C . 2 Lemma 17. Let T be a completion tree for # constructed by the algorithm. Then the depth of T is bounded by j#j and the outdegree of T does not exceed 2dj#j, where d is a constant. Proof If g0 is a successor of g in T , then clearly md(L(g0 )) md(L(g)): So the depth of T is at most md(#) j#j. Now we compute the outdegree. Let g be a node in T . Each successor of g in T is generated by an application of the R3f rule to some formula 3i' or by an application of the R3c rule to some constraint x : 3iC in L(g). The number of applications of the R3f rule is obviously bounded by the number of distinct formulas in L(g), i.e., by j#j. Moreover, by de nition of the R3c rule, the number of applications of this rule is bounded by 22j#j (i.e., the number of distinct subsets of concepts in Fg(#)). Thus, the outdegree of T does not exceed j#j + 22j#j . 2 We are in a position now to prove termination. Theorem 18. Having started on the initial completion tree Td #, the (nondeterministic) completion algorithm terminates after at most 2j#j steps, where d is a constant. Proof In view of Lemma 17, there is a constant f such that the number off nodes in each completion tree constructed by the algorithm is at most 2j#j . As every global generating rule adds a new node, the number of applications of such rules is bounded by the same number. Now let us compute the number of applications of local rules on formulas. Since jfor(#)j j#j, formulas are never deleted from node labels, and since each local rule on formulas introduces a new formula to a node label, there may be at most j#j applications of rules of this type per node. So the total number of applications of local rules on formulas is bounded by 2j#jf j#j. Finally, each of the local non-generating rules on concepts, local generating rules and global non-generating rules adds a new constraint of the form x : C to a constraint system L(g). By Lemma 16, the number of such constraints per node is bounded by 2cj#j for some constant c. Thus, the number of applications of these rules per node is atf most 2cj#j . The total number of such rule applications is then bounded by 2j#j 2cj#j . 2 2
2
2
19
Theorem 18 states that the nondeterministic tableau algorithm terminates after exponentially many steps (in the length of the input formula). Together with the soundness and completeness yet to be established, this provides us with a NEXPTIME satis ability-checking procedure for KALC -formulas. Since the satis ability problem for KALC -formulas is NEXPTIME-complete [26], our algorithm is optimal with respect to the worst case complexity. Let us now turn to soundness. De nition 19 (local correctness). Say that a #-frame F = hW; ; i is locally correct if it satis es the following conditions: (i) if (3i ') 2 (w) for some w 2 W , then there exists a w0 2 W such that w i w0 and ' 2 (w0 ); (ii) if (a : 3i C ) 2 (w) for some w 2 W and a 2 ob(#), then there exists a w0 2 W such that w i w0 and (a : C ) 2 (w0 ); (iii) if (v : 3iC ) 2 (w) for some w 2 W and = fE j (v : 2i E ) 2 (w)g, then there exist a world w0 2 W and a term x such that w i w0 and [ fC g fE j (x : E ) 2 (w0 )g; (iv) if w i w0 then: (a) (2i ') 2 (w) implies ' 2 (w0 ), (b) fE j (a : 2i E ) 2 (w)g fE j (a : E ) 2 (w0 )g for all a 2 ob(#), (c) for each variable v in (w), there exists a term x in (w0 ) such that fE j (v : 2i E ) 2 (w)g fE j (x : E ) 2 (w0 )g, (d) for each variable v in (w0 ), there exists a term x in (w) such that fE j (x : 2i E ) 2 (w)g fE j (v : E ) 2 (w0 )g. It is not hard to see that each quasimodel for # is also a locally correct #frame. However, the converse does not hold. Consider, for example, the #-frame F = hW; ; i with W = fw; w0 g, i = fhw; w0 ig, (w) = fv : 3iC; v : 3iDg and (w0 ) = fv : C; v0 : Dg (it does not matter how # actually looks like). F is obviously locally correct, but it cannot be a quasimodel for #: there exists no run r with r(w) = v, because whatever choice r(w0 ) = v or r(w0 ) = v0 we make, the rst property in the de nition of runs does not hold. However, it is not dicult to modify F and convert it in a quasimodel. Indeed, construct a new #-frame F0 by duplicating the world w0 in F, i.e., by adding a new world w00 to F in such a way that w i w00 and (w00 ) = (w0 ). It should be clear that F0 is both a locally correct #-frame and a quasimodel for #. The next lemma generalizes this observation. 20
Lemma 20. A KALC -formula # is quasi-satis able i there exists a locally correct #-frame F = hW; ; i with a world w# 2 W such that # 2 (w# ). Proof The implication ()) is trivial. Let us prove ((). As in the example
above, we construct a quasimodel F0 satisfying # by duplicating worlds in the given locally correct #-frame F. Let W be a set containing 2 jcon(#)j + 1 `copies' w(1) ; : : : ; w(2jcon(#)j+1) of each world w 2 W , i.e.,
W = fw(i) j w 2 W and 1 i 2 jcon(#)j + 1g: By bw(i) c we denote the `parent' w 2 W of w(i) . De ne a thread p in F as a pair of sequences
x1 ; : : : ; xk and i1 ; : : : ; ik?1 such that x1 = w#(1) and bxi c ij bxi+1 c in F, for 1 j < k. Denote by tail(p) the last world in p (i.e., xk in our case) and, whenever btail(p)c ik bxk+1 c, let pik xk+1 be the thread
x1 ; : : : ; xk ; xk+1 and i1 ; : : : ; ik?1 ; ik : Now we de ne a #-frame F0 = hW 0 ; 0 ; 0 i by taking W 0 to be the set of all threads in F, p 0i p0 i p0 = pi x for some x 2 W , 0 (p) = (btail(p)c). It is readily seen that 0i \ 0j = ; whenever i 6= j and that the structure
0 S W ; i
fE j (v : 2i E ) 2 0 (w)g [ fC g fE j (xj : E ) 2 0 (wj )g:
We now show that F0 is a quasimodel for #, i.e., that it satis es conditions 1{4 from De nition 13. Conditions 1, 3, and 4 follow immediately from (ii), (i), and (iv) in De nition 19. Let us prove condition 2 claiming that, for every variable v in every 0 (w0 ), w0 2 W 0 , there is a run r coming through v. We construct r by induction. To begin with, we put r(w0 ) = v. Now two cases are possible. 21
Case #: Suppose that r(w0 ) has been already de ned and w 0i w0 with unde ned r(w). By (iv.d) in De nition 19, there is a term x such that fE j (x : 2i E ) 2 0 (w)g fE j (r(w0 ) : E ) 2 0 (w0 )g: Then we put r(w) = x. We proceed with Case # till (in nitely many steps) we reach the root of F0 . After that we switch to Case ": Suppose that r(w) has already been de ned, but there is w0 0i w with unde ned r(w0 ). Let 3i C1 ; : : : ; 3ik Ck be all distinct concepts in Fg(#) of the form 3iC such that (r(w) : 3ij Cj ) 2 0 (w), 1 j k .By de nition of Fg, we have k 2 jcon(#)j. For every such 3ij Cj we choose by (iii') a world wj with unde ned r(wj ) and a term xj such that fE j (v : 2ij E ) 2 0 (w)g [ fCj g fE j (xj : E ) 2 0 (wj )g and wj 6= wl whenever j 6= l. Put r(wj ) = xj . If we still have a world w0 0i w with unde ned r(w0 ), then we use condition (iv.c) in De nition 19, according to which there is a term x such that fE j (r(w) : 2i E ) 2 0 (w)g fE j (x : E ) 2 0 (w0 )g: Then we set r(w0 ) = x. It should be clear from the de nition that r is a run in F0 coming through v in 0 (w). Thus F0 is a quasimodel for #. That # is satis ed in F0 follows from # 2 (w# ). 2 Theorem 21 (soundness). If there is a complete and clash-free completion tree for a KACL -formula #, then # is satis able. Proof Let T be a complete and clash-free completion tree for #. De ne a structure F = hW; ; i by taking W to be the set of nodes in T , w i w0 i w0 is an i-successor of w in T , (w) = unmark(L(w)), where unmark(L(w)) is the constraint system obtained by `unmarking' the marked variables in L(w). By Lemma 20 and Theorem 14, it is sucient to show that F is a locally correct #-frame. Clearly, every (w), for w 2 W , is a saturated clash-free constraint system, i.e., a quasiworld for #. So F is a #-frame. Let us see why it is locally correct. Conditions (i){(iii) are satis ed simply because the rules R3f and R3c are not applicable to T in view of its completeness. Let (2i ') 2 L(w) and w i w0 . Then w0 has been generated by an application of a global generating rule (either R3f or R3c). As these rules are applied only when no other rule is applicable, 2i ' was already in L(w) by the moment of the application of that rule, and so ' 2 L(w0 ). This proves (iv.a). Conditions (iv.b) and (iv.c) are proved analogously, and (iv.d) follows from that the rules 2 R# and R#0 are not applicable to T . 1
22
Theorem 22 (completeness). If a KALC -formula # is satis able then, having started from T # , the satis ability-checking algorithm for KALC will construct a complete and clash-free completion tree for #. Proof If # is satis able then it is quasi-satis able, and so by Lemma 20, there are a locally correct #-frame F = hW; ; i and a world w# 2 W such that # 2 w# . We use F as a `guide' for applications of the non-deterministic rules to construct a complete and clash-free completion tree for #. We assume that, for every w 2 W , (x : >) 2 (w) whenever x is a variable which occurs in (w). (This is achieved by adding (x : >) to (w) whenever necessary. The resulting structure is still a locally correct #-frame.) Say that a completion tree T for # is F-compatible if the following holds: 1. there is a map from the set of nodes in T to W such that if g0 is an i-successor of g in T , then (g) i (g0 ) and if ' 2 L(g) then ' 2 ((g)), for every ' 2 for(#); 2. for each node g in T , there is a total surjective function g from the set of terms in ((g)) to the set of marked variables in L(g) such that if (v : C ) 2 L(g) and g (x) = v then (x : C ) 2 ((g)), and 3. for each node g in T , there is a total function g from the set of unmarked terms in L(g) to the set of terms in ((g)) such that if (x : C ) 2 L(g) then (g (x) : C ) 2 ((g)). Claim. If a completion tree T for # is F-compatible and T 0 is the result of an application of a rule R to T , then T 0 is F-compatible as well. Proof. Let T be an F-compatible completion tree, g a node in T , and let , g , and g be the functions supplied by the de nition of F-compatibility. Consider all possible cases for R. Suppose that the R^ rule is applicable to a formula ' ^ in L(g). Since T is F-compatible, (' ^ ) 2 ((g)) and since F is a #-frame, ((g)) is saturated, which means, in particular, that f'; g ((g)). The application of R^ to L(g) adds ' and to L(g). Then the0 very same functions , g , and g ensure that the resulting completion tree T is F-compatible. Suppose that the R_ rule is applicable to a formula ' _ in L(g). Then, as we know, (' _ ) 2 ((g)), and so either ' or is in ((g)). By applying the R_ rule to L(g) accordingly, we clearly obtain an F-compatible completion tree. Suppose that the Ru rule is applicable to a constraint x : C u D in L(g). Let y be a term in ((g)) such that either g (x) = y (x is unmarked in L(g)) or g (y) = x (x is marked in L(g)). In both cases we have (y : C u D) 2 ((g)). The non-applicability of Ru to ((g)) means that fy : C; y : Dg ((g)). The application of the Ru rule adds x : C and x : D to L(g). Hence, the functions , g , and g are as required for the resulting completion tree T 0 . Suppose the Rt rule is applicable to a constraint x : C t D in L(g). Then x is unmarked in L(g). Clearly, either g (x) : C or g (x) : D is in ((g)). By
23
applying the Rt rule to L(g) accordingly, we see that the functions , g , and g are as required (since x is unmarked in L(g), there is no term y in ((g)) such that g (y) = x). Suppose that the Rt0 rule is applicable to v : C t D in L(g). Then v is marked in L(g). Let Y be the set of terms y in ((g)) for which g (y) = v. Since g is surjective, Y is non-empty. The non-applicability of the Rt rule to ((g)) (recall that all variables in ((g)) are unmarked) means that fy : C; y : Dg \ ((g)) 6= ; for each y 2 Y . Put YC = fy 2 Y j (y : C ) 2 ((g))g; YD = Y n Y C : The application of Rt0 adds either (i) v : C or (ii) v : D to L(g), or (iii) it creates `a marked copy' v0 of v, for which, additionally, (v0 : D) 2 L(g) holds, and then adds v : C to L(g). If YC = ;, apply the rule in such a way that v : D is added. If YD = ;, apply the rule so that v : C is added. Otherwise we apply the rule in the third possible way. In the rst two cases, , g , and g are as required for the resulting completion tree T 0 . In the third case, de ne 0 if y 2 YD g0 (y) = v (y) otherwise g
and h0 = h for all h 6= g. The functions , g , and g0 ensure that T 0 is F-compatible. Suppose that the R= rule is applicable to a formula C = > and a term x in L(g). Then (C = >) 2 ((g)). Let y be a term in ((g)) such that either g (x) = y (x is unmarked in L(g)) or g (y) = x (x is marked in L(g)). Nonapplicability of R= to ((g)) means that (y : C ) 2 ((g)). Hence, after the application of R= (which adds x : C to L(g)), the functions , g , and g will be as required for the resulting completion tree T 0 . Suppose R6= is applicable to :(C = >) in L(g). Then :(C = >) 2 ((g)) and there is a term y in ((g)) such that (y : C ) 2 ((g)). By applying R6= to L(g), we introduce a new (unmarked) variable x. De ne g0 as the extension of g to x with g0 (x) = y, and put h0 = h for all h 6= g. The functions , g , and g0 are then as required for the resulting completion tree T 0 . Let R9 be applicable to x : 9R:C in L(g). Let y be a term in ((g)) such that either g (x) = y (x is unmarked in L(g)) or g (y) = x (x is marked in L(g)). In both cases, we have (y : 9R:C ) 2 ((g)) and can proceed as in the R6= case (note that the newly generated variable is unmarked in any case). Now we come to the global rules and suppose that R3f is applicable to 3i ' in L(g). Let (g) = w. Then (3i') 2 (w). The rule application generates an i-successor g0 of g. In view of (i) in De nition 19, there is a w0 2 W such that w i w0 and ' 2 (w0 ). Set (g0 ) = w0 . It remains to de ne g0 and g0 . The terms occurring in L(g0 ) are the object names in ob(#), one marked variable v, and a set of unmarked variables v1 ; : : : ; vk . Set 1. g0 (a) = a for every a 2 ob(#), 24
2. g0 (vj ) = min x j fE j (vj : E ) 2 L(g0 )g fE j (x : E ) 2 (w0 )g for 1 j k, and 3. g0 (x) = v for every term x in (w). The function g0 is well-de ned for all unmarked variables v1 ; : : : ; vk in L(g). Indeed, x a j 2 f1; : : : ; kg. By the de nition of the R3f rule, there is a variable v such that
fE j (v : 2i E ) 2 L(g)g = fE j (vj : E ) 2 L(g0 )g: By the de nition of F-compatibility and (iv.b), (iv.c) in De nition 19, it follows that there is a term x such that
fE j (vj : E ) 2 L(g0 )g fE j (x : E ) 2 (w0 )g: It is easy to see that the de ned functions , g , and g are as required. The case of R3c is considered analogously. Suppose that R# is applicable to a variable v in a L(g0 ) and (g0 ) = w0 . Then v is unmarked in L(g0 ) and there is a node g such that g0 is i-successor of g in T . Let g0 (v) = x and (g) = w. By the de nition of F-compatibility, we have w i w0 , and by (iv.d) in De nition 19, there is a term y such that
fE j (y : 2E ) 2 (w)g fE j (x : E ) 2 (w0 )g: The rule application nondeterministically chooses a marked variable v0 in L(g) and augments L(g0 ) by fv : D j (v0 : 2i D) 2 L(g)g. Take v0 = g (y) (which exists, since g is total). The functions , g , and g are as required for the resulting completion tree T 0 . Indeed, let (v0 : 2i D) 2 L(g). Since v0 = g (y), we have (y : 2i D) 2 (w) by the de nition of F-compatibility. By the choice of y, (x : D) 2 (w0 ) and, since g0 (v) = x, we can safely add v : D to L(g0 ). Finally, assume that the R#0 rule is applicable to a marked variable v1 in L(g0 ) and that (g0 ) = w0 . Then there is a node g such that g0 is an i-successor of g. Let X be the set of terms x in ((g0 )) for which g0 (x) = v1 and let (g) = w. As g0 is surjective, X 6= ;. For each x 2 X , x a term yx such that
fE j (yx : 2E ) 2 (w)g fE j (x : E ) 2 (w0 )g (such terms exist by (iv.d) in De nition 19). The rule application chooses a non-empty subset Y of the marked variables in L(g). Let
Y = fv0 j 9x 2 X : g (yx ) = v0 g: Since g is total, Y is non-empty. Let v10 ; : : : ; vk0 be all its elements. The application of R#0 does the following: it generates k ? 1 `marked copies' v2 ; : : : ; vk of v1 and then augments L(g0 ) with fvj : D j (vj0 : 2i D) 2 L(g)g for 1 j k. 25
Put
vj if x 2 X and g (yx ) = vj0 g (x) otherwise and h0 = h if h 6= g. It is obvious that the functions and g are as required for the resulting completion tree T 0 (note that g (vj ) is unde ned for 1 j k). We show that g0 is also as required. Assume that the rule application added a constraint (vj : D) to L(g0 ) and x a term x such that g0 (x) = vj . Then (vj0 : 2i D) 2 L(g). By the de nition of X and g0 , we have g (yx ) = vj0 , which yields (yx : 2i D) 2 (w). By the choice of yx , we then obtain (x : D) 2 (w0 ). 0 (x) = g
The claim is proved. Now, returning to the proof of the completeness theorem, we show that it follows from the claim above. Let T # be the initial completion tree for #, g the node in T # , and v the marked variable in L(g). Set (g) = w# (recall that we have # 2 w# ), g (x) = v for all x in (w), and g (a) = a for all a 2 ob(#). It is readily checked that these functions ensure that T # is F-compatible. By the claim above, the completion rules can be applied in such a way that the resulting completion trees are F-compatible. According to Theorem 18, we then eventually construct a complete F-compatible completion tree T . It remains to show that T is clash-free. Suppose otherwise. Let , g , and g (g a node in T ) be the functions supplied by the de nition of F-compatibility. Two cases are possible. Case 1: there is a node g in T such that fx : A; x : :Ag L(g) for some term x and concept name A. Suppose rst that x is unmarked. Then we have fg (x) : A; g (x) : :Ag ((g)), contrary to F being a #-frame. Now assume that x is marked. Since g is surjective, there is a term y in ((g)) such that g (y) = x. But then fy : A; y : :Ag ((g)), which is again a contradiction. Case 2: Clashes of the form :> 2 L(g) or (x : :>) 2 L(g) are considered analogously. 2
8 Conclusion In this paper, we developed a tableau algorithm that is capable of reasoning with a modalized description logic under the constant domain assumption. The presented work can be extended in at least two interesting directions. First, the considered logic KALC is still too weak for many application areas. For adequate temporal reasoning or reasoning about knowledge and belief, we are interested in extending our results to modalized DLs with a dierent modal component, such as, e.g., S4, KD45, or Since/Until Logic. We conjecture that the ideas underlying our algorithm can be applied to many modalized DLs. As the next step, we plan to combine the results given in this paper with the ones given in [31], i.e., to develop a tableau algorithm for Until Logic based on ALC with constant domains. To do this, it is necessary to combine the ideas of marked variables and duplication of nondeterministic rules presented in this paper with blocking in the modal component as performed in [31]. 26
Second, it would be interesting to investigate our claim of the presented tableau algorithm to be \practicable" in more detail. A straightforward implementation of the algorithm as described in this paper cannot be expected to have an acceptable run-time behavior. However, since the algorithm in this paper diers in several aspects from standard DL tableau algorithms, it is not clear if and how the known optimizations for tableau algorithms (see, e.g., [18, 19]) can be applied. Moreover, because of some of its unusual aspects, the presented algorithm may be amenable to new optimization techniques. For example, the marked variables could be used for early clash detection: Assume that constraints involving marked variables are expanded before constraints containing unmarked terms are expanded. If, during the expansion of constraints with unmarked variables, a constraint v : C for some concept C is obtained and we have v0 : :C for all marked variables v0 in the constraint system, then a clash can be reported immediately. An alternative strategy could be to delay the expansion of constraints with marked variables as long as possible. This means generating the minimal types `on demand', i.e., only if the R# rule needs to be applied. The application of this rule itself also calls for optimization, i.e., it seems possible to develop heuristics for guiding the nondeterministic choice of variables in this rule.
Acknowledgments The work in this paper was supported by the DFG Projects Ba1122/3-1 and Wo583/3-1 on \Combinations of Modal and Description Logics".
References [1] A. Artale and E. Franconi. A temporal description logic for reasoning about actions and plans. Journal of Arti cial Intelligence Research (JAIR), (9), 1998. [2] A. Artale and C. Lutz. A correspondance between temporal description logics. In P. Lambrix, A. Borgida, M. Lenzerini, R. Moller, and P. PatelSchneider, editors, Proceedings of the International Workshop on Description Logics (DL'99), number 22 in CEUR-WS, pages 145{149, Linkoeping, Sweden, July 30 { August 1 1999. [3] F. Baader, M. Buchheit, and B. Hollunder. Cardinality restrictions on concepts. Arti cial Intelligence, 88(1{2):195{213, 1996. [4] F. Baader and P. Hanschke. A scheme for integrating concrete domains into concept languages. In J. Mylopoulos and R. Reiter, editors, Proceedings of the Twelfth International Joint Conference on Arti cial Intelligence IJCAI-91, pages 452{457, Sydney, Australia, August 24{30, 1991. Morgan Kaufmann Publ. Inc., San Mateo, CA, 1991. 27
[5] F. Baader and A. Laux. Terminological logics with modal operators. In C. Mellish, editor, Proceedings of the 14th International Joint Conference on Arti cial Intelligence, pages 808{814, Montreal, Canada, 1995. Morgan Kaufmann. [6] F. Baader and H.-J. Ohlbach. A multi-dimensional terminological knowledge representation language. J. Applied Non-Classical Logics, 5:153{197, 1995. [7] D. Calvanese, G. De Giacomo, and M. Lenzerini. Reasoning in expressive description logics with xpoints based on automata on in nite trees. In Proc. of the 16th Int. Joint. Conf. on Arti cial Intelligence (IJCAI'99), 1999. [8] G. De Giacomo and M. Lenzerini. TBox and ABox reasoning in expressive description logics. In Proceedings of the Fifth International Conference on the Principles of Knowledge Representation and Reasoning (KR'96), pages 316{327. Morgan Kaufmann Publishers, 1996. [9] P. T. Devanbu and D. J. Litman. Taxonomic plan reasoning. Arti cial Intelligence, 84:1{35, 1996. [10] F. M. Donini, M. Lenzerini, D. Nardi, and A. Schaerf. Adding epistemic - operators to concept languages. In W. Nebel, Bernhard; Rich, Charles; Swartout, editor, Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning, pages 342{356, Cambridge, MA, Oct. 1992. Morgan Kaufmann. [11] F. M. Donini, M. Lenzerini, D. Nardi, and A. Schaerf. Reasoning in description logics. In G. Brewka, editor, Foundation of Knowledge Representation, pages 191{236. CSLI-Publications, 1996. [12] R. Fagin, J. Halpern, Y. Moses, and M. Vardi. Reasoning about Knowledge. MIT Press, 1995. [13] D. Gabbay and V. Shehtman. Products of modal logics, part 1. Journal of the IGPL, 6:73{146, 1998. [14] V. Haarslev and R. Moller. An empirical evaluation of optimization strategies for ABox reasoning in expressive description logics. In P. Lambrix, A. Borgida, M. Lenzerini, R. Moller, and P. Patel-Schneider, editors, Proceedings of the International Workshop on Description Logics (DL'99), number 22 in CEUR-WS, pages 115{119, Linkoeping, Sweden, July 30 { August 1 1999. [15] V. Haarslev and R. Moller. RACE system description. In P. Lambrix, A. Borgida, M. Lenzerini, R. lf Moller, and P. Patel-Schneider, editors, Proceedings of the International Workshop on Description Logics, number 22 in CEUR-WS, pages 140{141, Linkoeping, Sweden, July 30 { August 1 1999. 28
[16] I. Hodkinson, F. Wolter, and M. Zakharyaschev. Decidable fragments of rst-order temporal logics. Annals of Pure and Applied Logic, 2000. [17] B. Hollunder and W. Nutt. Subsumption algorithms for concept languages. DFKI Research Report RR-90-04, Deutsches Forschungszentrum fur Kunstliche Intelligenz, Kaiserslautern, 1990. [18] I. Horrocks. Optimising Tableaux Decision Procedures for Description Logics. Phd thesis, University of Manchester, 1997. [19] I. Horrocks. Using an expressive description logic: Fact or ction? In A. Cohn, L. Schubert, and S.C.Shapiro, editors, Principles of Knowledge Representation and Reasoning { Proc. of the Sixth International Conference KR'98, pages 636{647, Trento, Italy, June 2{5, 1998. Morgan Kaufmann Publ. Inc., San Francicso, CA, 1998. [20] I. Horrocks. FaCT and iFaCT. In P. Lambrix, A. Borgida, M. Lenzerini, R. lf Moller, and P. Patel-Schneider, editors, Proceedings of the International Workshop on Description Logics, number 22 in CEUR-WS, pages 133{135, Linkoeping, Sweden, July 30 { August 1 1999. [21] I. Horrocks, P. Patel-Schneider, and R. Sebastiani. An analysis of empirical testing for modal decision procedures. Logic Journal of the IGPL, 8(3):293{ 323, 2000. [22] I. Horrocks and U. Sattler. A description logic with transitive and inverse roles and role hierarchies. Journal of Logic and Computation, 9(3), 1999. [23] I. Horrocks, U. Sattler, and S. Tobies. Practical reasoning for expressive description logics. In H. Ganzinger, D. McAllester, and A. Voronkov, editors, Proceedings of the 6th International Conference on Logic for Programming and Automated Reasoning (LPAR'99), number 1705 in Lecture Notes in Arti cial Intelligence, pages 161{180. Springer-Verlag, Sept. 1999. [24] G. Hughes and M. Cresswell. A New Introduction to Modal Logic. Methuen, London, 1996. [25] U. Hustadt and R. A. Schmidt. Issues of decidability for description logics in the framework of resolution. In R. Caterra and G. Salzer, editors, Automated Deduction in classical and non-classical logic, volume 1761 of Lecture Notes in Arti cial Intelligence, pages 191{205. Springer-Verlag, 1996. [26] M. Mosurovic and M. Zakharyaschev. On the complexity of description logics with modal operators. In P. Kolaitos and G. Koletos, editors, Proceedings of the 2nd Panhellenic Logic Symposion, pages 166{171, Delphi, Greece, 1999. [27] P. Patel-Schneider. DLP. In P. Lambrix, A. Borgida, M. Lenzerini, R. lf Moller, and P. Patel-Schneider, editors, Proceedings of the International Workshop on Description Logics, number 22 in CEUR-WS, pages 140{141, Linkoeping, Sweden, July 30 { August 1 1999. 29
[28] K. D. Schild. A correspondence theory for terminological logics: Preliminary report. In Proc. of the 13 th IJCAI, pages 466{471, Sidney, Australia, 1991. [29] K. D. Schild. Combining terminological logics with tense logic. In M. Filgueiras and L. Damas, editors, Progress in Arti cial Intelligence { 6th Portuguese Conference on Arti cial Intelligence, EPIA'93, Lecture Notes in Arti cial Intelligence, pages 105{120, Porto, Portugal, Oct. 1993. Springer-Verlag. [30] M. Schmidt-Schau and G. Smolka. Attributive concept descriptions with complements. Arti cial Intelligence, 48(1):1{26, 1991. [31] H. Sturm and F. Wolter. A tableau calculus for temporal description logic: The expanding domain case. to appear, 2000. [32] J. van Benthem. Temporal logic. In D. Gabbay, C. Hogger, and J. Robinson, editors, Handbook of Logic in Arti cial Intelligence and Logic Programming, Volume 4, pages 241{350. Oxford Scienti c Publishers, 1996. [33] F. Wolter. The product of converse PDL and polymodal K. Journal of Logic anf Computation, 10:223{251, 2000. [34] F. Wolter and M. Zakharyaschev. Satis ability problem in description logics with modal operators. In A. G. Cohn, L. Schubert, and S. C. Shapiro, editors, KR'98: Principles of Knowledge Representation and Reasoning, pages 512{523. Morgan Kaufmann, San Francisco, California, 1998. [35] F. Wolter and M. Zakharyaschev. Modal description logics: modalizing roles. Fundamenta Informaticae, 39:411{438, 1999. [36] F. Wolter and M. Zakharyaschev. Multi-dimensional description logics. In D. Thomas, editor, Proceedings of the 16th International Joint Conference on Arti cial Intelligence (IJCAI-99-Vol1), pages 104{109, S.F., July 31{ Aug. 6 1999. Morgan Kaufmann Publishers. [37] F. Wolter and M. Zakharyaschev. Temporalizing description logic. In D. Gabbay and M. de Rijke, editors, Frontiers of Combining Systems, pages 379{402. Studies Press/Wiley, 1999. [38] F. Wolter and M. Zakharyaschev. Dynamic description logic. In K. Segerberg, M. de Rijke, H. Wansing, and M. Zakharyaschev, editors, Advances in Modal Logic, Volume 2. CSLI Publications, 2000. [39] F. Wolter and M. Zakharyaschev. Decidable fragments of rst-order modal logics. Journal of Symbolic Logic, 2001.
30