that an attribute 'parent' of 'joe' takes a value 'jim' is described as. (joe, jim) E ...... 206-217, 1983. Chen, W., Kifer, M., and Warren, D. S., âHilog as a Platform for.
Query Processing for a Knowledge-Base Using DOT Algebra Masahiko T S U K A M O T O l
Shojiro NISH102
Mitsuhiko FUJIO'
3 Department of Information
Information System Research and Development Center S H A R P Corporation Tenri, Nara 632, JAPAN
and Computer Sciences Osaka University Toyonaka, Osaka 560, JAPAN
Abstract
In our previous papers[l3, 141, we proposed a new extended term representation DOT (Deductive and Object-oriented Term representation) aiming at providing a capable framework for realizing the DOOD. Our knowledge representation scheme DOT has the following characteristics:
In this paper, we consider the query processing problem for a class of knowledge-based systems consisting of object names and labels. Each knowledge-based system in this class represents an IS-A relation and inheritance, and its two important features are (1) no distinction between type and entity, and (2) the capability to represent virtual objects. In order to discuss the query processing problem for such a system, a new algebra called DOT algebra is introduced. We show that a query concerning the ISA relation is reduced to the problem to find the set of upper and/or lower bounds of a certain element of the DOT algebra. The answer to a query can be expressed by a regular expression, which is obtained by constructing an automaton from the query. We also show an implementation plan of our proposed knowledge-based system in a distributed system environment.
1
1. No distinction between type and entity. 2. Virtual objects represented by DOT expressions.
3. Consideration of an IS-A relation among DOT expressions. In those papers, the definition of object in our DOOD model was formally defined, and then the IS-A relation was considered on the power set of DOT expressions. Though such a formalization gave us powerful concepts for knowledge representation, it is also possible to consider DOT on the plain domain (not power set) for its sufficient and effective use in practical application area such as dialogue understanding system. In this paper, we will discuss a primitive aspect for the foundation of knowledge representation and consider a DOT query processing problem using a new algebra called DOT algebra for a simplified version of DOT with the plain domain. We will not . discuss an extended term representation in this paper, but represent a knowledge-based system with an IS-A relation and a type of inheritance using the DOT algebra. It is shown that a query concerning the IS-A relation is reduced to the problem to find the set of upper and/or lower bounds of a certain element of the DOT algebra. Then we demonstrate that the answer to a query can be expressed by a regular expression[6], which is obtained by constructing an automaton[6] from the query. We discuss the reasons for making no distinction between type and entity in the next section. There, we also explain our approach of the representation of attributes, in spite of the usual ones based on binary relations, as in semantic network or frametheory. We show in section 3 that our representation can be regarded as a problem of deductive database with function synibols. We give the formalization of DOT algebra in section 4, and define a query concerning the IS-A relation in section 5. In section 6 , we discuss a query processing and show that the answer to a query can be expressed by a regular expression in automata theory. This query answering capability is very important because effective query evaluation methods for deductive databases with general function symbols are now interesting open problems. Section 7 is gives a brief description for our implementation plan of DOT in distributed environments. Several remarks related to this paper will be given in section 8.
Introduction
Deductive databases[l5] and object-oriented databases(l1 are at the forefront of research in next generation intelligent database systems. The deductive database has the capability of rule-based deduction based on the first order logic, however, it has been pointed out the insufficiency of its modeling ability. On the other hand, the object-oriented database has the strong capability of data modeling such as the complex structure of object and the inheritance among attribute values of objects. Recently, aiming at providing a powerful single framework for intelligent database systems, researches for integrating object-oriented paradigm and rule-based deduction are very active 12, 3, 5, 7, 8 , 9 , 11, 12, 13, 14, 17, 181. Such a database model i s called deductive and object-oriented database (DOOD)[lO]. Among them, F-logic[d, 91 gives a prominent knowledge representation based on the frame theory, and it provides very rich concepts required for the DOOD. However, it is important to consider simpler DOOD models which are sufficient and efficiently applicable to practical systems.
46
TH0372-3/91/0000/0046$01.OO 0 1991 IEEE
Masayuki M I Y A M O T O '
2
Knowledge Representation in DOT Scheme person parent
In this section, we first give our reason for thinking of type and entity without discrimination. Then under the comparison with the conventional representation of binary attributes, we present the motivation t o introduce the new notion of virtual (or a b stract) objects, i.e., DOT expressions.
2.1
(by inheritance)
IS- A (by inheritance)
Unified View of Type and Entity I
A type is usually used for representing a set of entities. In Flogic[9], type and entity are regarded as essentially different notions, though it is allowed t o employ the same name occasionally for a type and an entity. However, DOT does not discriminate type from entity a t all and treats attributes of both type and entity in the same manner. One of our motivation for the unified view of type and entity is due to the following observation regarding the real world. Now let us consider a student named ‘joe’. Usually ‘joe’ is regarded as an entity, and ‘student’ as well as ‘person’is regarded as a type. We can describe the constraints on attribute values of entities which belong to a certain type (such constraints are called domain constraints). For instance, a ‘parent’ of a ‘person’ is a ‘person’. In case of the database classifying the species of animals, it is reasonable to regard ‘person’ as an entity in the type of ‘mammal’, where the ‘number of eyes’, ‘legs’ or ‘ecology’ are considered to be the attributes of ‘person’. On the one hand, it is possible to treat ‘joe’ as the type with constraints such that his ‘first name’ is ‘Joseph’, his ‘sez’ is ‘male’, ‘jim’ is his ‘parent’, and so on. As instances of ‘joe’, we can consider ‘joe on April 10’or ‘joe at 200 p.m. on June 8’. In these instances, the values of the attributes, e.g., ‘age’, ‘hobby’, or ‘family name’, may be different. Consequently, the distinction between type and entity fully depends on the viewpoint (or contezt) in a given situation. Although it is sufficient to consider only one viewpoint as for the conventional applications of databases, it will be required for database systems to support various viewpoints in their advanced applications which handle the knowledge, for instance, the dialogue understanding system. Our representation DOT stands on the assumption that viewpoints should not be taken into account in the step of knowledge representation but in the one of knowledge processing. Thus we do not distinguish between type and entity. Another reason of indistinctness is due to our intention to provide a simple mathematical foundation for the inheritance relationship. For a more detailed discussion, see [14].
2.2
IS-A
this Pe rs on
parent
I * thatPerson
Figure 1: Usual Knowledge Representation ( j i n , joeqarent) E I S - A . Among the relations corresponding to many attributes, only an IS-A relation is treated in DOT while usual approaches employ others as well as IS-A relations. One of the advantages of this simple approach is in expressing the inheritance rule. For example, suppose that the following knowledge is given: A Lparent’of a ‘person’ is a ‘person’. ‘joe’ is a ‘person’. ‘jim’ is a ‘parent’ of ‘joe’. In usual approaches, the first and the last statements are expressed by ‘parent’ relation and the second one by an IS-A relation. We can derive ‘jim’ IS-A ‘person’ by using inheritance. Generally, the inheritance is a rule given as follows:
Usual.Inheritance If the p-attribute value of an object a is an object b, the p-attribute value of an object c is an object d , and c IS-A a , then d IS-A b holds. Now, to see that such inheritance rule does not work well, we add the following knowledge to the above: ‘mary’ is a ‘parent’ of ‘joe’. ‘thisperson’ is a ‘joe’. ‘thatPerson’ is a ‘parent’ of ‘thisperson’. Then we can derive the following sentences by using the inheritance rule (see Figure 1): ‘IhatPerson’ is a ‘person’. ‘thatPerson’ is a ‘jim’. ‘thatPerson’ i s a ‘mary’.
DOT Expression
Usually, in order to represent properties of objects, we use binary relations on the data domain consisting of objects. These relations corresponds to attributes. For example, the situation that an attribute ‘parent’ of ‘joe’ takes a value ‘jim’ is described
Although the first one is consistent with our intuition, the second and the third ones are not because we can not make sure whether ‘thatPerson’ is ‘jim’ or ‘mary’ from only such information. In some approaches, for avoiding such problems, relations are classified into several cases for each‘of which inheritance rules are prepared. In other approaches, attributes are classified into single-valued ones, set-valued ones, and others. For example, F-logic[9] employs the four classes of attributes, ‘4’ ‘+’, ) ‘*’
as (joe, j i m ) E parent. In DOT, this relationship is represented by using an IS-A relation on the data domain by extending the data using DOT notations: 41
can reduce the DOT representation to be a type of deductive databases as will be shown below. DOT expressions are represented by using unary function symbols, that is, a DOT expression a l l . . . . .lk is represented by a term lk(’. .( I I ( a ) ) .. .). The IS-A relation is expressed by a binary predicate isa, and satisfies:
IS-A
c-----’\
person
Reflexive low,
1
Transitive low, IS-A
thisPirson parent thisPehon.parent
-
Inheritance low (for each function symbol f):
thatperJon
i s a ( f ( X ) , f ( Y ) ):- i s a ( X , Y ) . For instance, the example in the previous section can be expressed by
Figure 2: DOT Representation
% Fact isa( parent(person) , person). isa(joe, person). isa( j i m , parent( joe)). isa(mary, parent(joe)). isa( thisperson, joe). ;sa( thatperson, parent( thisPerson)). % General Rule isa(X,X). i s a ( X , Y ) :- i s a ( X , Z ) , i s a ( Z , Y ) . isa(parent(X ) , parent( Y ) ):- isa(X ,Y)
and I=#’. In these cases, inheritance rules become more complicated and these models seem less general as the models of the real world. Furthermore, their computation models must become more complicated and it becomes very difficult to construct mathematical foundations. In DOT, an abstract object called DOT ezpression is used for representing such a kind of knowledge. Original three statements given above are expressed as the following IS-A statements: person.parent joe jim
IS-A person IS-A person IS-A joe.parent
(1) (2) (3)
.
For deductive databases with general function symbols, the development of effective query processing methods is one of the most interesting open problems. Our work is a proposal of query processing for the class of deductive databases with the above constraints.
where ‘person.parent’ can be considered as an object representing the value of attribute ‘parent’ of ‘person’ in the abstract, and ‘joe.parent’ can be considered in the same way. Using DOT expression, the inheritance rule can be simply expressed as follows:
DOTJnheritance
4
If a IS-A b then a.p IS-A b.p.
Hereafter, we assume that a finite set A of object names and a finite set L of labels are given and fixed. A product of k-elements in L (k 2 0)
In this example, we can obtain joe.parent
IS-A
person.parent
11. *
IS-A
person
by (l),(3) and the transitivity of the inheritance. The supplement knowledge is expressed by the IS-A statements: mary thisperson thatPerson
IS-A IS-A IS-A
* *
./k
is called a label expression. Here we denote the product by ‘.’ (dot!). The positive integer k is called the length of the label expression. The unique label expression whose length is 0 is called the empty label expression and denoted by f . The set of all label expressions generated by L is called the label ezpression set and denoted by La. The product
by (21, and jim
DOT Algebra
joe.parent. joe. thisPerson.parent.
.Ik
a.11..
of an element a of A and an element [ I . . . . .lk of L* is called a DOT expression. The set of all DOT expressions generated by A and L* is called DOT ezpression set and denoted by AL’. For X 1 = 11 .-...I k , X 2 = 1; I;, E L’, the expression X l . X 2 denotes the label expression Il’. + .lk.li.. . . .I;,. Then, for arbitrary X l , X 2 , X 3 E L’,
Then it follows that ‘thatperson’ is a ‘person’, but no IS-A relationship between ‘thatperson’ and ‘jim’ or ‘mary’ can be derived (see Figure 2). As it was shown in the above, properties of objects in DOT is expressed by IS-A relation among DOT expressions.
. . . e .
.
.
= Xl.(XZ.X3), = X1.E = A1
(Xl.X2).X3
3
Comparison: DOT and First Order Logic
€.A1
hold. Hence L* forms a free monoid generated by L. Moreover, for d = a . l l : - . . l k E AL’ and = I i : . . . / ; , E L’, the expression d . X is the DOT expression a.11 .....lk.1: .....I;,. Since, for arbitrary d E AL’, X1, Xz E L’,
The relation among objects using DOT expressions can be represented in a framework of first order predicate logic. Thus we 48
5
(d.Ai).Xz = d.(Ai.Az), d.r = d , (If d.A1 = d , then A I = 6 . )
For a given binary relation R o n AL., regarded as a knowledgebase, a query is expressed by either ? ( d , X ) or ? ( X , d ) , where d E AL’ and X is a variable. The intuitive meaning of the query ? ( d , X ) is what is X that satisfies “ d is-a X ” in M ( R ) , and in case of ?(X, d ) , what is X that satisfies “X is-a d.” For example, a query ? ( j i m , X) asks what j i m is. The answer is “jim is joe’s parent” and ‘(jim is person.” A query ? ( X ,joe.parent) asks what(who) is joe’s pareni. T h e answer will be “ j i m is joe’s parent .” The answers to the queries ? ( d , X) and ?(X, d ) are respectively UR( d ) and LR( d ) given by
hold, the monoid L’ acts on A L . . In particular, the last equation means that the action of L’ on AL’ is free.
Example 1. Let us consider A = { j o e , j i m , p e r s o n } and L = { p a r e n i , ancestor} as the set of object names and the set of labels. Then the label expression set and the DOT expression set are given as follows: L’ = { pareni, ancesior,parent.pareni,pareni.ancesior, ancestor.pareni, ancestor.ancesior, .. .}. AL’ = { j o e , j i m , p e r s o n , joe.pareni, j i m . p a r e n i , person.pareni, joe.ancesior, jim.ancestor, person.ancestor, joe.pareni.parent, jim.pareni.pareni, person.pareni.pareni, joe.pareni.ancesior, jim.pareni.ancesior, person.pareni.ancesior, . . .}.
u R ( d ) = { Z E AL’ L R ( d ) = { Z E AL’
0
1. (Reflexive Law) (z,z) E B. (2,z )
3. (Inheritance Law) If (z, y) E B, then (..A,
+
6 B.
y.A) E
I ( d , ~E) M ( R ) } I (z,d)E M(R)},
where u R ( d ) and L R ( d ) are respectively the upper and lower bound of d with respect to the quasi-ordering relation M ( R ) . We prove that if the knowledge-base R is finite then U n ( d ) and LR(d) can be expressed by a regular expression in the automata theory. Namely, we can construct a nondeterministic finite automaton with €-moves which accepts exactly Un(d)(or -LR(d)). In our example, these sets are given by
A binary relation B on A L . is called an IS-A relaiion if the following conditions are satisfied for z , y, z E AL’ and X E L’:
2. (Transitive Law) If (z, y), (y, z ) E B, then
DOT Query
UR(joe.pareni) = joe.parent person.pareni LR(joe.parent) = joe.parent j i m .
B.
+
+ person,
If we consider another relation
In general, a relation which satisfies the reflexive law and the transitive law is called quasi-ordering relation. Condition 3 shows that the action of L* is monotone with respect to the quasi-ordering. Therefore, an IS-A relation is exactly a quasiordering relation on AL’ such that the right action of L* is monotone with respect to the ordering. It may make sense to refer t o such a relation as right invariant quasi-ordering relation. Here the following lemma can be easily shown:
R‘ = { (joe.parent, joe.ancestor), (joe.ancestor.pareni, j o e . a n c e s t o r ) } , the answer to a query ? ( X , i o e . a n c e s t o r ) is LRi( joe.ancestor) = joe.(pareni ancestor).parent*.
+
The obtained answer is represented by a regular expression and it is generally given by a set consisting of infinite elements, which is unusual in the conventional database systems. As one of the advantages of DOT, virtual objects are defined as the elements of the DOT expression set, which are not involved in the set ol object names and may be elements of the answer set. Due to this fact, the answer of query has infinite elements, though the object name set is finite.
Lemma 1. For an arbitrary binary relation R o n AL., there exists the minimum IS-A relation Z with respect to the inclusion such that R c 1. 0 The IS-A relation Z given by the above proposition is denoted by M ( R ) . Consider the binary relation
R = { (person.pareni,person), ( j o e , person), ( j i m ,j o e . p a r e n i ) } .
6
By applying the inheritance law t o ( j o e , p e r s o n ) E M ( R ) , we obtain (joe.parent,person.pareni)E M ( R ) .
Query Processing
In this section, let us assume that the knowledge-base R is finite and given. We show that the answer t o a query can be expressed by a regular expression, and give a method for constructing an automaton that accepts the exactly same set as the answer.
From this and (person.pareni,person) E M ( R ) , ( j z m ,joe.pareni) E M ( R ) ,
Definition 1. Let B be an arbitrary binary relation on A L . . A pair (2,y) E AL’ x A L . is directly derivable f r o m B if either of the following conditions holds:
we have (jim,person) E M ( R )
1 . 32 E A L * , ( z , z ) , ( z , y )E B.
by virtue of the transitive law. In DOT theory, we infer the fact that ‘jim’ is a ‘person’ from the relation R as the knowledgebase. Hence M ( R ) is interpreted as the whole set of the IS-A relation represented by the knowledge-base R.
2. 3(z‘,y‘) E
B 3A E
L’, z = z’.A, y = y‘.A
Definition 2. For a n arbitrary set D C AL’, relations A( D ,R) and A , ( D , R ) ( i 2 0) are defined as follows: 49
r
Ao(D, R ) = ( R n (DX D))U ( ( 2 , ~I )2 E D}, A ; + l ( D , R )= {(z,y) E D x D I (z,y) is directly derivable from A;(D, R)} ( i 2 0), A ( D , R) = UEoA ; ( D ,R).
1. dl = d , d , 2. 1
Let X
c
AL.. We define a mapping T of
For our knowledge-base R and d E AL., consider the set Then construct the automaton A:(d) according to the following procedures: UR(d).
RI
" id)).
3(r;,r!) E R 3A; E L*, d; = r;.A; and d;+l = (.Xi.
A necessary and sufficient condition for P r o p o s i t i o n 2. (d,d') E A ( R ) is that there exists a proof sequence for ( d , d ' ) . Proof. If there exists a proof sequence {d;}y=l for ( d , d ' ) , then it is immediately from ( d i , d ; + l ) E A ( R ) ( l 5 i < n) that (d,d') E A(R). Conversely, we show that if ( d , d ' ) E A ( R ) then there exists a proof sequence for ( d , d'). Suppose ( d , d ' ) E A ( R ) . Since A ( R ) = U L o A ; ( R ) ,there exists an integer m 2 0 such that ( d , d ' ) E A,(R). We show the proposition by induction on m.
T ( X ) = { y I 3X E L' 32 E X,y.X = 2).
1. Put D R ( ~=) T ( { z E AI;' I 3 y E A L * , ( z , y )E R or (y,
5 Vi < n
The integer n is called the length of the proof sequence.
We denote A(AL', R) by A ( R ) and A;(AL', R) by A ; ( R ) . Definition 3 . 2AL' to ZAL' by
= d',
2)
E
i) For m = 0, by definition of Ao(R), either of the followings holds:
2. Calculate d ( D ~ ( d )R). , 3. Construct the nondeterministic finite automaton A g ( d ) = (Q, C, 6, qo, F) with e-moves:
1. (d,d') E R . 2 . d = d'.
Set of States Q = D R ( d ) U { q o } . (qo $? AL.) Input Alphabet C = A U L U {E}. Transition Function 0 For qo and s E C,
In case 1, the sequence d,d' is a proof sequence for ( d , d'). In case 2, the trivial sequence d is a proof sequence.
ii) Assume that the proposition holds for m = k. Consider the case m = k 1. Let ( d , d ' ) € Ak+l(R). By definition of Ak+l(R),one of the following holds:
+
6(q's)
=
{
(s
{q.s}.
(s
1. There exists t E AL' such that ( d , z ) , ( z , d ' ) E Ak(R). 2. There exist (z,y) E Ak(R), z , y E AL' and X E L' such that d = z.X and d' = y.X.
= €)
E L and q.s E D R ( d ) )
-0.
(0
therwise)
I n i t i a l State qo. Set o f Final States F = {d}. The set D R ( d ) in procedure 1 is finite because R is finite. Then the set A ( D R ( d ) , d ) in procedure 2 is also finite and this procedure always terminates in finite steps. Therefore the set of states of the automaton turns out finite. The following is our main theorem.
.
In case 1, by the assumption of induction, there exist proof sequences d = u l r . . - , u p= z , and z = u ~ , * * ' ,= v d' ~ for ( d , z ) and
(2,
d'), respectively. Then a sequence
u1, *
M a i n Theorem. The set UR(d) is equal to the set accepted by the finite automaton A g ( d ) . 0
- .,up =
0
= 01,* .. 1 vq
is a proof sequence for ( d , d'). In case 2, since (z,y) E Ak(R), there exists a proof sequence z = u1,.. . ,up = y for (z,y) by the assumption of induction. Then
In the rest of this section, our main concern is to prove the theorem. The following proposition is straightforward:
d =U~.X,*'.,U,,.A=~'
P r o p o s i t i o n 1. is a proof sequence for (d, d').
M ( R )= A ( R ) .
0 0
L e m m a 2. Let (n 2 1) be a proof sequence. Then there exists a proof sequence such that,
In what follow, we will identify M ( R ) and A ( R ) . Definition 4. For d,d' E AL., a finite sequence {d;}yZl of AL' which satisfies the following conditions is called a proof sequence for ( d , d'), or simply a proof sequence:
1. 3X E L*, d; = e;.X (1 5 i 2. 1
50
5 n)
5 3 p < n, (ep,ep+l) E R
Proof. Since { d;}y=l is a proof sequence, for all i (1 5 i < n), there exist r;, r:, A; such that d; = r;.A;, d ; + l = r:.A;. Let A, be the one of the label expressions that have the shortest label length in {A1,...,An-l} . Then for each i (1 < i < n), the element d ; can be expressed in two ways:
Definition 5. Let R = {(dl,di), * . * , (dn,dk)} ( d i , d ~ , . * - , d , , dE~ AL') and d E AL' be given. Consider the automaton AE(d) = (Q, C, 6, qo, F). Let q be a state. We use E(q) to denote the set of all states that can be reached from q using emoves only.' Let P be a set of states and we define E ( P ) = UpEPE ( p ) . For q E & , / E E,we define
d; = r;.A; = r:-l.A:-l. As for two A-factors of the last equality, the one with shorter label length divides the other. (Consider the refined expressions for A's with elements in L.) From this fact, it is clear that the element, say A,, which has the shortest label length among A's divides the others:
A; = A:.Xp (1
1) = E(6(E(q),9).
Lemma 3.
In Ag(d), the function
i satisfies
f(q0,a) = I (2,a ) E A ( D R ( d ) ,R ) } , 6(9, 1) = { Z 3Y,(Y,a), (2,Y-1) E A ( D R ( d ) ,RI},
I
5 i < n),
where a E A , q E D R ( ~ )and , 1 E L. Proof. By using the transitivity of A ( D R ( ~ ) , R )we , have
where Xk = c. Then a sequence {e;}L1:
E(q) =
€1.
is a proof sequence and
d ; = e;.A, (1 (ep,ep+1)
5 i 5 n), ER
hold. This establishes the lemma.
Proposition 3. T ( D ) c D , then
Let D C AL.. If D satisfies R
0
cD
x D and Consider tke nondeterministic finite *automaton A g ( d ) = ( Q , C - { ~ } , 6 , q o , F ) The . automaton AK(d) is ?n equivalent automaton to A R ( ~ that ) , is, the set accepted by A g ( d ) is equal to the one by Ag(d). (See 161.)
A ( D , R ) = A(R) n ( D x D). Proof. Since D c AL', A ; ( D , R ) C A;(R) ( i 2 0 ) holds, which means A ( D , R) c A(R) n ( D x D ) . Conversely, if (z, y) E A ( R ) n ( Dx D ) , then, by Proposition 2, there exists a proof sequence { d i } : ! l for (z, y). We fix the proof sequence and show that (dl, dn) E A ( D , R ) by induction on the length n.
Lemma 4. In A g ( d ) , if z E i ( q 0 , y), then (2,y) E A(R). Proof. Let y = u.11. . . - . I , (n 2 0). We show the lemma by induction on n. i) In case n = 0, that is, y = a , if z E j ( q 0 , a ) , then from Lemma 3 (2,a ) E A ( D R ( ~ R ) ,) C A(R). ii) Assume the lemma holds for n = E. Consider the case n = E + 1, that is, y = a . l ~ . . ./k+l. Let y' = a.l1.. .. ./k. If 6(qo,y') = 0, then
i) In case n = 1, (d1,dl) E A o ( D , R ) c A ( D , R ) . ii) Assume that the statement is true for n 5 k for some positive integer E and consider the case n = k+1. From Lemma 2, there exist a proof sequence X E L', and p (1 5 p < n), such that d; = e;.A (1 5 i 5 n) and ( e p , e p + l ) E R. Since d l = z E D and T preserves D , we have e l E T({dl}) C T ( D ) C D. Thus e l E D. Similarly, it can be shown that e, E D . On the other hand, since ( e p , e p + l )E R c D x D, both e p and e p + l are in D. Hence the sequences {e;}L1 and {e;}~=p+lare proof sequences satisfying (el,ep),(ep+l,en) E D x D and both of their length are less than t. Moreover, we see that ( e l , e p ) and (ep+l,en) are contained in A(R) by Proposition 2. Thus we have ( e l , e p ) , ( e p + l r e n )E A ( D , R) by the assumption of induction. By combining these facts with ( e p , E R C A,(D, R ) C A ( D , R), we obtain that (el,e,) E A ( D , R ) . Finally, since the pair ( d l , d , ) = (el.A,e,.X) is directly derivable from ( e l , en), we can conclude that ( d l , d,) E A ( D , R).
i ( q 0 , Y) = i ( q 0 , y'*lk+l) = i ( i ( q 0 , y'), Ik+l) = 0.
n! this case, the.statement is trivial. We may assume that 6 ( p o , y ' ) # 0. Let 6(po,y') = {al,...,~,} (3 2 1). Then
i ( q 0 , Y) = Q(i(q0, Y'), 1k+1) = W a l ; . * * 1 GI, h+l) = 6(aj,1k+I)* On the other hand, by the assumption of induction, each a j ( 1
5
j 5 s) satisfies ( u j , y') E A(R). b(aj,h+l)
If z E
=
13w,(w,aj),(z,w-lk+l) E A ( D R ( ~ ) , R ) } .
i ( U j , Ik+l),
then there exists w such that
0
( w , a j ) E A ( D R ( ~ )R,) C A(R), (2,
Remark that D R ( ~satisfies ) the assumptions for D in Proposition 3, that is, R c D R ( ~x) D R ( ~and ) T ( D R ( ~ C) )D R ( ~ ) .
w./k+l)E A ( D R ( ~ ) , R ) C d(R).
'In [ 6 ] , E(q) is denoted by e-CLOSURE(q).
51
(1) (2)
By (l), (w.lk+l, ( 2 , aj*Ik+l)"E
aj.lk+l) E
A(R). Combining this with (2),
A(R).
If-z E 6 ( q o , y ) , then there exists j (1 z E 6(aj, lk+l), and hence (z,aj.lk+l) E
5
j
5 s)
such that
d(R).
Moreover, by ( a j , y ' ) E A ( R ) , (aj.lk+l,y)
= (aj*lk+l,y'.lk+l) E A ( R ) .
By the above two expressions, we have
( 2 ,y)
E A(R).
0
Now we can show our main theorem.
Proof of Main Theorem. Since the sets accepted by A g ( d ) and A g ( d ) coincide, we prove the theorem for A g ( d ) instead of A g ( d ) . i) If A $ ( d ) accepts 2, then d E i ( q 0 , z ) . By Lemma 4, we have ( d , z) E A ( R ) , which means z E U n ( d ) . ii) Conversely, if ( d , z ) E A ( R ) , then there exists a proof sequence { d;}:='=, for ( d , z) by Proposition 2. We fix the sequence and show that A g ( d ) accepts d , by induction on n.
Figure 3: The Automaton A g ( d ) of Example 2
1. In case n = 1, that is, z = d , it is clear by the construction of A g ( d ) that A g ( d ) accepts d .
7 Implementation of DOT in a Distributed Environment
2. Assume that A g ( d ) accepts d k for any proof sequence {di}tZl for some positive integer k. Consider a proof sequence { d i } f z : . There exists rk,rL E D R ( ~and ) Xk E 1;' such that d k = T k . X k , d k + l = rL.Xk, (rk,rL) E R.
Aiming a t sharing the knowledge on the IS-A relation in various applications, we are now considering the implementation of DOT in a distributed environment. More precisely, we are considering the following two system protocols.
It is shown that 8 ( q , , r k ) c d ( q 0 , t . L ) as follows: If q E 8 ( q o , r k ) , then ( q , T k ) E A ( R ) by Lemma 4. From the fact that (rk,rL) E R, (q,rL) E A ( R ) . Moreover since q,rL E D R ( ~ )we , have (q,rLj E A ( D n ( d > , R )by Proposition 3. Hence it follows q E 6(qo, 7); by Lemma 4.
DOT server (a DOT database manager) and a DOT client for transmitting the following two types of messages:
1. The communication protocol between a
(a) Commands for updating (i.e., appending and deleting) the knowledge.
Thus i('?O,
dk) = !(go,
(b) Queries and their answers.
rk&)
= y ( q O , T k ) , Ak.1
2. DOT specific version of the OSI-RDA protocol[l9].
c 6(6(q0, d ) ,X k ) = 6(qO,
The distributed system architecture of DOT provides us the following advantages:
= i(q0, d k + l ) .
From the assumption of induction, i ( q 0 , d k ) n P # 0. Therefore we obtain 6 ( q o , d k + l ) n F # 0, which means dk+l is accepted.
DOT is a rather primitive knowledge representation and various kinds of applications may share the knowledge stored in distributed DOT databases.
0
Example 2.
Let
R = { ( a , b.P.P), ( b , c ) , ( c , 4, (d.P.P, e.q), (e, e.!?)) and consider query ?(a.q.q.p, X ) . The automaton Ag(a.q.q.p) is illustrated in Figure 3 and the answer is (a
+ ( b + c + d).p.p + e.q.(q)').q.q.p.
0
We can easily construct an automaton A k ( d ) for 1 ; ~ ( d in ) the same manner as that of A g ( d ) except for 6(q, 6). 6(q, c) of A f i ( d ) is given by
I
6(!7,€) = (2 ( q , z ) E A ( D R ( d ) ,RI).
Figure 4: Implementation of D O T on OSI-RDA Protocol 52
Beeri, C., Nasr, R., and Tsur, S., “Embedding +terms in a Hornclause Logic Language,” Proc. of the 3rd Int’l Conf. on Data and Knowledge Base#, pp. 347-359, 1988.
Query and answer have simple forms. We can distribute the large CPU power required for query processing of DOT (i.e., load sharing).
8
Brown, R. and Parker, D. S., “LAURA : A Formal Data Model and her Logical Design Methodology,” Proc. of the 9th Int’l Conf. on Very Large Database Systems, pp. 206-217, 1983.
Conclusion
Chen, W., Kifer, M., and Warren, D. S., “Hilog as a Platform for Database Languages (or why predicate calculus is not enough),”
In this paper, for a simple version of our proposed knowledge r e p resentation DOT, we have considered the DOT query processing problem by using DOT algebra, and we have shown that the answer of the query is expressed by a regular expression. We also provided an algorithm to obtain a nondeterministic automaton which gives the regular expression. As we focussed only on the IS-A relation and the inheritance rule, the proposed knowledge representation is very primitive and we can use it in various area, for example, a class management in object-oriented database system, a part of knowledge processing engine for dialogue understanding system, and embedding DOT in existing programming languages, etc. Actually, the result of this paper can be applicable not only to the kernel of the DOT query processor, but also to the sub-engine for the database systems such as Juan[l8] or &1,y07~[12, 171 which employ the D O T ordering, to the type processing engine for object-oriented databases and to the sub-engine for designing the schema of the relational databases. Moreover few researches employed the regular expression as answers to queries of knowledge-based systems. Though some applications of automata theory were proposed for the verification of the properties of database logic programs[l6], our application is different from such studies in the sense that we used regular expressions in knowledge representation. We believe that our new approach is promising for knowledge processing because it can deal with a certain class of infinite set of knowledge. The work in the paper suggests some possible future directions for research.
Proc. of the 2nd Int’I Workshop on Database Programming Language, pp. 121-135, 1989. Hopcroft, J. E. and Ullman, J. D., Introduction to Automata Theory, Language and Computation, Addison-Wesley Publishing Co., Inc. Reading, Mass.,1979. Kifer, M. and Wu, J., “A Logic for Object-Oriented Logic P r e gramming (Maier’s 0-Logic Revisited),” Proc. of the 8th ACM
SIGACT-SIGMOD Symp. on Principles of Database Systems, pp. 379-393, 1989. (81 Kifer, M. and Lausen, G., “F-Logic : A Higher-Order Language for Reasoning About Objects, Inheritance, and Scheme,” Proc. of the 1989 AGM SIGMOD Int’I Conf. on the Management of Data, pp. 134-146, 1989. [9] Kifer, M., Lausen, G., and Wu, J., “Logical Foundation of ObjectOriented and FrameBased Languages,” Technical Report 90/14 (revised), Dept. of Computer Science, University of New York at
Stony Brook, 1990.
[lo] Kim,
W., Nicolas, J.-N., and Nishio, S. (Eds.), Deductive and Object-Oriented Databases, North-Holland, 1990.
1111 Maier, D., “A Logic for Objects,” Proc. of Workshop on Foundation of Deductive Databases and Logic Programming, pp. 6-26, 1986. [12] Morita, Y., Haniuda, H., and Yokota, K., “Object Identity in Q U Z X O T C , ” Report of SIGDBS, IPSJ, 90-DBS-80, pp. 109-118, 1990.
Efficient algorithm for implementing DOT servers.
[13] Tsukamoto, M., Nishio, S., and Hasegawa, T., “DOT : A Term Representation for Deductive and Object-Oriented Databases,” Proc. of Advanced Database System Symposium, IPSJ, pp. 231240, 1989. (in Japanese)
Development of methods for updating knowledge represented by DOT.
A framework of the upper layer concepts for knowledgebases using DOT including the viewpoint addressed in section 2.
[14] Tsukamoto, M., Nishio, S. and Hasegawa, T., “Deductive and Object-Oriented Database Model using a Term Representation based on DOT Notation and IS-A Relationship,” Submitted for Publication, 1990. (in Japanese)
Acknowledgements
1151 Ullman, J.D., Database and Knowledge-Base Systems, Vo‘olumeI, Computer Science Press, 1988.
The authors would like to express their gratitude to the members of ETR-SWG of ICOT for invaluable comments on this work, and to Dr.T.Kawada and Dr.T.Chiba of SHARP Corporation for their persistent encouragement.
[lS] Vardi, M. Y., “Automata Theory for Database Theoreticians,” Proc. of the 8th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, pp. 83-92, 1989. [17] Yasukawa, H. and Yokota, K., “Labeled Graphs as Semantics of Objects,” Report of SIGDBS, IPSJ, 90-DBS-80, pp. 119-127, 1990.
References [I] Atkinson, M., Bancilhon, F., DeWitt, D., Dittrich, K., Maier, D., and Zdonik, S., “The Object-Oriented Database System Manifesto,” Proc. of the 1st Int’I Conf. on Deductive and ObjectOriented Databases, pp. 40-57, 1989.
1181 Yokota, K., “Outline of a Deductive and Object-Oriented Database Language Juan (Extended Abstract),” Report of SIGDBS, IPSJ, 90-DBS-78, pp. 149-157, 1990.
[2] Ait-Kaci, H., “An Algebraic Semantics Approach to the Effective
I191 ISO/IEC JTCl/SC21 WG 3, “Information Processing Systems Open Systems Interconnection - Remote Database Access - Part 1: Generic Model, Service, and Protocol,” 1990.
Resolution of Type Equations,” Journal of Theoretical Computer Science, Vol. 45, pp. 293-351, 1986. ’The DOT-operator‘.’ and the IS-A relation of the DOT the limited usage of has, is-connection of LAURA[4]
can
be seen aa 53