1
Annals of Mathematics and Artificial Intelligence 3 (1991) 211-258
An Object-Oriented Deductive Language Yves
Caseau
Bellcore, room 2M337 445, South Street, Morristown, NJ 07960-1910
[email protected] Abstract:
We propose a logic for objects that captures the knowledge represented with the LAURE object-oriented language. The work is oriented toward efficient implementation and compilation of queries. A data model for object-oriented databases is presented, with a declarative logic language used to perform queries and positive updates on the database. The expressiveness of this language is reduced, compared to other propositions in the same field, by the use of purely Horn clauses. An equivalent relational algebra is given, from which a formal technique for performing positive updates, called differentiation, is obtained. Two algorithms are proposed that achieve a sound and complete resolution, either for a bottom-up evaluation or a top-down resolution. An efficient implementation of constraint resolution is presented in this framework.
1. INTRODUCTION This paper describes the theoretical foundations and the practical implementation of an object-oriented deductive language, LAURE. LAURE is a programming language that incorporates logic and objects. As a programming language, LAURE is reflective [Ca89a], based on sets and relations, which includes a type system aimed towards efficient compilation. As a deductive system, LAURE provides a logic language with a declarative semantics based on logic, and a set of algorithms that achieves efficient and complete resolution. The purpose of the paper is to present the underlying logic for objects that was developed for this system. The programming language is described in other documents [Lau89]. The goal here is similar to relational deductive databases, but an objectoriented database is used instead of a usual relational database. A deductive language for databases, such as LDL [NT89], consists of a logic language with declarative semantics that manipulates knowledge put in a relational database (see Figure. 1). Here the knowledge is expressed with objects that can be described with binary relations [Gal86]. The LAURE logic language (L 3 ) is presented, which is used to perform queries and positive updates on the database (Section 4).
2
Logic Language
L3 Queries Updates
Relational Database
Object-Oriented Database Binary Relations
Figure 1: Logic Languages for Databases
The specific data model (Section 3) for object-oriented programming brings some issues that must be addressed by a proposition for object logic. Though the database may be seen as a set of binary relations, the access to information is based on set-at-a-time answers (the database delivers the set of objects bound to a given object by a binary relation). The use of a logic that requires a tupleat-a-time access may lead to the so-called “impedance mismatch” problem. Another issue that is more important with an object-oriented model is the object invention [HS89], which is needed because some deduced information must be represented with new objects (Section 2). Many other propositions have been made in the field of logic for objects, such as LDM [Ku85], O-Logic [Mai86] or COL [AG87]. Here the emphasis is put on tractability, rather than expressiveness. Some operations, like negation, which would have lead to languages with no obvious efficient implementation strategies, have not been incorporated. As it stands, the LAURE logic is a subset of revisited O-Logic [KW89], where negation is not allowed, and where object construction functions (for invention) are more carefully restricted. The advantage taken from these restrictions is the existence of an efficient strategy for top-down resolution. A more detailed comparison with other logics for objects is given in Section 8. A Relational algebra has been developed, whose expressive power is equivalent to the logic language L3 (Section 5). Like the LDM algebra [Ku85], this algebra is designed for efficient implementation and compilation of queries. It supports a formal operation, called differentiation, which allows positive updates to be performed efficiently by concentrating only on the part of a rule that changes when the update is performed. The idea is similar to magic sets [BMSU86] applied to the propagation of updates. This is not a solution to the problem of views update, since derived relations in the object model may be updated directly as any other relation. The goal of tractability is achieved by a set of algorithms, presented in Section 7, which describes propagation of updates and its use for bottom-up computation, and a top-down resolution algorithm. The bottom-up computation
3
is done by applying differentiation to the “semi-naive” technique [BR86]. The resolution algorithm is a query/subquery approach [Vi86], improved by the use of differentiation to maintain recursive goals. Efficiency in the LAURE system is also obtained with a type system and some compilation techniques that are presented in Section 6. We also propose an efficient algorithm for constraint resolution, which uses an abstract interpretation of the relational algebra. The LAURE system is the result of a five-year development effort, which started as a programming language designed for AI applications, and evolved into a deductive system. It is implemented in C, runs on Unix-based machines and is made of approximately 50 000 lines of code. The practical efficiency on a set of small problems is one order of magnitude higher than compiled Quintus Prolog. The LAURE system has been used in many projects, especially for simulation and expert systems.
2. INFORMAL DESCRIPTION OF THE LAURE SYSTEM 2.1 The LAURE Data Model a. Objects
We use a set of objects (a finite set O) and a set of properties, which are represented by binary relations on O. Here an object is any member of the set O, and corresponds to what is usually called object identifier (oid) [A&al89], which means that the object itself in a LAURE database has no dimension; it is a single node in a semantic network [Ca87]. The set of objects O is divided into many subsets, which are called classes, and which are placed in an inclusion lattice. In Figure 2, the set of objects contains four classes: PERSON, its two subsets MAN and WOMAN, and the INTEGER set. O is the union of these four sets. A property is a binary relation on O × O (i.e., a subset of O × O), which is either defined as mono_valued (for each object x, there is at most one object y such that P(x,y)) or not. We use the notation P # to represent the set of mono-valued properties. For instance, f a t h e r , m o t h e r or a g e in Figure 2 are mono_valued properties. We write P(x) = y if P is mono_valued and if y is the unique object such that P(x,y). For instance, in Figure 2 we have: age(Luke) = 70,
father(Peter) = Luke,
When there is no such object y, we write P(x) = ⊥ where ⊥ is a distinguished value outside of O (which stands for unknown, as in [KW89]). For example: age(Lucy) = ⊥ , father(Mary) = ⊥ .... etc.
4
father
PERSON WOMAN
MAN Luke Paul John Peter
sons
age
mother
INTEGER
Mary Lucy
daughters
daughters Lucy {Mary}
mother Mary Lucy Paul Mary John Mary
sons Mary {Paul, John} Peter {Paul, John} Luke {Peter}
father Peter Luke Paul Peter John Peter
age Peter 40 Luke 70 John 10
Figure 2: Some Objects in a LAURE Database
When a property is not mono-valued, it is assumed that it has a c l a s s o r i e n t e d representation. For any binary relation R, the class of an object x according to R is the set {y | R(x,y)}. What is meant here is that a distinctive feature of object-oriented databases is to store the class of an object directly using some set mechanism. Efficiency in logic programming relies on using this feature instead of a more classical “tuple-at-a-time” approach. The notation P * is used for the multi_valued properties and we write P(x) = s, if s is the class of x according to R. For instance, in the example: sons(Mary) = {John, Paul}, sons(Luke) = {Peter}, sons(Paul) = {} ...
In this paper, the existence of an object set O is assumed, with some associated properties. The way such objects can be built [Ca87] is beyond the scope of this paper. We shall simply mention it in Section 3.3. Here the object system (O,P) is considered as a given representation of the database, and the focus is on the feasibility of logic programming with this database. b. Relations as Variables
From the logic programming point of view, the objects of the database and their properties are constants. For instance, the father of a person, or the mass of a physical device are not supposed to vary, or to be deduced from other knowledge. On the other hand, it is necessary to represent information that may vary, according to logical rules. Therefore, other binary relations are added to the database that are called variable relations. We call R the set of such relations; in the example (see Figure 3) R = { P A R E N T S , K N O W S , F R I E N D S , A N C E S T O R S , GRAND_FATHER}. Variable relations will be used to define rules; they correspond to derived relations in other systems. However, they may also be updated directly, as any other database relation. The set of variable relations is partitioned into R * and R# , as previously.
5
parents grand_father ancestors
PERSON MAN
WOMAN
knows friends
Figure 3: Some Variable Relations
This distinction is similar to the one made in IQL [AK89], except that these relations are restricted to be binary relations, as are the properties (attributes in IQL). Mono-valued relations from R # are used to capture some disjunctive information. For each such relation r and each object x, if r(x) is unknown, a set of possible values can be given. We write r(x) = {y 1 , y2 ,... yn }, which means that r(x) = y1 ∨ r(x) = y2 ∨ ... ∨ r(x) = yn . By definition, r(x) = y is equivalent to r(x) = {y}. It can be said that we associate a multi-valued relation (of possible values) to each mono-valued relation. c. Functions on Objects
Representing extensional knowledge with only binary relations and a subset organization is not a problem. Each n-ary relation is usually replaced by a class in the object-oriented system, and as many binary relations as fields in the original relation are created. A well-known example is the COURSE(topic,teacher,student,time) relation that is transformed into a set (COURSE) and four binary relations: topic, teacher, student and time. By doing so, each tuple of the COURSE relation is transformed into an object. This is illustrated in Figure 4. COURSE
COURSE
topic
teacher student time
Maths Maths ...
MrWho Mr Who
John Paul
10am 10am
TIMES
topic teacher TOPICS
Maths
relational database approach
time
c0 student
PERSON Paul
10am
MrWho John
object-oriented database approach
Figure 4: Object-Oriented vs. Relational Database
6
However, some aspects of knowledge cannot fit in the “purely” binary relation scheme. Arithmetic, for instance, uses functions that are ternary relations. Therefore, it is necessary to consider a set of external functions on O, to represent some intensional knowledge. If a too general set of functions is allowed, it will be impossible to solve equations or handle queries that use these functions. The first kind of function to consider is the composition operations, such as addition or multiplication, that are commutative, associative and may have an inverse function. We call F o the set of such functions (operations) on O. Since a finite representation has been chosen, the possibility of an error (an overflow) must be taken into account . We write ƒ(x,y) = T if the computation of ƒ on the object pair {x,y} produces an error. If ƒ is not defined for {x,y}, we write ƒ(x,y) = ⊥ as previously. Therefore, a function from F o is a function from O × O to O = O ∪ {T, ⊥} . There is a second implicit drawback in the object-oriented transformation made in Figure 4; if a deductive rule in the relational model creates a new tuple, it means creating a new object in the object-oriented model. There is consequently a need to “create new objects” during the resolution process, which is often referred to as i n v e n t i o n . The problem arises when the “objectification” of the tuples is not natural. Consider the following example (Figure 5): MAP C B A
4 2
(A B)
2 1
0 0
Figure 5: The MAP Example
O = {set of points in a map} want to define the relation ALINE(x,y,z) = “x,y,z are on the same line.” A simple associated rule is: ALINE(x,y,z), ALINE(y,z,t) ⇒ ALINE(x,y,t)
We
Applying the aforementioned method consists of creating a class of alignments objects, with three binary relations p 1 , p2 and p3 , which correspond to the three fields (for an alignment X, p 1 (X) is the first point, p 2 (X) is the second ... etc). If a set of “alignment objects” is created, the rule should become: ( ∃ a1 , a2 , [p2 (a 1 ) = p1 (a 2 )] ∧ [p3 (a 1 ) = p2 (a 1 )]) ⇒ ∃ a3, [p1 (a 3 ) = p1 (a 1 )] ∧ [p2 (a 3 ) = p2 (a 1 )] ∧ [p3 (a 3 ) = p3 (a 2 )] .
7
Invention is necessary, but is difficult to handle in backward resolution of logic programming (bottom-up evaluation is much easier). Our proposition is to limit invention by the use of parametrization: the invented object must be parameterized by one or two existing objects. Parametrization by one object can be modeled with a binary relation from P (a property). Parametrization by two objects cannot be represented in the object-oriented model. A set of parametrization functions F p must be introduced that define a bijection between a subset of O × O and another subset of O. A bijection is necessary to get the two original objects, x and y, from the invented object ƒ(x,y). This is needed to get a tractable query language. In the previous example, a parametrization function LINE could be used that associates an object representing the line (A B) to any pair {A,B} of distinct points in the map. The fact that another point belongs to an abstract line can be represented by a binary relation in. The predicate ALINE(x,y,z) is replaced by in(z,line(x,y)). Here is a possible representation: O = (P={set of points} ∪ L={set of all possible lines}) LINE is a bijection from (P × P - {(x,x) , x∈ P}) to L. in is a binary relation on P × L. p 1 and p 2 are two “inverse” functions from L to P, such that ∀ x,y ∈ P, ((x,y) = (p1(LINE(x,y)),p 2(LINE(x,y)) . The previous rule becomes: ∃ z, [in(z,l) ∧ in(t,LINE(p2 (l),z))] ⇒ in(l,t) . There are many other uses of parametrization functions to escape from the limitation of a binary-oriented model. If some information is to be added to each pair (x,y) of a relation R to describe the status of the information “R(x,y)”, it is difficult to create an object to represent each item, for the same reasons. For instance, if LOVE is a relation on PERSON × PERSON, such that a probability value is associated to each pair of person {a,b}, and some reasoning about these figures is needed, it is necessary to “invent” new objects to represent new information. Therefore, using a parameterized function that associates an abstract object LOVE(a,b) to each pair of persons that represents the statement “a LOVES b”, and some binary relations (such as a probability value) to describe the “value” of the statement is a better solution. This distinction between the abstract tuple and its “value” is a powerful idea (for instance see BOOJUM [Gon88] or ENCORE [HZ87]).
2.2 Logic Programming in L A U R E
8
a. Logical Expressions
Using the previous relations and functions, designated objects from a subset N of O can be combined into object expressions. A designated object is an object that can be referred to in the query language (e.g., integers and named objects are designated). As a convention, brackets are used in an object-oriented style for functions and prefix functional notation with parenthesis for relations (properties or variable relations). For instance, here are some object expressions and their associated values (an object from O ) : John age(Luke) [1 + 2] [4 - 5] [A Line B]
→ → → → →
John 70 3 -1 {[a - b] = [a + -b]} (A B)
If a set of variables V is introduced, a set of object expressions with variables can be deduced, to which a value for each variable assignment from V to O can be associated. Assertions are obtained by comparing two expressions with a binary relation. Here are some simple examples: father(John Luke) ancestor(father(John) Luke) divide([4 * [x + 2]] [y - 1])
However, there are some useful order binary relations that do not fit in the model. For instance, the < relation on the set of integers cannot be represented by its class, since {y | y < x} is potentially infinite. Moreover, there are some other useful orders to use, but the only operation that may be performed efficiently is to test if a given pair belongs to the relation. Therefore, a last kind of object functions has to be introduced. We call F c a set of functions from O × O to {true, false}, which represents binary relations (i.e., (1,2) ∈ θ iff θ (1,2) = true). For instance = , < , > , ≠ , ≥ and ≤ are some usual functions in F c that can be used as follows: [father(John) = Luke] [age(X) > age(Y)] [[age(Paul) + 1] ≠ [X - 3]] b. Rules and Constraints
Existential quantification is introduced, as well as conjunction and disjunction to obtain a first-order logic language. All variables that are not explicitly existentially quantified are assumed to be universally quantified. The notion of clause may now be derived from the assertion language. A clause is a formula of
9
the form “r(X,Y) :- a(X,Y)”, where a is an assertion from the logic language that represents a condition on X and Y, where X and Y are the two free variables of a and where r is a variable relation of R. The intuitive meaning of the clause is if any pair of objects satisfies the condition, it should be added in the relation r. The logic language is an extension of binary DATALOG with object functions [GS87]. Here are some classical example of clauses: ;; some family stuff parents(X,Y) :- [father(X,Y) ∨ mother(X,Y)] ancestor(X,Y) :- parents(X,Y) ancestor(X,Y) :- [∃ Z parents(X,Z) ∧ ancestor(Z,Y)] ;; the fibonnacci sequence fib(X,Y) :- [[X > 1] ∧ [Y = [fib([X - 1]) + fib([X - 2])]] ;; Line example, with the previous notations in(L1,Y) :- [∃ X in([p2(L1) LINE X],Y) ∧ in(L1,X)]
Because the variable relations are all binary, assertions will only be used with at most two free variables (universally quantified). This is an interesting property, because it simplifies the general problem of resolution, as will be seen later. Another way to look at a clause is to say that “a(x,y) ⇒ r(x,y)” is satisfied if, and only if, each pair of objects that satisfies the condition is in the value of the relation r in the database. Rules are explicit deduction formulae. To manage the disjunctive information about mono-valued relations, we need implicit deduction formulae, called constraints. A relational constraint is a formula with the converse form of a clause: “r(x,y) ⇒ a(x,y)”, where r is a relation of R # and a is a logic assertion. A relational constraint is satisfied if all the object pairs (x,y) in the relation r satisfy the condition represented by a. A constraint does not tell explicitly how to deduce a value (it is not a sufficient condition), it tells if a value is allowed to be chosen (a necessary condition). Usually relational constraints like: pressure(x,p) ⇒
[[p * volume(x)] = [n(x) * [R * temperature(x)]]]
are derived from object constraints like: [[pressure(x) * volume(x)] = [n(x) * [R * temperature(x)]]]
Object constraints are logic assertions associated with objects of a given set. A certain number of subterms called goals of the form r(x) in an object constraint a(x), where r is a relation of R # , are identified, and associated relational constraints are generated for each such relation r. 2.3
Operations on the Deductive Database
10
a.
Deterministic
Resolution
The purpose of a resolution system may now be defined informally. Starting from the initial database value, represented by the value of the relation variable from R, each clause allows new facts to be “deduced”. Each fact R(a,b) that is deduced from other known or deduced facts is entailed by the deductive database. Because the object set O is finite, there is a limit to this process, which is a database value such that each rule is satisfied. This database value is the goal of resolution (Figure 6), and is also defined as the smallest database value that satisfies all the clauses and contains the initial database value. D
initial database value
unique solution S(D)
resolution
initial + deduced facts
Figure 6: Computing a Minimal Solution
Example: Here are the relation values that are obtained by resolution, with the “family” set of rules: [parents(John) solve] [ancestor(John) solve]
→ {Peter, Mary} → {Peter, Mary, Luke, Lucy}
This deterministic computation may be performed in two ways, which are usually called top-down and bottom-up, or backward and forward chaining. In forward chaining, resolution starts from the database value and computes the unique solution as a complete database value. On the other hand, backward resolution is directed by a query (the goal of resolution), and only computes a part of the solution necessary to provide a correct answer to the query. The advantage of forward chaining is to be able to give efficient answers to as many queries as wished when the solution is computed. On the other hand, the resolution is more expensive than with backward chaining. Our experience is that the mode of resolution depends on the clauses and the kind of knowledge they manipulate. Clauses that apply to knowledge that varies are better computed with backward chaining, which allows us to modify the initial database value at no cost. Clauses that apply to knowledge that is stable (whatever is put in the database will remain true) are usually better computed with forward chaining. Therefore, LAURE makes a distinction between two kind of clauses: the rules and the axioms. Rules must be computed in backward chaining, whereas axioms must be used in forward chaining (which implies that they are always satisfied). The model must support the two kinds of resolution.
11
Two difficulties exists: maintaining a solution when the database varies monotonically, and computing recursive rules in backward chaining with cyclic data. Even though it was assumed that knowledge used by axioms is stable, new facts may be added to the database. Instead of re-computing the solution from the initial database value, it is necessary to compute it directly from its previous value, which is called propagation (Figure 7). The second difficulty is that rules may be recursive (as the transitive closure example) and apply to recursive data. D
resolution
unique solution S(D)
update D' = D + R(a,b)
propagation resolution
unique solution S(D')
Figure 7: Propagation
A last important issue is the compilation of rules [Ca89b]. To perform a good compilation, a powerful type system is needed. The role of a type system is mostly to predict the type of the value of an expression, so that optimization techniques may be applied at compile time. The more powerful the type system is, the better [A&al89]. Section 7 provides such a complete type system, which is taken from the LAURE language. b.
Non-Deterministic
Operations
Constraint resolution is a non-deterministic operation. A solution to a set of constraints is a database value that contains the initial value and satisfies each constraint. The problem is that for a given set of constraints there may be no solutions or many non-comparable solutions. Because the goal of resolution is more complex, more strategies may apply. We might look for one possible solution (non-deterministic choice) or for an optimal solution according to some external criteria. We may also want to examine all the solutions. Finally, we want to perform constraint resolution restricted by some additional information (hypothetical reasoning). This implies that we can make a copy of the database, add some information and later return to the original state. LAURE permits these different operations, either for a simple goal (i.e., pressure(x)) or for a set of non-necessarily related goals.
12
D initial database value
solution 1
resolution solution 2 D + some hint
solution n
Figure 8: Constraint Resolution
3. THE LAURE DATA M ODEL 3.1 Object Models and Object Systems In this section we describe the data model upon which LAURE is based. The first notion that we present here is the object system and its associated model. The LAURE implementation (Section 3.3) provides such an object system. The data set that we use is made of a finite set of objects O and two distinguished values ⊥ (unknown) and T (error). Using a finite set is justified by the closeness with the actual implementation, where the set of objects is necessarily finite. The logic consequence is the introduction of the “error” value, which we use for instance to represent overflow in our finite representation of integers: MAXINT
+ 1 = T , where MAXINT is the largest known integer in the system .
Each object is described with its properties from P, which are binary relations among objects. The set of properties is divided into two subsets P * and P# , which respectively contain multi-valued and mono-valued properties 1 . The object set O is divided into many subsets called classes. Classes are reified [KL89], which means that the set of classes C is included in O. For each member c of C, the set of objects that belongs to the set represented by c is written s(c). As a part of reification, the hierarchy among classes is represented by a distinguished property subset ∈ P* . The property subset is expected to represent an inclusion lattice. For each object o, the set of classes c such that o ∈ s(c) has a minimal element, which will be represented by the mono-valued property owner ∈ P# . Non-binary relations among objects are represented by object functions from a set F. Object functions are classified according to their mathematical properties. F contains three distinguished subsets: F o (for object operation), Fc (for comparison) and Fp (for parametrization). We limit functions to these three categories in order to perform equation solving (Section 5.2). 1
A muti-valued relation is any binary relation. A mono-valued relation R is a f u n c t i o n a l relation, where there is at most one object y for each object x such that R(x y).
13
Definition: An object model is a triple M=(O,P,F) where: - O is a finite set of objects . - P is a finite set of object properties. P is the disjoint union of P* and P# . - F is a finite set of object functions . F is the disjoint union of Fo , Fc and Fp. Definition: An object system
of the model M is a function I such that2 :
- ∀ p ∈ P#, I(p) ∈ (O → (O ∪ {⊥,T})) - ∀ p ∈ P*, I(p) ∈ (O → Powerset(O)). - ∀ ƒ ∈ Fo , I(ƒ) is a commutative and associative operation from O × O to O ∪ {⊥,T} - ∀ θ ∈ Fc, I(θ) ∈ ((O × O) → {true,false,T }) - ∀ ƒ ∈ Fp , I(ƒ) ∈ ((A × B) → C), where A,B,C are three subsets of O and I(ƒ) is bijective. By extension, if (a,b) ∉ A × B, I(ƒ)(a,b) = ⊥ . - I(subset) is a partial order relation, which induces a lattice structure. - ∀ c1 , c2 ∈ C, c1 ∈ I(subset)(c2 ) ⇔ s(c1 ) Ê s(c2 ) - ∀ c ∈ C, o ∈ O, o ∈ s(c) ⇒ c ∈ I(subset)(I(owner)(o)) . The restriction on I(subset) is a key point for a clean resolution of multiple inheritance conflicts or for the completeness of the type system (Section 6). For any function ƒ, we call d o m a i n (ƒ) the set {x, ƒ(x) ≠ ⊥ } 3 . This model represents each binary relation as a function (set-valued for multi-valued relations) instead of a traditional set of ordered pairs to emphasize the actual implementation of an object-oriented system, as explained in Section 3.3. 3.2 Logic Relations On top of the object system we define a deductive database, made of extensional binary relations and deductive rules. Throughout this paper, the object system O is supposed to be given and fixed. The object system can be thought as the universe of our logic world.
2 3
In this document, the set of functions from a set A to a set B will be represented by (A → B) . I(p) and I(f) are extended for p ∈ P# and f ∈ F, to apply them to ⊥ and T : I(p)( T ) = T , I(p)(⊥ ) = ⊥ . ∀ x ∈ O ∪ {⊥ }, I(f)(x,T ) = I(f)(T ,x) = I(f)(T ,T ) = T , If θ ∈ Fc, ∀ x ∈ O , I(θ )(x, ⊥ ) = I(θ )( ⊥ ,x) = I(θ )( ⊥ ,⊥ ) = false, If ƒ ∈ Fp ∪ Fo, ∀ x ∈ O , I(ƒ)(x,⊥ ) = I(ƒ)(⊥ ,x) = I(ƒ)(⊥ , ⊥ ) = ⊥ .
14
Definition: A database scheme is a pair (S, R), where S is an object system and R a finite set of variables {R1 , ..., Rn }. The set of relation variables R is partitioned into R # and R* , as previously. We use two levels of information representation, called database instances and database values. Each variable from R * represents a binary relation, # denoted by its class function. A variable from R represents a mono-valued relation. If the value of the relation for an object is not known, we want to represent a set of possible values instead. Therefore, we define a database instance of a database model as an assignment from R to (O → Powerset(O)). Definition: A database instance is a function d of D = (R → (O →
Powerset(O)).
This representation captures both multi-valued relations among objects ( ∀ x ∈ O, if Ri ∈ R* , d(Ri)(x) represents the class of x according to the relation denoted by R i ) and mono-valued relations (If R i ∈ R# , d(Ri )(x) represents the set of possible values for R i (x)). Representing the mono-valued relation “like a multivalued one" has many advantages as will be shown later. It captures both the knowledge of disjunctive information and the case of no possible values (R i (x) = ∅ ). There is an information containment order: ∀ d1 ,d2 ∈ D,
d1 < d 2 ⇔
∀ Ri ∈ R* , ∀ x ∈ O, d1 (R i)(x) Ê d2 (R i)(x) ∧ ∀ Ri ∈ R# ,∀ x ∈ O, d2 (R i )(x) Ê d1 (R i )(x)
.
The intuitive meaning of this order is that if d 2 > d1 , then d2 contains the knowledge in d1 plus some additional information. The inclusion is reversed for a mono-valued relation since more knowledge about possible value sets actually means a smaller set. This order induces a lattice structure on D. The greatest lower bound operation is defined by: ∀ Ri ∈ R* , ∀ x ∈ O, glb(d 1 ,d 2 )(R i)(x) = d2 (R i)(x) ∩ d1 (R i)(x) ∀ Ri ∈ R# , ∀ x ∈ O, glb(d 1 ,d 2 )(R i)(x) = d2 (R i)(x) ∪ d1 (R i)(x)
.
A database instance d usually contains some incomplete information through the value of relations from R # . A complete database instance d is such that | d ( R i )(x)| ≤ 1 for each R i in R # . A complete database instance contains no disjunctive information. To each database instance d , we associate a database value q , which represents the certain information in the database. For each multi-valued relation R i , q(Ri ) = d(Ri ) since there is no disjunctive information about such relations. For a mono-valued relation R i , and for any object x, q(Ri ) is the monovalued relation that associates T to x if there are no possible values, y to x if y is the unique possible value, and nothing to x if there are many possible values:
15
- if d(Ri)(x) = ∅ then q(Ri)(x) = T which means that Ri(x) cannot be defined, - if d(Ri)(x) = { y } then q(Ri)(x) = T , - else q(Ri)(x) = ⊥ which means that Ri(x) is unknown. Definition: A database value is a function q of Q = (R → (O → Powerset(O) ∪ O) 4 . The database value is the projection of the database instance in the “certain” world. The semantics of a deterministic program is defined with respect to the certain database value, so deterministic resolution will be performed on values. The same order and lattice structure can be deduced on Q. We also say that a database value is complete if q(Ri )(x) ≠ ⊥ for all Ri and x. From now on, we shall represent a database with its database function q; the database instance representation will be used in Section 7 to describe constraint resolution. 3.3
Implementation
This data model is very similar to those in [Mai86], [KW89], except for the object functions that are strongly restricted here. Objects are simple nodes in a semantic net. Precisely, they are what are usually called oids (object identifiers). The actual implementation of the extensional component of the object system is a fast semantic network, using a record structure associated with each node to store the classes according to the various properties. A list system has been implemented to represent the powerset of O. Therefore, the class of each object according to each multi-valued property is represented by a list. The two statistically important operations on such lists are membership and iteration, which suggest a contiguous memory representation, as opposed to a cell representation, such as a LISP system. Hashing techniques for membership are not used since they would lower the efficiency of iteration. LAURE uses a record structure to store objects, with as many spaces as there are extensional properties for the object. Lists are represented by contiguous shunks of memory (of size 2i for fast allocation/reuse). The principal feature of this implementation is to guarantee the access to the class of a given object (oid) in one machine instruction. Building the actual object system from the semantic network capability is done through the reflection of the detailed LAURE model [Ca88]. Each aspect of the model (class, methods, attributes, instructions of the programming language ...) is represented with objects of the system, which are nodes of this semantic network. The properties o w n e r and s u b s e t are derived from the object/class declaration (subset is the transitive closure of the class/subclass relationship 4q ( R
i ) is a function representing a binary relation, and will be used consequently throughout this paper.
16
and owner is given by the instantiation scheme). A complex algorithm [Ca87] is used to transform the class hierarchy into an equivalent class lattice, so as to obtain the intersection closure property, illustrated in Section 6. The associated programming language is a very classical object-oriented language, with a message notion, which can be compiled easily into low-level code. All the LAURE system is described with the LAURE language, so as to increase portability and maintenance. The type system described in Section 6.2 is used to provide excellent compiling of this self-description. The extensional relational database is built on top of the object system (see Figure 9). Because LAURE is a reflective object system [Ca88], the integration is straightforward. However, another commercial object system could be substituted. Each binary relation is stored through a hashing table, which associates a unit of information called a bucket to each object in the graph of the relation. The LAURE system actually implements a stack of ordered database instances d 1 , d2 , ... di . Instead of representing only one database instance d i , LAURE is able to memorize some intermediate levels. The monotonicity of this sequence is taken into account so that memory allocation could be shared. Creating or deleting a new level of database instance is a very fast operation, due to a complex sharing of memory structures. Each level in LAURE is called a world. di {Paul, Mary}
di-1 d2 d1
hash-code
objectsystem
R1
FRIEND
demons
Figure 9:
Peter
if_written if_read
Implementation of the “Relational”
hash-code
Rn
demons
. Database
The bucket contains the class of the object according to the relation if it belongs to R* or the set of possible values if it belongs to R# . It also contains status and history information. The status of a bucket is either unseen, seeing or seen, and will be used by the resolution algorithm presented in Section 7. Each bucket also knows the size of the list representing q(R i )(x), for the concerned relation R i and the object x. Two different buckets representing the same information in two different database instances may thus share the same physical list structure.
17
The history mechanism consists of storing all the buckets in the same stack, where a world is defined as a piece of this stack. Each bucket knows his predecessor (another bucket deeper in the stack that represents the same value q ( R i )(x)). Creating a new world is done by moving a pointer that defines the current top of the stack; adding new information will create a new bucket in this new part of the stack. When LAURE resumes to a previous state, each bucket in the newer part replaces itself by its predecessor in the hash table through the use of a direct index. Each relation of R is represented as an object from O. In addition to the previous hashcode table, each relation has a certain number of properties such as its domain or its range (which are LAURE sets). Multi-valued relations have two properties if_read and if_written, which contain demons. An if_read demon is a small procedural function ƒ(x), which is called each time the value of r(x) is asked (the class of x according to r). An if_written demon ƒ(x,y) will be called each time an object y is added to the class of x according to r. Such procedural attachment is used as a target for rule compiling, because of its efficiency (Section 6.3). Mono-valued relations have two additional properties i f _ n e e d e d and if_change, which also contain demons and are used for constraint resolution.
4. THE LAURE LOGIC LANGUAGE 4.1 A Language for Expressing Conditions on Objects The logic language associated with the data model (O,P,R,F) has the following alphabet: O, P, R, F, a set of variables V (it is convenient to suppose that R Ê V) and some usual symbols {(, ), [, ],∧ , ∨ , ∃ }. The Laure Logic Language (L 3 ) is an extension of binary Datalog to object functions. Binary Datalog is the restriction of Datalog to binary predicates, which is adequate for the object model. We extend it by allowing the use of object functions from F o ∪ Fc ∪ F p . An assertion with free variables in the set F is described by the following grammar 5 . :: ( [ ] | [ ∧ ] | [ ∨ ] | [∃ *]
) |
:: F | O | () | [ ] | [ ]
5
This syntax is made similar to the LAURE programming language to facilitate integration.
18
:: ::
R* | P* R # | P# | F p 1
This language is implemented as an extension of the LAURE programming language. This is easy since the LAURE reader is extensible and L 3 follows the common language pattern. In the LAURE system the number of possible free variables in an assertion is limited to two. This is sufficient for the need of our resolution system, as we shall see in the next sections. The semantics of this language is defined in a classical manner. The only relevant part of the database instance is its “certain” component, represented by the database value q. Let I f be the set of functions from V to O. For each i n t e r p r e t a t i o n f ∈ If , and each database value q ∈ Q, [ ] q,f , which is a function from to O ∪ {⊥,T} and from to {true,false, T } is defined by: ∀ o∈ O, [o]q,f = o; ∀ x∈ V, [x]q,f = f(x); ∀ p∈ P # ∪ Fp1,∀ e∈ , [p(e)] q,f = I(p)([e] q,f) ∀ ƒ∈ F o ∪ Fp 2 ,∀ e1 ,e 2 ∈ , [[e 1 ƒ e2 ]] q,f = I(ƒ)([e 1 ] q,f,[e 2 ] q,f) ∀ p∈ P * ,∀ e1 ,e 2 ∈ , [p(e 1 e2 )] q,f = ([e 1 ] q,f,[e 2 ] q,f) ∈ I(p) ∀ Ri∈ R # ,∀ e∈ , [R i(e)] q,f = q(Ri)([e] q,f) ∀ Ri∈ R * ,∀ e1 ,e 2 ∈ , [R i(e 1 e2 )] q,f = ([e 1 ] q,f,[e 2 ] q,f) ∈ q(Ri) ∀ θ ∈F c,∀ e1,e2∈ , [[e 1 θ e2]]q,f = I(θ )([e 1]q,f,[e 2]q,f) ∀ a1 ,a 2 ∈ , [[a 1 ∧ a2 ]] q,f = [a 1 ] q,f ∧ [a 2 ] q,f ∀ a1 ,a 2 ∈ , [[a 1 ∨ a2 ]] q,f = [a 1 ] q,f ∨ [a 2 ] q,f ∀ z∈ V, ∀ a∈ , [[ ∃ z a]] q,f = true iff there exists another interpretation f', which differs from f only on {z}, such that: [a] q,f' = true. A q u e r y is simply defined as an object-assertion, with at most two free variables. It is said that an object o satisfies a one-free variable assertion a(x) if [a ] q,f = true for all f such that f(x) = o. It is said that a pair of objects (o 1 , o 2 ) satisfies a two-free variable assertion a(x,y) if [a] q,f = true for all f such that f(x) = o1 and f(y) = o2. The answer to the query a(F )? is written [a] q . - if F = ∅, [a] q. = [a] q,f , since it is independent from f. - if F = {x}, if ∃ o ∈ O, ∃ f such that (f(x) = o) ∧ ([a] q,f = T ), then [a] q = T ; otherwise, [a] q = {o| o satisfies a}; - if F = {x,y}, if ∃ o1 ,o 2 ∈ O, ∃ f such that (f(x) = o1 ) ∧ (f(y) = o2 ) ∧ ([a] q,f = T ), then [a] q = T ; otherwise, [a] q is the binary relation {(o 1 ,o 2 )| (o1 ,o 2 ) satisfies a}. Examples:
With the previous database value:
19
- [father(Paul) = Peter] → [[father(Paul) = Peter]] q = true - [[father(x) = Peter] ∨ [mother(x) = Mary]] → {Paul, John} - [∃ z [father(x z) ∧ father(z y)]] → {(Paul,Luke), (John,Luke)}
Lemma: This logic language is monotonic with respect to the database values: ∀ q1,q2 ∈ Q, ∀ a ∈ , ∀ ƒ, q 1 < q2 ⇒ [ {[a]q1,f = true} ⇒ {([a]q2,f = true) ∨ ([a]q2,f = T ) ] Proof: ( by induction on the structure of the assertion a). 4.2 Rules in LAURE The query notion may be used to perform logic programming in this model, while defining relations by queries. Precisely, a c l a u s e is defined as a pair (R i ,condition) where R i is a relation variable from R, and condition is an objectassertion from L 3 . Such a clause is usually written: {condition(x y) ⇒ Ri (x,y)} Example: [∃ z [y = father(z)] ∧ [father(x z) ∨ mother(x z)]] ⇒ GRAND _ FATHER (x y)
Clauses are implemented in LAURE with two kinds of objects, called rules and a x i o m s . They both represent a clause, with a condition attribute that is an assertion from L 3 and a conclusion attribute that is a relation from R. The only difference is in the resolution mode that we want to apply to the clause. A rule will be used in top-down resolution, whereas an axiom will be used in bottom-up evaluation, thus being always satisfied (Section 7). Satisfaction of a clause is defined with respect to the database value. If Ri is a relation from R, it is said that a database value q satisfies a clause {a ⇒ Ri } if the value of the query "a?" is not T and is included in the value of binary relation R i : [a]q ≠ T ∧ {∀ x ∈ O, ∀ y ∈ O, (x,y) ∈ [a]q ⇒ (x,y) ∈ q(Ri).} If the relation Ri is mono-valued, and if more than one value can be deduced from rules for R i (x), this is considered as an error, represented by q(R i )(x) = T . It follows logically that satisfiability is stable with the lattice operation: For each clause C= {a ⇒ Ri}, for each database values q 1 ,q 2 such that q1 and q 2 satisfy C, glb(q1 , q2 ) satisfies C. D e f i n t i o n : A deductive program is a pair (q 0 , {C i }), where q 0 is the i n i t i a l database value, and {Ci} a finite set of clauses. An admissible solution of a program is a database value q such that:
20
- q0 < q - q satisfies each clause of {Ci}. Theorem: For each deductive program (q 0 , {Ci}), either - There exists one unique smallest admissible solution, which is called the solution of the program. - There is no admissible solution. The solution of the program is T . Proof [Ta55]: If there is an admissible solution, the greatest lower bound of all admissible solutions is the unique smallest solution. A program whose solution is T is a program where the evaluation of a clause yields an error. Here we do not address the problem of error prediction (such as overflow), which is equivalent to the safety problem in relational databases. As mentioned previously, a rule in LAURE is a clause that will be used in topdown resolution and an axiom is a clause that will be used in bottom-up evaluation. An extension of the axiom notion is offered in LAURE, which allows the user to bind any action to a L 3 assertion: ::
{ ⇒ action(x,y)}
The action is defined with the LAURE programming language in its full generality. The semantics of such an axiom is that action(x,y) must be evaluated for each new pair (x,y) that happens to satisfy the condition represented by assertion(x,y). This the LAURE way of defining production rules, which are very useful when building a large reactive simulation system. Section 7 will illustrate how these production rules can be managed efficiently through their algebraic representation. A special case of production rules is the integrity constraint, where the action is another condition, for which evaluation will produce an error if it is not satisfied. LAURE supports integrity constraints like: [integrity_constraint new for_all (s1 s2 integer) if [e exists [salary(e) = s1] [salary(manager(e)) = s2]] check [s1 < s2]]
Their implementation is a subcase of production rule implementation, as discussed in Section 7. 4.3
Constraints in LAURE
In order to take full advantage of the disjunctive information, we need to introduce logic formulae that define mono-valued relations implicitly. This is the
21
goal of constraints, which can be thought of as choice rules. This model supports relational constraints and negative constraints. Relational constraints have the converse form of a clause: (R i (x,y) ⇒ condition(x,y)), where R i is a relation from R # and condition is an assertion from L 3 . Satisfaction of a constraint (Ri ⇒ a) is defined in a symmetrical manner from a clause satisfaction. If Ri is a relation from R * , it is said that a database value q satisfies a constraint (Ri ⇒ a) if and only if: [a]q ≠ T ∧ {∀ x ∈ O, ∀ y ∈ O, (x,y) ∈ q(Ri)(x) ⇒ (x,y) ∈ [a]q } . A relational constraint is not a sufficient condition to put something in the database, it is a necessary condition. Finding a solution from a set of constraints is a different process that is non-deterministic and involves backtracking. Relational constraints are derived automatically (Section 5) from object constraints, which are assertions of L 3 holding on one object variable. For instance the relational constraint pressure(x y) ⇒ [[y * volume(x)] = [n(x) * [R * temperature(x)]]
is derived from the following object constraint: [[pressure(x) * volume(x)] = [n(x) * [R * temperature(x)]] .
A negative constraint is a clause with a negative conclusion: (condition(x,y) ⇒ ·R i ), where Ri is a relation from R # and condition is an assertion from L 3 . If Ri is a relation from R # and a an assertion from L 3 , it is said that a database value q satisfies a negative constraint (a ⇒ ·Ri) if and only if: [a] q ≠ T ∧ ([a] q ∩ q(Ri) = ∅) . It follows logically that satisfiability for constraints is also stable by intersection: For each constraint C, for each database values q 1 , q 2 that satisfy C, glb(q 1 ,q 2 ) [ {(I,q1) satisfies C} ⇒ {(I,q2) satisfies C} ∨ [a]q2 = T)} . A database program is a pair (q 0 , {Ci }), where q 0 is the initial database value, and {C i } a finite set of constraints. An admissible solution of a program is a database value q such that: - q0 < q - q satisfies each clause of {Ci}. The existence of a unique smallest admissible solution unless an error occurs can also be obtained here. However, constraint resolution is aimed towards completion of incomplete information.
22
Definition: A solution of a logic program (q 0 , {Ci }) is a minimal complete admissible solution. There is no equivalent result about the unicity of a solution for each program. Because of the non-determinism, there may be one, many or no solutions to a given program. The next section will introduce some useful tools that LAURE uses for rule and constraint resolution.
5. THE OBJECT RELATIONAL ALGEBRA 5.1 A Relational Algebra on Binary Relations Here an algebra of terms representing binary relations on O is defined, using R as a set of variables. This algebra, written A (R), will be used to define the algebraic query language. It is built from a set of operations on the set of binary relations O × O . This algebra is implemented in LAURE with objects, where each operation of the algebra is a class, and each node in a tree representing an algebraic term is represented by an object. This algebra is the most important tool with which LAURE achieves efficiency, as will be illustrated in Section 7.
The first simple operation is to make a relation from two sets as a cartesian product: If S1 and S2 are two subsets of O, S1 × S2 is a binary relation on O : (x,y) ∈ S1 × S2 ⇔ (x ∈ S1 ) ∧ (y ∈ S2 ). The c o m p o s i t i o n of two binary relations may be seen as a simple join in a relational database: if r1 and r2 are two binary relations, (r 1 o r2 ) is a binary relation on O: (x,y) ∈ (r 1 o r2 ) ⇔ (∃ z ∈ O, ((x,z) ∈ r2 ) ∧ ((z,y) ∈ r1 ) For instance, (father o parents) represents the “grand_father” relation.
The union and the intersection of two relations are straightforward to define: if r1 and r2 are two binary relations, (x,y) ∈ (r 1 ∪ r2 ) ⇔ ((x,y) ∈ r1 ) ∨ ((x,y) ∈ r2 ), (x,y) ∈ (r 1 ∩ r2 ) ⇔ ((x,y) ∈ r1 ) ∧ ((x,y) ∈ r2 ). For instance, (father ∪ mother) represents the “parent” relation, (friend ∩ represents the “preferred ancestors” relation.
The inverse of a binary relation is defined as: (x,y) ∈ r -1 ⇔ (y,x) ∈ r. For instance, (parent) -1
represents the “children” relation,
ancestor)
23
A selection operation on O × O is introduced, which uses functions from F c . It corresponds to the selection operation in relational databases. If r1 and r2 are two binary relations, θ a function from O × O → {true,false}, then (r 1 θ r 2 ) is a binary relation defined by: (x,y) ∈ (r1 θ r2 ) ⇔ (x = y) ∧ ∃ z1 , z2 ∈ O , ((x,z1 ) ∈ r1 ) ∧ ((x,z2 ) ∈ r2 ) ∧ (θ (z 1 ,z 2 ) = true) For instance, (age = PERSON × {10}) represents the “10 years old” relation.
There is one relation from F o and Fp . If r1 and r2 then ψ (r 1 ,ƒ,r 2 ) is a binary (x,y) ∈ ψ (r 1 ,ƒ,r 2 ) ⇔ ∃
operation that permits the introduction of functions are two binary relations, ƒ a function from O × O → O , relation defined by: z1 , z2 ∈ O, ((x,z1 ) ∈ r1 ) ∧ ((x,z2 ) ∈ r2 ) ∧ (ƒ(z1 ,z 2 ) = y)
For instance, ψ (Id, +, INTEGER × {1}) represents the “successor” relation.
It is also necessary to “create new objects” inside the query language, what is often referred to as invention. Our proposition is to limit invention by the use of parametrization (the invented object must be parameterized by one or two existing objects). A famous example of such an object constructor is the c o n s function, which builds a new list from an object and another list. Its two inverse functions are respectively car and cdr. This representation of object invention fits easily in the algebra since the ψ operation may be used. For instance, ψ (car, cons, cdr) represents the “Identity” relation, by definition.
We introduce a subterm naming ability as an operation from our algebra: If x ∈ V - R, if t1 (R 1 ,...,R n ,x) is a term of A (R ∪ {x}) and t2 (R 1 ,..,R n ) is a term of A(R): - (λ x.t 1 (r 1 ,...,r n ,x))t 2 (r 1 ,r 2 ,..,r n ) is a term of A (R), - For each database value q, [( λ x.t 1 (r 1 ,...,r n ,x))t 2 (r 1 ,r 2 ,..,r n )] q = [t1 (r 1 ,...,r n ,x)] g , with g(r) = q(r) for r ∈ R, and g(x) = [t2 (r 1 ,r 2 ,..,r n )] q . For each term t, V(t) may be defined as the set of variables z used in a subterm of t, with a construction ( λ z.t 1 )t 2 . For instance, ( λ Z.{( ψ ((age o Z), +, age) = age) o (size -1 o size o Z)}) friend represents the following relation: x is in relation with y if he has a friend z of the same size as y such that the sum of their ages (he (x) and his friend (z)) is equal to the age of y.
If t is a term of A (R), if we replace each variable of R by obtain a relational expression whose value is a binary relation d(t) by extension). Since d(Ri) contains the possible values for d(t) as a computation of possible values. On the other hand,
its value d(R i ), we on O × O (written R i, we will refer to if we replace each
24
variable by the relation represented by q(R i ), we get another relation written q(t), which represents the “certain” value of the term in the database value q. This algebra is similar to a relational database algebra restricted to binary relations. However, an important difference with relational databases is the class-oriented access to the relations. Therefore, the value of each term in a database value q is actually represented with its class-access function , which is represented by extending the database value function q to A (R) . The object relational algebra ([Bac78], [McL81]) is now complete. Here is the definition: The object algebra A (R) of relational terms on R is recursively defined by: - If r ∈ R ∪ P, r ∈ A (R), - If S1 , S2 are two subsets of O, S1 × S2 ∈ A (R) . - if t, t1, t2 ∈ A (R), θ ∈ Fc, ƒ ∈ F o ∪ Fp, t-1 ∈ A (R), (t1 o t2) ∈ A (R), (t1 ∪ t2)] ∈ A (R), (t1 ∩ t2)] ∈ A (R), (t 1 θ t2 ) ∈ A (R), ψ (t 1 ƒ t2 ) ∈ A (R) - if x ∈ V-R, t1 ∈ A (R ∪ {x}), t2 ∈ A (R) (λ x.t1)t2 ∈ A (R). The binary relation that is associated with a term t of A (R) w.r.t a database value q is represented by a class-access function written q(t) . All the relational operations are monotonic and compatible with the lattice structure: ∀ q1, q2 ∈ D, ∀ t ∈ A(R), ∀ o ∈ O, (q1 < q2 ) ⇒ (q1 (t)(o) Ê q2 (t)(o)) ∨ (q2 (t)(o) = T ). (q1 (t)(o) ≠ T ) ∧ (q2 (t)(o) ≠ T ) ⇒ glb(q 1 ,q 2 )(t)(o) = q1 (t)(o) ∩ q2 (t)(o).
An algebraic query may be defined in a similar way. An algebraic query t? is made from a term t of A(R). The answer to the query is simply q(t).
5.2 Translation from L3 to the Relational Algebra The following result states that the expressive power of the algebra is exactly the same as the LAURE logic language: Theorem:
25
For each term t of A (R), there exists an assertion a of L3 such that: ∀ q ∈ Q, q(t) = [a]q. Reciprocately, for each assertion a of L3 there exists a term t of A (R) such that: ∀ q ∈ Q, q(t) = [a]q. The proof is a structural induction, which is very long and relies heavily on the finiteness of O. The actual translation of an L 3 query into an algebraic query is more interesting because among many possible algebraic translations, there is usually one optimal solution. Translation into the algebraic form is based on rewriting and involves a lot of knowledge about object functions. This aspect is detailed in [Ca90]. The principle is to solve the equation assertion(x,y), while considering that x is known and that y is sought. The result of the resolution is a relational algorithm that explains how to get y from x and is represented as a term in our relational algebra. Introducing object functions such as operations and comparisons in the condition language L 3 leads to “object equations”. For instance, such an equation for a gas object x may be: [[pressure(x) * volume(x) ] = [ [R * n(x)] * T] .
Any equation that can be solved by rewriting should be transformed. Since the object functions are intensional, it is not possible to find object values that satisfy a given equation unless the equation is solved, which means rewritten into an evaluable form. For instance, if T is needed, the previous equation should be: T = ( pressure(x) * volume(x) ) / R * n(x) .
Not all L3 equations are solvable, but the system recognizes each plausible equation that can be solved, otherwise logic resolution would be far less efficient than ad hoc programming (when the user solves the equation himself). Therefore, domain-specific knowledge about object functions has been included in the logic system, in the form of rewriting rules. We have restricted ourselves to functions that have some interesting mathematical properties. Here are some examples of rewriting rules that LAURE uses (with some ad hoc ordering to obtain termination) . Group operations may be used; if + induces an abelian we get:
group structure then
X + 0 → X, X + (Y + Z) → (X + Y) + Z, X + Y → Y + X, X + Y = Z → X = Z - Y.
Similarly, inversible unary functions have an interesting rewriting rule:
26
X = f(Y) → Y = f-1(X).
Monoid operations and quotient operations are also very common and described by a nice set of rewriting rules. Multiplication on integers for instance fits in the monoid category; associativity and commutativity rewriting rules can be used. There are more sophisticated rewriting rules involving the quotient operation, such as: X * Y = Z → ( Y | Z ) ∧ (X = Z / Y) .
LAURE also holds some knowledge about compatibility between comparisons and operations, with rules like: X + Y < Z ∧ compatible(+,