Unifying Schema and Instance Levels of Object

0 downloads 0 Views 247KB Size Report
Abstract. In this paper we present the consequences of unifying the representation of the schema and the instance levels of an object-oriented (OO) database.
Unifying Schema and Instance Levels of Object-Oriented Databases Iztok Savnik, Tomaz Mohoric University of Ljubljana Faculty of Computer and Information Science Slovenia Zahir Tari Royal Melbourne Institute of Technology Department of Computer Science Australia April, 1997 Abstract

In this paper we present the consequences of unifying the representation of the schema and the instance levels of an object-oriented (OO) database to the formal representation of OO database model. The uniform representation of the schema and the instance levels of OO databases is achieved, as in the frame-based knowledge representation languages [13], by representing them using a uniform set of modelling constructs. We show that, using such an approach, the structural part of the OO database model can be described in a clear manner providing the simple means for the description of the main constructs of the structural model and the relationships among them. Further, we study the consequences of releasing the boundary between the schema and the instance levels of an OO database by allowing the de nition of objects which include data from both levels. We show that few changes are needed in order to augment the previously presented formal de nition of the structural part of OO database to represent the extended database model. Keywords: database semantics, object-oriented databases, database modelling.

1 Introduction In spite of considerable research e ort directed to the problems of the formalisation of object-oriented (OO) database models [16, 3, 6, 10, 14, 23] in the last decade, there is still a lack of a strong theoretical background for OO databases. The main reason for this lies in the rich set of sophisticated data modelling constructs provided by OO database models [6]. In this paper we focus on one aspect of the OO database models: a structural model. We consider that the main obstacle that prevents clear formalisation of the structural part of the OO data model is the unnecessarily strict distinction between the schema and the instance levels of the database. The presented formalisation uni es them by treating classes as objects in a similar way as frames [13] are used to represent abstract concepts. We show that the uniform treatment of the extensional and the intensional parts of OO database [17] allows a clear and simple de nition of the formal representation of the OO database model. In the rst part of the paper, we present a formalisation of the OO structural database model which conforms with the main features of the OO database models suggested in [4]. Similar modelling constructs are used for the representation of the extensional and the intensional parts of an OO database. However, we retain a strict separation between the conceptual schema and the instance parts of a database. In the second part of the paper, we investigate the consequences of relasing the boundary between these two parts of a database. Basically, merging the intensional and the extensional levels of an OO database is achieved by allowing the objects to include individual objects and classes as their components. The features of the extended OO database model are studied by considering the constructs needed to extend the previously de ned formalisation of the ordinary OO data model to be able to express all aspects of the extended model. Apart from serving as the guideline for our formal presentation of OO database model, the uniform treatment of the extensional and intensional parts of a database creates a basis that allows for a simple means of manipulation of the intensional part of a database [20]. Moreover, the intensional database can be manipulated by the constructs which are also used for manipulation of the extensional part of the database. The study of algebraic operations that provide the means for querying the intensional and extensional parts of an OO database is presented in [19]. Further, the uniform treatment of an OO database is of particular importance in distributed database environments, which can include rich conceptual schemata and, most often, require expressive language for the representation of the meaning of stored data through some form of re ection [12]. Initial studies on using some constructs of the model proposed in this paper in distributed environments are presented in [24]. The rest of the paper is organised as follows. We start the formal presentation by describing identi ers and their structural properties in Section 2. In Section 3 2

we de ne values, relate them to the previously presented identi ers and describe their properties. Further, in Section 4 the objects are de ned and their properties are described. In most cases, the properties of objects are derived from the properties of values and identi ers. In Section 5 we remove the distinction between the schema and the instance levels of OO databases and study the consequences of this in the framework of the previously presented formalisation. Related work is presented in Section 6. We review the formalisations and the description languages which had signi cant in uence on the development of the proposed formalisation. Finally, concluding remarks, some aspects of implementation of the presented database model, and the main directions of our future work are given in Section 7.

2 Identi ers and their properties Let us rst de ne some basic terminology used in the paper. We assume the existence of a prede ned in nite set of identi ers O. An identi er is a unique symbol which represents an abstract or concrete entity from the real world. Identi ers will be denoted by terms written in the lower-case letters. For example, the identi er tom serves as a unique identi cation of a person whose name is \Tom", or, the identi er student stands for the unique identi cation of the abstract representation of a student. As suggested by the above two examples, the set O is further divided into the set of individual identi ers OD representing concrete entities such as persons, and the set of class identi ers OC representing abstract concepts, which usually stand for a group of individual entities. In some cases we will refer to the individual and class identi ers simply as individuals and classes.

2.1 Interpretation The most signi cant di erence between class identi ers and individual identi ers is in their interpretations. While the interpretation of an individual identi er is the individual itself, the interpretation of a class identi er is the set of individuals. The interpretation of class identi ers is de ned as follows. De nition 1 Let c 2 OC . The interpretation of c, denoted I (c), has the the following properties: 1. I (c)  OD , and 2. 8p(p 2 OC ^ p 6= c ) I (c) \ I (p) = ;). As can be seen from the above de nition, we use the common engineering intuition, as stated in [3], by treating individuals as members of the interpretations of single class identi ers. This design decision leads to disjunctive sets 3

of individuals that represent the interpretations of class identi ers. Therefore, an individual identi er is an element of the interpretation of exactly one class identi er. The class interpretation speci es the membership relationship among individual and class identi ers. Let id1 2 OD and id2 2 OC . The identi er id1 is a member of the identi er id2 if id1 2 I (id2). The membership relationship should not be exchanged with the instantiation relationship, which is de ned shortly.

2.2 Partially ordered set of identi ers

A binary relation among class identi ers, denoted as (id1 subclass id2) where id1; id2 2 OC , is used to represent the inheritance hierarchy of classes. We assume that this relation is given by the de nition of the conceptual schema of an objectoriented database. Using the subclass relationship, we de ne a relationship i.

De nition 2 Let id1 ; id2 2 O then id1 i id2 if one of the following holds:  id1 = id2 ,  id1; id2 2 OC =) 9id3 (id3 2 OC ^ (id1 subclass id3 ) ^ id3 i id2), or  id1 2 OD ^ id2 2 OC =) 9id3 (id3 2 OC ^ id1 2 I (id3) ^ id3 i id2 ). 2 The i relationship is called more speci c or, the opposite, more general relationship. An example of a set of identi ers ordered by the relationship i is de ned by the following terms: student i person, employee i person, instructor i person, ta i student, ta i instructor, jim i instructor, jane i student, john i ta. It can be easily seen that the relationship i organises identi ers into the partially ordered set (abbr. poset). It is re ective, that is, id i id for all id 2 O. It is antisymmetric since id1; id2 2 O^ id1 i id2 ^ id2 i id1 implies id1 = id2. It is also transitive since id1 i id2 ^id2 i id3 implies id1 i id3 for id1 ; id2; id3 2 O. Lemma 1 The set O is partially ordered by the relationship i . 2.3 Inherited interpretation The ordinary class interpretation maps a class identi er to a set of individual identi ers called the members of the class. By taking into account the previously de ned partial ordering among the class and individual identi ers, another interpretation is introduced. The inherited interpretation [3] of the class identi er c includes the members of the class c and the members of class c's subclasses.

De nition 3 Let cS2 OC . The inherited interpretation of c, denoted I (c), is de ned as: I  (c) = p2OC ^pi c I (p). 4

Using the above de nition of the inherited interpretation, we de ne the instantiation relationship commonly used to represent the associations between individual and class concepts. Let id1 2 OD and id2 2 OC . The identi er id1 is an instance of id2 if id1 2 I (id2 ).

3 Values and their properties So far we have presented identi ers and their structural properties. In this section, we extend the concept of identi er to the notion of value. We distinguish between two basic types of value: identi ers and structured values. The set and tuple constructors are used to build the structured values. Let us rst present the basic terminology used in this section. We assume the existence of an in nite set of values V . Identi ers are, as de ned in the previous section, elements of the set O which can be further divided into the set of individual identi ers OD and the set of class identi ers OC . Structured values are divided into the set of ground values VD , and the set of values VT that represent types. Further, we assume the existence of a set of attribute names A.

3.1 Values De nition 4 The value is one of the following:

 id 2 O,  fo1; : : : ; ong, where oi 2 V , or  [A1 : oi; : : : ; An : on], where oi 2 V and Ai 2 A. 2 An example of a value is [age:50; kids:fana; jimg; lives at:"Brisbane"; work as: "teacher"], representing the properties of a person. The component fana; jimg is

the set of identi ers which represent kids. The strings "Brisbane" and "teacher" have the role of primitive identi ers which denote the address and profession of a person.

3.2 Types Values which include only the individual identi ers are called ground values. When the values are composed solely of class identi ers, we refer to them as types1 . Analogously to our perception of class identi ers, types stand for the abstract representation of a set of values. Formally, a type is de ned as follows.

De nition 5 The value t is a type, that is, t 2 VT , if one of the following holds: Usually, the term type is used to represent the static structure and the behaviour of a set of values. For the purpose of this presentation we ignore object behaviour. 1

5

 t 2 OC ,  t = fsg, where s 2 VT ,  t = [A1 : t1 ; : : : ; An : tn], where ti 2 VT and Ai 2 A. 2 Let us present an example of a type. The tuple [name:string; age:int; works: organisation; lives at:address] can represent the properties of employee. The type of the attribute age is the primitive type int. Next, the class identi er organisation has the role of a reference type [23].

3.3 Interpretation Close integration of the concepts of class identi er and type provides a clear method for the de nition of type interpretation which can be now de ned as a straightforward extension of the class interpretation. The type interpretation is de ned as follows. Note that in the following de nition Ic denotes the inherited interpretation of class identi ers.

De nition 6 Let t 2 VT . With respect to type t structure, its interpretation, denoted I (t), is:  t 2 OC =) I (t) = Ic(t),  t = fsg =) I (t) = fo; o  I (s)g,

 t = [A1 : v1 ; : : : ; An : vn] =) I (t) = f[A1 : v1; : : : ; An : vn]; vi 2 I (ti )g.

This de nition of the type interpretation speci es the membership relationship between values and types. Let v 2 VD and t 2 VT . The value v is a member of t if v 2 I (t). Therefore, the members of the particular type t are the elements of type t interpretation.

3.4 Partially ordered set of values

The relationship i de ned on identi ers is extended to relate values. It is denoted as v . As with the relationship i , we call the relationship v the more speci c relationship. Intuitively, values that are more speci c, or "below" in the ordering de ned by the relationship v , re ne the more general values that are "higher" in the set of values V with regard to the relationship v .

De nition 7 Let v1 ; v2 2 V be values. The value v1 is more speci c then the value v2 , denoted by v1 v v2 , if one of the following holds:  v1; v2 2 O =) v1 i v2 , 6

 v1; v2 2 VT ^ v1 = fsg ^ v2 = ftg =) s v t,  v1; v2 2 VT ^ v1 = [A1 : a1; : : : ; An : an] ^ v2 = [B1 : b1 ; : : : ; Bk : bk ] =) n  k ^ 8i(i 2 [1::k] ^ Ai = Bi ^ ai v bi), or  v1 2 VD ^ v2 2 VT ) v1 2 I (v2). Just as the relationship i organises identi ers into a partially ordered set, the relationship v forms a partial ordering of values. It can be easily seen that

it is re ective, antisymmetric and transitive. Therefore, the following lemma can be written.

Lemma 2 The set V is partially ordered by the relationship v . The previous de nition of the value poset captures the notion of partial ordering of types as de ned by Cardelli in [10], or Vandenberg in [23]. It can be obtained by restricting the set of all values V to types VT . Similarly, the value poset includes the membership relationship between types and ground values.

3.5 Inherited type interpretation

The type interpretation de ned in the previous sub-section maps a type T to a set of its members whose structure is strictly the same as the structure of a given type. We remove this constraint by de ning the inherited interpretation of a type to be the union of the interpretation of a given type and the interpretations of all types which are more speci c than a given type.

De nition 8 Let tS2 VT . The inherited interpretation of t, denoted I (t), is de ned as: I  (t) = s2VT ^sv t I (s): The above de nition captures the notion of the instantiation relationship between ground values and types. Formally, the instantiation relationship can be de ned as follows. Let v 2 VD and t 2 VT . The value v is an instance of type t if v 2 I (t). De nition 7 gives a syntactical means for checking the relationship v between values. The following theorem shows the correspondence between the syntactical de nition of v and the inherited interpretation function I .

Theorem 1 Let t1; t2 2 VT . The following relation between the relationship v and the interpretation I  holds: t1 v t2 () I  (t1 )  I  (t2 ): Proof. The rst part of the proof is to show that the syntactical de nition implies the subsumption of the corresponding inherited interpretations. De nition 8, which presents the inherited interpretation of types, uses the relationship v to

7

identify more speci c values. If t1 is more speci c than t2 then, according to De nition 8, the set I (t1) has to be included in the set I (t2 ). The reverse direction can be proved in a similar manner. Suppose the relationship I (t1 )  I  (t2) holds. De nition 8 states that I  (t2) includes all inherited interpretations of more speci c types from VT . Therefore, if I (t1) is included in I (t2 ), than t1 has to be in the set of types which are more speci c than t2 , or t1 v t2. 2

4 Objects and their properties The proposed data model distinguishes between two aspects of objects. First, every object has an identity, also called object identity (oid), which is realised by an identi er that distinguishes it from all other objects in the database. Second, every object has a value which describes its state. The two basic object aspects are connected by means of a value assignment function that maps identi ers to corresponding values.

4.1 Object de nition We distinguish between primitive and de ned objects. The identity of the primitive object is the same as its value. The value of the de ned object can be any of the previously de ned values. De nition 9 An object is a pair o = (id; v), where id 2 O and v 2 V . The object can be in the following two forms:  primitive object: (id; id), where id 2 OD , or  de ned object: (id; v), where id 2 (O ? OD ) ^ v 2 V . Let us now present some examples of objects. Firstly, an example of a primitive object is (1; 1), which is a formal representation of the integer number "1". The identity of "1" is the same as its value. Similarly, the object (int; int) stands for the formal representation of the integer number. As presented in Section 2, the term int represents an identi er. Further, the following are some examples of de ned objects. An example of tuple-structured object is (tom; [name:"Tom";age:19;courses:fmath; history; gymg]), where the term tom is an identi er which uniquely identi es an object. This example presents an individual object; in the following example, we give a description of an abstract entity represented by a class object. The object (person; [name:string; age:int; lives at:address]) stands for an abstract representation of the person. In the above examples, we made a distinction between individual and class objects. Since this distinction is important for the further presentation, we de ne 8

these two types of objects explicitly. Firstly, the individual object represents a single concrete entity from a modelling environment. Its value includes solely the individual identi ers. Secondly, the class object represents an abstract entity which stands for: the representation of an abstract concept, or, from the other point of view, an abstract representation of the set of individual objects. The value of a class object includes solely the class identi ers.

4.2 Relations between identi ers, values and objects In this sub-section, we present in more detail some relationships among the basic elements of the formal presentation: identi ers, values and objects. First, we present the value assignment function. Next, the inheritance of properties in the context of the presented formal view of object-oriented database is discussed. Finally, we present the relations between the partially ordered sets de ned in previous sections and the partially ordered set of objects.

On the value of object An object is described by a set of properties which are represented by attributes. We assume that the attributes are de ned for a particular object at the time of their creation. The individual objects inherit attributes from their parent class objects. The attributes of class objects are de ned when the conceptual schema of a database is de ned2 . Due to the partial ordering relationship i among object identi ers and the use of the inheritance principle, the attributes that correspond to an object consist of: attributes that are directly associated to the class, and inherited attributes. For this purpose, two assignment functions are de ned. First, given an object o the value assignment function  returns the properties of object o that are directly associated to the object o. Second, the inherited value assignment function   returns all properties of the object o. For example, the value assignment function applied to the class identi er student is de ned as  (student) = [degree:int; courses:fcourseg]. The result of the application of the inherited value assignment function to the same object identi er is   (student) = [name:string; age:int; degree:int; courses:fcourseg]. The latter type includes the properties that pertain to classes student and person. Formally, the inherited value assignment function is de ned as follows. Note that the function C maps a tuple to the set of its components.

De nition 10 Let id 2 O. The inherited value assignment is de ned as:   (id) = [A1 :v1; : : : ; Ak :vk ]; 8Ai (Ai 2 Atr ^ i 2 [1::k] ^ 9p(id i p ^ Ai 2 C ( (p)))): 2

We do not consider here schema modi cation facilities.

9

Let us now present examples of using the value assignment and the inherited value assignment function on an individual identi er. The value assignment function applied to the identi er peter, which is an element of the class identi er person, returns, for instance, the tuple [name:"Peter"; age:40; address:"Ljubljana"]. Next, the result of the application of the inherited assignment function   to the same identi er is the tuple [name:"Peter"; name:string; age:40; age:int; address:"Ljubljana"; address:string]. The above tuple includes pairs of components with equal attribute names: the type of the attribute, and the actual value of the attribute. The properties of inheritance are further discussed in the following sub-section.

Structural inheritance

The inherited value assignment function   (De nition 10) can return a type which includes two attributes with the same name. Therefore, a problem arises when we would like to access the value of such an attribute. There are two types of con icts. In the rst case, inherited attributes with the same name are de ned for objects related by the relationship i . In the second case, the cause of the con ict is multiple inheritance. In this situation, an object inherits two or more attributes with the same name from more general objects, which are not related by the relationship i. The rst type of con ict is resolved using the overriding principle; the attribute which is closest with respect to the poset of identi ers is chosen. Still, according to De nition 10, both attributes are de ned for the particular object. The value of the overridden attribute can be accessed by explicitly stating the object of its de nition. An additional property of the overriding principle is required in the proposed data model to establish the conditions for the partial ordering of objects. The value of attribute A which overrides attribute A0 must be more speci c than the value of attribute A0. This property is speci ed by the following de nition.

De nition 11 Let id1 ; id2 2 O, A 2 C ( (id1 )) and A 2 C ( (id2)). The following implication must hold: id1 i id2 )   (id1 ):A o   (id2 ):A: Note that the value assignment function   is used to obtain the value of an identi er. The dot operator is then used to select the value of the attribute A. The property expressed by the above de nition is necessary for the de nition of the partial ordering relationship o and for the de nition of the static type checking algorithm [20]. Finally, if the name con ict arises as a consequence of multiple inheritance, the user has to state the class where the attribute is de ned explicitly. Therefore, in this case con ict resolution is left to the user. 10

The structures of objects The partially ordered set of values can be seen as the following two posets. First, the relationship i organizes the identi ers into a partially ordered set. Second, values that correspond to each of the identi ers are partially ordered by the relationship v . In other words, if the relationship i holds between identi ers, then the relationship v holds among the corresponding values. This is expressed by the following Lemma. Note that id1 and id2 can be individual or class identi ers. Lemma 3 Let id1 ; id2 2 O such that id1 i id2, then  (id1 ) v   (id2 ). Proof. When the inherited value assignment   is applied to an identi er id, it returns the union of the attributes that pertain to the object referenced by id and all its more general objects. Since id1 i id2 and since De nition 11 requires that the attribute is always overridden by a more speci c attribute, we can conclude that   (id1 ) v   (id2 ). 2 As a consequence of the above Lemma, we can de ne a relationship among objects which integrates the relationships i and v . Analogously to the relationships i and v , we denote the relationship o and we call it the relationship more speci c de ned on objects. De nition 12 Let o1 = (id1; v1 ) and o2 = (id2; v2) be objects. The object o1 is more speci c than o2 , or o1 o o2, if id1 i id2 . Since the object identi ers uniquely identify objects and since, by the above Lemma, the partial ordering relationship v among object values is determined by the relationship i among object identi ers, the complete set of objects is also partially ordered by the relationship o . Lemma 4 The set of objects f(id; v); id 2 O ^ v =  (id)g is partially ordered by the relationship o. In a similar way to the above de nition of the relationship o and partial ordering of objects, the ordinary interpretation and the inherited interpretation of class identi ers can be extended to class objects. They are denoted by I and I , respectively. We give here only the de nition of the ordinary class interpretation. The inherited interpretation of class objects can then be de ned in the same manner as the inherited interpretation of the class identi ers (see De nition 3). De nition 13 Let oc = (idc; vc), where idc 2 OC and vc 2 VT , be a class object. The interpretation of oc , denoted I (oc), is: I (oc ) = f(id; v); id 2 I (idc ) ^ v =   (id)g: Again, as a consequence of the unique identi cation of objects by means of object identi ers, the membership and the instantiation relationships between the class objects and the individual objects are de ned in the same manner as the membership and instantiation relationships among the class and individual identi ers. 11

5 Releasing the boundary between schema and instance levels In the previous sections we have retained a strict boundary between the schema and the instance levels of a database; this was achieved by making a strict distinction between values which represent types and individual values. The components of the former are solely the class individuals, while the components of the later are solely the individual identi ers. In this section we study consequences of allowing values to include individual and class identi ers. The structure and the properties of the identi ers is not a ected by the change in the de nition of values. Therefore, we study the consequences of mixing the schema and instance levels of a database by revising the properties of values, and, further, the properties of objects.

5.1 Values We start with the de nition of values given by De nition 4. The de nition does not restrict the structure and the contents of values in any way. Therefore, the set and tuple structured values can include individual and class identi ers as leaf components. As an example, the value [name:string; age:int; works at:csd] is a type describing the structure of values representing the employees of the Computer Systems Department represented by an identi er csd. Note that string and int are class identi ers while csd is an individual identi er. To di erentiate between values which include class and individual identi ers, and values which include only individual identi ers, we refer to them as types and individual values, respectively. The interpretation of types can now be de ned in a similar way to the previous de nition of type interpretation (De nition 5). The de nition has to be augmented by adding to it a new item stating that the interpretation of an individual identi er is the identi er itself.  t 2 OD ) (t) = t.

Partially ordered set of values The partial ordering relationship v de ned in Section 3 has to be rede ned to

relate values which are composed of individual and class identi ers. Intuitively, as with the values de ned in Section 3, the structured value v1 is more speci c than value v2 if every component of v2 is replaced by a more speci c or equal component. Formally, the more speci c relationship among values, referred to as v , is de ned as follows.

De nition 14 Let v1 ; v2 2 V . The value v1 is more speci c then the value v2, denoted as v1 v v2 , if one of the following holds: 12

 v1 2 O ^ v2 2 O =) v1 i v2  v1 = fs1; : : : ; sng^ v2 = ft1; : : : ; tk g =) 8ti (ti 2 v2 ^9Sj (Sj  v1 ^8sl (sl 2 Sj ^ sl v ti)) ^ [1jk Sj = v1 ), or  v1 = [A1:a1 ; : : : ; An:an] ^ v2 = [B1 :b1; : : : ; Bk :4bk ] =) 8bi (bi 2 C (v2 ) ^ 9Sj (Sj  C (v1) ^ 8al (al 2 Sj ^ Al = Bi ^ ai o bl ))), where C (v) denotes a set of components of tuple v. 2

Let us present some examples of pairs of values for which the relationship v holds. Suppose a database includes: a set of class identi ers student; phd student; etc:; a set of individual identi ers s1; s2; s3; etc: which are the members of class student; a set of identi ers e1 ; e2; etc: representing employees; etc. 1. 2. 3. 4.

s1 v student fs1; s2; s3g v fstudentg fphd student; e1; e2g v fstudent; employeeg [name:string; age:int; lives:address] v [name:string; age:int]

Similarly as for values presented in Section 3, the set of values which can include the individual and class identi ers is partially ordered. Lemma 5 The set of values V is partially ordered by the relationship v . Proof. It can be easily seen from the de nition of v (De nition 7) that the re exivity, antisymmetry and transitivity properties hold for the relationship v . We consider here only the transitivity. Let v1; v2 and v3 be sets such that v1 v v2 and v2 v v3 . If for each element of v3 there exists a set of more speci c elements from v2 for which, in turn, there exists a set of elements of v1 , then it is also true that for each element of v3 there exists a set of more speci c elements of v1 . Hence, v1 v v3 holds. The case when v1 ; v2 an v3 are tuples can be proved in a similar manner. 2 The inherited interpretation of values I  can now be de ned in the same manner as in De nition of 8. Further, using the previously de ned partial ordering relationship v , we de ne another type of interpretation which is useful for the de nition of variables in a database programming language and query algebra [20, 18]. This type of interpretation, referred to as natural interpretation, allows for the variables to range over the individual values and types, providing the basis for queries which can manipulate the instance and schema level of a database. It is de ned as follows.

De nition 15 Let T be a type. The natural interpretation of T is: (T ) = fo; o o T g: 13

5.2 Objects The structure and the relationships among objects are largely determined by the properties of values. In this sub-section, we revise the properties of objects presented in Section 4, and point out the di erences due to the above presented di erences in the de nition and the properties of values. The de nition of objects (De nition 9) can readily withstand the change in the de nition of values. To reconcile, each object has a unique identi er and a value which can now include individual and class identi ers. As with ordinary objects, we di erentiate between individual objects whose values are composed solely of individual identi ers, and class objects whose values include at least one class identi er. Let us now consider some of the relationships among the newly de ned objects in more detail. Firstly, since a value can now include individual and class identi ers, the structural inheritance of properties has to be revised. Secondly, for the same reasons, the procedure used for the derivation of the value of a given class identi er presented by De nition 10, and Lemma 3, which presents the relationships between identi ers and values of objects, require some changes.

Inheritance Similarly to F-Logic [14, 15], two types of attribute which describe the state of a class need to be de ned in order to be able to express the properties which are owned exclusively by the particular classes and those that are inherited by subclasses and instances. Firstly, the ordinary attributes which are inherited by all more speci c objects are called inheritable attributes. Secondly, the noninheritable attributes of a class are used to represent the properties which are not inherited by its subclasses and instances Non-inheritable attributes are useful for the description of the class properties which represent general information about the concept described by a given class. For instance, the number of members of a class phd student, or, average grade for a class student can be represented by non-inheritable attributes.

Other properties of objects As for ordinary classes presented in Section 4, the value of the object can be obtained as presented by the de nition of the inherited value assignment function   (De nition 10). However, only the values of the inheritable attributes de ned with objects more general than an object o are among the properties of o. Since a class can now include an attribute which describes only that class and is not inherited by its subclasses, the partial ordering relationship v and the relationship o have to be augmented accordingly. This can be achieved by treating the values of non-inheritable attributes as xed (or non-evolving) attributes which need not be specialised in the context of a more speci c value. 14

Therefore, tuple t1 is more speci c than tuple t2 if t1 includes attributes whose values are more speci c than the values of the inheritable attributes of t2. The above presented De nition 14, which speci es the partial ordering relationship v , needs to be rede ned by replacing the last item with the following one.  v1 = [A1:a1 ; : : : ; An:an] ^ v2 = [B1:b1 ; : : : ; Bk :bk ] =) 8bi (bi 2 C (v2) ^ inh(Bi ) ^ 9Sj (Sj  C (v1) ^ 8al (al 2 Sj ^ Al = Bi ^ ai o bl ))), where C (v) denotes a set of components of tuple v and the predicate inh(A) returns true if the attribute A is inheritable. Something similar holds for the partially ordering relationship o: non-inheritable properties are treated as properties that are local to class objects and need not be specialised in the frame of more speci c objects. As the consequence of the rede nition of the relationship o , Theorem 1| saying that id1 i id2 , where id1 ; id2 2 O, implies   (id1) v   (id2)|holds for the extended de nition of values and objects. Similarly, the ordinary and the inherited interpretations of objects presented in Sub-section 4.2 do not need to be changed in order to present the possible interpretations of objects as de ned in this section. In addition, the natural interpretation of values (De nition 16) can be easily extended to objects. De nition 16 Let o = (id; v), where id 2 O and v 2 V , be an object. The natural interpretation of o, denoted I  (o), is: I  ((id; v)) = f(ids; vs); ids i id ^ vs v vg:

6 Related work In this section, we overview the existing formalisations of database models and the representation languages that are related to, or have in uenced the design of, the proposed formal treatment of the object-oriented database model. First of all, the presented formal treatment of objects bears close resemblance to the Frame-based languages [13]. The formal view of OO database model is based on ideas introduced by Frame Logic (abbr. F-Logic) [14]. F-Logic is a declarative language that integrates predicate calculus and the features of the OO database model. It treats classes and instances uniformly as objects. Consequently, there is no distinction between class and individual objects, at one level. In comparison to F-Logic, the presented formalisation proposes a view of the OO database model which is closer to recent implementations of OO database management systems. We de ne the semantics of OO database model which is close to the model proposed in [4], show the consequences of treating classes in the same manner as individual objects in the framework of the formalisation, and, afterwards, release the barrier between schema and individual objects by allowing values to include individual and class identi ers. 15

The proposed formalisation uses many ideas presented by some existing formal representations of the OO database model. First, the formalisation is closely related to the formalisation of the O2 database model proposed by Lecluse at al. in [16], and to the formal presentation of the database model of IQL [3]. The paper of Lecluse et al. presents clearly the main features of the OO database model, including the inheritance, methods and types, in a denotational style using the standard notions of interpretations and models. In a similar way, Abiteboul and Kanellakis de ne a structural part of the database model of the logic-based declarative language IQL. Important contributions of this formalisation, which had a signi cant in uence on our work, are: the formal distinction between types and classes; the de nitions of the inherited interpretation of types; and study of the relations between structural inheritance and interpretations in the framework of *-interpretation. Second, our work is related to the formalisation of the database model EXTRA presented by Vandenberg in [23]. Among the important features of the formalisation of EXTRA are: the interpretation which, similarly to the inherited interpretation of types [3], takes into account type hierarchy; and the interpretation of so-called reference types, which correspond to classes in the terminology of IQL and ours. By the presented formalisation we de ne a data model which is related to the family of languages popularly called description logic (abbr. DL) [8]. Description logics evolved from KL-ONE [5]|a frame-based language that uses concepts and roles for describing the modelled domain. Concepts are usually divided into simple concepts and de ned concepts, which can be constructed using the composition and intersection of concepts, providing a kind of inheritance relationship and the generalisation type constructor [2]. Roles can be seen as a generalisation of the single-valued and multi-valued attributes in a database terminology. In addition, roles can be attributed by: value restrictions, co-reference constraints (e.g. equality of two roles) and cardinality constraints, which can be associated to the database constraints. Although DL is designed to serve as knowledgerepresentation formalism providing reasoning facilities such as testing the subsumption between descriptions, there are many common properties between DL and the proposed data model. Besides the similarities in the structural properties of DL concepts and objects in our formalisation, they also share some common operations: for instance, the subsumption test [8] is related to testing the validity of the relationship v between values [19], and, computing Least Common Subsumer [9] corresponds to the model-based operations lub-set and glb-set [19].

7 Concluding Remarks In this paper, we studied structural aspects of the object-oriented database model [4] through the formalisation which uni es the instance and schema levels of OO database. It is shown that the structural part of OO databases can be seen 16

as three partially ordered sets: partially ordered sets of identi ers, values and objects. The relationships between the elements of these sets are presented by de ning the semantics of the basic constructs of the OO database model, such as classes, types, inheritance, and instantiation. The formal view of the structural part of the OO database model [4, 7] is presented by the formalisation which de nes classes in a similar manner to Frames| classes are objects which represent abstract concepts. However, we still had to treat instance and schema levels of the OO database as separate entities in the rst part of the paper. In the second part of the paper, we removed this constraint by allowing the objects to include individual and class identi ers. We studied the consequences by augmenting the basic de nitions of the previously introduced formalisation and we showed that only a few changes and additional constructs need to be introduced in order to represent the formal view of the extended OO database model. The presented database model has been implemented in the framework of the query algebra QAL [18, 20]. In brief, QAL is an object algebra that includes, in addition to the facilities of recent object algebras (see e.g. [23, 21, 1]), also operations which allow for: querying conceptual schemata and the relationships between schema and instance levels of object-oriented databases [19]; and ecient manipulation of complex composite objects [20]. QAL is implemented as a query language interface to the extended relational database system Postgres [22]. Particular attention has been directed to the design of a set of operations, called model-based operations, which are used for inquiring about the properties of class objects and the relationships between classes and individual objects. A detailed description of model-based operations is given in [19] where we also study the queries with which one can inquire about the conceptual schema and/or the relationships between the conceptual schema and the instances. In brief, the model-based operations provide the means for: computing the interpretations of class objects; relating individual objects to their parent class objects; using the more speci c relationships, that is, i, v and o , in queries; computing leastcommon more general and more speci c values of a set of values; and expressing two di erent types of equalities among objects. Our future work related to the formalisation of the OO database model will be as follows. Firstly, the presented formalisation will be used in the implementation of the query algebra QAL [18] as an extension of the database programming language E [11]. In particular, the formalisation will serve as the basis for the design of an extended type system of the C++-based DBPL. One of the main aims of the prototype is to study the means for the manipulation of schema in the framework of DBPL. Secondly, the presented formalisation of the OO database model will be used as the basis for the de nition and implementation of the type checking algorithm for queries expressed by the query algebra QAL [18]. Initial results on both of the above stated issues are presented in [20]. 17

References

[1] S. Abiteboul, C. Beeri, On the Power of the Languages For the Manipulation of Complex Objects, Verso Report No.4, INRIA, France, Dec. 1993 [2] S. Abiteboul, R. Hull, IFO: A Formal Semantic Database Model, ACM Trans. Database Systems, Vol.12, No.4, 1987 [3] S. Abiteboul, P.C. Kanellakis, Object Identity as Query Language Primitive, ACM SIGMOD 1988 [4] M. Atkinson et al. The Object-Oriented Database System Manifesto, Proc. First Int'l Conf Deductive and Object-Oriented Databases, Elsevier Science Publisher B. V., Amsterdam, 1989, pp. 40-57. [5] R.J. Brachman, J.G. Schmolze, An Overview of the KL-ONE Knowledge Representation System, Cognitive Science, Vol.9, No.2, 1985 [6] C. Beeri, A Formal Approach to Object-Oriented Databases, Data & Knowledge Engineering, No.5, 1990 [7] E. Bertino et al., Object-Oriented Query Languages: The Notion and Issues, IEEE TKDE, Vol.4, No.3, June 1992 [8] A. Borgida, Description Logics in Data Management, IEEE TKDE, Vol.7, No.5, October 1995 [9] W.W. Choen, A. Borgida, H. Hrish, Computing Least Common Subsumers in Description Logics, Proc. AAAI Conference, 1992 [10] L. Cardelli, A Semantic of Multiple Inheritance, Information and Computation, 76, 138-164, 1988 [11] An Introduction to GNU E, The E Reference Manual and The Design of the E Programming Language, Exodus Project Documents, University of Wisconsin-Madison [12] D. Edmond, M. Papazoglou and Z. Tari, An overview of re ection and its use in cooperation, Int. Journal of Intelligent and Cooperative Information Systems, Vol.4, No.1, 1995 [13] R. Fikes, T. Kehler, The Role of Frame-Based Representation in Reasoning, Comm. of ACM, Vol.28, No.9, Sept. 1985 [14] M. Kifer, G. Lausen, F-Logic: A Higher-Order Language for Reasoning about Objects, Inheritance, and Scheme, ACM SIGMOD 1989 [15] M. Kifer, G. Lausen, J. Wu, Logical Foundations of Object-Oriented and Frame-Based Languages, Technical Report 93/06, Dept. of Computer Science, SUNY at Stony Brook [16] C. Lecluse, P. Richard, F. Velez, O2, an Object-Oriented Data Model, ACM SIGMOD 1988 [17] M.P. Papazoglou, Unraveling the Semantics of Conceptual Schemas, Comm. of ACM, Sept. 1995 [18] I. Savnik, Z. Tari, T. Mohoric, A Query Algebra for Objects, Technical Report, Jozef Stefan Institute, IJS-DP 7285, February 1996 [19] I. Savnik, Z. Tari, Querying Conceptual Schemata of Object-Oriented Databases, Proc. of DEXA'96 Workshop, IEEE Comp. Soc. Press, Zurich, September 1996 18

[20] I. Savnik, A Query Language for Complex Database Objects, Ph.D. thesis, University of Ljubljana, CSD Technical Report, Jozef Stefan Institute, CSDTR-95-6, June 1995 [21] G.M. Shaw, S.B. Zdonik, A Query Algebra for Object-Oriented Databases, Proc. of Data Eng., IEEE, 1990 [22] M. Stonebraker, L.A. Rowe, M. Hirohama, The Implementation of Postgres, IEEE Transactions on Knowledge and Data Engineering, March 1990, vol.2, (no.1):125-42 [23] S.L. Vandenberg, Algebras for Object-Oriented Query Languages, Ph.D. thesis, Technical Report #1161, University of Wisconsin-Madison, July 1993 [24] Z. Tari, W. Cheng, K. Yetongnon, I. Savnik, Towards Cooperative Databases: The Distributed Object Kernel Approach, Proc. of Parallel and Distributed Computing Systems Conf., Dijon, France, 1996.

19

Suggest Documents