version 2.0 revision e - CiteSeerX

0 downloads 0 Views 606KB Size Report
Mar 6, 1995 - Thus if we have defined a class Railroad, we are allowed to use a type ...... attributes are represented by arrows going from one box to another. ...... Ctrl If Ctrl is pressed during dragging, attributes and sub/supertypes are ... Pressing Shift and the right mouse button jumps back to the position before the last.
The TM Manual

version 2.0 revision e Ren´e Bal Herman Balsters Rolf A. de By Alexander Bosschaart Jan Flokstra Maurice van Keulen Jacek Skowronek Bart Termorshuizen 03/06/95

Contents I Introduction to TM: language overview, its usage and its tools 1 Introduction 1.1 History of TM : : : : : : : : 1.2 Differences with prior versions 1.3 The ‘jizz’ of TM : : : : : : : 1.4 How to use this manual : : :

: : : :

2 Language overview 2.1 Introduction : : : : : : : : : : 2.2 Description and characterization 2.3 Modules : : : : : : : : : : : : 2.4 Persistence : : : : : : : : : : : 2.5 Object sharing : : : : : : : : : 2.6 Method inheritance : : : : : : : 2.7 Transactions and queries : : : : 2.7.1 Transactions : : : : : : 2.7.2 Queries in TM-QL : : :

: : : : : : : : : : : : :

: : : :

5 6 6 7 9

: : : : : : : : :

10 10 15 16 18 18 19 22 22 23

: : : : : : : : : : : :

24 24 24 25 29 30 30 33 36 36 36 37 38

:::::::::::::::::::::::::::: :::::::::::::::::::::::::::: ::::::::::::::::::::::::::::

40 40 42 42

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

3 Methodology and example specification 3.1 How to use TM : : : : : : : : : : : : : : : : : : : : 3.2 A diagram language for TM : : : : : : : : : : : : : : 3.3 The four-step methodology of arriving at a specification 3.4 An example specification : : : : : : : : : : : : : : : : 3.4.1 The architecture : : : : : : : : : : : : : : : : 3.4.2 The Action module : : : : : : : : : : : : : : : 3.4.3 The User Interface module : : : : : : : : : : : 3.4.4 The RMS module : : : : : : : : : : : : : : : 3.5 Application interface : : : : : : : : : : : : : : : : : : 3.5.1 Introduction : : : : : : : : : : : : : : : : : : 3.5.2 The TMcore library : : : : : : : : : : : : : : 3.5.3 How to use the resulting SPOKE program? : : 4 The TM tools and how to use them 4.1 How the tools work together : : : 4.2 The Graphical TM Interface (GTI) 4.2.1 The graph viewer : : : :

3

1

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : :

CONTENTS

2

4.3 4.4

II

4.2.2 Mouse and key bindings : : 4.2.3 The browsers : : : : : : : : The type checker : : : : : : : : : : The prototyping environment : : : 4.4.1 The expression list window 4.4.2 The expression window : : 4.4.3 The edit window : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

The TM language definition: syntax, typing rules and theoretical issues

5 Syntax diagrams 5.1 Preliminaries : : : : 5.2 Conceptual schema : 5.3 Classes and sorts : : 5.4 Constraints : : : : : 5.5 Methods : : : : : : 5.6 Expressions : : : : 5.7 Operator priorities : 5.8 Naming conventions

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

53

: : : : : : : :

55 55 55 57 60 61 63 75 75

: : : : : : : :

76 76 76 76 81 82 82 83 87

:::::::::::::::::::::: :::::::::::::::::::::: ::::::::::::::::::::::

89 89 91 95

: : : : : : : :

6 Typing rules 6.1 Introduction : : : : : : : : : : : : : : : : : : 6.2 Typing rules : : : : : : : : : : : : : : : : : : 6.2.1 Preliminaries : : : : : : : : : : : : : : 6.2.2 Modules : : : : : : : : : : : : : : : : 6.2.3 Typing rules for the schema part of TM 6.2.4 How to read typing rules : : : : : : : : 6.2.5 Typing rules of TM : : : : : : : : : : 6.2.6 Additional comments : : : : : : : : : 7 Method inheritance and related theoretical issues 7.1 Methods : : : : : : : : : : : : : : : : : : : 7.2 Illustrations of methods and their inheritance : 7.3 Incorporation of method inheritance in TM :

45 45 47 47 48 48 49

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : :

8 Open issues

98

References

99

Index

101

Part I

Introduction to TM: language overview, its usage and its tools

3

Chapter 1

Introduction This is the TM language manual, describing version 2.0 of the database specification language TM. This document is a working document, which means that not all (although many) of our ideas concerning the construction of the language have been thoroughly evaluated. Most importantly, we lack a fine set of application case studies, which would undoubtedly result in further enhancements and tuning of the language. This is obviously a cyclic problem, because how can we obtain good case studies if there is no manual? Thus, the main goal of this document is to get some people out in the field of database specification use TM, and report on their findings. We use TM as a high-level language for the design and specification of object-oriented database schemas in an efficient and effective manner. The TM language and its accompanying design tools enable users to perform complex semantical analyses of schemas, thus paving the way to a complete debugging of the conceptual design. As a design language, TM is equipped with powerful structuring primitives which enable a user to arrive at natural and intuitively correct designs. These structuring primitives are characterized by the following features

       

Encapsulation (The concepts of Module and Class) Multiple inheritance Object-oriented specialization (Objects are not only specialized by adding attributes; already existing attributes can also be subject of specialization) Complex objects (Records, lists, sets, variants; and all arbitrarily nested) Methods and method inheritance Static constraints of different granularity (Object-, Class-, and Module level, described by a full first-order typed logic) Composition links (Direct references to other objects as values of attributes) Static type checkability (The language has a complete formal basis)

We note that the TM language is has a complete formal semantics ([BaBZ93]), and it is this property of having a formal semantics that actually creates the possibility of having an integrated tool set, as TM does. Having a formal semantics entails that all expressions in the language have a precise and unique meaning; without such a non-ambiguous meaning for all constructions occurring in the language, 5

CHAPTER 1. INTRODUCTION

6

it is impossible to build a reliable toolset supporting the language features involved. A designer using TM does not necessarily have to have knowledge of TM’s underlying formal basis to achieve correct specifications of TM schemas, but it should be a reassuring fact that the TM toolset is for a large part the result of careful research depending on TM’s well-established mathematical semantics.

1.1 History of TM The TM language is a specification language that is formally founded in the language FM. FM, in turn, is a language that is based on the ideas of Luca Cardelli [Card84,Card88]. It can be seen as a strongly typed lambda calculus that allows for subtyping and multiple inheritance. Over the past four years the theory of FM has been developed by Balsters, Fokkinga, and de Vreeze [BaFo91,BaVr91,Vree89] to exploit the ideas of Cardelli and to augment the theory to make it one that is suitable for object-oriented database specification. The language TM should be understood as a (rather heavily) syntactically sugared—as not to say syruped—version of FM, that gives a database modeller the full dictionary of object-oriented data models. First ideas on this language originated in discussions during our visit to Milano in September 1990 [BaBZ91]. This is why the language is called TM: it stands for Twente–Milano. Most of the present syntactical constructs of the language were developed in the early weeks of 1991. A redesign of the language took place in the first half of 1993.

1.2 Differences with prior versions TM is, like many other languages, still evolving. Version 2.0 is the result of another round of new ideas, discussions about these ideas and formal definition of the final results. The language has changed with respect to the following aspects:



   

The concept of a module is introduced. Each module defines a number of classes and a module section. It is possible to construct complex module hierarchies. The concept of a database section has been replaced by the module section of the top level module. Which module serves as the top level module can be chosen freely. The complex module hierarchy ’under’ this top level module can be considered a subsystem, which can be specified, type checked, etc. in isolation. The manual has been extended with a chapter giving an exact description of the TM typing rules. This TM typing rules existed previously as a separate document. The document has been revised to make it easy readable for a more general public. A formal basis has been defined for method inheritance. Consequently, the syntax has been changed to enable its use. Persistence and transaction specification have been studied in depth. This manual has been extended with a section that explains the problems and pitfalls. Several small syntax changes have been carried out: – Enumerated types based on variant types were added. – Notations for the empty list and empty set were introduced. – Several operations on sets, lists and strings were added.

1.3. THE ‘JIZZ’ OF TM

7

– The distinction between sort expressions (SE) and class expressions (E) has been removed.



Of course, additional comments have been added to the manual and some pieces of text have been re-formulated.

1.3 The ‘jizz’ of TM The jizz1 of TM is mainly determined by its formal theory. This means that TM is a functional language, with a strict typing paradigm. As a consequence, any expression is assigned a type, so even an if: : : then: : : else: : : endif expression has a type. TM is not merely functional, it also has a general set notion that makes it especially suited for database specifications. Like any other expression, set expressions are assigned a type. Moreover, there is no preset order of evaluation of expressions other than to fully evaluate an expression, its subexpressions have to be evaluated.2 To the user who is mainly trained in imperative languages like Pascal, Modula, or C, this may seem a bit strange at first. We feel that the constructs offered by the language will soon give him/her the hang of it. The notion of type is a rather important one in TM. A full understanding of the type system will improve the quality of the specifications written, and we would therefore like to devote a few words to it here. In general, the possible complex, nested types of the FM language are either

    

basic types, such as bool, int, real, string, but possibly also others for the sake of a full support of specific application areas like GIS (Geographic Information Systems) (think of photo, colour, et cetera), record types with named fields, which are essential for a complete database specification, variant types with named fields, which could also be called choice types and which are akin to Pascal’s variants, list types of some underlying type, to allow for lists of expressions of that underlying type, and set types of some underlying type, to allow for sets of expressions of that underlying type. Set types are sometimes also called power types.

In addition to the above types of FM, the language TM also has so-called class types. An (objectoriented) specification consists of a list of class specifications.3 A class specification identifies a type, which is usually the called the class’ underlying type, and constraints and methods on the objects of the class. The class itself, having been specified, can be used as a new type in other parts of the specification. Thus if we have defined a class Railroad, we are allowed to use a type LRailroad. If an expression e has this type as its type, we can think of e as a list of railroads. Another important issue in modelling in general, and in the use of TM in particular, is that of subtyping and the inheritance that follows from subtyping. Again, there is a single general concept for this in FM, and we have extended this concept to TM to be able to deal with class types as well. In FM, informally speaking, we have the following situation: 1

Jizz is an expression commonly used amongst naturalists, plain spotters and other fanatical hobbyists. It is a noun that stands for G.I.S., which means general impression of shape . This abbreviation allegedly has a military background. 2 And if we were to use the technique of lazy evaluation this isn’t fully correct either! 3 In order to turn such a list into a full conceptual database specification, there will also be a separate ‘module section’ in the list. This is not important for the present discussion.

CHAPTER 1. INTRODUCTION

8

   

there is no subtyping relation amongst basic types, a record type is a subtype of another record type if and only if it has at least the same fields but possibly more and the types of the common fields are in the subtyping relation, a variant type is a subtype of another variant type if and only if it has at most the same fields of that type and the types associated with those fields are in the subtyping relation, a list (or set) type is a subtype of another list (set) type if and only if the underlying types are in the subtyping relation.

We remark that the subtyping relation is thus one that is strictly separated between families of types (e.g., the record types). Also, FM’s subtyping is a relation that can be deduced from the structure of the types. In addition, TM allows for a declared form of subtyping, i.e., a form in which the user should explicitly state that one type is a subtype of another. This possibility is restricted to classes and sorts only, and it imposes the requirement that the underlying types of the two classes (sorts) are in the subtype relationship themselves. The explicit statement that a class (sort) is a subtype of another class (sort) is performed through the ISA-clause. A final issue that we would like to raise in this introduction is the kind of expressions that are allowed in TM. As we have already discussed, each expression is assigned a type in TM, be it a record type, list type or any other type. The expressive power of TM is sufficient to model many database applications. To the untrained user, it is the functional and object-oriented nature of the language that will make its use somewhat ‘unnatural’ at first. Soon, however, the advantages of a strong typing discipline, the use of inheritance, and a clear modelling paradigm will become apparent. If we were to take a closer look at the forms of expressions, we could place them in several categories. The following list is not exhaustive but illustrates some interesting forms of expression.



There are so-called explicit expressions, like for instance

hname = “John

Doe”,age = 25i ,

which denotes a two-field record, or true, which denotes the boolean truth, or even [1, 3, 2, 89] which denotes a list of four integer numbers.



A second class of expressions are selection expressions, like for instance record field selection in

hname = “John

Doe”, age = 25iname,

which evaluates to the string “John Doe”, or list selection, like in in head([1, 3, 2, 89]), which evaluates to the integer constant 1.



Another class is formed by the predicative expressions, which allow to describe a set of elements by characterizing those elements, like in

fx:ha:int,b:inti j xa = xb and (xb = 1 or xb = 2)g,

1.4. HOW TO USE THIS MANUAL

9

which is a clumsy way of writing the enumerated set of records

fhha = 1,b = 1i, ha = 2,b = 2ig.

By the way,

xa = xb and (xb = 1 or xb = 2),

is in itself also a legal TM expression (of type bool). The reason is that part of the expressions of the language is formed by a full typed first-order logic, including quantifiers.



There is actually a whole range of useful expressions, which we would not like to discuss in full in this introduction. The reader is referred to Section 3, where we will give hopefully illuminating examples of their use. A last class of expressions are iterative expressions, which take a set or list as their first argument and a function and evaluate that function for each element of the set or list. The result obtained is again a set or list. Hence, the expression collect h a = x div 2i for x in [1, 3, 2, 89]

renders a new list:

[h a = 0i , h a = 1i , h a = 1i , h a = 44i ].

1.4 How to use this manual The TM manual is divided into seven separate chapters. The current chapter gives a limited, but general overview of the syntax of the language, that can be used to gain a quick understanding of what the language is about, and how its constructions are organized. It refrains from giving the precise syntax of all expression forms and is thus best used for obtaining some quick, though not necessarily detailed knowledge of TM. Chapter 2 explains the concepts that were introduced in this version of TM. To further study the language one should take a look at Chapter 3. Here, some small examples of TM specifications are given that should be enough to get the first-time TM user started. We do remark that this report does not discuss the methodological topic of how to obtain fine database specifications in TM, or any other appropriate language for that matter. Rather, we provide the user with building blocks that (s)he can put to his/her own use. Chapter 3 does, however, address the methodological topic a bit, to make the example and its construction more clear. Chapter 4 is also useful for a starting user. It describes how he/she should use the tools provided for TM. Once the reader has reached the hands-rubbing stage the Chapters 5 and 6 may become useful, since in those chapters the full syntax and typing rules are provided. We have chosen to present the syntax in the form of diagrams for ease of reference. In Chapter 7, some theoretical issues (e.g. method inheritance) are addressed to give a clear understanding of TM’s subtleties.

Chapter 2

Language overview 2.1 Introduction Before giving a more elaborate overview of the TM language we provide the reader with a list of answers to questions posed by [ADGM90]. We repeat the questions. 1. Nature of the type system (a) Does the language have a type system? Yes, TM is based on the Cardelli typing discipline, and it has in addition a notion of class and one of sort. We stress that TM was meant to be a specification language, and not a database programming language, although the differences are slight. (b) What is the purpose of the type system? The motivation for the type system is type checking a specification at compile-time, and also data modelling, i.e. by identifying natural classes. To some extent, the type system provides a hint towards implementation, but this hasn’t been elaborated upon completely. (c) Is the language strongly typed? Is type checking static or dynamic? The language is both strictly and statically typed; there is no notion of late binding. (d) How is type checking used in data definition and for operations? Type checking is used for both: in data definition to make sure that all types and classes exist and are consistent, and in operations to make sure that expressions are constituted by correctly typed subexpressions. (e) If the previous questions do not fully cover the nature of the type system, what is missing from the description of the system? Taking the more or less standard Cardelli type system [Card84], we have defined four basic extensions for database purposes: i. ii. iii. iv. 1

addition of collection types—i.e., sets, lists and bags1—and their expressions, an additional layer of typed logic, an additional layer of user-defined types that we call classes or sorts, and an approach of class-based, inheritable methods with covariance characteristics.

The current manual doesn’t treat bag types and bag expressions, but they will eventually be a part of the language.

10

2.1. INTRODUCTION

11

2. Expressiveness (a) What primitive types and type constructors are available in the language? The following primitive types are allowed: bool, int, real, string, char, nil, error and oid. There is, however, no limitation on the number of primitive types, and one can also have basic multi-media types and their operators in the language. The following type constructors are available: record (labeled cross-product), variant (labeled disjoint sum), list, set, bag, method, reference (modelled as an oid-value with referential integrity constraint), abstract data type. References are modelled as oid-values with referential integrity constraints being defined over them. Likewise classes and sorts can be viewed as abstract data types in some sense: they have an underlying ‘implementation type’ that can be restricted through additional constraints. The constraints are not part of the type system in the way that they are evaluated by the type checker. The language is a functional language and therefore doesn’t have the notion of mutable values. However, eventually what results from a database specification is the database object and its operations. The semantics of queries and transactions treat the database object as a mutable value. (b) Are there restrictions on the combination of type constructors or the form of recursion? Recursive classes are the only recursive structures that are allowed. Otherwise, methods are dynamic operators that cannot be combined with the static type operators: one cannot have a set of methods. Each method is associated to a class. All static type operators are orthogonally combinable. (c) What kinds of polymorphism are supported by the type system? There exist simple ad hoc polymorphic operators for the primitive types in the system: arithmetic operators for instance. Also, it has been a design decision to let set operators be used as much as possible also as list operators. For instance, the collect-expression takes both sets and lists in its in-clause. The language has no facility for defining ad hoc polymorphic operators. Methods show a restricted form of parametric polymorphism in the sense that its implicit self argument is implicitly typed by the type parameter selftype, which is restricted to take values (i.e., classes or sorts) that are subclasses (-sorts) of the class (sort) where the method is defined. Furthermore can additional, explicit arguments and the result type of a method be typed by a type expression that has selftype as its only type parameter. 3. Types and values (a) What are the properties that are possessed by values of all types? Is there a concept of first class values? What are the properties that define first class values? Do all types have these properties? First class values coincide with the values for types that have been built from the primitive types and static type operators only. Such values can all be assigned to attribute labels or variables, can be stored in the database object, and can also be passed as actual arguments to methods, or be the result of them. (b) Are there values that can be typed that have special properties not shared by all values? What are they and how are they special? Among these might be functions, mutable values and types themselves.

CHAPTER 2. LANGUAGE OVERVIEW

12

As indicated above methods behave differently than expressions from other types: they can, for instance, not be passed as actual parameters of other methods. The same holds for types. (c) Does the type system allow the construction of objects? Yes it does, but the formal treatment is somewhat awkward at the present stage. (d) Is every value an object, or do both object and non-object values exist? Are objects first class values? What is the exact distinction (if any) between objects and first class values in the language? How do types of objects fit into the type system? The language has object and non-object values: the distinction can be made by looking at the type of a value. If its type is a class, the value is an object, otherwise a non-object. Only objects have identities. Objects are first class values, but not the only ones: each value of a primitive type is a first class value too, for instance. The type of an object is always a class, or many classes, and all classes constitute a subfamily of types. (e) What is the semantics of the equality relation? Non-object values are equal if, in the formal semantics, their constituents are pairwise equal. This bottoms out in an equality relation that has been postulated over the primitive types. For object values there is a notion of equality, meaning that identities and attribute values should be equal, and a notion of sameness, meaning that only identities should be equal. There is no user-defined equality. (f) Is there a concept of null value? Are there different kinds of null values? How do they interact with the equality relation? There is a primitive—trivial, one element—type called nil, that can be used to construct possibly null valued expressions by giving them a binary variant type with a nil possibility. This is the only way of having null values. There is no problem with equality. (g) Can a value have more than one type simultaneously? Can a value change its type? What are the restrictions on such mutations? Values have a so-called minimal type; this holds for both object and non-object values. Any type that is a supertype of the minimal type is also a type of that value. Values can under certain circumstances change their minimal type, but strictly speaking the result of such a change renders another value. In fact, there are just two constructs that do this: one is the construct “e as  ” which casts an expression e to one of its supertypes  , or in other words, the result of this expression is like e but it has minimal type  . The other form is when a new-operator, for which there is one for each class and sort, is applied to an expression of the underlying type of the class or sort. The result is a new object or a new sort value. 4. Relationships among types (a) What is the nature of type equivalence? With classes and sorts name equivalence is used, with all other types we use structural equivalence. (b) Does the language offer abstract, concrete and unnamed types? The language offers abstract types in the form of classes, which have objects as instances, and sorts, which have sort values as instances. Typically, however, the representation type or underlying type as we call it is not so hidden as in programming languages: attributes of objects, for instance, can be addressed in query expressions. Hence, it is better to speak about semi-abstract types in this case. We feel that this is more natural in a specification language.

2.1. INTRODUCTION

13

So far, the language doesn’t allow type abbreviations or concrete types, but it could easily be extended with such a feature. Unnamed types are supported throughout. (c) How are abstract types related to object types? Object types, or classes, are a specific case of abstract types. The other case are sorts. (d) Does the language offer a module mechanism and, if so, what are its features? Are modules related to abstract data types? Are modules realised through a type constructor or are they unrelated to the type system? The language offers a simple module mechanism that has no formal counterpart. Specifications are organized in modules and each module can import other modules. This is only used for textual separation of distinct parts of the specification. (e) Is there a concept of abstract type in which the term ‘abstract’ refers to the stipulation that the type is not meant to have instances, but only to allow for the more convenient definition of other types? No, such a notion is not present, although the specifier could use a class to that end in the sense that there need not be any persistent objects of the class. 5. Types and subtypes (a) Is the subtype relation defined implicitly or explicitly? In contrast to classes and sorts, ordinary types have an implicitly defined subtype relation, known as Cardelli’s subtyping rule extended to collection types. There are no subtype relations among primitive types. (b) What subtyping rules are adopted in the type system? Is there any notion of closure in subtyping? If so, are subtypes restricted to closed subtypes, or can closure be specified and enforced or verified? For the reader: B is called a closed subtype of A if any operation that accepts values of type A and produces values of type A will produce a value of type B if applied to a B type value. See our discussion above on methods. Methods are intended to give closure, but their bodies may not be well-typed in the subtype for closure characteristics. Closure cannot be enforced other than changing the method body. The type checker is capable of verifying whether closure holds for a subtype. (c) Can redefinitions be made going down the subtype hierarchy? If so, what may be redefined and under what control? For the time being the language does not allow to redefine a method. The prime reason for this is that we are hesitant to run into disambiguation problems. It may prove that such problems are not too difficult after all, and then redefinition could be allowed. (d) Is the subtype graph single or multiple? How are name conflicts resolved? Multiple hierarchies are allowed. Name conflicts in attributes, if they lead to type conflicts of those attributes are not allowed: we assume a specification error. 6. Classes and subclasses (a) Does the language have a concept of class? If so, what is it? Our language has two separate notions of class, namely class and sort. A class (sort) is a definition of a representation type, constraints over its values, and methods that apply to its

CHAPTER 2. LANGUAGE OVERVIEW

14

values. The representation type of a class is always a record type with one special attribute id of type oid. Representation types of sorts are free, and hence sort values do not have an identity. There is no direct notion of a class (sort) ‘container,’ but see our discussion of persistence, and that of transactions in Section 2.7. (b) How are a class and the type of its elements related? An object (sort value) has a class (sort) as its type as the typing discipline goes. Each also has a representation type as discussed above. Since there is no direct notion of class (sort) container any object (sort value) built in a program should be inserted explicitly in a proper variable. (c) Does the language have a distinction between subtypes and subclasses? The type, class and sort hierarchies are independent of each other, but all three are used for typing purposes. As we have no direct containers there is no implicit inclusion dependency between such containers. See again the discussion of persistence below. (d) Is the subclass relation defined implicitly or explicitly? Both for classes and sorts it is explicitly defined by the specifier through an ISA-clause. (e) What subclass rules are adopted in the type system? See our discussion under the same question for subtype rules. (f) Can redefinitions be made going down the subclass hierarchy? If so, what may be redefined and under what control? No, see above. (g) Is the subtype graph single or multiple? How are name conflicts resolved? It is multiple, and name conflicts in attributes, if they lead to type conflicts of those attributes are not allowed: we assume a specification error (h) Does the language have single or multiple superclasses? Is it the case that the set of all superclasses of a class must have a maximum element? The language allows multiple superclasses, and there need not be a maximum element. (i) How are subclasses populated? There is no direct population of classes. There is a provision of object constructors, but these are yet incomplete for the full capabilities required for object migration. (j) Can objects be removed from classes and subclasses? Can objects be deleted? See above. Inclusion dependencies can be explicitly defined over persistent values but there is no automatic propagation mechanism for insertions or deletions. (k) Are subclasses of the same superclass disjoint? Is the union of the sets of the elements of the subclasses equal to the set of the elements of the superclass? Persistent values can be defined to be disjoint if so needed. 7. Database issues (a) Is persistence orthogonal to the type system? Yes, it is: any value, object or set, list or bag thereof can be defined as being persistent simply by defining it to be a database attribute. (b) What kinds of database schema evolution are supported? Currently, none at run-time. The prototyping environment (See 4.4) is capable of handling

2.2. DESCRIPTION AND CHARACTERIZATION

15

situations where schema changes make existing data invalid. A user is offered a facility to adapt the existing data to the changed schema without him having to construct the database state from scratch. (c) Are there issues of transaction processing, concurrency control, resilience, reliability or recovery that are addressed in the type system? Transaction specification is a topic that we are currently studying. There is a separate discussion of this topic in Section 2.7. None of the other issues have been addressed yet, and some will not be either. 8. Other issues (a) To what extent is type inference employed in the language? Except for the trivial cases of inferring the type of primitive values, there is no advanced type inference mechanism. (b) Is there any theory supporting the type system? The language has a formal, denotational semantics. As of now, there is no complete proof theory, although work in this direction is under way. We want to obtain such a theory for having the possibility of constructive generation of proofs over expressions in the language. (c) Are there implementation factors that are essential to the understanding of the system? We apply a special technique, at least formally, for dealing with persistence and object sharing. Besides that, implementation factors play no role. (d) Are there issues that are not covered in the previous questions that are important to understanding the type system? After such a list? To the best of our knowledge: no. (e) What is the status of the implementation of the system? We have developed a simple prototype generator in the logic programming language LIFE [A¨ıt-K91,A¨ıNa]. We are currently investigating mapping specifications in the language to a proper object platform GEODE [BHPT93] and the object-oriented programming language SPOKE [ISR91].

2.2 Description and characterization TM is an object-oriented and a functional specification language for database applications. It is object-

oriented in the sense that it

     

allows to define classes that incorporate attributes, methods and constraints, has a full type theory that encompasses subtyping, makes use of object identity, supports ISA-hierarchies amongst classes, supports inheritance of attributes, and supports a restricted form of inheritance of constraints and methods.

The language is functional in the sense that it

CHAPTER 2. LANGUAGE OVERVIEW

16

  

is based on a lambda calculus-like language in which expressions have a denotational semantics, and thus expressions do not have side-effects.

Finally, the language is a specification language for database applications in the sense that

  

it has no support for encapsulation principles, as we believe that the separation between specification and implementation — for which the notion of encapsulation was designed — should not be made in a language like this; in other words, a specification carries no implementational details, it has no support for late binding as this issue seems to be too poorly understood to address it in a formal context of software engineering, it has specific support for the definition of database applications like the issues of module, persistence, object sharing and transactions.

In the general literature the reader may by now find many treatises on object-oriented concepts, like [BaDK92,Meye88], and functional concepts, like [BiWa88,Davi92]. The combination of object-oriented and functional concepts is already a rarer feat, and its application in the area of database specification is even more so. Some approaches consider a type-theoretic, functional language for this applications area [Beer90,Duzi91,HuKi86,OhBB89,Ship81,StSh84]. Some specific topics in functional database specification need further mentioning and elaboration. The topics that we discuss in the upcoming sections are: persistence, object sharing, method inheritance and transactions.

2.3 Modules A module is a view of a TM specification. It groups a number of class and sort definitions together for easier division of the specification. “On top” of those sort and class definitions in the so-called module section the designer can define the constraints and methods using those sorts and classes. The methods grouped within the module section are the interface to the application. The end-user of the application will use those methods to create new instances, manipulate and retrieve them. Only the methods defined in the module section can be used from the “outside” (i.e. An application program using the database specified with TM). This does not preclude using class or object methods, but only within module methods, not from “the outside”. In this way, the designer can group classes and/or methods in such a way into modules, that the end-user will “see” the whole application according to his/her needs. We will explain the module concept by way of example (see Figure 2.1). Supposing we specify a database describing a company, we could distinguish different divisions of a company, and associate modules with them. In this way, we would have modules Personnel, Financial, Research : : :. We would also probably have a module called Base, where classes common to the whole company would be defined. Each of the modules would contain classes, sorts and methods specific to that department, and will include other modules when other classes/sorts or methods are needed. If we, for example, define a class Person in the module Base, then the Person class will be visible in the Research module if we include the Base module in the Research module definition. The module section defines the module attributes, which are the “own” module attributes, as well as module attributes of the included modules. In this way we are able to access the attribute names from

2.3. MODULES

17

module Base class Person : : : // specification of class Person end Person module section attributes PERSONS : PPerson // this attribute will hold all Persons defined // in the database end Base module Research includes Base class Researcher ISA Person // specification of class Researcher end Researcher module section attributes RESEARCHERS : PResearcher researchStaffCount : integer module constraints m1 : researchStaffCount = count RESEARCHERS module methods

:::

end Research module WholeBusiness includes Base, Research module section attributes totalStaffCount : integer module constraints m2 : totalStaffCount = count PERSONS // visible from Base module m3 : totalStaffCount = researchStaffCount + personnelStaffCount // supposing we also defined module Personnel somewhere module methods

:::

end WholeBusiness

Figure 2.1: Example specification with modules.

18

CHAPTER 2. LANGUAGE OVERVIEW

those modules (for example researchStaffCount in the WholeBusiness module). This requires then that the names (of attributes, classes and sorts) introduced by different modules must be unique. From the software management point of view, modules can be placed in different source files. It is assumed that the file containing a module definition has the same name as the module itself, only with extension “.tm”. The search for module definitions in an includes clause is therefore only a search for the corresponding file.

2.4 Persistence Persistence is the ability of objects to continue to exist across user sessions in spite of operations on them or hard- or software failures. Or in apt terminology for this user manual: persistent objects are the objects that form the database data. Earlier object-oriented data models treat persistence as an issue that is coupled to the class hierarchy, typically by having a with extension-clause connected to each class specified, like in [L´eRV88], and also in the earlier versions of the TM language. We no longer consider the issue of persistence to be one so closely linked to the class hierarchy. Rather, we want to allow an arbitrary number of persistent objects to be defined, giving the specifier all possible freedom, for instance, in having two tables — i.e., sets — of employee objects, or by having no tables at all, but other complex and persistent objects. Thus, the notion of persistence becomes an orthogonal notion with respect to the class hierarchy. This should be a clear statement to the will-be specifier. A persistent manager object is still an employee object because of typing rules, i.e. if Manager is a subclass of Employee, of course. That manager object, however, need not be a persistent employee object: it could be, but it need not be. Persistence is orthogonal to the typing discipline. A precise notion of persistence can be unambiguously put into words in the following way. As we have seen when we discussed modules, a TM-specification will in general be made up of several modules, each of which will identify some persistent objects. To be precise, each module will identify the names of the persistent objects that are relevant to that module. In this sense, persistence can be viewed as a ‘compile-time notion’: when a module is specified, we know the names of the relevant persistent objects. This first form of persistence will be called persistence by name. There is, however, also a ‘run-time notion’ of persistence, namely those objects that are referred to by other persistent objects. A persistent manager object, for instance, could have an attribute that takes the manager’s department as value. Because the manager object is persistent, this department object is also persistent. This second notion of persistence we call persistence by reference. Each module has a general section that describes relationships between parts of the eventual database. Typically it also contains a list of attributes that will form the persistent objects of that module.

2.5 Object sharing A database typically is not just a collection of isolated data items, but rather one of data items sharing common properties and obeying common rules. Hence, the data in the database, i.e. the objects, should be able to refer to other data, possibly many times to the same item. This is known as object sharing. To give a small example, consider a manager object that identifies the department (another object) that the manager is leading. At the same time there may be many employee objects that identify the same

2.6. METHOD INHERITANCE

19

department object as their primary department. We say that the department object is shared amongst the manager object and employee objects. Object sharing is a natural requirement for conceptual data modelling as it allows the specifier to identify that data items may participate in relationships amongst each other. There are, however, certain serious drawbacks from a formal point of view with object sharing that we illustrate in what follows. Let us first of all concentrate on the fact that the TM language is a functional language that allows certain forms of parallelism in its expressions. These forms are not at all specific for our language, and we believe that many object-oriented languages that allow set expressions do indeed suffer from the problem. The problem has already been studied by a couple of authors [ChHa80,DCBM90,HuSu89, HuYo91,LaSc93]. We give two illustrations of the problem in TM syntax. First of all consider the following expression: collect m[x] for x in S iff p(x),

which calculates the set of m[x] expressions for each x in S for which p(x) holds. Now consider the fact that the method m, as a side-effect, updates some global variable g in some way. It may well be that the order in which we take the values for x from S matters to the eventual result of this variable g . There is an intrinsic—and unwanted!—form of non-determinism in the expression. Now the reader may argue that TM is a functional language in which it is impossible to update a global variable as a side-effect. The reader is right up to the point where we allow object sharing, as this brings a non-functional characteristic into the language. Another example uses the more explicit form of parallelism of our language, namely the exceptconstruct. Consider an object e that has two attributes a and b. The expression e except (a = e1 , b = e2) stands for the same object e but with its attributes a and b having obtained new values in parallel. Now suppose that the original value of both a and b was an object that they were sharing, and that the e1 and e2 expressions in themselves are just local changes to that shared object then we have another form of a (potentially) non-deterministic expression. A full solution to the problem explicated above, and that remains within the realm of a functional specification language, is not provided here nor anywhere else in the literature. We will provide a pragmatic way of dealing with it in Section 2.7.

2.6 Method inheritance In TM we have the possibilty to inherit methods within the framework of our ISA-hierarchy; i.e. we can apply methods defined in some Class C also within any subclass C’ of C. This is, however, not a trivial task, and TM is one of the few object-oriented languages that takes care of matters related to method inheritance in a (mathematically) sound manner. We first sketch why method inheritance poses some serious problems; we will then offer a solution to these problems in TM. A more general approach to problems concerning method inheritance is discussed in section 7, where we also treat in some detail the mathematical background for a better understanding of the problems related to inheritance of methods. Some example methods Consider the following module Module Example

CHAPTER 2. LANGUAGE OVERVIEW

20 Sort Address type end Address Sort ManagerAddress ISA Address type end ManagerAddress

Class Employee attributes first address : Address, second address : Address object update methods Move1 (in na:Address) = end Employee Class Manager ISA Employee attributes first address : ManagerAddress end Manager end Example

In this module we have defined a method, called Move1, which updates an employee object by changing its former first address into a new address whose value is given by the parameter na. Since class Manager is in the ISA-relation with class Employee, this method Move1 is also applicable to manager objects as well. Note that manager objects have an extra field, namely the field holiday address, in comparison to employee objects; this means that if we strictly apply the method Move1 to some manager object it results in a new employee object and not in a new manager object because the component with field holiday address is no longer available in the resulting record. In other words, by applying method Move1 to some manager we do not get a manager object in return, but rather an just an employee object. Surely this is not the expected result of inheriting method Move1 from class Employee to class Manager. The way to remedy this situation is to rewrite method Move1 in the following manner Move1 (in na:Address) = self except (first address = na)

The except-construction used above, should be read as follows: the application of Move1 to some object substituted for “self” results in a new object created from self by solely changing the value of the attribute“first address” to the value given by “na”, and leaving the rest of the field values of the self object intact. This means that by now applying Move1 to some manager object returns the same manager object but only with an altered value for the field first address. Hence, in this manner we have achieved that inhertance of method Move1 from class Employee to class Manager indeed results in the updating of manager objects. There is still a slight problem, though, in the presented solution. The attentive reader might have noticed that there is another difference between class Employee and class Manager next to the additional field “holiday address”: the field value for “first address” of manager objects is a specialized version of the corresponding employee value, since it is required that for manager objects this field value is of type ManagerAddress in stead of just Address. This means that by strictly applying the (though improved) method Move1 to a manager object reults in a new object that is not of type Manager because the type of

2.6. METHOD INHERITANCE

21

the first address field is not precisely correct anymore. What we would like is a means to indicate that certain fields in the declaration of Move1 in the original class Employee are to vary in accordance with the subclass hierarchy. In our case, we would like that possible specializations on the first adress field of Employee are taken into account. This can be done by rewriting method Move1, a second time, in the following manner Move1 (in na:selftypefirst address) = self except (first address = na)

Here, selftype is meant to be a variable ranging over subclasses of Employee, and the construction selftypefirst address denotes the type corresponding to the field first address in the specific (sub-)class substituted for selftype. Hence, we will now apply the method Move1 first to some specific subclass of Employee, for example Manager, and secondly to some object of class manager, and thirdly to some suitable value of the parameter “na” in accordance with the firstly substituted subclass; in the case of our example na is substituted by some value of type ManagerAddress (in stead of just Address). To illustrate the differences between employing and not employing selftype, consider an altered version of the class Employee below, where we have listed various methods differing slightly, but crucially, in their use of selftype Class Employee attributes first address : Address, second address : Address object update methods Move1 (in na:Address) = self except (first address = na) Move2 (in na:selftypefirst address) = self except (first address = na) DoubleMove1 (in fna:Address, sna:Address) = self except (first address = fna, second address = sna) DoubleMove2 (in fna:selftypefirst address, sna:Address) = self except (first address = fna, second address = sna) DoubleMove3 (in fna:selftypefirst address, sna:selftypesecond address) = self except (first address = fna, second address = sna) DoNothing (in fna:Address, sna:Address) = if (fna=sna) then self else self endif end Employee

and again we will assume that we hanve the following definition of the class Manager Class Manager ISA Employee attributes first address : ManagerAddress end Manager

From this example one can easily see that the methods Move1 and DoubleMove1 canot be inherited from class Employee to class Manager, but all the other methods do! This example nicely illustrates the differences beween inheritable and non-inheritable methods. We mention here that the use of the selftype construct is not limited to object methods alone; actually this construct can be used to construct inheritable class methods, as well as inheritable retrieval methods. We also mention that we can do more than just perform record projections on selftype. For all type constructs available in TM we also have a corresponding version of selftype; i.e. that we have selftype constructions associated with sets, lists, and variants as well. We refer the reader to section 7 for more details on matters related to the subject of method inheritance in TM.

CHAPTER 2. LANGUAGE OVERVIEW

22

2.7 Transactions and queries 2.7.1

Transactions

The very fact that our language has a functional nature and that it allows to have object sharing is a heavy burden for transaction specification possibilities. The situation becomes somewhat more complex because of the two earlier identified ways of having intrinsic parallelism in expressions. The present section proposes a pragmatic way of dealing with these problems. The general problem, as far we know, is unresolved. As we discussed before, we assume the class hierarchy and the set of persistent variables constituting the database to be two orthogonal notions. In other words, there is no mandatory extension to each class. Hence, the choice of persistent variables is completely free to the designer. The persistent variables are interpreted as the attributes of the database, which is formally viewed as a record expression. Now consider the general case where we know the database record type that has been defined in terms of classes, sorts and other types. Let the persistent variables, i.e. the database attributes, be identified by the names a1; : : :; an. The types of these attributes will typically use classes in their complex structure; some classes may not be used, while others are used many times. To model object sharing—and as we will see to characterize valid parallel updates—we will for each used class in C in the database record type augment that database type with an attribute C ext of type PC , to which the end-user or specifier has no direct access. This is closely akin to the explicit extensions as used in [BaBZ93]. The C ext attributes can be seen as a formal implementation of object sharing, because in effect it will be these added database attributes that hold the actual object data, whereas the original database attributes will only hold object references in the form of oid values, besides other values. Thus, possibly shared objects rest in the C ext database attribute for their class. Let us illustrate this with a small example. Assume that we have two classes, C 1 and C 2, and C 1 is the only one with a class-typed attribute a : C 2. Suppose our database record type has been specified as

hb1 : C 1; b2 : PC 1; b3 : LC 2i :

The specifier and end-users will always see the database just as a record of this form. However, in our formal treatment, we will be dealing with records of the form

hb1 : oid; b2 : Poid; b3 : Loid; C 1 ext : Ptype(C 1); C 2 ext : Ptype(C 2)i ; where type(C ) stands for the underlying type of class C in which references to class types have been replaced by the type oid. The intention for the C 1 ext attribute is to hold the representation of all persistent objects of class C 1, hence this collection consists of the b1 object and all objects in b2. Likewise, the C 2 ext attribute holds the representations of all elements of the list b3 but also all C 2 objects that are referred to by objects in C 1 ext. It is this set-up that ‘implements’ object sharing in a way that gives a single entry point in the database state for any arbitrary object, and thus allows us to have a strictly functional (and denotational) approach to transactions. The pragmatic approach that we currently use for formal transaction semantics unfolds as follows. A specified transaction is translated to an operation on the extended database record form. (We speak only of database methods here: these we call transactions.) If, and only if, this translation is possible in a deterministic manner, which means that there are no concurrent updates on the same class extension, the transaction is considered a valid one. The last condition can in fact be slackened if we provide a theory that allows to prove that all possible deterministic interpretations of a non-deterministically specified transaction have the same semantics.

2.7. TRANSACTIONS AND QUERIES

23

To determine the semantics of valid transactions, we use two object spaces, one being the database seen as a set of objects, the other initially being the database but being changed by inline object changes. The reader is referred to an upcoming technical report. To conclude, the TM user may wonder what should be the lesson of this section. It is twofold: first of all, the design toolbox will eventually have a tool that checks for non-determinism in transactions. Such specifications will not be accepted by the system. The second lesson is that the TM user can circumvent these problems by not using any intrinsically non-deterministic specification. If the illustrated forms above are not used, no problems should arise. Obviously, all these remarks are only valid for updates to the database; query evaluation will never be a problem.

2.7.2

Queries in TM-QL

A database management system usually also supports ad hoc querying the database. Since TM is a database specification language, it is typically used when there is not yet a database, and not when there is. We do, however, contemplate to build a simple but direct ad hoc query facility to use the TM language as query language. The prime reason for this is that most work that is needed for such a facility has already been done: a query is just a database retrieval expression, and we know how to type check these, and also how to translate such an expression to, for instance, a P-SPOKE environment. The main principles of use of a database query facility are thus:

  

a query is just a TM-expression in which the self parameter stands for the database record, a query may invoke predefined methods from the database schema, as long as they are type-correct, and a query should be type-correct and safe, i.e., give a finite result.

There are some technical issues to be resolved later when we actually build the query facility. The most obvious problem to be resolved is that if we want to accommodate general recursive queries that go beyond the use of recursion in predefined methods, then the facility should in fact provide a recursive query definition mechanism. It may well be, that in such cases we want to resort to a logical query language like DTL that is currently under development in our group [BaBa].

Chapter 3

Methodology and example specification 3.1 How to use TM Though the eventual goal when working within a TM design environment (= language + tools) is to arrive at a complete and correct specification of a database schema in terms of the TM language, it will not always be clear from the outset just how to start at arriving at such a specification. To help the designer getting started we first of all provide the aid of a accompanying TM diagram language and a set of design heuristics (a so-called methodology). This combination of a diagrammatical representation technique with a specific methodology (the so-called 4-phase methodology) will get the designer started off in the right direction, thus enabling him to get a clear top-level view of the design he is aiming at. By using the other TM tools (such as the TM-Emacs syntax-directed editor (SDE) for entering method and constraint specifications, the TM type checker and the TM prototyping environment) the designer can further enhance his specifications with more detail and debug his design on the way. The TM diagram language is described in section 3.2, the TM tools in section 4, and the 4-phase methodology in section 3.3.

3.2 A diagram language for TM To ease the process of writing TM specifications, a diagram language has been developed. There exists a one-to-one mapping from a TM diagram to and from a TM specification (structural part only, of course). Therefore, all structural elements and type constructors are present. In this section, we will explain the diagram language topic by topic. Very similar to ER-diagrams, TM classes are represented by boxes filled with the class name, and attributes are represented by arrows going from one box to another. In TM, we also have sorts, which are represented by ovals. Subtyping between classes or between sorts is represented by double-line arrows going from the subtype to the supertype. See Figure 3.1. To avoid messy diagrams in which many attribute arrows intersect, there is a special notation (dotted oval/box) for a sort/class you would only like to reference. There must always be exactly one normal oval/box specifying the sort/class. In this way, you can reference a sort/class directly, or via a dotted oval/box. For an example diagram, see Figure 3.2 in which you can see two equivalent TM diagrams depicting a Person/Employee/Manager hierarchy where the Manager class has one attribute ‘subordinates’ which takes a set of Employees as its accompanying value. Remember that there is always exactly one definition of each sort/class and there may be multiple 24

3.3. THE FOUR-STEP METHODOLOGY OF ARRIVING AT A SPECIFICATION

Class Sort

Definition

Reference

Class name

Class name

Sort name

Sort name

Inheritance

25

Basic sorts int

real

nil

string

char

bool

error

Figure 3.1: TM diagram symbols for sorts, classes, inheritance and basic sorts. references either directly (arrow points to a normal oval/box) or indirectly (arrow points to a dotted oval/box). Of course, if there is a dotted oval/box, there must always be a normal box for that sort/class. Of course, you can also reference the basic sorts int, real, char, string, bool, nil and error. Basic sorts are represented by circles, and they can appear many times in a diagram. Therefore, there exist no dotted circles. When you need to reference a basic sort, just draw the circle once again. See Figure 3.1. In TM, we can construct arbitrarily complex types using record, variant, list and set constructors. The last two constructors are represented by different arrow ends. A list valued attribute is represented by an arrow ending in a filled square and a set valued attribute is represented by an arrow ending in a filled circle. These arrow endings can be concatenated constructing types like a set of lists of integers. See Figure 3.3. Record and Variant types are represented by empty circles from which attribute arrows originate. Variant types contain an arc through those attribute arrows. See Figure 3.4. A complete overview of the diagram language can be found in Figure 3.5. In Section 3.4, you can find an example specification for which the TM diagrams are also given. The tool with which you can draw these diagrams and specify constraints and methods, is described in Section 4.2.

3.3 The four-step methodology of arriving at a specification In this section we will assume that an inventory has already been made of relevant data pertaining to the specification of our schema; i.e. we assume that we know, to a certain extent, what we want to model. This inventorial (or “scratch pad”) phase, though very important, is not part of our methodology. As far as we are concerned, any scratch pad will do as long as it does the job; i.e. it should provide us with an informal list of data (though not necessarily complete) relevant to our modelling task. (Examples of such scratch pads are the initial phases of ISAC, IE, JSD, SDM, SADT, etc.) What we want to do is get all the information contained in such an informal description of data organized into a TM schema, with the ultimate goal to get the information described as precisely as possible and suitable for querying and further transaction processing. Roughly, the way we work will consist of the following four phases 1. Identify the overall architecture of your specification; i.e. Identify your modules

CHAPTER 3. METHODOLOGY AND EXAMPLE SPECIFICATION

26

Person

Person

Employee

Employee

Manager

Manager

subordinates

subordinates

Employee Figure 3.2: TM diagram example of direct and indirect referencing.

Single valued attribute List valued attribute Set valued attribute

attribute name attribute name attribute name

Example of concatenating List/Set constructors: P L int

attribute name

int

Figure 3.3: TM diagram symbols for attribute arrows.

Record type

record attributes

Variant type

variant attributes

Figure 3.4: TM diagram symbols for records and variants.

3.3. THE FOUR-STEP METHODOLOGY OF ARRIVING AT A SPECIFICATION

Class Sort

Basic sorts

Definition

Reference

Class name

Class name

Sort name

Sort name

int

real

nil

string

char

bool

error

attribute name

Single valued attribute

attribute name

List valued attribute

attribute name

Set valued attribute Inheritance

Record type

record attributes

Variant type

variant attributes

Figure 3.5: Overview of the TM diagram language.

27

CHAPTER 3. METHODOLOGY AND EXAMPLE SPECIFICATION

28

2. For each module, Identify your Classes 3. For each Class, Identify your Objects 4. Complete your system module with relevant constraints and methods Each of these phases will be discussed shortly below; an application of the 4-phase methodology will be given in section 3.4, where we treat an example specification. As a starting point, we will assume that the scratch pad phase has resulted in the informal descriptions that can be found at the beginning of sections 3.4.2 and 3.4.3. 1. Identify your modules This phase consists of designing your system module and determining which separate modules will play a role inside the system module. This system module will play the role of the encompassing database where all relevant information of the organization can be found. By separating the system module into separate modules, the top-down approach of our design methodology is emphasized. In our example specification (cf. section 3.4), we identified a system module consisting of two separate modules. These separate modules concerned an Action module (containing a database specification) and a User-Interface module (containing a user-interface specification). These two modules are combined in an overall specification in which the interdependencies between the two modules are described (this is done in the system module, called the Relations Management System). 2. Identify your Classes This phase consists of identifying the classes that play a role inside each separate module. It also concerns organizing these classes into a suitable ISA-hierarchy. In our example specification, we discerned in the Action module six separate classes playing a role in the specification; these are the classes Action, User, Person, Company, Customer and Supplier. Furthermore, we can place these classes in the following hierarchy: User ISA Person Customer, Supplier ISA Company Once the object types for each of these classes has been established (see phase 3), we can come back to this phase and specify relevant class constraints and class methods. 3. Identify your objects Actually, this phase should be called “identify your object types”, since the aim of this phase is to identify, for each separate class, what the common structure of the objects inside that class look like. This means that for each object we should identify



its attributes



for each attribute its associated domaintype

Such a domaintype can have a complex structure; it can even be another Class or some Sort defined elsewhere in the specification. In our example specification, we have identified for the objects belonging

3.4. AN EXAMPLE SPECIFICATION

29

to class Action, the following relevant attributes and associated domaintypes: relation: Company, contact: Person, description: string, entered-by: User, action: string, sent-to: User, action-done: bool, follow-up-of: [jnull,value:Actionj] Note that the majority of these attributes have other classes as their corresponding domaintype. Once these type structures have been determined, we can move to the specification of constraints and methods that are relevant to the objects belonging to the particular class. 4. Complete your system module After having completed phases 1–3, we can return to further enhancement of the system module. In the system module, we can now provide the link between the separately described constituent modules by possibly adding extra attributes, constraints and methods. In our example specification, we have identified an extra attribute for the system module, namely an attribute whose value corresponds to the one company form existing in this environment. We have also described a method that black-lists a customer in both the user interface and the database. It should be remarked that even though our methodology is called a 4-phase methodology, there is no real order in which the different phases are to be applied. We started from a more or less top-down perspective, beginning by considering the system module and gradually working our way down, but this does not necessarily have to be the case. For example, it is perfectly valid to start from first fully specifying classes separately, and then to move on from there to adding them to certain modules you think will play a role in the specification. In fact there are hardly any rules to be applied at all; all specification activities can be done in parallel, interactively, and with various stages of feedback. All our methodology suggests to offer, and no methodology can claim to do more, is a set of heuristic guidelines that will help the designer find his way in a mass of raw informal data. The actual job remains to be done by the designer, and such a job is, by nature, a creative one.

3.4 An example specification In this chapter, a sample TM specification for a relations1 management system will be given. In this system, an enterprise’s relations are stored, together with the actions which users undertake with regard to these relations. The specification particularly shows how a system may be divided up into modules, and how the dependencies amongst modules may be specified. The description below consists of three modules: (1) a database specification (Action module), (2) a user interface specification (User Interface module), and (3) an overall specification in which the former two are tied together (RMS module or Relations Management System). This architecture is shown in figure 3.6. The specification given in the following sections is not an attempt at completely modelling a real-life RMS. The example is merely intended to give the reader an idea of how the constructs presented in the previous chapters may be used. 1

In the example, a relation is a human-human relation, not a database relation.

CHAPTER 3. METHODOLOGY AND EXAMPLE SPECIFICATION

30

3.4.1

The architecture

The Action module contains a user-interface independent specification of the RMS system. The User Interface module contains a specification independent of the Action module. The integration is performed by the RMS module of the specification, where links are specified between the first two modules.

Action Module

User_Interface Module

RMS System

Figure 3.6: The architecture of the RMS system For each of the modules, a short description and a graphic representation of the structure will be given.

3.4.2

The Action module

The goal of the RMS is to professionalize the communication of the users with customers and suppliers, whereby a tool for coordinating this communication is a necessary sub-goal. The Action module describes the outside world of the enterprise as companies, each of which may be classified as a customer or a supplier. Suppliers have as extra characteristic a discount rate, whereas customers have a credit limit and may be black-listed, in which case they are not to be given credit. All companies may have one or more contact-persons. Internally, the enterprise consists of people, of which certain ones are users. All users have user-accounts with passwords at least 5 characters long. The internal world, the users, are linked to the outside world, the relations, via actions. These actions register contacts or planned contacts amongst users or between users and relations. Each action is entered by a user for a certain user, requires a description (a context)and the action to be undertaken with regard the specified relation. An important element in the enterprise is the problem of users giving black-listed customers credit. This should be impossible. The Action module, at the object (Customer) and module level, describes how a customer is to be black-listed. This method will be used by the RMS module. module Action Module Class Action attributes relation : Company, contact : Person, description : string, entered by : User, action : string,

3.4. AN EXAMPLE SPECIFICATION

31

follow_up_of

description

str action

Action

action_ done

entered_by

bool

contact sent_to

relation name

User

Company

Person

mail_address visiting_address

name address

tel_number

contacts tel_number

account

str

Customer credit_limit

UserAccount username

password

int

black_listed

bool

Supplier discount_rate

real

str

Figure 3.7: Graphical TM specification of the Action module sent to : User, action done : bool, follow up of : [| null, value:Action |] object constraints act1 : contact in relation.contacts object update methods conclude action (in action : Action) = if action done = false then self except (action done = true) else self endif class update methods create action (in co : Company, con : Person, desc : string, e user : User, a user : User, act : string, lastid : oid) = self union fAction (inc(lastid), < relation = co, contact = con, description = desc, entered by = e user,

str

32

CHAPTER 3. METHODOLOGY AND EXAMPLE SPECIFICATION action = act, sent to = a user, action done = false >)g remove action (in act : Action ) = if not act.action done then self minus act else self endif class retrieval methods what still to do (in user : User, out P Action) = collect x for x in self iff (not x.action done and x.sent to = user) end Action Class Person attributes name : string, address : string, tel number : string class constraints keyname1: self key (name) end Person Class Company attributes name : string, mail address : string, visiting address : string, tel number : string contacts : PPerson object constraints com1 : forall c in contacts | c.address = visiting address com2 : contacts 6= emptyset(Person) class constraints keyname2: self key (name) keytel1: self key (tel number) object update methods change tel number(in new number : string) = self except (tel number = new number) class retrieval methods company with contact(in contact : Person, out PCompany) = collect x for x in self iff (contact in x.contacts) end Company Class Customer ISA Company attributes credit limit : int

3.4. AN EXAMPLE SPECIFICATION

33

black listed : bool object constraints cus1 : black listed implies credit limit = 0 object update methods Cus put on black list (in cus : Customer) = self except ( black listed = true, credit limit = 0) end Customer Class Supplier ISA Company attributes discount rate : real object constraints sup1 : 0 5 end UserAccount Class User ISA Person attributes account : UserAccount end User module section attributes ACTION : PAction, CUSTOMER : PCustomer, SUPPLIER : PSupplier module update methods AM put on black list (in cus : Customer ) = self except (CUSTOMER = replace Cus put on black list[CUSTOMER](cus) for cus in CUSTOMER) end Action Module

3.4.3

The User Interface module

The user interface is described in terms of generic elements such as may be found in a graphic environment. The elements used in this specification are data fields, which have a string content, list boxes, which have a set of strings for their contents, check boxes, which are either checked or unchecked, and forms, which have a state and a status message. The status is either insert (for entering new data), select (for entering a search criterium), or update (for changing data). The specification below describes a form for entering, searching for and updating of companies. These companies may be suppliers or customers. For each case the parts of the form which are customer- or supplier-specific are enabled or disabled, depending on the kind of company. Again, a method has been specified at object (CompanyForm) and module level for

CHAPTER 3. METHODOLOGY AND EXAMPLE SPECIFICATION

34

black-listing a customer. This method will be used in linking the user interface with the Action module. Checkbox black_listed

str

supplier

checked

visiting_address

bool

Graphical Item

message mail_address name tel_number

Datafield contents

credit_limit

enabled

Company Form

Form form_state

discount_rate

str bool

contacts

Listbox

insert/ select/ update

contents

str

Figure 3.8: Graphical TM representation of the User Interface module module User Interface Sort GraphicalItem type < enabled : bool > update methods enable = self except (enabled = true) disable = self except (enabled = false) end GraphicalItem Sort DataField ISA GraphicalItem type < contents : string> end DataField Sort ListBox ISA GraphicalItem type < contents : Pstring> end ListBox Sort CheckBox ISA GraphicalItem type < checked : bool > end CheckBox Class Form attributes

nil

3.4. AN EXAMPLE SPECIFICATION form state : [| insert, select, update|], message : string object update methods set insert mode = self except (form state = [| insert |]) set select mode = self except (form state = [|select |]) set update mode = self except (form state = [|update |]) end Form Class CompanyForm ISA Form attributes name : DataField, mail address : DataField, visiting address : DataField, tel number : DataField, contacts : ListBox, supplier : CheckBox, credit limit : DataField, discount rate : DataField, black listed : CheckBox object constraints formstate query : (form state is select) implies (not contacts.enabled) supplier : supplier.checked implies (discount rate.enabled and not credit limit.enabled and not black listed.enabled) customer : not supplier.checked implies (credit limit.enabled and black listed.enabled and not discount rate.enabled) object update methods CF put on black list (in cf : CompanyForm) = if form state is update and not supplier.checked then self except (credit limit.contents = ’0’, black listed.checked = true, message = ’Customer blacklisted’ ) else self except (message = ’This function is not available’) endif end CompanyForm module section module update methods UI put on black list (in cf : CompanyForm) = self except ( COMPANYFORM = replace CF put on black list[x] for x in COMPANYFORM iff x = cf) end User interface

35

CHAPTER 3. METHODOLOGY AND EXAMPLE SPECIFICATION

36

3.4.4

The RMS module

The RMS module, which describes the link between the User Interface and Action modules, has only one attribute and one method. It includes the other two modules, thereby obtaining the attributes, classes, sorts and methods of both those modules. The one company form which exists in this environment is specified as an attribute of the system module. In the method specified here, the customer currently in the company form attribute is black-listed in both the user interface and the database. module RMS includes Action Module, User interface module section attributes cf : CompanyForm module update methods put on black list = UI put on black list [ AM put on black list [ self ] ( unique collect x for x in CUSTOMER iff x.name = cf.name.contents)] (cf) end RMS

3.5 Application interface 3.5.1

Introduction

When the TM compiler is invoked with option -gen spoke, it generates a text file which is the translation of TM specification to SPOKE (see Section 4.3). The generated SPOKE program2 consists of:

 



the prologue, where preliminary statements are placed. declarations of names for: classes, sorts, record, variant, set, list and extension types. These declarations are usually ISA statements. There is also a collection for other names, which may come in handy, and cannot be directly classified into above kinds. Some TM concepts are translated in special way. For example, every variant type is translated into one class representing the variant type and a number of classes representing the type of the individual variant values. A TM class always is translated into two SPOKE classes, one representing the class itself, and the other representing the underlying record type. Sets and lists get translated with an additional SPOKE class representing the element type. Here, the inheritance links between classes and sorts are also made. declarations of attributes for the above classes. The class and sort attributes are placed in the underlying types.

2 The TM-to-SPOKE code generator still operates on in current (e.g. the notion of a extension)

TM

TM version 1.3, so some issues may be addressed that no longer exist

3.5. APPLICATION INTERFACE

  

37

declarations of constraints and methods, which are placed in appropriate locations. For example, class methods are placed in the extension type. other SPOKE statements. Here, various initialization statements are placed in the appropriate SPOKE objects. epilogue, where for example instantiations are placed.

To run the resulting SPOKE file in the SPOKE interpreter, one has to use the TMcore library, which contains classes which represent TM concepts in SPOKE. For example, TMcore contains a class TMclass, from which every SPOKE class translated from a TM class inherits.

3.5.2

The TMcore library

The TMcore library is written in SPOKE and contains the elements, which are used in the translated program. For example, if the generated SPOKE program contains a statement: [Person ISA TMclassDynamic inherits

f TMclassg]

then the TMclassDynamic is a SPOKE class defined in the TMcore library. The classes defined in the TMcore library are independent of a specific TM source file. They contain, for example, methods to regenerate the source TM specification from the translated SPOKE program. Also, in this library, SPOKE classes are defined which are the roots of various TM objects. These include TMclass, TMsort, TMset. In the TMcore library, the methods representing TM expressions such as collect are defined, in the classes they concern. For example, set methods such as union and intersect are defined in the TMset class, so every set inheriting from TMset can use them. For example, if in the specification there was an anonymous set type Ph a:inti , then in SPOKE the following class will be defined: [ [ [ [

TMrecord0 ISA TMrecordDynamic inherits f TMrecordg] TMrecord0 HAS attribute a range TMInt] TMset1 ISA TMsetDynamic inherits f TMsetg] TMset1. type := TMrecord0]

TMset1 inherits from TMset, where set methods are defined, so we can now call: [aSet1 ISA TMset1] [aSet1 count over a]

The method count over is the equivalent of the count-over expression and is defined in the TMcore library as a method in the TMset class. In general, the TMcore library contains following elements:

 

metaclasses used in the translation back to TM. Here, attributes are defined which are relevant for all classes/sorts/records, like for example the attribute type, which defines the underlying record type of the class. classes representing TM type kinds, with appropriate attributes and methods. These include aggregate types with methods like count, sum, as well as TMclass class with an oid attribute. All types except record types have an attribute value, which represents the underlying value. For example, the TMint has this attribute with type integer.

CHAPTER 3. METHODOLOGY AND EXAMPLE SPECIFICATION

38



different kinds of methods: retrieval/update, object/class/database. In this way, a method is defined as: [myMethod ISA TMobjectRetrievalMethod range TMint form ....]

   3.5.3

instantiations of the above types methods used in the translation back to TM. Typically, they are defined on the metaclass level. These methods generate TM text representation from the SPOKE representation. methods representing different forms of TM expressions, such as arithmetic methods in TMint representing addition, but also union and sum methods in TMset and TMlist.

How to use the resulting SPOKE program?

To us, in the resulting SPOKE program, one has to:

  

“use” the TMcore library ([use TMcore]) “include” the resulting SPOKE file ([include ]) write SPOKE programs using classes defined in the translated file.

For example, if we had defined class Person in file Person.tm in TM such as: Class Person pers name : string end Person

The translator invocation extm -gen spoke Person.tm > Person.sp

will produce following SPOKE code in Person.sp: [ TMrecord0 ISA TMrecordDynamic inherits f TMrecordg] [ TMrecord0 HAS attribute pers name range TMstring] [ TMrecord0 instanciable] [Person ISA TMclassDynamic inherits TMclass] [Person. type := TMrecord0] [Person instanciable]

Now, we can create an object of type Person by writing: [use TMcore] [include "Person.sp"] ; here starts code NOT written by the compiler [aString0 ISA TMstring value "Jacek"] [aRecord0 ISA TMrecord0 pers name aString0] [anOid0 ISA TMoid value 1] [aPerson0 ISA Person oid anOid0 value aRecord0]

3.5. APPLICATION INTERFACE

39

Suppose we also defined a method in Person: object retrieval methods name method(out string) = pers name

We can now write (in SPOKE): [aPerson0 name method]

and the name method will be evaluated. In this way, we can write SPOKE programs using translated classes and methods.

Chapter 4

The TM tools and how to use them 4.1 How the tools work together All tools for TM are integrated into one environment called the Database Design Tool (DDT). Currently, there are two versions of the DDT: a public domain C++ version containing the GTI and the typechecker, and a SPOKE version containing all components. In the sequel, the C++ version is explained when the GTI is concerned as this is the newest and most widely used version, and the SPOKE version is explained for the other components. Currently, the DDT contains the following components:

  

The Graphical TM Interface (GTI, see Section 4.2). With this, you can view and edit your specification using the TM diagram language described in Section 3.2 for the structural part, and a Syntax Directed Editor (SDE) for the constraints and methods. The Type Checker. This is an analysis tool with which you can analyze your specification for typing errors. Because TM is a strongly typed language, this will find almost all specification mistakes. The type checker is also used as a front-end to all the code generation (e.g. TM-to-SPOKE) that is done for TM. The type checker is explained in Section 4.3. The Prototyping Environment (PE, see Section 4.4). This environment generates a quick-and-dirty prototype of your specification, allows you to fill the database with some test data, and allows you to perform constraint/method evaluations and analyze the results. This tools helps you to find the more obscure mistakes, that slipped through the type checker. The architecture of this tool is set up in such a way, to allow prototyping in different languages. This requires a backend for the type checker for each language, responsible for translating the specification to the target language. Currently, only LIFE and SPOKE are supported.

In the future, we’re planning to offer also the following:

  

A documentation tool. A safeness detector (for analyzing a specification on occurrences of inherent infinity). A proof tool (for analyzing methods on which constraints they leave invariant)

When the DDT is started (the C++ version by running ddt tool, and the SPOKE version by running run in your DDT directory), the main menu is popped up (see Figure 4.1). The buttons have the following meanings: 40

4.1. HOW THE TOOLS WORK TOGETHER

41

Figure 4.1: The main menu of the DDT.

   

Options Specify preferences and system specific parameters. Generate Generate TM source from the current session. For each module, a file is generated with filename modulename.tm. Typecheck Typecheck the current session. Help Pop up the help pages of the tool. There are three different ways to access help (use Options to specify your choice): – External help with global pages. Help is displayed using a WWW browser like Mosaic or netscape, and the pages are accessed through internet from the university. The advantage of this is that always the most up-to-date versions of the pages are displayed without having to update anything. – External help with local pages. Help is also displayed using a WWW browser, but the pages are accessed locally as they were distributed in the package. – Internal help. Help is displayed using a built-in browser, which is rather limited in its capabilities. The use of this browser is not recommended unless there is no WWW browser available.

  

Quit Quit the DDT. New Start a new session. Load Load a session from a session file. A session file is the recommended way to load/save your specifications when you’re working on them as opposed to generating TM source, because the latter has the disadvantage of demanding the specification to be typing correct. A session file has extension .ddt and can be generated from TM source by using the typechecker with the -gen ddt option, for example, tm -gen ddt Example.tm >Example v1.ddt

  

Save Save the current session to a session file. The filename of the session file can be specified using the Options button. View Start a Graph Viewer (or Class Editor, GTI, see Section 4.2) on the current session. Undo Undo the previous update action. Pressing the Undo button n times will undo the previous n update actions.

CHAPTER 4. THE TM TOOLS AND HOW TO USE THEM

42

4.2 The Graphical TM Interface (GTI) 4.2.1

The graph viewer

When you start a new graph viewer, a window like the one in Figure 4.2 appears. This window is used for viewing and editing TM diagrams. You can also start Class Browsers for a particular class in which you can edit constraints and methods.

Figure 4.2: A class editor (viewer) window. The graph viewer window is composed of a menu bar and button bar at the top and a large scrollable area which contains the TM diagram. There is also an input area between the button bar and the diagram. The input area can be used to enter the names of modules, sorts, classes and attributes. At the bottom is a message area in which (error) messages are displayed. The buttons and menus in the menu and button bar have the following meanings:





Select Start select mode. In select mode, you can select (multiple) objects not necessarily of the same kind (e.g. selecting a class and an attribute arrow is possible). Subsequent actions are performed on the current selection, if appropriate. If the select button is pressed and there is an existing selection, the selection is cancelled. Browse If there is no selection, browse mode is entered. In browse mode the mouse cursor changes to a question mark with an arrow at the bottom and for everything you click, a browser is started, if appropriate. In this way, you can start a class browser for a sort or class and an attribute browser

4.2. THE GRAPHICAL TM INTERFACE (GTI)

43

for an attribute (see Section 4.2.3 for a description of the browsers). If there is a current selection, browsers are started for all selected objects, if appropriate.

  

 

Delete If there is a current selection, all objects in that selection are deleted. If there is no current selection, delete mode is entered. In delete mode, the mouse cursor changes into a skull, and all objects you click will be deleted. Current Module Each sort or class that is added, is added in the module specified in this menu. You can use the menu to change to another current module. Add In the add menu, you can select what you would like to add, i.e. a module, a sort or a class, and if there is an appropriate selection, also an attribute. The name of the thing that is added has to be entered into the input area in advance. When adding an attribute, a type must be supplied subsequently by clicking at an existing sort, class or complex attribute. Inherit If some sorts or classes are selected, the inherit menu can be chosen from the menu bar. From the inherit menu, you can choose whether you like to make the currently selected sorts or classes supertypes or subtypes of a subsequently to be selected sort or class. Show From the show menu, you can choose to hide or show specific parts of the diagram. For example, you can hide all attributes, leaving the inheritance tree (see Figure 4.3). Some show operations are sensitive to a selection.

Figure 4.3: An inheritance tree.

CHAPTER 4. THE TM TOOLS AND HOW TO USE THEM

44







Recalc In the recalc menu, you can choose to recalculate the positioning of all objects in the TM diagram. You can also choose to make this recalculation to be automatic, i.e. an automatic recalculation is performed after every edit action. You can also change the orientation of the diagram. Focus With focus, you can search for a particular sort, class or module. You can select the sort, class or module from a selection list, and consequently the sort or class is selected and brought into the displayed portion of the TM diagram. There is also a quick focus on an indirect reference. When you click with the right mouse button on a dotted oval or box, the definition of this sort or class is focussed. Zoom From the zoom menu, you can choose the zooming factor (1:0 is default). See Figure 4.4 for an example of zoom factor 0:5.

Figure 4.4: An overview of a TM diagram using zooming.



Modules From the modules menu you can hide or show each module individually.



Clone The clone button creates a new graph viewer window.



Help The help button will pop up the help pages of the tool.



Close The close button, closes the graph viewer window.

4.2. THE GRAPHICAL TM INTERFACE (GTI)

4.2.2

45

Mouse and key bindings

The different mouse buttons have different meanings, and those meanings can change if specific keys are pressed. This section explains the current bindings:

 

The left mouse button is used for selecting and pressing pushbuttons. The middle mouse button is used for dragging portions of a TM diagram to different locations. This dragging is done differently when the following keys are pressed: – No If no key is pressed, attributes are dragged along with their owners, but sub/supertypes are left in their place. – Shift If Shift is pressed during dragging, the attributes aren’t dragged along. – Ctrl If Ctrl is pressed during dragging, attributes and sub/supertypes are dragged along. – Ctrl-Shift If both Ctrl as Shift are pressed, attributes are not dragged along, but sub/supertypes are.



4.2.3

When the right mouse button is pressed on a dotted oval/box, the corresponding sort/class definition is focussed. Pressing Shift and the right mouse button jumps back to the position before the last focus operation.

The browsers

The class browser window gives an overview of all information (i.e. name, module, supertypes, subtypes, attributes, preambles, constraints and methods). See Figure 4.5 for an example of a class browser window. From the class browser, you can start an attribute browser by double-clicking on an attribute. By clicking on a preamble, constraint or method, you can view its definition at the bottom of the browser window. When double-clicking on it, you can edit it there. The save button can be used to make the change permanent and the cancel button can be used to revoke the changes. In the SPOKE version of the DDT, you can start the SDE by clicking on a constraint or method. The browser also contains a menu with which you can add supertypes, subtypes, attributes, constraints and methods. The same row of buttons is available in the browser having the same meanings as in the graph viewer. The edit button has the same meaning as double-clicking on a preamble, constraint or method. Of course, the help and close buttons are also available.. For sorts there is a similar browser. It differs only in the fact that sorts do not have attributes but a sort type, and a sort doesn’t have class constraints and class methods. With the change menu you can change the sort type using a sort type browser. The sort type browser can be used to view and edit the underlying type of a sort. It is very similar to the attribute browser. They differ only in that a sort type browser doesn’t have a name. An attribute browser gives an overview of all relevant information for that attribute. You can see the name of the owner of the attribute, and you can see and edit the name of the attribute and the type of the attribute. With the type constructor buttons, you can construct any TM type for the attribute. It is also possible to enter a complex type by hand by pressing the edit button first. The clear button will erase the type of the attribute.

46

CHAPTER 4. THE TM TOOLS AND HOW TO USE THEM

Figure 4.5: A class browser window.

4.3. THE TYPE CHECKER

47

4.3 The type checker The TM type checker is implemented on a variety of hardware/software platforms, the most important platforms at the moment are SunOs and MS-DOS. The type checker consists of a single executable which accepts the filename of the TM program as argument. So for instance tm example.tm will type check the TM program in the file ’example.tm’. The type checker is implemented as a flexible type checking tool and as a software tool for simple translation of TM into another language. To accomplish this the type checker is strictly divided in a front-end and a back-end. The front-end consists of a number of passes which check the syntax of the input and build a datastructure from the input. After this a number of passes check if the data-structure represents a valid TM program. At the moment 3 passes are implemented but in the future a number of passes will by added to check the hairy parts of TM semantics. The number of passes which are executed can be influenced with the ’maxpass’ option of the type checker, so if you only want to check if the syntax of the file example.tm you type: tm -maxpass 1 example.tm The passes must be executed in their numerical order, so pass 3 can only be executed if pass 1 and 2 are executed first. The back-end can be viewed as a kind of switch for generating different output formats. The value of switch can be set with the ’-gen’ option. The default option is to generate nothing and is equal to: tm -gen nothing example.tm The type checker supports the generation of a session file of the input specification. For example: tm -maxpass 1 -gen ddt Example.tm > Example.ddt generates a session file from the input TM file without checking the semantics of the TM specification. The ’-gen tm’ option regenerates the TM source from the generated data structure, for users this option is not very useful but it is used for automating compiler testing. Recently, the ’-gen spoke’ and the ’-gen proto’ options have been added. The first one generates code for use in a SPOKE application (see Section 3.5). The second one, generates code for the PE, which is only meant to be used internally by the DDT. Because in the future there will be several more ’-gen’ options, the ’-gen’ option without argument will list the possible generation possibilities of the implementation which is used at the moment. The ’-info’ option is added to supply more information about the current implementation of the type checker, so use this option if you want to know more about the latest bugs and features. From the DDT, the type checker is run whenever a session is loaded, saved, or prototyped. The appropriate ’-gen’ is selected. This all happens internally, so the database designer won’t notice a thing.

4.4 The prototyping environment When you press the DDT’s test button in the SPOKE version, a prototype of the current session is automatically generated and the prototyping environment is started. The PE consists initially of two windows. An expression list window and an expression window.

CHAPTER 4. THE TM TOOLS AND HOW TO USE THEM

48

The expression list window contains separate sections for values (test databases, individual objects or any TM value), query results and subexpressions (partial query results). The boxes shown in this window are called expression boxes, because what they contain can be any TM expression with its evaluation results. There is always one expression box the current one. See Section 4.4.1. In the expression window, the current expression is shown. At the bottom of the the expression window, two scrollable representations are shown; the top one shows the actual expression and the bottom one shows the expression evaluation result. Expressions and TM objects are hierarchily represented by nested boxes. The top part of the expression window is reserved for query construction. See Section 4.4.2. There is also a third window type called the edit window. With an edit window, you can construct a TM test value of any type in the current specification. The value in this window is also represented by nested boxes. See Section 4.4.3. The recommended way of working with the PE is as follows. If you think the TM specification you specified with the GTI is correct, you can prototype it by pressing the test button in the main menu of the DDT. Then you probably need to construct a test database or some test instances using the edit window. Then, you can construct queries, evaluate them and analyze the results. If you discover a wrong result and have found out what was wrong, you need to quit the PE and go back the GTI (start a new graph viewer or use one you had still on your screen). There, you can make the appropriate changes to the specification. Then, you have to start the PE again1 , where you probably would like to load your saved data from a previous PE session2. You can then evaluate the same method you found wrong and see if it behaves correctly now. The test cycle can be continued this way until you have a fair amount of confidence that the specification is correct.

4.4.1

The expression list window

An example of an expression list window can be found in Figure 4.6. The rows of expressions are already explained in Section 4.4. What else you can see in an expression list window, are filter buttons3 and a button bar. The button bar contains, of course, a quit button, load and save buttons for loading and saving TM values, delete and clear buttons for managing expressions, edit button for editing existing TM values in an edit window and a New value menu for creating a new TM value of a specific type (the menu contains all available types).

4.4.2

The expression window

An example of an expression window can be found in Figure 4.7. As explained earlier, an expression window consists of a button bar, a query construction section, an expression section and an object section. Let us start with the query construction section. To construct a query (a method call), you have to supply a method name, a self object for the method and perhaps some parameters. For the self object and the parameters, you can use any of the expressions in the expression list window. The evaluation result or value of the selected expression will be used as self object or parameter. It is good practice to start at the self object, because when it is set, the option menu 1 This is absolutely necessary, because a new prototype has to be generated and this must happen before the prototyping environment is started. 2 The implementation has been prepared to deal with schema changes that make your test data invalid, but the analysis of what mismatches exist between the current schema and the test data hasn’t been implemented, yet. This will be part of a future release. 3 Not implemented yet. They will offer facilities for filtering long lists of expressions.

4.4. THE PROTOTYPING ENVIRONMENT

49

Figure 4.6: An expression list window. for the method name will consequently only contain the methods applicable to the self object. If a query is constructed, you can press the evaluate button to evaluate the method call. The result of the evaluation will be added to the query list in the expression list window and it will be made the current expression. The expression section contains the currently selected expression. It is represented with a box-in-box method. Each box represents one (sub)expression. Each box can be displayed in two ways:

 

Expanded The (sub)expression is displayed fully with all its subexpressions. Collapsed The (sub)expression is displayed as a small box with a brief description. No subexpression of this (sub)expression will be displayed.

The boxes in the expression section can be selected with the left mouse button. The expand and collapse buttons toggle for the selected box between expanded and collapsed display. There is also a short-cut. When you press the right mouse button on an expression box, that expression box is toggled between expanded and collapsed. With this expand/collapse technique, you can examine an expression as deep as you want. The object section contains the value of the currently selected expression box, or, if no expression box is selected, the value of the current expression. It also uses the box-in-box method of displaying the object and you can also use the expand and collapse operations on it. The add subexpr and add subobject buttons allow you to add a selected subexpression or subobject to the value or subexpr list in the expression window.

4.4.3

The edit window

An example of an edit window can be found in Figure 4.8. In the edit window, you can edit partially filled templates of TM values. A template already contains the correct type structure of the TM value, so you only need to fill in the details.

CHAPTER 4. THE TM TOOLS AND HOW TO USE THEM

50

Figure 4.7: An expression window. An edit window consists of three sections. A menu bar, a fill-in section and an object section. The object section displays the TM value you are currently constructing/editing in the usual box-in-box fashion. Some of the boxes contain a string of the form ?type ?, which represents a value of type type that has not been filled in yet. Any box can be selected and appropriate fill-in methods are offered in the fill-in section (e.g. an option menu with all matching instances for a class type, or a text field widget for a string, etc.). If a change has been made in the fill-in section, the apply button makes the change permanent in the TM value. The buttons in the button bar are only visible, when they are appropriate for the current selection. The buttons are:

 

 

Cancel This button closes the edit window and throws away any of the changes you may have done. Add&Exit This button only appears when the value you are editing is completely filled in. The button closes the edit window and adds the constructed/edited value to the value expression list or, in case you were editing an instance of a database record, applies the changes to the instance in the database record. Edit instance This button only appears when you are editing a database record and a database instance is currently selected. The button opens a new edit window in which you can edit the instance. Delete instance This button also appears only if you are editing a database record and have selected a database instance. It deletes the selected instance.

4.4. THE PROTOTYPING ENVIRONMENT

51

Figure 4.8: An edit window.

   

Add instance This menu only appears if you are currently editing a database record. The menu contains all class types of which you can create database instances. If you select one from the menu, an empty template structure of the instance is added to the database instances. Import instance This menu only appears if you are editing a database record. The menu contains all instances anywhere to be found in the PE at that moment. If you select one, the instance is added to the database instances. Delete element This button only appears, when you have selected an element of a list or set. The button will delete the selected element from the list or set. Add element This button only appears, when you have selected a list or set. It will add an empty template structure of the element type to the elements of the list or set.

52

CHAPTER 4. THE TM TOOLS AND HOW TO USE THEM

Part II

The TM language definition: syntax, typing rules and theoretical issues

53

Chapter 5

Syntax diagrams 5.1 Preliminaries In this section of the manual, syntax diagrams are given for the language TM. The following conventions have been used for construction of the diagrams.

     

Each syntax rule has a name, given in italics to the left of the diagram. Terminal symbols are depicted in ovals, non-terminal symbols are depicted in boxes. Keywords in terminal symbols are printed in boldface. Non-keyword terminal symbols are informally described ‘between quotes’. Correct syntax is obtained by following the lines and curves, as if it were a railroad; direction is implied by the curves. The symbol CS is the start symbol of the syntax.

5.2 Conceptual schema CS

ModuleList

ModuleSpec

ModuleList

 

     ModuleSpec

module

ModuleName

IncludeList

ClassList

PreAmble



end

ModuleSection

55



ModuleName

56 IncludeList

CHAPTER 5. SYNTAX DIAGRAMS



    



    

    

            

              

ModuleName

includes

ModuleSection

,

MConsMethList

module section

AttList

MConsMethList

MConsList

MConsList

MMethList

constraints



Constraints

module

MMethList

update

module

methods



OUpMethList

methods

RetMethList



retrieval

ModuleName ClassList

‘any module name identifier’

Cl

PreAmble

let

Var

=

X

,

Additional comments: A TM specification consists of a number of module definitions. A module can be seen as a view on the specification; it enables division of specification into manageable parts. A module can include other modules. The underlying specification is then the union of all included modules (also those included by included modules). Each module must be specified in a separate file. A module specification consists of a list of sort and class definitions and a module section. The module can contain a preamble, which describes abbreviations used in class/sort definitions and in the

5.3. CLASSES AND SORTS

57

module section. The includes clause lists the module names (which presumably have to be defined elsewhere). The class, sort and variable names from the included module are automatically visible in the including class (they can of course be used in expressions or in defining other classes). This implies that those names have to be unique for the whole specification. The module section contains several module attributes denoting the persistent objects of the module. They can serve many purposes, typically involving general database state information, like last change date. Module constraints have a similar syntax as class constraints within ordinary classes or sorts. The role of a module constraint is to state what constitutes an allowed database state. Typical examples of such constraints are forms of referential integrity. Module attributes together form a module record, just like with classes, where the class attributes form a class record. The inclusion of other modules in a specific module can be compared to the ISA of classes, i.e. the actual module record will be the GLB (Greatest Lower Bound) of the module records of the included modules and the module record constructed from the module attributes. As a consequence, the self in the body of a module constraint or module method represents the module record. There is one additional restriction, namely the module attributes must have unique names, i.e. it is not possible to specialize on module attributes as this is possible with class attributes. Abbreviations given in the ModuleSpec-preamble are local to the module, i.e. defined in the classes and sorts of the module and in the module section.

5.3 Classes and sorts Cl



Class Sort

Class

Clbegin

Sort

Sortbegin

Clbegin

Sortbegin

ISAList

AttList

   

Class

Sort

   

ISA



Preamble





ClConsMethList

Preamble Clnm

Clnm



SortConsMethList

ISAList

ISAList

 

  

attributes

Clend

SortT

Clnm ,

AttDomList

Clend

 



 

AttList

SortT

58 SortT Clnm

    type

CHAPTER 5. SYNTAX DIAGRAMS Domain

‘any class name identifier’



Additional comments: The Clnm that is mentioned in Clbegin and Sortbegin has to be the same as the one in Clend. The AttList and SortT may only be omitted in case that one or more superclasses are specified. Sorts have the additional restriction, that the Domain in the type-clause may not be a single sort name, i.e. a sort can not have another sort as its underlying type. As a consequence, a single basic type can not be used as the underlying type of a sort, because basic types are considered to be predefined sorts. Abbreviations given in the preamble of a class or sort specification are local to that class or sort specification. BasType

  

 





 

   

   : : : 

 

 h   j   L  P

  

  int

real

char

string bool

error nil

BasDom

BasType Clnm

Domain

BasDom

AttDomList

[

VarAttDomList Domain

Domain

AttDomList

L

Domain

:

,

 i

j 



]

5.3. CLASSES AND SORTS VarAttDomList

     L

Domain

:

,

59



Additional comments: A domain is a basic type (seven are given, others can be added to support the specific application area), a class or sort, a record type or variant type constructed of other domains, or a list- or set type of a domain. A variant expression is somewhat like a record with a single field, denoting the status of the expression. Its type, in general, has several fields, and can thus be seen as a choice type. If a domain of a label of a variant type is omitted, implicitly the domain nil is used. This makes the use of enumerated types and NULL-values based on variant types easier (e.g. [jmarried:Person, notmarried j]). The special error type allows one to make partial functions robust. The basic type string is actually shorthand for Lchar. That means that operations like head, tail, at and concat also apply to strings. Note: The ASCII-form of P and L are P and L resp. TypeExpr

             

      P  L     

   j         selftype

BasDom

(

)

TypeExpr

TypeExpr

L

TypeExpr elmt

L

on

(

TypeExpr

)

TypeExpr

TypeExpr




]

Additional comments: Polymorphic types are needed in the declaration of the parameters of a method, that needs to be inherited by subclasses of the class currently being defined. With them, you can specify how the types of the parameters varies with the type of the class during inheritance. Type expressions are much like domains with the exception of a few extra facilities. selftype stand for the type of self, which is the class type for object methods and the set type of the classtype for class

CHAPTER 5. SYNTAX DIAGRAMS

60

methods. The dot-operator () stands for the type of the attribute L of the record TypeExpr (the left operand must be a record type). Finally, the elmt (element-type) of a type (TypeExpr) is the type of the elements of TypeExpr. Note: The ASCII-form of P and L are P and L resp.

5.4 Constraints ClConsMethList

SortConsMethList

    

 

     

    

      



    

     

          

  OConsList

OMethList

CConsList

CMethList

OConsList

OConsList



OMethList

constraints

 

Constraints

object

CConsList

Constraints

OMethList

class

Constraints

constraints

L

:

X

update

methods

object



OUpMethList

methods

RetMethList

retrieval

CMethList

class

update

methods

CUpMethList

methods

retrieval

Clend

end

Clnm

RetMethList





5.5. METHODS

61

Additional comments: Both constraint and method specifications are optional. In constraints for classes a distinction is made between so-called object constraints and class constraints. The first are necessarily true statements about each individual object in the database that is an instance of the class, the second are statements about the set of all objects in the database that are instances of the class. A similar distinction is made for methods. An object method is a method applicable to a single object that is an instance of that class, and a class method is applicable to a set of objects of that class. Consequently, the self object in the body of an object constraint/method, will be a single object of the sort/class type. The self object in the body of a class constraint/method will be a set of single objects of the class type. The keywords object/class constraints/methods can be seen as a header that announces a section of constraints/methods of that kind. These section may be placed in any order apart from the fact that all constraints must preceed all methods.

5.5 Methods OUpMethList

CUpMethList

RetMethList

OUpMeth

     

CUpMeth

CUpMeth

RetMeth

Methnm

 







OUpMeth

    (

=

MVarDomList

OUpMethBody

         

 

Methnm

(

RetMeth

in

Methnm

 

out

in

MVarDomList

in

MVarDomList

)

=

)

     

=

(

TypeExpr

  



X

)

,

X

62 MethArgs

CHAPTER 5. SYNTAX DIAGRAMS

       

 

(

)

X ,

Methnm

‘any method identifier’

Additional comments: The methods of a class are update or retrieval methods. An object update method changes the object at hand; an object retrieval method retrieves information from that object. A class update method changes a set of objects of the class at hand; a class retrieval method retrieves information from a set of objects. Typing rules: 3, 4. OUpMethBody

        

  

a

(

b

OUpAbbr

                 

       e

OUpChoice

f

Methnm

g

Var

X

:

OUpMethBody

of

MethArgs

]

OCaseList

ODefltCase

endcase

TypeExpr

,

L

else

OUpExcept

[

case

:

=

ODefltCase





 

)

OUpExcept

d

OCaseList



self

c

MVarDomList

OUpMethBody



OUpMethBody

Var

OUpMethBody

OUpMethBody

except

(

L

= ,

X



)

5.6. EXPRESSIONS OUpAbbr

63

                   OUpMethBody

where

(

Var

=

X

)

,

OUpChoice

X

if

then

OUpMethBody

else

OUpMethBody

endif

Additional comments: An important property of a correct retrieval method is that the type of its result, specified in the out-clause, matches the type of the method body. The body of an object update method can take one of several forms (a–g). The basic idea of this specific diagram is to allow all sorts of changes to the object, except for its object identity. Thus, eventually the diagram bottoms out in self, possibly and usually changed on its ordinary attributes. Module methods are similar to object methods for classes as the database (state) is regarded as a record expression. In this expression the field names (attributes) are those that have been introduced as module attributes of the current module or attributes of the included modules.

5.6 Expressions X

   



    























self

Cons Var

(

X

Record Variant Choice

Abbreviation Set

List

Iterate

Arithmetic Aggregate Pred Call Oid

)

CHAPTER 5. SYNTAX DIAGRAMS

64

Additional comments: The expression self is a rather special variable denoting the ‘current’ object. This means that in an object constraint it stands for an object of the class at hand, while in a class constraint it denotes the class extension. In an object method, self stands for the object of the class at hand on which the method is invoked, whereas in a class method it denotes the set of objects on which that method is invoked. The other expression forms are constants (Cons), variables (Var, typing rule: 2) , parenthesized expressions to deal with operator priorities (typing rule: 6), expressions specific for records (Record) and variants (Variant), choice expressions (Choice), abbreviational expressions (Abbreviation), expressions dealing with lists (List), sets (Set), arithmetic values (Arithmetic) and aggregate functions (Aggregate), predicative expressions (boolean expressions) (Pred), and method calls (Call). Oid expressions are a peculiar kind of expression to be discussed below. All of these will be expanded in the following. Cons

 







 

   : : :       j j j j j j j j               

  intCons

realCons

charCons

stringCons boolCons error nil

intCons

digit

-

digit

realCons

charCons stringCons

boolCons



j

0 1 2 3 4 5 6 7 8 9 intCons

.

digit

 





‘any character except single quote’

"

‘any character string without double quote’

false true



"

5.6. EXPRESSIONS

65

Additional comments: The constants are fairly straightforward. Our error type h as a single constant error. There are three conversion methods defined on int and real:

  

method int on real. The method returns the integer part of a real (e.g. the result of int[1:6] is 1). method real on int. The method returns the value as a real (e.g. the result of real[1] is 1:0) method round on real. The method returns the nearest integer (e.g. the result of round[1:6] is 2).

Typing rule: 1. Var

 

‘any string’



Additional comments: Variables have to be ‘in scope’, i.e. they must have been introduced as variables in a preamble or where-expression. In an object constraint or method, the attributes of the class can also be used as variables that have been implicitly declared as selfattribute. For module attributes in module constraints and methods, the same is valid. Record

 h    

              a

L

=

X

,

b c

X

X

L

(

except

L

i 

     =

X

)

,

Additional comments: An expression based on records is either an explicit record expression (a, typing rule: 7), a field selection in which case X is a record expression and L is a label of X (b, typing rule: 8), or a record overwriting, in which case the first expression X is a record expression with at least the labels L, used in the except-clause (c, typing rule: 10). In the latter case, the resulting expression is a record with the named fields altered. There is no side-effect implied here. An example of an expression that uses all these forms is (h a = 1,b = truei except (a = 2))a

    j j    



        



 

which has as result 2. Variant

a

[

]

L

=

b c

case

X

X

is

on

X

of

CaseList

L

endcase

    

  

CHAPTER 5. SYNTAX DIAGRAMS

66 CaseList

L

:

=

DefltCase

else

X

DefltCase

Var

X

Additional comments: An expression based on a variant is either an explicit variant (similar to variant types, X may be omitted in which case nil is implicitly used) (a, typing rule: 11), or a case statement based on the possible labels L of the first expression X, which is a variant (b, typing rules: 13, 14). In the ‘L = Var : X’ list, the labels L are labels occurring in the (variant) type of the first expression X. One may wonder how an expression of a variant type may come into being. The way to obtain such expressions is through if : : : then : : : else : : : endif expressions, for which see Choice. The case : : : endcase-expression functions as follows. The variant argument is matched against the labels (L) in the list, and the variable (Var) that is associated with the matching label is instantiated to the value in the variant argument, i.e., if that variable is mentioned, otherwise the value of the following expression is independent of the value associated to the case label. If the variable is present, the expression associated with the matching label is evaluated with the variable having the value associated with the case label as value, and its result is returned. There is also an optional default clause that can be used if there is not a matching label. The typing rule for case is a bit complicated. The main thing with it is that the expressions in CaseList do not have to have the same type. There must, however, exist a LUB which is the resulting type of the case-expression. The following expression is therefore perfectly legal and of type [jage:int,noage:nilj]: case [jage = 1j] of age = v:[jage = v+1j] else [jnoagej] endcase

The expression X is L (c, typing rule: 56) is true if the label of X matches L, otherwise it is false. For example, the result of [ja = 1j] is b, is false. The expression X on L (c, typing rule: 12) represents the value of the variant argument of X if the label of X matches L, otherwise it is undefined (i.e. an error is raised). For example, the result of [ja = 1j] on a, is 1. The expressions (c) are short-hand 1 for case X of L:true else false endcase

and case X of L=v:v endcase Choice

        if

X

then

X

else

1 The case-equivalent of the on-operator is actually not a correct operator, while the is-operator is only syntactic sugar

X

endif

TM-expression. Therefore, the on-operator is a necessary

5.6. EXPRESSIONS

67

Additional comments: The first expression is of type bool, the second and third expression have equal types such that the overall expression has a unique type. We remark here, that, just as with caseexpressions, the second expression may be of the form [jl1 = e1 j] while the third expression is of the form [jl2 = e2j], such that the overall expression is of type [jl1 : 1 ; l2 : 2 j], whenever 1 and 2 are the respective types of the expressions e1 and e2. For example, the result of if [ja = 1j] is a then [ja = 2j] else [jb = truej] endif

is [ja = 2j] :: [ja:int, b:boolj]. We remark, finally, that nested if may even yield ‘bigger’ variant types. Typing rule: 15. Abbreviation

: : : then : : : else : : : endif expressions

          

X

(

where

Var

)

X

=

,

Additional comments: The ‘X where (: : : )’ expression forms a convenient shorthand of an expression that is obtained by substituting the expressions associated with the named variables for those variables in the first expression. The variables are abbreviations of (usually complex) expressions. Substitution of the used abbreviations is performed simultaneously. This form of expression allows a much more local abbreviational facility then that of a preamble. An example of a where-expression is if a2g = f3,4g fx:int | x2) = f1,2,4,5g (collect x+1 for x in f1,2,3,4g iff x>2) = f4,5g (nest f1,2,3,4g over x by (x>2)) = ff1,2g,f3,4gg unnest(ff1,2g,f3,4gg) = f1,2,3,4g unique in (f1g) = 1 (unique for x in fhha = 1i ,h a = 2i g iff xa=2) = fhha = 2i g

Arithmetic

                



  

 













  

  



 

 

 

 

SArgArith

DArgArith

SArgArith

(

abs

X

)

sqrt sin

cos

tan

asin

acos

atan

DArgArith

X

+

X

-

* /

div

mod ˆ

Additional comments: Arithmetic expressions are made up of expressions of type int or real. The result of ‘abs’, ‘+’, ‘-’, ‘*’ and ‘ˆ ’ is of type int only when both operands are of type int; otherwise it is of type

5.6. EXPRESSIONS

71

real. The result of sqrt, sin, cos, tan, asin, acos, atan and ‘/’ is always of type real. The operands and the result of div and mod are always of type int. Typing rules: 33, 34, 35, 36, 37, 38, 39

Aggregate

     

    



  





 

  count

sum

X

X

avg

over

max

L .

min sd

Additional comments: Aggregate expressions have a set or list and, optionally—except for ‘count’—a sequence of labels as their input. The count operator simply counts the number of elements of the set or list. In case that the over-clause is present, the other operators expect a set or list of records that enable the nested field selection specified by L: : :L. Furthermore, the associated values should be arithmetic, i.e., they should be of type real or int, such that applying an aggregate function is sensible. If the over-clause is not present, the set or list should contain simple arithmetic values. An expression ‘count X’ is always of type int. The result of sum, max, and min expressions is determined by their operand. The expressions avg and sd (standard deviation) always yield real results. Examples of some forms of Aggregate expressions are: count f1,2,3g = 3 sum [h a = 1i ,h a = 2i ] over a = 3 max fhha = h b = 1i i ,h a = h b = 0i i g over ab = 1

Typing rules: 40, 41, 42, 43, 44, 45, 46

72 Litt

      





  







    

        

6 



 ' 

 6'



< 

  

 >

  

 





 

 

 

 

CHAPTER 5. SYNTAX DIAGRAMS

X

(

partition

X

spartition

X

(

key

)

,

L

)

,

X

=

X

=

isa

in

sin

subset

ssubset sublist

ssublist

Additional comments: In object-oriented literature, ‘=’ is usually called ‘deep equality’, and ‘'’ is called ‘shallow equality’. This distinction only makes sense for true objects, i.e. those that have an oid value. The isa -predicate is used to compare objects of distinct types, such that one is a subtype of the other. Several operators are available in a regular and an ‘s’-form. The former is based on = while the latter is based on isa, for example, the subset operator determines if all elements of the set on the left hand side are equal w.r.t. ‘=’ to an element in the set on the right hand side. The ssubset operator determines if all elements of the set on the left hand side are a specialization (i.e. equal w.r.t. isa) of an element in the set on the irght hand side. As a consequence, the type of the expressions on both sides of the subset operator must be the same, while the typing rule for the ssubset operator states that the type of the expression on the left hand side may be a subtype of the expression on the right hand side. The operators have the following meanings (and the ‘s’-forms have their isa-variants of those mean-

5.6. EXPRESSIONS

73

ings):

 e partition (e1; e2; : : :; en) ()

n ! [ (8i; j  1  i  n ^ 1  j  n ^ i 6= j ) ei \ ej = ;) ^ ei = e i=1

 e key (l1; l2; : : :; ln) () 8x in e 8y in e  xl1 = yl1 ^    ^ xln = yln ) x = y  e1 subset e2 () 8x in e19y in e2  x = y  e1 sublist e2 () 9k8i  0  k ^ 1  i  length(e1)  length(e2) ? k ^ (e1 at i) = (e2 at i + k) Note: The ASCII-form of = 6 is , of ' is ˜=, of 6' is ˜, of  is =. Typing rules: 47, 48, 49, 50, 51, 52, 53, 54, 55, 57, 58, 59 Pred

   j  

   j     

 





 

  

 

  

   Litt X

Connec

not

Connec

X

X

Quant

VarDomList

Quant

Var

in

X

X

X





and or

implies equiv

Quant

forall exists

VarDomList

Var

:

Domain

,

Additional comments: For X to be a Pred it has to be of type bool. Predicate expressions are made up in the usual first-order way. Examples: forall x in f1,2,3g j (x>1 or x1) forall x:Pint j exists n:int j (n=length(x))

Typing rules: 60, 61, 62, 63

   

74 Call

Methnm

X

[

]

CHAPTER 5. SYNTAX DIAGRAMS MethArgs

Additional comments: The first argument, which is mandatory, is the actual parameter for the formal self parameter of the method. See section 2.6 for an overview of the typing issues concerning the self-object and the parameters in a method call. See chapter 7 for a more in depth explanation of these issues. Typing rule: 5

Oid

      



OidProper Oidish

OidProper

X

inc

Oidish



 

 



  

        

  



lastid

id

(

OidProper

Clnm

(

OidProper

Clnm

(

X

X

as

)

,

X

)

)

Domain

Additional comments: Oid expressions are expressions in which object identities play a prominent rˆole. Basically, an OidProper is an expression of type oid which can be either the special database attribute lastid (typing rule: 71) or the selected id attribute of a proper object. One may apply the oidincrement function inc (typing rule: 70) to these. An Oidish expression is comparable to new-constructs in some programming languages. To create a new object of a certain class C, one should use, typically an expression like C(inc(lastid), : : : ) (typing rule: 65) in which : : : stands for the normal record of the underlying type of C that will represent the new object. An expression of this kind can only be issued at module level as lastid is a module attribute (that gives you the last oid issued). The use of inc(lastid) should therefore coincide with an update of this lastid attribute in the database. The first argument of the C(: : : ) object generator need not be inc(lastid) as one can also use it to turn already existing objects into object of class C. In that case, given an object x, the first argument will be x  id (typing rule: 9). New sort expressions are generated in a similar fashion but then, obviously, without oid’s playing a rˆole (typing rule: 64). The as-constructor, as used in e as  (typing rule: 66, allows to ‘cast’ an expression e to (one of) its supertype(s)  . Obviously, the minimal type of e should be a subtype of  . Example: self except (PERSONS=PERSONS union fpg) where (p=Person( inc(lastid), h name=“John”,birthdate=Date(h day=1,month=2,year=1963i )i ))

5.7. OPERATOR PRIORITIES

75

5.7 Operator priorities As in any formal language with multiple operators and mixed syntactical constructs we have to define the way (i.e., the order) in which expressions are evaluated by the TM-machine. Below, we give an overview of the priority levels of all of TM’s operators. priority 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

expression form all closed operators , is, on, except (: : : ), where (: : : ) intersect union, minus, concat as, at, partition, spartition count ˆ , =, div, mod +, ? =, 6=, ', 6', isa, in, sin, subset, ssubset, sublist, ssublist not and or implies equiv

5.8 Naming conventions Specifications will become more easily readable if the different kinds of identifiers are formed in a universal way. It is good practice to conform to these naming conventions when writing TM specifications. The naming conventions are specified using regular expression syntax.

    

Module, sort and class names : [A-Z][a-z,A-Z,0-9, ]* (e.g. Book, Library, Member, Person). Class and record attributes : [a-z][a-z,A-Z,0-9, ]* (e.g. salary, booksInPossession). Constraint labels : [a-z][a-z,0-9, ]* (e.g. unique number, max in possession). Method names : [A-Z][a-z,A-Z,0-9, ]* (e.g. Move, ReturnItem, CalculateFee). Module attributes : [A-Z][A-Z,0-9, ]* (e.g. MEMBERS, BOOKS).

Note that identifiers always begin with a letter.

Chapter 6

Typing rules 6.1 Introduction TM is a high-level specification language for object-oriented database schemas. Originally, TM was conceived of as a sugared layer on top of its formal counterpart, the language FM. The FM language consists of a typed lambda calculus employing a notion of subtyping and is based on the well-known ideas of Cardelli-Wegner [CaWe85] type theory. The typing rules as defined in FM [BaFo91,BaVr91] could be used to type check a TM specification [BaBZ91,BaBV92]. However, this would be a rather rough test, because apart from the types we also have a user defined ISA-hierarchy in TM. In this ISA-hierarchy it is for instance defined which methods are defined for a certain class or sort, hence instead of using the FM-types, classes and sorts should be considered as types. In this paper the typing rules of TM are defined. The type rules are defined for a TM schema specification containing classes and sorts which could be used for the typing of expressions. The type rules are defined directly for TM-expressions, hence, all typing rules defined for FM are redefined for TM-expressions. If you have trouble reading all the formal stuff, please take a look at Section 6.2.4. In that section, it is explained how you should read typing rules. If you’re only roughly interested in how to apply the typing rules, it is sufficient to read from that section to the end of the chapter.

6.2 Typing rules In this section the typing rules of TM are defined. In the first section a few essential definitions are introduced. Then in section 6.2.2 the module concept is introduced. After that in section 6.2.3 the typing rules for the schema part of the TM specification are defined. Then in section 6.2.5 the typing rules of TM are defined, and finally in section 6.2.6 some additional comments are offered on the typing rules.

6.2.1

Preliminaries

As already explained in the introduction, the typing rules of TM are defined on the level of TM, that is to say that we consider classes and sorts as ’types’. These classes and sorts are only known in the context of a TM specification. 76

6.2. TYPING RULES

77

Remark 6.2.1.1 In the following we assume that the TM-specification consists of just one module, later on in section 6.2.2, we will explain how we shall deal with multiple modules. Definition 6.2.1.2 A TM specification (or context) ? is a tuple < C ; S ; type; ISA >, where C denotes the set defined classes, S denotes the set of defined sorts, type is a function mapping each class and sort in C [ S to its underlying domain, and ISA denotes the closure of a user-defined ISA relation. The ISA relation is a binary relation on (S  S ) [ (C  C ). We let C vary over C , and S vary over S . For a context ?, a set of basic sorts is distinguished. This set of basic sorts is a subset of the set of sorts S .

Definition 6.2.1.3 For a context ?, let BS , BS int,real,string,bool,char, and oid.

 S be a set of basic sorts. This set includes the sorts:

In order to abstract from sorts and classes, a set of basic domains is introduced. This set BD, contains all classes and sorts known within a context ?. Definition 6.2.1.4 The set of basic domains BD is defined as BD We let vary over BD.

= C [ S.

Notational Convention 6.2.1.5 We abbreviate “h a1 : 1 ; : : :; am : m i " to “h ai : i (i 2 m)i ". That is to say, “(i 2 m)” is a postfix qualification, meaning “for all i form 1 to m”. The predicate “i is some value between, and including, 1 and m” is not abbreviated to (i 2 m) but to 1  i  m. The abbreviation is also used in other contexts. Now we are able to define the set of TM-types which can be used in the context ?. These TM-types will be called Domains. Definition 6.2.1.6 The set D (of Domains) is inductively defined as follows 1.

2 D, whenever 2 BD

hai : i (i 2 m)i 2 D, whenever ai 2 L and i 2 D (i 2 m) 3. [jai : i (i 2 m)j] 2 D, whenever ai 2 L and i 2 D (i 2 m) 4. P 2 D, whenever  2 D 5. L 2 D, whenever  2 D We let ;  vary over D 2.

The relation between the different sets of types, in a given context ?, is graphically displayed in figure 1.

In definition 6.2.1.2, the context ? is defined; the type function, however, was not treated in any detail. Now that we have defined the Domains, we are able to discuss the type function in more detail. Informally, for a class C 2 C , type(C) returns a record consisting of the attributes and their corresponding types specified in the TM-specification. For a sort S 2 S , type(S) returns the type specified in the TM-specification. Example 6.2.1.7

CHAPTER 6. TYPING RULES

78

TM-types Basic Sorts

Classes

Sorts

Basic Domains Basic Domains

Domains

Figure 6.1: The types module Example Sort Coordinate ISA int end Coordinate Sort Circle type h center x:Coordinate, center y:Coordinate, radius:reali end Circle Sort Circles type PCircle end Circles Class Persistent Circle attributes center x : Coordinate center y : Coordinate radius : real end Persistent Circle

end Example The context ? =< C ; S ; type; ISA >, the set of classes C = fPersistent Circleg, the set of sorts S = fCoordinate, Circle, Circlesg [ BS , type(Coordinate)=int, type(Circle)=h center x:Coordinate, center y:Coordinate, radius:real i type(Circles)=PCircle type(Persistent Circle)=h center x:Coordinate, center y:Coordinate, radius:real i

In the example above, the sort Circle is different from the domain h center x:Coordinate, center y:Coordinate, radius:reali . Some operations defined on the underlying domain,

6.2. TYPING RULES

79

however, are also legal operations on the sort. For instance, an expression e ::Circle can be projected on the radius attribute if the underlying domain of the sort Circle is a record having a radius attribute. In this way the type operation is used in the typing rules of TM. Other important operations are the class and sort operations. These operations can be used to convert an expression of domain  , which is the underlying domain of a basic domain , to the basic domain . For instance the expression e ::hcenter x:Coordinate, center y:Coordinate, radius:reali can be converted to a Circle, by means of the Circle operation as follows Circle(e)::Circle. For classes this operation is more complicated, because the class operation requires an additional argument of type oid. More formally the axioms of the type, sort, and class operations are defined as follows. Definition 6.2.1.8 Axioms of the type, S and C operation in context ? 1. 2. 3. 4. 5.

type( ) = , whenever 2 BS n fstringg type(string) = Lchar type(S) = e :: S2S S(e) :: S type(C) = w :: oid e :: C(w; e) :: C

C2S

type( ) = , whenever 2 D n BD

In order to check the correctness of sorts (see section 6.2.3), a set compS of complex sorts, (compS  D), is inductively defined by taking only sorts into consideration. Definition 6.2.1.9 Let compS  D denote the set of domains in context ? containing only sorts. The set compS is inductively defined as follows 1. 2 compS , whenever 2 (S [ BS ) n foidg 2. h ai : i (i 2 m)i 2 compS , whenever ai 2 L and i 2 compS (i 2 m) 3. [jai : i (i 2 m)j] 2 compS , whenever ai 2 L and i 2 compS (i 2 m) 4. P 2 compS , whenever  2 compS 5. L 2 compS , whenever  2 compS Notational Convention 6.2.1.10 We abbreviate “ a not necessarily contiguous sub-sequence” to a nncss. In ?, a partial ordering of basic domains is defined by means of a ISA -relation. We can now define a  relation as an extension of the ISA relation to D  D. This means that the  relation takes only the sort names and class names in consideration and does not take the attributes of the corresponding basic domains into account. Definition 6.2.1.11 The  relation on D  D is defined as follows 1. 2.

 0, whenever ISA 0 hai : i (i 2 m)i  haj : j (i 2 n)i, j  j (i 2 n) i

i

i

i

whenever

j1; : : :; jn

is a nncss of

1; : : :m and

CHAPTER 6. TYPING RULES

80 3.

[jaj : j (i 2 n)j] j  j (i 2 n) i

i

4. 5.

i

 [jai : i (i 2 m)j],

whenever

j1; : : :; jn

is a nncss of

1; : : :m and

i

P  P , whenever    L  L , whenever   

Having the  relation, a into account.

E relation can be defined, which takes the attributes of the basic domains

E relation on D  D is defined as follows

E 0, whenever ISA 0 and type( )  type( 0) hai : i (i 2 m)i E haj : j (i 2 n)i, whenever j1; : : :; jn is a nncss of 1; : : :m and j E j (i 2 n) [jaj : j (i 2 n)j] E [jai : i (i 2 m)j], whenever j1; : : :; jn is a nncss of 1; : : :m and j E j (i 2 n) P E P , whenever  E  L E L , whenever  E 

Definition 6.2.1.12 The 1. 2. 3. 4. 5.

i

i

i

i

i

i

i

i

Analogous to the LUB (Least Upper Bound) and GLB (Greatest Lower Bound) operation defined in FM

Lspace [BaFo91], the GLB and LUB operations are defined for the Domains. Definition 6.2.1.13 The TM LUB ( 4 ) operation, and the TM GLB ( 5 ) operation, 4 ; 5 2 D  D ,! D are defined as follows 1. For 1; 2 we define:

 1 4 2 = 3 whenever 1 E 3 and 2 E 3 and 8 0 2 BD (( 1 E 0 ^ 2 E 0) ) 3 E 0) for 1; 2; 3 2 BD. Otherwise 1 4 2 does not exist.

 1 5 2 = 3 whenever 3 E 1 and 3 E 2 and 8 0 2 BD (( 0 E 1 ^ 0 E 2) ) 0 E 3) for 1; 2; 3 2 BD. Otherwise 1 5 2 does not exist.

 = h ai : i (i 2 m)i ;  = h bj : j (j 2 n)i we define  4  ,  5  as follows. Let c1; : : :; cp be the ordered sequence of labels (of minimal length) containing exactly all ai (i 2 m) and bj (j 2 n). Furthermore let d1; : : :; dq be the nncss (of maximal length) of c1 ; : : :; cp that is a subsequence of both a1 ; : : :; am and b1; : : :; bn. Then  4  = h dl : l0 (l 2 q)i where l0 = i 4 j , with i; j such that ai = dl = bj  5  = h ck : k0 (k 2 p)i, where 8 > if ck = ai 62 fb1; : : :; bng < i 0 k = > i 5 j if ai = ck = bj : j if ck = bj 62 fa1; : : :; amg

2. For

where it is assumed that all i 4 j and i 5 j occurring in the formulas above exist.

6.2. TYPING RULES

81

 = [jai : i (i 2 m)j];  = [jbj : j (j 2 n)j] we define  4  ,  5  as follows. c1; : : :; cp and d1; : : :; dq be the ordered sequences of labels as constructed above, then  4  = [jck : k0 (k 2 p)j], where 8 > i if ck = ai 62 fb1; : : :; bng < k0 = > i 4 j if ai = ck = bj : j if ck = bj 62 fa1; : : :; amg

3. For

4.

Let

 5  = [jdl : l0 (l 2 q)j] where l0 = i 5 j , with i; j such that ai = dl = bj where it is assumed that all i 4 j and i 5 j occurring in the formulas above exist. For P1 ; P2 we define:  P1 4 P2 = P whenever 1 4 2 does exist and 1 4 2 =  . Otherwise P1 4 P2 does not exist.

 P1 5 P2 = P whenever 1 5 2 does exist and 1 5 2 =  . Otherwise P1 5 P2 does not exist. 5. For L1 ; L2 we define:

 L1 4 L2 = L whenever 1 4 2 does exist and 1 4 2 =  .

Otherwise

L1 4 L2

Otherwise

L1 5 L2

does not exist.

 L1 5 L2 = L whenever 1 5 2 does exist and 1 5 2 =  . does not exist. Remark 6.2.1.14 Analogous to the u and the t operation in FM, the ciative if they exist.

4 and the 5 operation are asso-

Finally, a projection operation on domains of a record structure is defined. This operation is required for the typing rules concerning aggregate operations over sets of records (see 41, 42, 43). Definition 6.2.1.15 For each domain   is inductively defined as 1. 2.

6.2.2

2 D, where type() = hai : i (i 2 m)i, a projection operation

aj = j , whenever type() = h ai : i (i 2 m)i and 1  j  m a1 : : : an = k, whenever type(a1 : : : an?1) = h bi : i (i 2 m)i , bk = an and 1  k  m Modules

In the previous section it was assumed that the whole specification was specified in one module. In TM, however, it is possible to specify multiple modules. This means that for each module M we have a context ?, and all definitions are related to this context.

Definition 6.2.2.1 The context of a module M is denoted by ?M . ?M is a tuple < C ; S ; type; ISA >, where C denotes the set of classes defined in M (or defined in the modules used by M ), S denotes the set of defined sorts defined in M (or defined in the modules used by M ), type is a function mapping each class and sort in C [ S to its underlying domain, and ISA denotes the closure of a user-defined ISA -relation. The ISA relation is a binary relation on (S  S ) [ (C  C ). Remark 6.2.2.2 It should be noticed that it is essential that all names occurring in a context ? are unique.

CHAPTER 6. TYPING RULES

82

6.2.3

Typing rules for the schema part of TM

Before checking the expressions, first the schema definition should be checked. Firstly, for all classes defined in context ?M and belonging to a module M , the domains used in the attribute definition should belong to DM . Secondly, for each sort defined in context ?M , the sorts used in the type definition should belong to BS [ (compSM nS ), since we shall not support recursive sorts. Finally, for the ISA -relation it should hold that its arguments are in a E relation. Definition 6.2.3.1 Typing rules for the TM schema definition. For every module M having context ?, the following rules should hold 1. 2. 3.

6.2.4

8C 2 C : type(C) = hai : i (i 2 m)i ) i 2 D (i 2 m) 8S 2 S : type(S) 2 BS [ (compS n S ) 8 ; 0 2 BD : ISA 0 ) type( ) E type( 0) How to read typing rules

Because of the formal nature of the TM typing rules, many may have problems reading them. This section gives an explanation of how to read the typing rules. Typing rules always have a standard form:

A ` ‘type assertions’ A ` ‘derived type assertion’ What this means is, that if all ‘type assertions’ are valid, then we can derive that ‘derived type assertion’ is valid. The assertions are frequently based on a basis A. This basis is nothing more than a list of variables with their types (e.g. [x :: int; y :: Lreal]). The basis is used to derive the type of expressions containing variables (e.g. head(y ); because y :: Lreal, we can derive that head(y ):: real). Let’s look at a simple typing rule, for example typing rule 61:

A ` Pred :: bool A ` not Pred :: bool The meaning of this is in plain English: If you can derive, keeping in mind the types of the variables in basis A, that Pred is of type bool, then we can derive that not Pred is also of type bool. Or, in a more usage oriented way of formulating If Pred is of type bool, then not Pred is also of type bool. Rules, like this one, deriving the type of a ‘larger’ expression from the types of its components, are called an introduction rules. Rules doing the opposite, deriving the type of a ‘smaller’ expression from the type of a ‘larger’ expression, are called elimination rules. More complex rules can be read in the same way. Take for example rule 26:

A; x ::  ` e1 ::  A ` e2 ::  (type( ) = P _ type( ) = L) A; x ::  ` Pred :: bool A ` replace e1 for x in e2 iff Pred ::  Again, in plain English: If e2 is a list or set of some type  , e1 is of type  and Pred is of type bool, then the replace-expression is of the same type as e2 (a list or set of  ).

6.2. TYPING RULES

83

Or, more loosely spoken: The variable x will be of the element type of e2 , e1 must be of the same type as x, must be of type bool, and the result will be of the same type as e2 .

Pred

To make this all even more clear, let’s look at a concrete example: head(L) where (L = [1])

The easiest way to attain the types of all (sub)expressions is to work inside out. So, according to typing rule 1, 1 is a constant of type int, and therefore the subexpression 1 is of type int (1 :: int for short). According to typing rule 22 is [1] :: Lint, because all ei :: int and thus  = int. Typing rule 17 for where, states that we have to derive the type of e from the basis A; xi :: i. A was empty and there was only one xi called L which had type Lint. The rule further states that if we derive the type of e (head(L)), than the type of the where-expression is of the same type. The rule for head is 23. It states that if e (L) is of type  which must be of the form L , than head(e) is of type  . In our situation, L:: Lint, so  = Lint and  = int. Therefore, the type of head(L) is int. Recapitulating, we can say that:

 1 :: int  [1] :: Lint  L:: Lint  head(L):: int  head(L) where (L=[1]):: int 6.2.5

Typing rules of TM

Postulation 6.2.5.1 For each  We let c vary over C .

2 D, let C be a (possible empty) set (of constants), mutually disjoint.

Definition 6.2.5.2 A basis A is a set of statements x ::  , where x is a variable and 

2 D.

Remark 6.2.5.3 Some typing rules will be preceded by a . These typing rules form the essential part of the typing rules for the TM-language. 1. 2. 3. 4. 5. 6.

 A ` c ::  if c 2 C  A ` x ::  if x ::  is in A (x ::  is a statement)

:: ; xi :: i (i 2 m) ` e ::   A ` m(in x ::  ; :A;: :;self xm :: m out  ) = e ::  ! 1 ! : : : ! m !  1 1

A; self :: ; xi :: i (i 2 m) ` e ::  A ` m(in x1 :: 1 ; : : :; xm :: m ) = e ::  ! 1 ! : : : ! m !   A ` m :: ! 1 ! : A: : `!me](!e (i 2Am`))e ::::  A ` ei :: i (i 2 m) i A ` e ::  A ` (e) :: 

CHAPTER 6. TYPING RULES

84

7.

:: i (i 2 m)  A ` h a = Ae `(i e2i m )i :: h ai : i (i 2 m)i i i

8.

() = h ai : i (i 2 m)i (1  j  m)  A ` e ::  type A ` eaj :: j

9. 10.

A ` e :: C A ` eid :: oid () = h ai : i (i 2 m)i A ` ej :: j (i 2 n) j1 ; : : :; jn is a nncss of 1; : : :; m  A ` e ::  type A ` e except (aj = ej (i 2 n)) ::   A ` [jaA=`eej]::::a : j] () = [jai : i (i 2 m)j]  A ` e ::  type (1  j  m) A ` e on aj :: j A ` e ::  type() E [jai : i (i 2 m)j] ) ` ei :: i (i 2 m)  = 1 4 : : : 4 m (and exists)  A; xi :: i A(i `2 mcase e of ai = xi : ei (i 2 m) endcase ::  A ` e ::  type() E [jai : i (i 2 m)j] A; xi :: j (i 2 n) ` ei :: i (i 2 n) A ` e0 ::  0  = 1 4 : : : 4 n 4  0 (and exists) j1 ; : : :; jn is a nncss of 1; : : :; m A ` case e of a = x : e (i 2 n) else e0 endcase ::  i

i

11. 12.

13.

i

i

i

14. 15.

ji

i

i

A ` e1 :: 1 A ` e2 :: 2  = 1 4 2 (and exists)  A ` e ::  type() = bool A ` if e then e1 else e2 endif :: 

16.

2 m) ` e ::  A ` ei :: i (i 2 m)  A; xi ::A `i (eiwhere (xi = ei (i 2 m)) :: 

17.

)  AA` `feei(::i 2 m(i )2g m:: P

18. 19. 20. 21. 22. 23. 24.

i

:: bool  AA;`xf::x : `jPred Predg :: P

A ` e ::  type() = P A; x ::  ` Pred :: bool A ` fx in e j Predg ::  2 ::  type() = P  A ` e1A::`e1Aset` eoperator set operator 2 f union ; intersect ; minus g e2 ::  )  AA``[eei(::i 2 m(i )]2::mL i type() = L  A ` Ae ::`head (e) ::  () = L  A ` eA::`tailtype (e) ::  e2 ::  type() = L  A ` e1 ::A `Ae1`concat e2 :: 

6.2. TYPING RULES

25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40.

41.

85

A ` e1 :: 1 type() = L A ` e2 :: 2 2 E int A ` (e1 at e2 ) ::  A; x ::  ` e1 ::  A ` e2 ::  (type( ) = P _ type( ) = L) A; x ::  ` Pred :: bool A ` replace e1 for x in e2 iff Pred ::  A; x ::  0 ` e1 ::  A ` e2 ::  type( ) = P 0 A; x ::  0 ` Pred :: bool A ` collect e1 for x in e2 iff Pred :: P A; x ::  0 ` e1 ::  A ` e2 ::  type( ) = L 0 A; x ::  0 ` Pred :: bool A ` collect e1 for x in e2 iff Pred :: L A ` e1 ::  type() = P0 A; x :: 0 ` e2 ::  A ` nest e1 over x by (e2 ) :: P A ` e ::  type() = P0 type(0 ) = P00 A ` unnest (e) :: 0 A ` e ::  (type() = P _ type() = L ) A ` unique in e ::  A ` e ::  (type() = P _ type() = L ) A; x ::  ` Pred :: bool A ` unique for x in e iff Pred ::  A ` e1 ::  A ` e2 ::  ( E int _  E real) op 2 f+; ?; ; ˆg A ` e1 op e2 ::  A ` e1 ::  A ` e2 ::   E real A ` e1 = e2 ::  A ` e1 ::  A ` e2 ::   E int A ` e1 = e2 :: real A ` e1 ::  A ` e2 ::   E int op 2 f div ; mod g A ` e1 op e2 ::  A ` e ::  ( E int _  E real) A ` abs(e) ::  A ` e ::   E real op 2 fsqrt; sin; cos; tan; asin; acos; atang A ` op(e) ::  A ` e ::   E int op 2 fsqrt; sin; cos; tan; asin; acos; atang A ` op(e) :: real () = P _ type() = L )  A ` e ::  (type A ` count e :: int A ` e ::  (type() = P0 _ type() = L0 ) type(0 ) = h ai : i (i 2 m)i 0 a01  : : : a0n =  ( E int _  E real) op 2 fsum ; max ; min g A ` op e over a0  : : : a0 ::  1

n

A ` e ::  (type() = P0 _ type() = L0 ) type(0 ) = h ai : i (i 2 m)i 0 a01  : : : a0n =   E real 42. op 2 favg ; sd g A ` op e over a01 : : : a0n :: 

86

43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61.

CHAPTER 6. TYPING RULES

A ` e ::  (type() = P0 _ type() = L0 ) type(0 ) = h ai : i (i 2 m)i 0 a01  : : : a0n =   E int op 2 favg ; sd g A ` op e over a01  : : : a0n :: real A ` e ::  (type() = P _ type() = L ) ( E int _  E real) op 2 fsum ; max ; min g A ` op e ::  A ` e ::  (type() = P _ type() = L )  E real op 2 favg ; sd g A ` op e ::  A ` e ::  (type() = P _ type() = L )  E int op 2 favg ; sd g A ` op e :: real ::  A ` e2 ::   AA``ee11operator e2 :: bool operator 2 f=; 6=g A ` e1 ::  A ` e2 ::  type() 2 freal; int; string; charg operator 2 f  ; < ;  ; > g A ` e1 operator e2 :: bool 2 1 E  2  A ` e1 A:: `1e1Aisa` ee22 :::: bool ( ) = P _ type( ) = L)  A ` e1 ::  A ` e2A::` e1(type in e2 :: bool (2 ) = P _ type(2 ) = L ) 1 E   A ` e1 :: 1 A ` e2 :: A2 `(type e1 sin e2 :: bool ` e2 ::  type() = P  A ` e1 A:: ` eA1 subset e2 :: bool (1 ) = P1 type(2 ) = P2 1 E 2  A ` e1 :: 1 A ` e2 ::A`2 etype ssubset e2 :: bool 1 A ` e1 ::  A ` e2 ::  type() = L A ` e1 sublist e2 :: bool A ` e1 :: 1 A ` e2 :: 2 type(1 ) = L1 type(2 ) = L2 1 E 2 A ` e1 ssublist e2 :: bool A ` e ::  type() = [jai : i (i 2 m)j] (1  j  m) A ` e is aj :: bool A ` ei ::  (i 2 m) A ` e ::  type() = P A ` ei (i 2 m) partition e :: bool A ` ei :: i (i 2 m) A ` e ::  type(i ) = Pi (i 2 m) type() = P i E  (i 2 m) A ` ei (i 2 m) spartition e :: bool A ` e1 :: C A ` e2 :: C A ` e1 operator e2 :: bool operator 2 f'; 6'g 1 :: bool A ` Pred2 :: bool  AA` `Pred Pred1 operator Pred2 :: bool operator 2 f and ; or ; implies ; equiv g A ` Pred :: bool A ` not Pred :: bool

6.2. TYPING RULES

62. 63. 64. 65. 66. 67. 68. 69. 70. 71.

6.2.6

87

xi :: i (i 2 m) ` Pred :: bool  A `A;quant quant 2 fforall ; exists g xi : i (i 2 m) j Pred :: bool A; x ::  ` Pred :: bool A ` e ::  (type( ) = P _ type( ) = L) quant 2 fforall ; exists g A ` quant x in e j Pred :: bool (S) =   A ` eA:: ` S(type e) :: S type(C) =   A ` e ::  AA``Cw(w;:: oid e) :: C  AA``ee::as  ::E  A ` emptyset() :: P  A ` emptylist() :: L A ` e ::  type() = L A ` unlist(e) :: P A ` e :: oid A ` inc(e) :: oid A ` lastid :: oid Additional comments

Basically, all typing rules are a straightforward extension of the FM-typing rules. The only complication lies in the fact that basic domains and domains should be integrated in a uniform framework. This is solved by having type, class, and sort operations. Below, we offer for some of the typing rules of TM some additional comments introducing some keywords and/or further explanation 3. Retrieval method-introduction. 4. Update method-introduction. 5. Method-elimination. Here we have adopted a strict typing rule, as one would expect in a context without subtyping. In section 2.6 some additional comments are offered on method elimination and method inheritance, where subtyping is indeed taken into account 7. Record-introduction. 8. Record-elimination. 10. A record override. The values of the attributes aji 11. Variant-introduction. 12. Variant-elimination.

(i 2 n) are overwritten by ej (i 2 n) i

CHAPTER 6. TYPING RULES

88

13-14. Case introduction. Instead of the strict requirement that the component types 1 : : :n, are all equal, we have a more liberal approach by requiring that there need only exist a common upper bound. 15 If-then-else-introduction. Here, analogous to the Case-construction, the more liberal approach is followed by requiring that there need only exist a common upper bound of 1 and 2. 16. Where-introduction. 17. Enumerated set-introduction. 18-19. Set comprehension-introduction. 21-25. These rules concern list expressions. It should be noticed that that for an expression e :: , where type( ) = P the expression head(e) concat tail(e) is incorrectly typed. 26-32. These rules concern iterate expressions. Rule 29 pertains to the nest-introduction, conform the group by in SQL. 33-39. Arithmetic operations. 40-46. Aggregates. 41. The projection operation on types is used to formulate a typing rule for a sum operation over nested labels in a record structure. If the domain of this label is a int or a real, then it is possible to use a sum over operation. For instance sum fhhname="Mary",sal=h ms=3000,re=500i i , h name="Paul",salary=h ms=2500,re=900i i g over salms

47-59. Literals. 47. If two terms are equal, then they should be equal in all aspects; i.e., they should also have exactly the same typing possibilities in the context of subtyping. An isa predicate is defined to allow for a more liberal comparison of specialized expressions with more general ones (see 51). 60-63. Logic. 64. Sort operator-introduction. 65. Class operator-introduction. 66. The as -operator can be used to generalize expressions to higher domains.

Chapter 7

Method inheritance and related theoretical issues 7.1 Methods The chosen framework for the treatment of (inheritance of) methods is a type system that has originated from the well-known type system of Cardelli, which supports multiple inheritance and static type checking ([Card84,Card88]). Some familiarity with this system (i.e., the syntax, not the semantics) on the part of the reader is helpful. Balsters and Fokkinga succeeded in providing this system with a simple settheoretic semantics ([BaFo91]). Next, a general set notion (sets defined by comprehension, together with logical formulas) was added to the theory ([BaBZ93], [BaVr91]). In this section we will explain that, by introducing a type variable, the resulting theory allows for a clean adaptation to incorporate methods and their inheritance. Before turning to the problem of inheritance of methods, from one class to another, let us have a closer look at methods of a class. If C is a class, then there are two levels at which methods of class C can be defined; we have

 

(update and retrieval) object methods of class C , and (update and retrieval) class methods of class C .

Object methods are methods on individual objects of a class. The data retrieved or updated by calls (i.e., applications) of these methods concern one or more attributes of a single object of a class at each call. For instance, provided our database contains a class Employee, we could have an update method for the promotion of an employee (which might amount to the update of attributes such as department, salary, job, et cetera), and a retrieval method for the age of an employee (if date of birth is an attribute). Second, we have class methods, which are update and retrieval methods on sets of (allowed) objects of that class. For example, for the class above, we could have an update method to insert (“create”) a new Employee object (it updates the set of persistent Employee objects available in the current database state (the class extension), an update method to delete (“kill”) an Employee object (which also updates the class extension), and a retrieval method to determine the average age of the employees of a given department. More complicated retrieval methods on a class are views, each pertaining to one class only. It is possible for a class to be a specialization, or subclass, of another class. If A is a subclass of B , then the underlying type of A is required to be a subtype of the underlying type of B . Specialization amounts to the addition of new attributes for the subclass, the addition of new static constraints, the addition of new 89

CHAPTER 7. METHOD INHERITANCE AND RELATED THEORETICAL ISSUES

90

methods, or a combination ([BaBZ93]). Only the new attributes or constraints need be specified, thus reducing redundancy. Besides contributing to efficiency, specialization is a powerful design concept. Yet this can only be true if not only attributes and constraints are inherited, but the specified methods as well. Within the boundaries of type theory it has been clearly shown why this poses a problem ([DaTo88]). In this manual we will show what can be done about it. Let us now hint at the representation of a TM database specification in FM In this framework we have expressions (or terms), and their types. Terms in FM may represent objects, constraints, methods, and so on, but whether a term represents an object, a constraint, a method, or something else, remains implicit in a FM specification. Remember that FM is a type theory, not a data model. Types are defined by induction, such that basic types (e.g., int, bool and oid, the type of object identifiers) are types, and there are type constructors for record types, set types, function types, list types and variant types. Terms are constants, variables, -abstractions, function applications, records, sets (not merely enumerated ones!), lists, and so on. A record is in fact nothing but a function from a finite set of field names to a set of values, whether representing an object, a database state, or for example just an object attribute. FM has sufficient expressive power to represent for example

    

complex objects (as records, denoted by h a1

: 1 ;    ; am : m i ),

class extensions (as sets of equally typed records), allowed class extensions, taking static constraints into account (as sets of sets of equally typed records), database states (as records with class names as fields and class extensions as their values), operations (as functions of the form (x1

: 1:    :xm : m :E ), etc.).

Specifications in FM are abstract in the sense that everything is a term, and that we thus seem to forget the organization of the database: the classes, the objects, the constraints, the operations (e.g., methods) etc.), are all treated alike, namely as terms. FM offers the right level of abstraction to gain some basic understanding of the topic of this manual. Specialization in TM reduces to the (abstract) concept of subtyping in FM, as far as the underlying types are concerned. Inheritance of methods, whether object or class methods, or update or retrieval methods, is known to be essentially a general (sub)typing problem ([DaTo88]). Hence the use of FM as a general framework to tackle this problem. However, the inheritance of update methods is of a somewhat different nature than the inheritance of retrieval methods (w.r.t. the inheritance of record structures), as we will see in the next section. Therefore we will explain the problem by giving some examples. From now on, only FM notation will be used, unless class names are mentioned. Consider the following straightforward example of an update method, as represented in FM, and how its inheritance poses a problem. Let employee be the record type

hid : oid; name : string; job : string; dept : stringi: Let the update method ChangeDepartment be represented as follows:

self : employee:d : string: hid = self  id; name = self  name; job = self  job; dept = di Let type manager be type employee with the additional field rank of type int. In Cardelli theory, manager is a subtype of employee, because manager has all fields of employee (and more), and the corresponding

7.2. ILLUSTRATIONS OF METHODS AND THEIR INHERITANCE

91

field types obey the subtyping relation (in this case they are even equal). Moreover, the application (call) of the above method to any record (tuple) of type manager is allowed, since every record of type manager also has type employee, due to the fact that manager is a subtype of employee. Yet, such an application renders an employee record rather than a manager record, and therefore does not describe an update of manager objects. It is not difficult to solve this specific problem in a rather ad-hoc manner. Most methods are far more complex than this trivial example, however. We therefore aim at a sufficiently general formulation and solution to the problem. The remainder of this chapter is organized as follows. Below, we will present a theory of inheritable methods in an informal manner, illustrated by a few examples of methods, richer than the example given above. It is essential to show what we want, as far as inheritance of methods is concerned. It is shown why this is a problem. The theory that solves this problem will be outlined in an informal manner.

7.2 Illustrations of methods and their inheritance Let us consider an imaginary database with two classes, namely the class Employee and the class Manager, which is a subclass of Employee. We will define a few methods of the class Employee, and show that the inheritance of these methods to the class Manager poses a problem. We will also show how to solve this problem, by specifying each of the methods in such a form that their inheritance immediately becomes apparent. Suppose that for each employee we keep track of the following data:

   

personalia, namely name, address, and date of birth rank allocated jobs current job

It is assumed that the rank and the personalia taken together determine the salary of each employee. Moreover, we assume that each employee is allocated a number of jobs, each corresponding to a job at a certain department in the organization. Every employee performs only one job for a few weeks, and then switches to another job at another department, provided the new job belongs to the set of jobs allocated to that employee. Hence, in this simplified case we have for the class Employee the following underlying type employee:

hid : oid; personalia : nab; rank : int; allocatedjobs : Pjob; currentjob : jobi where

nab addr date job

= = = =

hname : string; address : addr; birthdate : datei hstreet : string; number : int; city : stringi hday : int; month : int; year : inti hjobname : string; department : stringi

For managers the data is somewhat refined:



the departments run by the manager are added as an attribute, and

CHAPTER 7. METHOD INHERITANCE AND RELATED THEORETICAL ISSUES

92



jobs for managers are refined by adding a short description of the specific job.

Hence, the class Manager has the following underlying type manager for its objects:

hid : oid; personalia : nab; rank : int; allocatedjobs : Pmanagerjob; currentjob : managerjob; deptsrun : Pstringi where

managerjob = hjobname : string; department : string; jobdescription : stringi Apart from function types, our theory follows the subtyping relation as defined by induction in the type theory of Cardelli ([Card84,Card88]) extended with the rule

   ) P  P where P denotes the set type of  . The (Cardelli) subtyping rule for record types is 1  10 ;    ; m  m0 ) h a1 : 1 ;    ; am : m ; b1 : 1 ;    ; bp : p i  h a1 : 10 ;    ; am : m0 i (Remember that the order of fields (attributes) of a record (type) is irrelevant.) Since managerjob  job, we have Pmanagerjob  Pjob. And thus, knowing that manager has all fields of employee (plus one), with the corresponding field types obeying the subtyping relation, we have manager  employee In order for the class Manager to be a subclass of the class Employee, it is required that the types manager and employee be subtypes, but there are also some additional demands, e.g., those pertaining to (the inheritance of) constraints ([BaBZ93]). Typical update operations on Employee objects include operations involving the allocation of a new job, or a shift from one job to another. In many cases, only the department of an allocated job should be updated, not the name of the job. The corresponding method ChangeAssignedDepartment can be specified as follows:

self : employee: olddep : string:newdep : string:job : string: hid = self  id; personalia = self  personalia; rank = self  rank; allocatedjobs = replace h jobname = jb  jobname; department = newdepi for jb in self  allocatedjobs iff (jb  jobname = job) and (jb  department = olddep), currentjob = self  currentjobi where = and in denote equality and set membership, respectively. In the replace notation it is easy to recognize the use of the replacement scheme for defining sets, namely ff (x) j x 2 Ag, where f is a function and A is a set. The method above is supposed to update employee objects. If we apply this method to a manager object, the result will be an employee object rather than an updated manager object. For instance, such an application does not render an object having a deptsrun attribute, and therefore the resulting object can not be a manager object. Hence, as an update method (written in this form) it is not inheritable from Employee to subclasses, such as Manager. What we would like to have is the following expression:

7.2. ILLUSTRATIONS OF METHODS AND THEIR INHERITANCE

93

self : employee: olddep : string:newdep : string:job : string: self except (allocatedjobs = replace h jobname = jb  jobname; department = newdepi for jb in self  allocatedjobs iff (jb  jobname = job) and (jb  department = olddep) ) The except expression (term) denotes the (record) value which equals the value denoted by self, except for the attribute allocatedjobs. The general format is

E except (a1 = e1;    ; am = em ) It does not mean that the “contents” of self is updated, because we have no notion of assignment in the functional (!) language FM. Intuitively, the specification above is inheritable, because it shows that

 

only the attribute allocatedjobs should be updated, such that for the element(s) of allocatedjobs having the given value j for attribute jobname of that element, only the attribute department must be updated.

But now we are facing a serious typing problem whenever we apply this method (specification) to a manager object. We still face the situation that the attribute allocatedjobs is of type Pjob in employee, while it is of type Pmanagerjob in manager. But then we are confronted with the typing problem that the replacement expression, having type Pjob, should be typed Pmanagerjob instead, if we substitute manager for employee. In order to solve this typing problem, we will write a polymorphic specification of the method ChangeAssignedDepartment. To that end we will first introduce a type variable . The idea is that  can be instantiated with employee, but it can also be instantiated with any of its subtypes, such as manager. The types of the attributes of employee can be denoted by   id,   personalia,   rank, and so on. Note that these polymorphic types also make sense for all (Cardelli) subtypes of employee, and for manager in particular. Obviously, the instantiation (  allocatedjobs)[employee] renders Pjob, whereas (  allocatedjobs)[manager] renders Pmanagerjob. The type, of which the set type is denoted by   allocatedjobs, will be denoted by elmt(  allocatedjobs) where elmt stands for “element type”, or the “removal of P from a set type”. We therefore have

(elmt(  allocatedjobs))[employee] = job (elmt(  allocatedjobs))[manager] = managerjob Descending one level in the type structures of employee and manager, we find

((elmt(  allocatedjobs))  jobname)[employee] = string ((elmt(  allocatedjobs))  department)[employee] = string The instantiations with manager render type string as well. Types such as employee and manager are called environments, because they incorporate their own local “type system”, as we will see. Polymorphic types like  and elmt(  allocatedjobs) are called component type schemes within environment

CHAPTER 7. METHOD INHERITANCE AND RELATED THEORETICAL ISSUES

94

employee, or manager, for that matter. We will see in the theory that component type schemes within an environment are inherited to each subenvironment (i.e., subtype of that environment). We will now rewrite our example of a method, by making use of the polymorphic types above. The method ChangeAssignedDepartment becomes:

self : : olddep : elmt(  allocatedjobs)  department: newdep : elmt(  allocatedjobs)  department: job : elmt(  allocatedjobs)  jobname: self except (allocatedjobs = replace (jb except (department = newdep)) for jb in self  allocatedjobs iff (jb  jobname = job) and (jb  department = olddep) ) The method is typed

 ! elmt(  allocatedjobs)  department ! elmt(  allocatedjobs)  department ! elmt(  allocatedjobs)  jobname) !  This type scheme is constructed from component type schemes. More generally, we can construct type schemes inductively (“bottom up”) from basic types and component type schemes (the latter being a “top down” notion), by using the constructs for set types, record types, list types, variant types, and function types. For example,

Pelmt(  allocatedjobs) is a type scheme within the environment employee: instantiation of  with employee renders type Pjob. It is also a type scheme within every subtype of employee, such as manager. Note that

  allocatedjobs has the same instantiation. Analogously, the above specification of ChangeAssignedDepartment is a term scheme within environment employee and within all its subtypes, since the corresponding instantiations are well-typed. In short, ChangeAssignedDepartment, being defined for employee objects, inherits to all subtypes of employee. In this solution, inheritance is implicit in the type variable . We will now give an example of an inheritable retrieval method in our database, defined for type employee. Suppose we are interested in a method that renders, for any given department, for every job associated with that department, the minimal and maximum ranks of employees to whom the job is allocated, or in a method that renders, for any given department, for every rank, the jobs associated with that department that are allocated to employees of that rank. In both cases it makes sense to first define a method that renders all relevant (job, rank)-pairs. The method JobRank, specified as an inheritable term scheme, reads like this:

7.3. INCORPORATION OF METHOD INHERITANCE IN TM

95

self : P: dep : elmt(  allocatedjobs)  department: unnest collect collect h job = j; rank = p  ranki for j in p  allocatedjobs iff j  department = dep for p in self For  we could read employee, and self could in that case denote the current extension of the class Employee, but could in fact denote any set of employee objects. The method specified above is therefore a class method. In the retrieval method JobRank records are created by ordinary record construction, while in the update method ChangeAssignedDepartment records are overwritten by the except construct. The reason of this difference lies in the structures that we want to inherit in either case. Calls of the update method should result in records of the same type as the type of self, while calls of the retrieval method should render sets of type scheme

Ph job : elmt(  allocatedjobs); rank :   ranki In this case the “outer” record structure of the elements of the result of the call is “fixed”. The “inner” record structure of the job attribute varies, however, if the method is inherited from employee to manager, since the job attribute is typed job for employees, but managerjob for managers. If we want to establish a fixed record structure, it is easy to adapt the retrieval method, but that is not the point made here. In short, record overwriting and (explicit) record construction are both inheritable in themselves, but the latter is not useful for the inheritance of updates.

7.3 Incorporation of method inheritance in TM We have seen in the previous section how, by employing a polymorphic language construct, we can can realize a sophisticated kind of method inheritance for both update and retrieval methods. The explanation that we offered, however, was mainly in a setting of the formal counterpart FM of the TM-language. We will now offer a description of syntax to be used in TM-specifications involving method inheritance. We will do so not by offering precise definitions of the type system, syntax, typing rules and semantics pertaining to polymorphic type constructs, but rather we will offer some simple (but convincing) examples setting out the guidelines for use of polymorphic types in TM. This does not mean, however, that such precise rules and semantics regarding polymorphic types in TM do not exist. The reason for not including such a theoretical section is mainly due to readability of the material involved, which goes beyond the current scope of this manual. The interested reader on more formal aspects of polymorphic types in TM is referred to [Vree90]. In TM we do not use the -notation to denote a polymorphic type. Instead we will use the notion of selftype; i.e. where we used  in the section above, we will now use selftype. An example of a class employing an inheritable method is the method called ChangedAssignedDepartment, which in FM-syntax was defined by

self : : olddep : elmt(  allocatedjobs)  department: newdep : elmt(  allocatedjobs)  department:

CHAPTER 7. METHOD INHERITANCE AND RELATED THEORETICAL ISSUES

96

job : elmt(  allocatedjobs)  jobname: self except (allocatedjobs = replace (jb except (department = newdep)) for jb in self  allocatedjobs iff (jb  jobname = job) and (jb  department = olddep) ) In TM this method is written as object update method

ChangedAssignedDepartment( in olddep; newdep : elmt(selftype allocatedjobs)  department; job : elmt(selftype allocatedjobs)  jobname) = (self except (allocatedjobs = replace (jb except (department = newdep)) for jb in self  allocatedjobs iff (jb  jobname = job) and (jb  department = olddep) ))

the type of this (object update) method, in TM, is selftype ! elmt(selftypeallocatedjobs)  department ! elmt(selftypeallocatedjobs)  department ! elmt(selftypeallocatedjobs)  jobname ! selftype As we can see, there is not much difference, apart from some sugaring, between the original FM-notation and the TM-version of the method ChangedAssignedDepartment. There are three things to keep in mind when writing TM-methods employing the polymorphic selftype

  

the only place where selftype-constructions play a role are in the declaration-part of the method involved; the body of the method is always written in ordinary TM-syntax (i.e. not employing selftype-constructs) the elmt- and -operations are used to show which components of the involved instantiation of selftype are to vary along with the intended inheritance of the method at hand selftype can be instantiated by any (non-polymorphic) TM type being in the ISA -relation to the original Class (or Sort) in which the method at hand was introduced

Below, we offer the TM-counterpart of the (class retrieval) method of the previous section called JobRank, which in FM-style was written as

self : P: dep : elmt(  allocatedjobs)  department: unnest collect collect h job = j; rank = emp  ranki for j in emp  allocatedjobs iff j  department = dep for p in self

7.3. INCORPORATION OF METHOD INHERITANCE IN TM

97

In TM, this method is written as class retrieval method JobRank ( in dep : elmt((elmt(selftype) allocatedjobs)  department out Ph job : elmt((selftype) allocatedjobs); rank : elmt((selftype) ranki = (unnest collect collect h job = j; rank = p  ranki for j in p  allocatedjobs iff j  department = dep for p in self) Notice that, in contrast to the FM-specification, selftype in the TM-specification is now associated to P instead of , since analogous to the use of self we also intend selftype to refer to a set of objects instead of loose objects alone. To make this analogy more strict and rigid, one can say that self:selftype always holds both on the object and on the class level. We conclude by offering the abstract syntax of polymorphic type in TM, and mention that the only place where they shall play a role in TM-specifications is in the parameter declaration part of inheritable methods Abstract syntax of polymorphic types The abstract syntax of polymorphic types pertaining to the examples as discussed above is given by the following definition:

PTy ::=

selftype j (PTy )

j PTy  L j elmtPTy j P PTy j PTy ! PTy

The full syntax of polymorphic types can be found in the syntax rule TypeExpr (page 59). For a detailed and formal treatment of polymorphic types in TM, we again refer to [Vree90].

Chapter 8

Open issues     

A extensive theory of method inheritance including a simple denotational semantics. A complete theory of transactions. A more elaborate and specific TM query language. A complete and formal treatment of safety issues and constructive aspects of TM specifications. A complete and formal treatment of constraint analysis.

98

References [A¨ıt-K91] H. A¨ıt-Kaci, “An overview of LIFE," in Next Generation Information System Technology , J. W. Schmidt & A. A. Stogny, eds., Proceedings of the First International East/West Data Base Workshop, Kiev, USSR, October 1990, Springer-Verlag, New York–Heidelberg–Berlin, 1991, 42–58, Lecture Notes in Computer Science # 504. [A¨ıNa] H. A¨ıt-Kaci & R. Nasr, “LOGIN: A Logic Programming Language with Built-in Inheritance," Journal of Logic Programming 1986, 185–215. [ADGM90] A. Albano, AL. Dearle, G. Ghelli, C. Marlin, R. Morrison, R. Orsini & D. Stemple, “A framework for comparing type systems for database programming languages," in Proceedings Second International Workshop on Database Programming Languages, Gleneden Beach, OR, June 4–8, 1989, R. Hull, R. Morrison & D. Stemple, eds., Morgan Kaufmann Publishers, San Mateo, CA, 1990, 170–178. [BaBa] R’e Bal & H. Balsters, “A Deductive and Typed Object-Oriented Language," in Third International Conference on Deductive and Object-Oriented Databases, December 6-8, 1993, Scottsdale, Arizona, USA . [BaBZ93] H. Balsters, R. A. de By & R. Zicari, “Typed sets as a basis for object-oriented database schemas," in Proceedings Seventh European Conference on Object-Oriented Programming, July 2630, 1993, Kaiserslautern, Germany , 1993, ??–??. [BaBZ91] H. Balsters, R. A. de By & R. Zicari, “Typed sets as a basis for object-oriented database schemas," in Proceedings Computer Science in The Netherlands (CSN-SION), Utrecht, 7–8 November, 1991, 1991, 62–77. [BaBV92] H. Balsters, R. A. de By & C. C. de Vreeze, “The TM Manual," University of Twente, technical report INF92-81, Enschede, 1992. [BaFo91] H. Balsters & M. M. Fokkinga, “Subtyping can have a simple semantics," Theoretical Computer Science 87 (September, 1991), 81–96. [BaVr91] H. Balsters & C. C. de Vreeze, “A semantics of object-oriented sets," in The Third International Workshop on Database Programming Languages: Bulk Types & Persistent Data (DBPL–3), August 27–30, 1991, Nafplion, Greece , P. Kanellakis & J. W. Schmidt, eds., Morgan Kaufmann Publishers, San Mateo, CA, 1991, 201–217. [BaDK92] F. Bancilhon, C. Delobel & P. Kanellakis, Building an Object-oriented Database System — The story of O2, Morgan Kaufmann Publishers, San Mateo, CA, 1992. [Beer90] C. Beeri, “A formal approach to object-oriented databases," Data & Knowledge Engineering 5 (1990), 353–382.

99

100

CHAPTER 8. OPEN ISSUES

[BHPT93] J. Besancenot, L. Hammami, P. Pucheral & J. Th´evenin, “GEODE User’s manual version 2," IMPRESS/IN6-Internal report-W1, 10/08/93. [BiWa88] R. Bird & P. Wadler, Introduction to functional programming , Prentice Hall International series in computer science, Prentice-Hall International, London, England, 1988. [CaWe85] L. Cardelli & P. Wegner, “On understanding types, data abstraction, and polymorphism," Computing Surveys 17 (1985), 471–522. [Card84] L. Cardelli, “A semantics of multiple inheritance," in Semantics of Data Types #173, G. Kahn, D. B. Macqueen & G. Plotkin, eds., Lecture Notes in Computer Science, Springer-Verlag, New York–Heidelberg–Berlin, 1984, 51–67. [Card88] L. Cardelli, “A semantics of multiple inheritance," Information and Computation 76 (1988), 138–164. [ChHa80] A. K. Chandra & D. Harel, “Structure and complexity of relational queries," in Proceedings 21st Symposium on Foundations of Computer Science, Syracuse, NY , October, 1980, 333–347. [DaTo88] S. Danforth & C. Tomlinson, “Type Theories and Object-Oriented Programming," Computing Surveys 20 (Mar. 1988), 29–72. [Davi92] A. J. T. Davie, An introduction to functional programming systems using Haskell , Cambridge computer science texts #27, Cambridge University Press, Cambridge, UK, 1992. [DCBM90] A. Dearle, R. Connor, F. Brown & R. Morrison, “Napier88—A database programming language?," in Proceedings Second International Workshop on Database Programming Languages, Gleneden Beach, OR, June 4–8, 1989, R. Hull, R. Morrison & D. Stemple, eds., Morgan Kaufmann Publishers, San Mateo, CA, 1990, 179–195. [Duzi91] M. Duzi, “Functional approach to the specification of distributed database systems," Database Technology 4 (1991), 69–76. [HuKi86] S. E. Hudson & R. King, “CACTIS: A Database System for Specifying Functionally-Defined Data," in Proceedings 1986 International Workshop on Object-Oriented Database Systems , K. R. Dittrich & U. Dayal, eds., Pacific Grove, CA, September, 1986, 26–37. [HuSu89] R. Hull & J. Su, “On accessing object-oriented databases: expressive power, complexity, and restrictions," in Proceedings of ACM-SIGMOD 1989 International Conference on Management of Data, Portland, OR, May 31–June 2, 1989 , J. Clifford, B. Lindsay & D. Maier, eds., ACM Press, New York, NY, 1989, 147–158, (also appeared as SIGMOD RECORD, 18, 2, June, 1989). [HuYo91] R. Hull & M. Yoshikawa, “On the equivalence of database restructurings involving object identifiers," in Proceedings of Tenth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Denver, CO, May 29–31, 1991, ACM Press, New York, NY, 1991, 328–340. [ISR91] Alcatel. ISR, “SPOKE 3.1.2," 1991. [LaSc93] C. Laasch & M. H. Scholl, “Deterministic semantics of set-oriented update sequences," in Proceedings Ninth International Conference on Data Engineering, Vienna, Austria, April 19–23, 1993 , IEEE Computer Society Press, Washington, DC, 1993, 4–13.

[L´eRV88] C. L´ecluse, P. Richard & F. Velez, “O2 , an object-oriented data model," in Proceedings of ACM-SIGMOD 1988 International Conference on Management of Data, Chicago, IL, June 1–3, 1988 , H. Boral & P-A. Larson, eds., ACM Press, New York, NY, 1988, 424–433, (also appeared as SIGMOD RECORD, 17, 3, September, 1988).

101 [Meye88] B. Meyer, Object-oriented Software Construction , Prentice-Hall International Series in Computer Science, Prentice-Hall International, London, England, 1988. [OhBB89] A. Ohori, P. Buneman & V. Breazu-Tannen, “Database programming in Machiavelli—a polymorphic language with static type inference," in Proceedings of ACM-SIGMOD 1989 International Conference on Management of Data, Portland, OR, May 31–June 2, 1989 , J. Clifford, B. Lindsay & D. Maier, eds., ACM Press, New York, NY, 1989, 46–57, (also appeared as SIGMOD RECORD, 18, 2, June, 1989). [Ship81] D. W. Shipman, “The functional data model and the data language DAPLEX," ACM Transactions on Database Systems 6 (March, 1981), 140–173. [StSh84] D. Stemple & T. Sheard, “Specification and Verification of Abstract Database Types," in Proceedings 3rd ACM SIGACT-SIGMOD Symposium on Principles of Database Systems , Waterloo, Canada, April, 1984, 248–257. [Vree89] C. C. de Vreeze, “Extending the Semantics of Subtyping, accommodating Database Maintenance Operations," Universiteit Twente, Enschede, The Netherlands, August, 1989, Doctoraal verslag. [Vree90] C. C. de Vreeze, “Formalization of inheritance of methods in an object-oriented data model," University of Twente, Technical Report, INF 90–76, Enschede, December, 1990.

Index *, 70, 85 +, 70, 85 -, 70, 85 /, 70, 85 , 72, 86 , 72, 86 ˆ, 70, 85

atan, 70 cos, 70 div, 70 mod, 70 sin, 70 sqrt, 70 tan, 70 as, 74, 87 asin, 70, 85 at, 68, 85 atan, 70, 85 AttDomList, 58 AttList, 57 attribute module, 57 attribute browser, 42, 45 attributes, 57 avg, 85 avg, 71, 86

Abbreviation, 67 abs, 70, 85 acos, 70, 85 add element button, 51 add instance menu, 51 add menu, 43, 45 add subexpr button, 49 add subobject button, 49 add&exit button, 50 Aggregate, 71 and, 73, 86 apply button, 50 Arithmetic, 70 arithmetic operator, 70–71 *, 70 +, 70 -, 70 /, 70 ˆ, 70 abs, 70 acos, 70 asin, 70

basic domains (BD), 77 basic sorts (BS ), 77 basic type, 7, 58 basis, 83 BasType, 58 bool, 58 boolCons, 64 browse button, 42, 45 browse mode, 42 browser attribute, 43, 45 class, 42, 45 sort type, 45 button add element, 51 add subexpr, 49 add subobject, 49 add&exit, 50 102

INDEX apply, 50 browse, 42, 45 cancel, 50 clear, 45, 48 clone, 44 close, 44, 45 collapse, 49 delete, 43, 45, 48 delete element, 51 delete instance, 50 edit, 45, 48 edit instance, 50 evaluate, 49 expand, 49 filter, 48 generate, 41 help, 41, 44, 45 load, 41, 48 mouse, 45 new, 41 options, 41 quit, 41, 48 save, 41, 48 select, 42, 45 type constructor, 45 typecheck, 41 view, 41 by, 68

Call, 74 cancel button, 50 Cardelli, 10, 13 case, 62, 65, 84 cast type —, 74 change menu, 45 char, 58 charCons, 64 Cl, 57 Class, 57 Class, 57 class, 7, 10, 13 recursive —, 11 class, 60 class as container, 14 class browser, 42, 45

103 class editor, 41 class extension, 89 class operation, 79 class operator, 87 class type, 7 ClConsMethList, 60 clear button, 45, 48 Clend, 60 Clnm, 58 clone button, 44 close button, 44, 45 collapse button, 49 collect, 68, 85 complex sorts (compS ), 79 concat, 68, 84 conceptual schema, 55 Cons, 64 constant, 64 constants, 83 constraint, 11 class —, 60 key —, 72 module —, 55, 56 object —, 60 Constraints, 60 constraints, 56, 60 context ?, 77 context ?M , 81 cos, 70, 85 count, 85 covariance, 10 CS, 55 CUpMeth, 61 current module menu, 43

, 93 database design tool, 40 DDT, 40 versions, 40 deep equality, 72 delete button, 43, 45, 48 delete element button, 51 delete instance button, 50 delete mode, 43 diagram orientation, 44 div, 70, 85

INDEX

104 documentation tool, 40 Domain, 58 domains (D), 77 edit button, 45, 48 edit instance button, 50 edit window, 48 elmt, 59, 93 else, 62, 63, 66 emptylist, 68, 87 emptyset, 67, 87 encapsulation, 16 endcase, 62, 65 endif, 63 enumerated list, 68 enumerated set, 67, 84 enumerated type, 6, 59 equality, 12 =, 72 ', 72 deep, 72 isa, 72 shallow, 72 equiv, 73, 86 error, 58, 64 evaluate button, 49 except, 62, 65, 84 exists, 73, 87 expand button, 49 expression, 63 explicit —, 8 iterative —, 9 predicative —, 8 selection —, 8 expression list window, 47 expression window, 48 extension, 89 false, 64 field selection, 65 filter button, 48 FM, 6, 7, 90 focus menu, 44 for, 69 forall, 73, 87 generate button, 41

GEODE TM-to-SPOKE translation, 15 geometric functions, 85 GLB, 57, 80 GLB 5, 80 graph viewer, 41, 42 change orientation, 44 input area, 42, 43 message are, 42 graphical TM interface, 40 Greatest Lower Bound, 57, 80 GTI, 40 head, 68, 84 help button, 41, 44, 45 id, 74 if, 84 if, 63 iff, 68, 69 implies, 73, 86 import instance menu, 51 in, 61, 67–69, 72, 86 inc, 74, 87 includes, 56 inherit menu, 43 inheritance, 7, 89 method —, 6 multiple —, 6, 13, 89 input area, 42 int, 58 intCons, 64 intersect, 67, 84 is, 65, 86 ISA, 57 isa, 72, 86 Iterate, 68 iterative expression, 69–70 jizz, 7 key, 72

L, 58, 59 lambda calculus, 6 lastid, 74, 87 late binding, 10, 16

INDEX Least Upper Bound, 80 let, 56 LIFE, 40 List, 68 list, 84 empty —, 6 enumerated —, 68 list operator, 68 at, 68 concat, 68 head, 68 tail, 68 unlist, 68 list type, 7 load button, 41, 48 logic typed —, 10 LUB, 80 LUB 4, 80 max, 85 max, 71, 86 menu add, 43, 45 add instance, 51 change, 45 current module, 43 focus, 44 import instance, 51 inherit, 43 modules, 44 new value, 48 recalc, 44 show, 43 zoom, 44 message area, 42 MethArgs, 62 Methnm, 62 method, 10 class —, 60 module —, 55, 56 object —, 60 method application, 83 method call, 74 method redefinition, 13 methods, 56, 60

105 min, 85 min, 71, 86 minus, 67, 84 mod, 70, 85 mode browse, 42 delete, 43 select, 42 module, 6, 13, 16–18, 56 module, 55, 56 module attribute, 57 module constraints, 55, 56 module methods, 55, 56 module section, 56 ModuleName, 56 modules, 81 modules menu, 44 ModuleSection, 56 ModuleSpec, 55 mouse button, 45 multiple inheritance, 89 MVarDomList, 62 nest, 68, 85 new button, 41 new value menu, 48 nil, 58, 64 not, 73, 86 null value, 12, 59 object, 12 object, 60 object creation, 74 object identity, 63 object sharing, 15–16, 18–19 of, 62, 65 Oid, 74 on, 59, 65, 84 options button, 41 or, 73, 86 OUpMeth, 61 out, 61 over, 68, 71

P, 58, 59 partition, 72, 86 PE, 40, 47

INDEX

106 persistence, 6, 14–16, 18, 22 polymorphic type, 59, 93 polymorphism ad hoc —, 11 parametric —, 11 PreAmble, 56 Pred, 73 predicative set, 67, 84 projection on record types, 81, 85 proof theory, 15 proof tool, 40 prototyping environment, 40, 47 quantifier exists, 73 forall, 73 query, 48 quit button, 41, 48 real, 58 realCons, 64 recalc menu, 44 Record, 65 record projection, 84 record type, 7 records, 84 replace, 68, 85 RetMeth, 61 retrieval, 56, 60 retrieval methods, 83 safeness detector, 40 sameness, 12 save button, 41, 48 schema conceptual —, 55 sd, 85 sd, 71, 86 SDE, 45 select button, 42, 45 select mode, 42 selection, 69 self, 11, 62–63 selftype, 11, 59, 96 semantics, 15 session, 41 session file, 41, 47

generate from TM source, 41 Set, 67 set empty —, 6 enumerated —, 67 set operator, 68 intersect, 67 minus, 67 union, 67 set type, 7 shallow equality, 72 show menu, 43 sin, 70, 72, 85, 86 Sort, 57 Sort, 57 sort, 10, 13 sort operation, 79 sort operator, 87 sort type browser, 45 SortConsMethList, 60 spartition, 72, 86 SPOKE, 40 TM-to-SPOKE translation, 15 sqrt, 70, 85 ssublist, 72 sublist, 86 ssubset, 72 subset, 86 static type checking, 89 string, 58 stringCons, 64 subclass, 14 sublist, 72, 86 subset, 72, 86 substitution, 67 subtype, 13, 14 closed —, 13 subtyping, 6, 7, 13 sum, 85 sum, 71, 86 syntax directed editor, 40, 45 tail, 68, 84 tan, 70, 85 template, 49 then, 63

INDEX TM-to-SPOKE translation, 15 transaction, 6, 15, 16 true, 64 type, 7, 58 abstract —, 12 basic —, 7, 58 collection —, 10 concrete —, 12 enumerated —, 6, 59 list —, 7 object —, see class polymorphic, 59 polymorphic —, 93 power, see type, set primitive, see type, basic record —, 7 representation —, see type, underlying semi-abstract —, 13 set —, 7 underlying —, 7, 11, 14 unnamed —, 12 variant —, 7 type, 58 type function, 77 type cast, 74 type checker, 40 type checking, 10 static —, 10, 89 type constructor, 11 type constructor button, 45 type equivalence, 12 type expression, 59 type inference mechanism, 15 type rule, 6 type system, 10 typecheck button, 41 TypeExpr, 59 typing rules, 83 typing rules for schema part, 82 underlying type, 7 undo test, 41 undo button, 41 union, 67, 84 unique for, 68, 85

107 unique in, 68, 85 unlist, 68, 87 unnest, 68, 85 update, 56, 60 update methods, 83

Var, 65 VarDomList, 73 variables, 83 Variant, 65 variant type, 7 variants, 84 view button, 41 where, 63, 67, 84 window edit, 48 expression, 48 windows expression list, 47

X, 63 zoom menu, 44