Design Issues and First Experience with a Visual Database Editor for ...

A Visual Database Editor

Design Issues and First Experience with a Visual Database Editor for the Extended NF2 Data Model Klaus Küspert IBM Heidelberg Scientific Center, West Germany Jukka Teuhola University of Turku, Finland Lutz Wegner University of Kassel, West Germany

Abstract. The extended NF2 (eNF2) data model permits structured data values like (sub-)relations, lists and tuples to be included as attributes in relations. While it is a very powerful data model, it is not easy to handle and even harder to implement. In this paper we descibe a visual database editor for the eNF2 data model for simple tasks like browsing in the database, data entry and editing. We believe its interface, based on the paradigm of a two-dimensional cursor, called a finger, is intuitive enough to be used even by casual users. At the same time, the system, named ESCHER, serves as a research vehicle for self-referencing methods in the implementation of database systems. One aspect, a fixpoint "scheme-of-schemes" is discussed in detail and its visual representation, as produced by the running prototype, is shown. Other topics which are treated include the semantics of the cursor movement and new additions to the eNF2 data model, like scripts and links. Finally, we report on first encouraging experiences with ESCHER which also point out where more work needs to be done.

1. Introduction 1.1 The extended NF2 Data Model The extended NF2 (eNF2) data model [4, 5, 6, 12] permits various kinds of structured data values, such as(sub-)relations, lists and tuples, to be included as attributes in relations. The resulting tables are thus not in 1. Normal Form (NF2: Non First Normal Form). It is called the extended NF2 data model in contrast to the ’pure’ NF2 data model [1, 9, 13, 18] which only permits nested tables (relations with relation valued attributes). The eNF2 data model forms the basis for a DBMS development - called the Advanced Information Management Prototype (AIM-P) - at the IBM Heidelberg Scientific Center.

This data model also provides the framework for the research project on a visually oriented data base editor, called ESCHER. ESCHER even includes some attribute types not provided in the AIM-P data model, as can be taken from TABLE I.

direct correspondence to: Dr. Klaus Küspert, IBM Heidelberg Scientific Center, Tiergartenstr. 15, D-6900 Heidelberg, West Germany. BITNET: KUESPERT at DHDIBM1

1

Küspert/Teuhola/Wegner

Category atomic

complex

TABLE I: Scope of proposed ESCHER Types

ESCHER Type Name

Property

Integer DoubleInt (LongInt) CharString (Text) Real Boolean Character

signed, based on PASCAL 16 bit type signed, based on PASCAL 32 bit type max. 255 characters based on PASCAL 6 byte type based on PASCAL type based on PASCAL type

Set Bag (Multiset) List (Multilist) Tuple

referential

ExternalLink GenericLink

Remarks

homogeneous, no duplicates homogeneous, duplicates permitted homogeneous, ordered homogeneous, ordered inhomogeneous

1, 2

pair of "pointers" to object and its scheme untyped pointer

2, 3, 4 2, 3

executable Script Program

interpretable ESCHER instructions compiled and linked code

2, 3 2, 3, 5

graphics

generic name for any type of pixel data including digitized video 2, 3, 6 generic name for structured graphics data, e.g. PostScript files 2, 3, 6

Image Vector

Remarks: (1) Syntactically differtiated from list, but no semantic restrictions; included for historical reasons (2) Not standard eNF2, not supported in AIM-P (3) Not yet implemented (4) Possibly extended to include remote host and carrier information (5) Similar to ADT-extension in AIM-P (6) requires graphic mode (not yet supported)

In the following, we shall use the terms atomic and complex data value to denote a single or aggregate collection of attribute entries. When we refer to (atomic or complex) objects, we like to imply that both the value and its describing type (catalogue) and display (format) information are meant, although we do not consistently enforce this distinction between an object and its value.

ESCHER also uses the notions table and scheme to refer to an object with a named value (the table, not necessarily a set) and its named scheme. Tables and schemes are collected into a special table, the so-called Pool, from where they can be selected under their logical name. For loading and storing of tables and schemes, a filename is associated with each entry. For a discussion of the link concept, which resembles the design strategies in [SJ77] and of the execu

2


{BOOKS} !

{KEYWORDS} "

TITLE #

#YEAR

L_NAME INITIAL

ABREV_KW algorithms sorting searching analysis of alg. $

Knuth

"

D:E:

The Art of Computer Programming, Vol. 3

1973 %

%

$

'

Aho Sethi Ullman *

A.V. R. J.D.

< >

compiler constr. formal languages automata theory &

( )

Compilers: Principles, Techniques, and Tools 1986 $

The XYZ Manual

1989 &

UNIX computer operat.

Fig. 1: A Table of Books and its Scheme table attribute values, which is a concept well known from Postgres [Stonebraker], we refer the reader to Section 4.

Fig. 1 shows a scheme and table for a set of books. In this simplified example, a book is a tuple with four attributes. The first is of type list with arity 2 since authors of a book are given in an ordered sequence. TITLE and YEAR receive atomic (text and integer) values, KEYWORDS is a set-valued attribute. We shall use this example throughout the paper to explain ESCHER features.

1.2 Purpose of a visual DB-Editor The AIM-P DBMS provides an accompanying data definition and data manipulation language, called HDBL (Heidelberg Database Language), which is an upward-compatible extension of SQL [Chamberlin 81]. While the expressive power of HDBL is very strong, it is at the same time a somewhat difficult language, at least for the casual user.

For simple tasks, like browsing in the database, data entry and editing, etc., a visual interface for the eNF2 data model was developed at the University of Kassel. The working prototype of this system, called ESCHER, shows that, under certain restrictions, the computational demands are light enough to have the eNF2 editor run on a PC-type workstation with color graphics. All eNF2 objects (with all kinds of nesting between (sub-)relations, lists, tuples, and - at the bottom - atomic data) are shown in tabular form within windows. Two dimensional cursors, called fingers, are used to manipulate objects drawing on analogies from cursor manipulations in text editors. Even though the interface is presently only based on text mode (using the IBM ASCII character set for mosaic-graphics and a commercial text mode windowing system), the interface is visually appealing and fairly easy to master, although not without pitfalls.

Apart from applications in the office environment (cf. Figures 12 and 13 further below for an example from the banking environment), we believe and can demonstrate that it will be quite feasible in the near future to provide workers on the shop-floor with a low-cost data entry station for reviewing and manipulating eNF2 objects. This would imply that e.g. robot arm movements could be adjusted and tested on-site and then played back to the central eNF2

3


Fig. 2: Editor Display for a simplified ROBOTS Example DBMS. Fig. 2 gives an impression how the eNF2 editor interface looks for some ROBOTS data (example taken from [5]).

1.3 Project History and Design Goals of ESCHER

+

Research on this project originated with the task of devising sorting and duplicate elimination strategies for AIM-P [KSW89] when the second and third author were visiting guest scientists at the IBM Heidelberg Scientific Center in 1986. AIM-P is mostly written in PASCAL and runs on an IBM 3090 mainframe. To test the sorting and duplicate elimination algorithms in Kassel which has no access to the Heidelberg facilities - it was decided to design a ’little’ framework. Secondly, the intention is to use the editor in a workstation-server environment, provided a suitable exchange format for objects can be devised. PASCAL as the implementation language was mandatory and since many modern dialects provide windowing and menu tools, it was decided to take a visual approach to object manipulations and display. Furthermore it was quite clear that with limited manpower and within a university environment, nothing could be produced that would even come close to a production-quality, complete DBMS, nor was it ever considered. Thus, our approach to object manipulation, storage, and filing resembles in many ways design criteria in a spread-sheet. Meanwhile, with many implemented features proving extremly useful, the appetite for including classical DBMS features is growing. This is evidenced when comparing the status report from about a year ago [Weg89] with the current status reported here. ,

By restricting the functionality and with the freedom (the duty) of writing the system from

-

4


scratch, the intention was to make the system a test vehicle for new ideas. One idea was to use self-referencing (recursive) methods to implement parts of the system. The principle is not entirely new [15] and is used e.g. to bootstrap compilers written in a subset of their own source language. Fairly trivial uses of this technique are in treating schemes as tuples of a scheme table and in storing all system parameters and messages (error windows, drop-down menues, etc.) as NF2 tables which faciliates porting the system to a foreign language environment.

.

More demanding uses are in mapping internal data structures into NF2 tables. Examples are fingers (structured cursors), window contents, and index structures. The idea to include interpretable operation scripts (procedures) as attribute values resulted from the conviction, that they would provide a bootstrapping method. Although not used in the early stages of the implementation (because of lack of courage) they now become a challenging perspective.

In the next section we shall present the idea of the "Bootscheme" ("metascheme"), i.e. the "scheme of all schemes", as a very practical example of this recursive nature of our system. We hope it also demonstrates, why we believe that it is legitimate to name the editor after the famous Dutch graphical artist M.C. Escher (1898-1972), whose work depicts these self-reflecting, "impossible" worlds [7].

/

2. Node Structure and the Bootscheme 2.1 Principles of Object Storage NF2 tables and schemes are nested structures stored as trees where each inner node represents a complex value and atomic values are stored in the leaves. Whether a particular type of value is stored inside a node or forms another node with a pointer (link) to it, changes from implementation to implementation. .

In the case of ESCHER, we need an efficient internal representation since we must traverse the table and scheme trees fast in order to "draw" the object on the screen in "real time", say as a result of sliding the window over a table. Therefore, as we open a table or scheme for editing, we try to preload as many objects into core as possible, starting at the root of the tree. If core becomes a bottleneck, priority is given to schemes. Certain vital tables can be saved from swapping by having a sticky bit set. Nodes loaded from disk are transfered to the heap area of the run-time system and links represented by invariant record identifiers (RIDs) are translated into PASCAL-pointers. At the cut-off line between nodes already in core and nodes not yet loaded, a pointer value SENTINEL is inserted pointing to a dummy node. Detecting this value in a tree traversal signals that further disk accesses are needed using the RIDs in the node.

0

1

Figure 3 below shows the basic node layout. Nodes have a small fixed size and can hold NodeSize values directly. Settings for NodeSize between 8 and 16 seem sufficient and ensure that when the node is used in a scheme tree, the scheme information of a particular (sub-)object can be accessed without having to resort to the variable length More-extension. Similarly, administrators of eNF2 tables tend to group their attributes s.t the arity of a particular attribute for complex values is rather small (usually less than 5). As for the "vertical dimension" (height) of a table, say when the node has to hold pointers to 100 000 tuples in a set, the structure depicted in Figure 3 serves only as an entry point to the object and an elaborate More-extension comes into play which has a tree structure with one node roughly corresponding to a page on disk. How objects on disk map to objects in core, questions of partial loading of large objects, index

2

5


ObjIs: in d ElType

ElType

ElType

ElType

More

... A

RID

RID

RID

RID

Value

Value

Value

Value

1

2

3

...

NodeSize

Fig. 3: Basic Node Structure structures, etc. will be the topic of an upcoming paper and are skipped here. Atomic values with a storage representation of up to 4 Bytes (integers, characters, Booleans) go directly into the node and have RID 0. Text values, which are considered atomic, and Reals are stored separately in the heap. For the node itself and for each value a type descriptor is stored in the ObjIs, resp. A.ElType[i] field. Naturally, this information is redundant since type descriptions must come from the scheme tree and therefore these fields are mostly used for consistency checks and certain "short-cuts" in the display routines.

Next, each node contains size (arity) and sorting status which is used to keep sets free of duplicates. Finally, the Lines field of a node reports the present "vertical extension" of the node in unit measure which is needed in the display of a node and which is updated following insertion and deletion on the path from the altered node up to the root (see Subsection 3.1 below for a more detailed discussion of the use of this field). 3

.

When the node holds scheme information, the values (from left to right) represent

•

ANAME: the attribute name (pointer to text)

•

BT: base type - type of the object described by this scheme node (integer: 0 = undefined, set = 1, multiset = 2, list = 3, tuple = 5, integer = 10, string = 11, ...)

•

CT: component type - type of the object elements, valid only if BT is a homogeneous type (set, list), recorded as integer

• •

DEG: degree (arity) of attribute - 0 if BT is an atomic type, recorded as integer EXT: extension - unit measure for horizontal extension (column width) of an attribute (integer)

•

FORM: display format - display style information (presently not used)

•

SUB_ATTRIBUTES: set of subattributes - a pointer to a node which again contains DEGmany pointers to scheme description nodes for each subattribute; points to a node describing the empty set if DEG = 0.

4

4

6


ObjIs = Tuple s = 6 ANAME BT CT DEG EXT SUB_ATTR A

1 5

4

94

"BOOKS"

ObjIs = Set s = 4 1 2

3

4

A

to KEYWORDSdescription

ObjIs = Tuple s = 6 ANAME BT CT DEG EXT SUB_ATTR A

3 5

2

to YEAR-description

31

ObjIs = Tuple s = 6 ANAME BT CT DEG EXT SUB_ATTR

"AUTHORS"

A

11 0


0

32

"TITLE"

A ObjIs = Tuple s = 6 ANAME BT CT DEG EXT SUB_ATTR ObjIs = Tuple s = 6 11 0 ANAME BT CT DEG EXT SUB_ATTR A

11 0

"L_NAME"

0

0

10

22

"INITIAL" Fig. 4: Node Tree for Scheme BOOKS

7



3

A

ObjIs = Tuple s = 4 1 2 3 ObjIs = Tuple s = 4 1 2 3 ObjIs = Tuple s = 4 1 2 3 A

4

4

3

r d

4

1973

"The Art of Computer ..." ObjIs = List s = 1 1 A

ObjIs = Tuple s = 2 1 2 A

"Knuth"

"D.E."


3

4

A "analysis of alg." .... algorithms Fig. 5: Node Tree for a BOOKS Table

8


With this short description, the reader should be able to decipher the scheme and table trees in Figures 4 and 5 for the BOOKS-object from the example in Figure 1. Note how in the scheme tree the object types (ObjIs) alternate between tuple and set: the attribute description is an ordered sequence of inhomogenous values, the subattribute descriptions form a set since attributes are unordered in relational theory and duplicate attributes on the same level are not permitted. Codings for the BT and CT fields can be taken from the list above.

3

,

{BOOKS} [BOOK] [AUTHOR] L_NAME INITIAL

{BOOKS} !

TITLE

!

'

Knuth

D.E.

Aho Sethi Ullman

A.V. R. J.D. *

"

TITLE

...

L_NAME INITIAL Knuth

D.E.

Aho Sethi Ullman

A.V. R. J.D.

The Art of ...

The Art of ... '

(

Compilers ...

*

(

Compilers ...

"

< >

The XYZ Manual ...

"

< >

The XYZ Manual ...

Fig.6: The BOOKS-example with explicit and "Pseudo-" Tuples 2.2 The "Pseudo-Tuple Problem" Any object with degree (arity) > 1 is a tuple. In the BOOKS-example from Figure 1, the authors form tuples of degree 2 and each book entry is a tuple of degree 4. However, the eNF2 data model also offers the choice of an attribute type tuple (for any degree ≥ 1). This seems to be an infinite source of confusion, not only in our design as we have been told.

The difference between explicit and implicit (named and pseudo) tuples is depicted in Figure 6 using again the BOOKS-example. It is our firm belief that the form on the right is more natural than the explicit form on the left and that the naming of a tuple attribute should be assigned to the user if he/she desires so.

Similar confusion arises from the scheme tree. A set node is inserted for any attribute with complex type regardless of the degree of the object components. Thus, even a set of integers has a set-node inserted between the tuple for the set and the tuple for the integer attribute. In terms of efficiency, this does not matter because scheme trees are of small height (about twice the level of nesting) and furthermore the method of tree traversal becomes uniform. On the other hand, there is no constant which translates scheme tree depth into table tree depth: in the table, atomic values reside basically within the node of the enclosing object, complex objects are stored one level below their enclosing object if the degree is one and two levels down if the degree is greater than one. The scheme description of a value, however, is always two levels below the description of the enclosing object, etc. Since scheme tree and table tree are always traversed in parallel in the process of displaying or manipulating a table, this is a serious cause of headache for the implemention team.

5

Unfortunately, the list of pitfalls could be almost endlessly extended. As a last example, consi-

9


der scheme nodes for atomic attributes. Should there be no pointer to a subattribute node, i.e. a nil valued pointer, or should there be a pointer to an empty set of attributes (we feel the second alternative is theoretically more sound)? In summary, these problems seem inherent to the eNF2 data model. Some appear again as ambiguities in the user interface (see Section 5), others can be successfully hidden from the user.

+

Fig.7: The BOOKS-Scheme as a Complex Object

10

6


2.3 Schemes as Tuples of a Scheme Table and the Bootscheme Any scheme can be considered as an eNF2 object provided we can supply a "scheme of the scheme". Naturally, we don’t want to design a new "scheme of scheme" for every variant of a scheme which the user might come up with. Thus we looked for a scheme to fit all schemes which is not trivial since there is no limit on the nesting of attributes, i.e. the scheme of schemes, called the Bootscheme, has to be infinite which is beyond the eNF2 model1). Figure 7 shows our BOOKS-scheme as a tuple in a table of schemes with is governed by the Bootscheme. 7

This approach has considerable implications. In the initial phase of the implementation, when we had no editing routines for schemes, we would manipulate schemes as tuples which was awkward (the editing person had to know how types translate into integers, etc) but quite feasi-

+

1)

Fig. 8: The Bootscheme as Tuple under its own Scheme

Note that this is different from the question of potentially infinite recursive queries as diskussed e.g. in [Linnemann 87] 8

11


ble and in any case better than "hard-wiring" schemes into the execution code. Similarly, storage and retrieval routines can handle schemes and tables the same way. Of course, display routines must differentiate between the two since there are different styles for representing them. If the Bootscheme is the ’scheme of schemes’, does it also cover itself? The answer is yes, i.e. the BootScheme acts as a fixpoint and it is the only scheme included into the code. Any other scheme, including the scheme for the table of objects (Poolscheme), schemes for tables which act as menues, error and help windows, etc. can be interactively created and edited by the implementation team, stored on disk and read into the system upon system start. Figure 8 below shows the Bootscheme simultaneously as scheme and as table.

9

7

The Bootscheme is an infinite scheme. This is achieved by including what we call the "magic loop", i.e. a pointer back from the SUB_ATTRIBUTES link to the own set-node (cf. Figure 9). This creates a number of interesting problems very practical in nature.

Fig.9: The Magic Loop

An almost trivial point is that this object must be protected against manipulations and against writing to disk (a sure way to fill your disk very quickly). Readers will also notice that the attribute names are in ascending order. This was done after applying the sorting option (out of pure curiosity) to this object. The sort succeeded, even without going into an infinite loop because the second time around the algorithm noticed that the subobject was already sorted and backed up. Afterwards, however, ESCHER was not able to display anything anymore because attributes for scheme definitions where in the wrong order. Since sorting rearranges elements of sets but not fields of tuples and since the attribute name is the first field of a tuple, it sufficies to bring attribute names into ascending order. By now we are contemplating to switch the degree attribute to the front (the term arity comes in

12


very handy). This would imply that - after sorting schemes - attributes for atomic values appear on the left end, followed by complex objects of degree one, followed by attributes of degree two, etc. Among attributes of equal arity, attribute names (or type?) would automatically serve as second ordering criterion. This would lead to a certain type of Normal Form (cf. also [?]) which seems quite important since it speeds up processing of tables, e.g. in duplicate deletion.

Another interesting problem arises in the context of displaying the Bootscheme. Because it is infinite, certain routines would infinitely recurse, e.g. the recursive computation of total column width (the EXT value) and of the object height (the LINES value in the node). The actual display, on the other hand, causes no problem: node drawing stops anyway when a node is completely outside a window (see discussion below). Since the Bootscheme is "hard-wired" into the system, setting the column and line values according to one’s taste (the display in Figure 8 was produced by ESCHER with a setting of EXT = 100) was the only measure required.

7

Before the actual display would function, one final trick was needed. For each object currently shown on screen, two windows are opened: one for its scheme and one for the table. For that purpose, a Window Control Block (for coordinates, colors, etc) is attached to the each of the two entries in the Pool table (table of all objects). But with one and the same object serving as scheme and as tuple simultaneously, we could not link two Control Blocks to it. The cheapest solution was to add another predefined object name to the Pool and to let this name point to the root node of the Bootscheme. Since ESCHER has case sensitive identifier recognition, BOOTSCHEME is now the scheme of BootScheme and BootScheme believes its scheme is BOOTSCHEME - a straightforward case of deceiving your own DBMS!

9

:

:

:

;

3. Window Management through Fingers 3.1 The Internal Finger Concept To manipulate an object, we use a so called "finger" which "points" to a particular node in a table or scheme tree. Several fingers can co-exist within a tree. This is needed e.g. for the BootScheme display above but also for many other tasks, like indicating the cursor (external finger) location while redrawing the window after a window slide operation with a second finger and reculculating dimensions with a third.

Fingers thus represent a path into the object using the relative position (index) of an object within a surrounding object. They are implemented by stacks and are manipulated by small primitive operations, e.g. Go to i’th component in present object, go down one level (push) or up one level (pop). Whenever two or more fingers point into the same object, two or more paths may touch the same nodes (the root by necessity). Thus, a pointer to a node appears on several stacks, but always on the same level. As (sub-)objects may be inserted and deleted, the path recorded on the stack may become invalid. However, this is easily detected by chaining identical stack entries on each level. Synchronization of fingers then requires walking down the stack modifying other stacks found in the chain as needed without ever having to backtrack. +

<

3.2 Drawing an Object Drawing of a table then starts by calling a recursive procedure DrawTableNode(FingerId, WindowControlBlock, X1, Y1) while the finger sits on the outermost object. Adding the height and width of the node under consideration to the passed coordinates (X1, Y1), DrawTableNode compares the resulting values with the Window Control Block coordinates to find out whether any part of this node stretches into the window. If this is not the case, then DrawTableNode determines whether it is to the right or below the current window and if yes exits the current level of recursion, otherwise it considers the next node (next attribute to the right or next element down in a set or list). ?

@

14


If DrawTableNode detects the fact that all or part of the node’s tabular representation stretches into the window, it draws top and left bordering lines (with clipping) and then recursively calls DrawTableNode for its enclosed subobjects (first to last attribute, first to last element of a set, etc.). Atomic values cause the depth-first descend to terminate and are displayed centered, resp. flush right, within the area prescribed by the EXT- and LINES-value of this node. Putting the right type of line character (from the IBM extended character set with double lines for the scheme and single line characters for the table) on the screen is not trivial because the proper choice depends on line drawings already done. It is solved by reading the video buffer whenever top and left lines and corners are drawn (cf. Figure 8 where this method must fail because of our arbitrary manipulations!).

4

B

;

This recursive method solves even tricky cases without having to care about exceptions. An example is the empty AUTHORS-list in Figure 1 where we don’t want the vertical line to stretch through the AUTHORS-field. 3.3 Styles At present, there is only one style for table and scheme display with the following rules:

• •

The width of a tuple is the sum of the widths of each attribute plus both bordering lines.

3

,

The height of a tuple is the maximum of the height of any field plus 1 for a horizontal line unless all attributes are atomic.

•

Elements of sets and lists are separated by a line unless all attributes have atomic type. Accordingly, the height is the sum of the element hights (plus 1 for a horizontal line).

•

Empty set and empty list have height 2 (1 for { }, resp. < > plus one for a vertical line.

•

Attributes are always separated by a vertical line unless the complex value is an empty set or list.

We have come to call this "Vanilla Style" since it seems to fit everybody’s tastes. Presently we are discussing the inclusion of a FORM-attribute into the Bootscheme which would allow user specification of different styles. Style elements might specify leading zeroes for integer values, list elements displayed horizontally without lines between them, list elements displayed vertically without lines between them (even for complex elements), character values shown with Hidden-Character-Flag set to ON. With these style indicators assigned to a scheme, we might try a bold attack on the computing power of our workstation: using an eNF2 database editor as a

CHARACTER !

#LINE #

!

*

C

C

C

C

C

C

C

0001 0002 0003 0004 0005 0006 0007 D

June 23rd, 1989 To the Dean of Science ¶ Dear Mrs ...: we are pleased to inform you, that our project ESCHER is up and running. Due to its unforeseen success, however, the group allowance for travel and communication ...

Fig. 11: An Unusual Application for the DB-Editor

15


word processor! Figure 11 show the possible set-up. Using it without the line numbers, requires only a simple projection, resp. calling the word processor under the proper view. While in honesty we think that this application would put the system to an extreme test with severe performance problems, it brings us to the next topic: user interaction.

4. Interface Design for the DB-Editor E

4.1 The DB-Cursor The main purpose of a DB-Editor is to allow the casual user to review, change, insert, delete, and move around objects in an eNF2 table. A secondary task (harder to achieve) is to initiate relational operations on one or more tables, like selection, projection, join, etc. In general, visual graphics-based interfaces to database systems are quite common these days (cf. [2, 3, 8, 10, 11, 14, 16, 19, 20, 21, 22]). At present, design evaluations for visual database systems are being undertaken (cf. [17]). To our knowledge, however, no other prototype has been developed for the extended NF2 data model. The closest similar approaches are the FORMAL development at IBM’s Los Angeles Scientific Center [19] and the work by Kitagawa, Kunii and others on the form-based user interface of NF2 relations and its prototype development [10, 11].

F

The governing principle in ESCHER is to move a two dimensional cursor to the object(s) which the user wants to manipulate. This cursor is implemented as an (internal) finger, i.e. a stack. Because the cursor is used to point to an object, it is also called an (external) finger. The interface is thus based on what is probably the oldest paradigm in human communication: if you don’t speak your host’s language, point to the item which you want or indicate your intentions through a gesture.

In ESCHER a cursor can stretch over one or more adjacent objects of the same type. The location is indicated by coloring the object(s) which the cursor points to. As mentioned before, several cursors (fingers) can coexist in a table or scheme. Each cursor is assigned a color and a key, namely one of the function keys F1 to F10 which are present on most keyboards. Which fingers presently exist can be seen from the so-called finger line at the bottom of the screen (cf. Figures 12 and 13).

G

,

When a table is opened, a "colorless" finger F0 covers all of the table. New fingers are created by forking of a new finger from an existing one. The two fingers then sit on top of each other (color white) and either one can be moved to another object. At any time, there is one active finger indicated by flashing the Fi icon in the finger line at the bottom. Activating another finger is by simply pressing the corresponding ALT+Fj key combination 3

H

3

I

Windows are automatically adjusted to have the active finger object centered, resp. have the upper left corner of the object inside the window for objects with a large dimension. When a particular object is pointed to, the corresponding attribute in the scheme above is highlighted in the same color. This comes in very handy in scheme and table data entry, where as a result of pressing the insert (INS) key, a new empty object is inserted following the active finger. Atomic values are shown with "?" and the finger keeps moving to the next atomic value field in preorder for which a value is to be entered. Which value is presently entered can also be seen from the flashing attribute in the scheme above.

16


Fig. 12: Selecting a Tuple with a Bit-mapped Image

Fig. 13: Moving F1 to a Bit-mapped Image and Zooming

17


Finger movement is restricted in a syntactical/semantical way which needs some getting used to. A finger can move up or down, left or right within an object, it can move into an object to the first subobject or out to the surrounding object. Even though it seems as if there are three dimensions, there are in reality only two: Enter/Escape und Go-to-Predecessor/Successor. +

Note that finger movements are different from window slide operations: fingers are moved from object to object and the window slides with it to show the object in the middle as best as possible. A window slide operation slides the window left, right, up or down, e.g. if the user wants to review the neighborhood of an object to which the finger points. The unit of movement there is the window hight (-2 lines) for up/down and either column width or finger width for left/right. Table II lists our present choices for the corresponding key and mouse actions. .

J

As an example, consider the simulated screens from Figures 12 and 13. The application is placed in a banking environment where a clerk has to verify a signature on a withdrawal of a large sum. Activating a finger F1, a select operation is initiated (ALT+S) and the #ACCNO-attribute is clicked in the scheme to indicate the selection attribute. The desired account number is entered in a separate window and F1 moves over the selected tuple. Pressing ENTER and twice the right arrow key moves the finger over the AUTHORIZED-field. Again ENTER and once "→" moves the cursor over the first signature which then might be zoomed for better resolution. All this could be easily done on a low-cost PC-type of data entry station with modest graphics capabilities (EGA would most likely suffice).

4.2 Ambiguities

18


Table II: Key Strokes and Mouse Action for Cursor Movement

19


20


Since eNF2 tables have a tendency to become very large, a mouse and scroll-bars for the windows can considerably speed up visual selection. Without a mouse, arrow keys are doubly occupied: they are used for the movement of the active finger and they control the sliding of the table under the window (CTRL+arrow key). Note, however, that clicking a certain field with the mouse can not resolve the ambiguity whether the finger is supposed to move over a particular atomic value or over any of its surrounding complex values.

Because we seem to keep constantly changing the environment, i.e. alter key assignments, pick lists for operation initiation, etc., we have little experience with user acceptance of this interface. Certain problems, however, are evident. It is practically impossible to enter even the smallest table without making errors. The reason is that data entry progresses both to the right and down with "leaps" inbetween. The leaps are needed to signal the end of a set or list for which we use the escape (ESC) key. On the other hand, pressing ESC is not needed when filling a tuple with atomic values. After an atomic value has been typed and has been sent off (ENTER), ESCHER progresses to the next attribute and having reached the last attribute a new empty tuple is inserted. Data entry continues with the first attribute of this tuple at which time there is an option to press ESC (shown in the entry window, cf. Figure 14). If ESC is not used, the user is forced to fill in the other attribute values because partially filled tuples are not permitted (a Null-value option is to be added later).

3

While experienced typists can easily handle flat tables, they will very quickly give up on eNF2 tables, in particular, if set- or list-valued attributes have a varying number of atomic elements like the AUTHORS-list and the KEYWORDS-set in our BOOKS-example. Things become very tricky when such an attribute recieves an empty value, i.e. if a book has no AUTHORS as in Figure 1 or if no KEYWORDS are known for a book. These values are perfectly permissible and are accepted by ESCHER, but ESCHER must ask the user to resolve the following ambiguity. K

+

Fig. 14: Ambiguous Data Entry Must be Resolved by User

19


Table II: Key Strokes and Mouse Action for Cursor Movement

Pressing ESC on the first KEYWORD signals an empty set of KEYWORDS and ESCHER continues with the first L_NAME value of the next book’s AUTHORS-list. No user prompting is required. Now, if the user presses again ESC, ESCHER does not know whether this means an empty list of AUTHORS or if it signals that the last book has been entered and the BOOKS-set is to be closed. This causes the prompt in Figure 14.

K

If books were classified according to topics, the ambiguity of ESC stretches over two levels. ESC might signal the last author of a book, an empty list of authors, the last book in a topic, an empty topic, the last topic in the set of topics or an empty library, all depending on the context and the user reply. This is also a convincing argument for not having a complex valued attribute in the first column (cf. our Normal Form hint in Subsection 2.3 above).

The cursor movement becomes more difficult (and more confusing for the user) if style attributes in the scheme call for omission of lines and horizontal orientation of sets and/or lists. Consi

20

A Visual Database Editor der again the extreme application of the eNF2 editor as a word processor. Pressing either ArrowUp or Arrow-Left would move the cursor to the predecessor object, i.e. the character to the left, or would turn the bell on if it where the leftmost character in a line. Simply jumping up one line or wrapping around to the previous line is not feasible within the present semantic cursor control of ESCHER. On the positive side, standard pull-down menues could be implemented as special cases of eNF2-tables with a style-attribute value "erase if no finger on subobject" for each of the submenues. 5

Very little has been implemented concerning the script- and link-type attributes. Script values will be represented by the script’s name. Moving a finger on a script value in a table and pressing ENTER causes the script to execute which matches exactly the semantics of common menues or pick-lists.

For link-values we might have either the name of the object pointed to or special symbols like "■" and "❏" for links to unnamed objects with the white square signaling a NULL-link. Named links would find their application in user views where the Pool offers a choice of views for a particular table which is selected by moving the finger over the link name and pressing the ENTER-key. As a result, the corresponding projected and/or nested/re-grouped table is shown in a window which is stacked on top of existing windows. The window is closed by pressing ESC when the finger is on the outermost level. This agrees both with finger semantics and user intuition concerning opening and closing of windows. 9

Unnamed links come into play in connection with (secondary) key indexes. An eNF2-index is of course a normal eNF2 table with two attributes: key and set of links. Visually reviewing an index would indicate the number of matching entries for each (secondary) key value (number of black squares). Looking at any particular indexed tuple calls again for moving a finger on the link field (black square) and pressing ENTER (double click of left button for mouse operation). 5

5. Summary

The extended NF2 data model is a very powerful tool for structuring information in office automation, scientific and engineering applications. Combined with a data definition and manipulation language, like the SQL-extension HDBL for AIM-P, they provide great flexibility, improve performance by avoiding joins over large flat tables and help the user by grouping logically connected data into neighbouring fields.

However, NF2 database systems and their accompanying languages are hard to understand and even harder to implement. Thus, a visual database editor, called ESCHER, was devised which shows tables and schemes in tabular form. The idea is to give the casual user a tool for reviewing, entering and modifying data. At the same time, self-referencing methods, e.g. eNF2 tables to hold implementation data for the eNF2-editor, were employed to faciliate the implementation process. L

The principal idea for manipulating eNF2-objects is to use two-dimensional cursors which point to objects in the tables and schemes. ESCHER uses color to identify the position of the cursor. Several cursors can co-exist in a table, even with one cursor nested inside another cursor. Cursors move in a syntactically/semantically restricted way but otherwise behave and can be used just like a cursor in a text processing system.

M

21


Cursors, also called fingers because they are used to point to objects, are internally implemented by stacks. They serve as implementation tools as well, e.g. for displaying an object in one of the windows. This is done by letting two fingers run in parallel through the table, resp. scheme tree. Initial fears that this approach would be too slow for object redrawing in "real-time" proved groundless.

Similarly, the techniques of using eNF2 structures to implement parts of ESCHER and of introducing an infinite "scheme of schemes", called the BootScheme, which also has the fixpoint property, was at first considered a risky design decision. Soon, however, the ability to visually inspect one’s own implementation turned out to be a valuable test aid. By now we are fully convinced that having a metascheme is the only proper way to handling schemes.

While bootstrapping the system on the data structure side is solved, we feel something must be done on the algorithms side. One solution which we indicated are so-called ESCHER-Scripts. They are a variant of Postgres program-valued attributes [Stonebraker]. Another feature considered indispensible are links which we need e.g. for views and indexes in the form of eNF2 tables.

3

First experiences with ESCHER show that the eNF2-editor is a very practical teaching tool for understanding the semantics of this data model. At present, we use it extensively to study the effects of various sorting and duplicate elimination strategies. However, despite its appealing visual form, data entry in NF2 tables is extremely error-prone even for experienced users. Most of this is due to inherent ambiguities in the translation between user intentions and resulting table structure, e.g. in signaling empty sets and empty lists. This also holds for non-uniform mappings between scheme and data trees, referred to in this paper as the "pseudo-tuple problem". Unless complex attributes of arity 1 have an extra tuple inserted for each attribute value and even atomic values become a node of their own, both of which are unacceptable performancewise, this non-uniformity cannot be solved and will keep harassing users and designers of NF2 databases. +

N

.

Much remains to be done. In particular, the integration of images and vector graphics could turn eNF2 databases into a type of hypermedia vehicle [ reference to hypertext?]. Given the computing power of todays workstations, an eNF2-editor could be used as word processor, in designing user interfaces with pull-down menues, in forms management and in a CAD/CAM environment, all using a set of uniform syntax/semantics rules borrowed from the eNF2 data model.

Looking into the future, it seems not infeasible to hook up a fast random access digital video recorder, select a certain video clip from an eNF2 table and then edit this film in an ESCHERwindow using the cursor mechanism described in this paper. What can be done and what shouldn’t be done with object-oriented database systems will only become clear when users start experimenting with their NF2-systems, visually reviewing and modifying their designs. Unless repeated routine queries are to be performed, they will need visual aids like ESCHER. After all, if a picture is worth a thousand words, then an eNF2 window is certainly worth a thousand nested queries.

22


References [1]

S. Abiteboul, N. Bidoit: Non First Normal Form Relations: An Algebra Allowing Data Restructuring, Rapport de Recherche No 347, Institut de Recherche en Informatique et en Automatique, Rocquencourt, France, Nov. 1984

[2]

S. Bing Yao, A.R. Hevner, Z. Shi, and D. Luo: FORMANAGER: An Office Forms Management System, ACM TOIS, Vol. 2, No. 3, July 1984, pp. 235-262

[3]

D. Bryce, R. Hull: SNAP: A Graphics-based Schema Manager, Proc. IEEE Intl. Conf. on Data Engineering, Feb. 1986, pp. 151-164

[4]

P. Dadam, K. Küspert, F. Andersen, H. Blanken, R. Erbe, J. Günauer, V. Lum, P. Pistor, G. Walch: A DBMS Prototype to Support Extended NF2 Relations: An Integrated View on Flat Tables and Hierarchies, Proc. ACM SIGMOD Conf., Washington D.C., May 1986, pp. 356-367 3

1

[5]

P. Dadam, K. Küspert, N. Südkamp, R. Erbe, V. Linnemann, P. Pistor, G. Walch: Managing Complex Objects in R2D2, IBM Heidelberg Scientific Center, TR 88.03.004 (March 1988) 1

>

U. Deppisch, J. Günauer, G. Walch: Storage Structures and Addressing Concepts for Complex Objects of the NF2 Relational Model (in German), Proc. GI Conference on "Datenbanksysteme für Büro, Technik und Wissenschaft", Karlsruhe, March 1985, pp. 441-459

[6]

5

[7]

B. Ernst: Der Zauberspiegel des Maurits Cornelis Escher, TACO Verlagsgesellschaft und Agentur mbH, 112 pp., Berlin, 1986

[8]

D. Fogg: Lessons from a "Living in a Database" graphical query interface, Proc. ACM SIGMOD Int. Conf. on the Management of Data, 1984, pp. 100-106

[9]

>

G. Jaeschke, H.J. Schek: Remarks on the Algebra of Non First Normal Form Relations, Proc. ACM SIGACT-SIGMOD Symp. on Principle of Data Base Systems, Los Angeles, Cal., March 1982, pp. 124-138 1

[10] Kitagawa, et al: Form Document Management System SPECDOQ, Proc. ACM SIGOA Conf. on Office Information Systems, Toronto, Ont., Canada, June 25-27, 1984 (published as SIGOA Newsletter, Vol. 5, Nos. 1-2), 1984, pp. 132-142 [11] Kitagawa, et al.: Formgraphics: A Form-Based Architecture Providing a Database Workbench, IEEE Computer Graphics and Applications, Vol. 4, No. 6, 1984, pp. 38-56

[KSW89] K. Küspert, G. Saake, and L. Wegner: Duplicate Detection and Deletion in the Extended NF2 Data Model, Proc. of the Third Int. Conf. on Foundations of Data Organization and Algorithms, Paris, June 20-23, 1989, Springer LNCS, 1989, pp. ,

[12] V. Linnemann, K. Küspert, P. Dadam, P. Pistor, R. Erbe, A. Kemper, N. Südkamp, G. Walch and M. Wallrath: Design and Implementation of an Extensible Database Management System Supporting User Defined Data Types and Functions, Technical Report TR 87.12.011, IBM Heidelberg Scientific Center, Dec. 1987

3

[13] R.A. Lorie, W.Plouffe: Complex Objects and Their Use in Design Transactions, Proc. Annual Meeting - Database Week: Engineering Design Applikations (IEEE), San Jose, Cal., May 1983, pp. 115-121 F

[14] D. Luo and S. Bing Yao: FORM Operation by Example - a Language for Information Processing, Proc. SIGMOD Conf., June 1981, pp. 213-223 >

23

Küspert/Teuhola/Wegner [15] L.Mark and N. Roussopoulos: Metadata Management, IEEE Computer, Vol. 19, No. 12, Dec. 1986, pp. 26-36 O

[16] R. Purvy, J. Farrel, and P. Klose: The Design of Star’s Record Processing: Data Processing for the Noncomputer Professional, ACM TOIS, Vol. 1, No. 1, Jan. 1983, pp. 3-24 [17] G. Rohr: Graphical User Languages for Querying Information: Where to Look for Criteria, Proc. IEEE Workshop on Visual Languages, Pittsburgh, PA, Oct. 10-12, 1988 [18] H.J. Schek, M.Scholl: The Relational Model with Relation-Valued Attributes, Information Systems, Vol. 11, No. 2, 1986, pp. 137-147 [19] N. Shu: FORMAL: A Forms-Oriented, Visual-Directed Application Development System, IEEE Computer, Vol. 18, No. 8, Aug. 1985, pp. 38-49 .

[20] M. Stonebraker, J.Kalash: TIMBER: A sophisticated relation browser, Proc. 8th Int. Conf. Very Large Databases, 1982, pp. 1-10

[21] M. Stonebraker, H. Stettner, N. Lynn, J. Kalash, and A. Guttman: Document Processing in a Relational Database System, ACM TOIS, Vol. 1, No. 2, April 1983, pp. 143-158 [22] M. Zloof: Query-by-Example: A data base language, IBM Systems Journal 6 (1977), 324343 F

[23] M. Stonebraker, L.A. Rowe: The Design of Postgres, Proc. ACM SIGMOD, Washington, D.C., 1986, pp. 340-355 [24] M. Stonebraker: Inclusion of New Types in Relational Database Systems, Proc. 2nd Int. Conf. on Data Engineering, Los Angeles, Feb. 1986, pp. 262-269 [25] G.-J. Houben and J. Paredaens: A Graphical Interface Formalism: Specifying Nested Relational Databases, Proc. of the IFIP TC2 Working Conference on Visual Database Systems, Tokyo, 3.-7. April 1989, Elsevier Science Publ., Amsterdam, 1989, pp. 257-276

[26] L. Wegner: ESCHER - Interactive, Visual Handling of Complex Objects in the Extended NF2-Database Model, Proc. of the IFIP TC2 Working Conference on Visual Database Systems, Tokyo, 3.-7.April 1989, Elsevier Science Publ., Amsterdam, 1989, pp. 277-297 .

24


25

Design Issues and First Experience with a Visual Database Editor for ...

Design Issues and First Experience with a Visual Database Editor for ...

Suggest Documents