LANCASTER UNIVERSITY Computing Department
Database Systems: Challenges and Opportunities for Graphical HCI Pete Sawyer
Cooperative Systems Engineering Group Technical Report : CSEG/14/1995
CSEG, Computing Department, Lancaster University, LANCASTER, LA1 4YR, UK. Phone: +44-524-593041; Fax: +44-524-593608; E-Mail:
[email protected]
Database Systems: Challenges and Opportunities for Graphical HCI
Abstract
Databases and database applications form one of the most important classes of computer systems yet they have received relatively little attention from the HCI community. They have nevertheless spawned some notably innovative user interfaces and it is interesting to examine these in the light of contemporary HCI issues. This paper addresses the relationship between HCI and database systems, reviews some of the major themes running through existing database user interfaces and postulates some issues which are likely to be important to database usability in the future. The central argument underlying this paper is that databases are sufficiently different from other classes of application to necessitate a raft of user interface techniques specifically for the needs of database users which would reward increased attention by the HCI community.
Keywords : graphical user interfaces, querying, browsers, data visualisation, databases, data models.
1. Introduction User interfaces to databases have always been something of a Cinderella outside the mainstream of both HCI and database management systems (DBMS) research. However, the importance of the topic has recently become recognised by the database research community. At an informal workshop held at Laguna Beach in February 1989, attended by 16 senior DBMS researchers from the USA and Germany, End User Interfaces were identified as the most important topic for future research. In April 1993, a similar gathering of 5 researchers in Vienna reiterated their interest in interfaces but "they lamented that progress in this area -2-
continues to be done by industry, and the research community has very little impact on this important topic" [Stonebraker 93a]. While database systems have been slow to attract the attention of mainstream HCI researchers, the development of database system technology has nevertheless spawned several innovative user interfaces. For example, relational DBMSs provide, through their declarative query languages, a flexibility unrivalled by other command-line-driven applications. Similarly, QBE [Zloof 75] and other associated example-based query languages [Ozsoyoglu 93] exhibited many of the principles ascribed to direct manipulation user interfaces at a time while word processing was still performed by line editors. It is true, therefore, that the practice and theory of HCI has been influenced by the work of Zloof and a few others operating in the field of database systems research. However, the database and HCI research communities have been largely content to go their independent ways. That this situation is slowly changing is evinced by dialogues such as those at Laguna Beach and Vienna, the IFIP 2.6 Visual Database Systems (VDB) conference series and the User Interfaces to Database Systems (IDS) workshop series. One reason why database user interface design has been a backwater of HCI research is that database systems have quite specific characteristics. On one level this may have fostered the notion that database systems and applications were uninteresting. This is a mistaken view. Quite apart from the economic importance of database systems, these same distinctive characteristics should render databases of particular interest to HCI researchers. For example: users often need to interact with very large volumes of data; database systems are fundamentally cooperative in that data is shared by multiple users; and there is a quite distinct division of roles for users of database systems. A particular challenge is provided by uncertainty about the applicability of principles of HCI derived from other domains. For example, one of the key features of direct manipulation systems as defined by Shneiderman [Shneiderman 82] is that all the items of data of interest to the user are continuously displayed for the duration of the user's interest. As observed by
-3-
Draper [Draper 1992], the sheer volume of data which a database user may need to manipulate poses particular difficulties for meeting this requirement. This implies that databases may challenge some of the accepted tenets of HCI. Of course databases are themselves a dynamic area of research. This aspect means that database systems are themselves moving targets. This paper seeks to promote awareness of the importance of HCI to database systems and of some of the challenges and opportunities posed by database systems to user interface designers. Specifically we will discuss the following: •
What are the distinctive features of interacting with large volumes of persistent data in contrast to smaller volumes of transient data manipulated by other interactive applications and how do they affect the design of the user interface?
•
How do different database technologies influences users' interaction styles?
•
How are new database models imposing additional requirements on users' interaction mechanisms and how might modern user interface technology can support these?
The rest of the paper is structured as follows: The factors contributing to the increased importance of HCI to DBMS are presented in section 2. Section 3 introduces those principal features of database systems of relevance to the design of their user interfaces. Section 4 discusses the use of graphics to provide end-user interfaces characterised by the two dominant paradigms of querying and browsing. Visualisation issues are examined in section 5. Section 6 examines user interfaces for database designers in terms of data definition and tools for designing tailored database user interfaces. Section 7 concludes the paper with by identifying some future research directions.
-4-
2. Factors contributing to the increased importance of HCI to DBMSs Before reviewing the generic characteristics of database systems which impose particular requirements on user interfaces, it is worth briefly considering some of the major trends which are currently acting to stimulate the need for HCI research in the area. User expectation: The most popular personal computer applications have converged on a number of windowing systems and associated interaction techniques, invariably employing direct manipulation [Shneiderman 82] user interfaces. The success of these applications has become such that users naturally expect the same levels of usability from their database applications. At one level, this observation refers to issues of "look-and-feel". At a more fundamental level, the success of these personal computer applications is due to the care with which they have been engineered to support particular user tasks. It follows therefore that database tools must closely match the quality (as perceived by the user) of other applications in the user's toolkit. The explosion in popularity of internet use is another factor which is shaping users' information access expectations. This is a very recent phenomenon which was both unpredicted and unpredictable. While for the purposes of this paper we will adopt a fairly narrow meaning of "database" which excludes "unmanaged" collections of data such as the world-wide-web [Berners-Lee 94], such systems nevertheless exert a strong influence. Perhaps the key lesson of the 'web is that it is unreasonable to expect users to be concerned with the niceties of the underlying data management or to tolerate constraints which that places on their access to the data. Whether the data is stored in relational tables or simple ASCII files; is text or video; or is stored locally or 8000 miles away, an enormous community of users have become accustomed to being able to access it through an everincreasing variety of graphical tools. The challenge to HCI is in reconciling the specific characteristics of database systems which have dictated the design of the normal means of interacting with a database (e.g. query -5-
languages, by-example form-filling, schema browsers, etc.), with users' expectations shaped from different domains. What unites databases with other application domains is the user's view of an application as a task-supporting tool. Wide variance in usability and interaction styles are therefore becoming decreasingly acceptable to users. New application domains: Application domains are emerging which require the services of DBMSs but which exhibit characteristics very different from those traditionally addressed by database technology. In these new domains data may be distributed, may include very different data and media types, and may change dynamically. Some applications, such as planetary remote sensing, generate data in immense volumes yet interpretation of the data may require it to be displayed enmasse. Data modelling in many application domains may pose such difficulties that it cannot be performed entirely by a-priori analysis but must be performed incrementally. This may necessitate frequent changes to the database schema with the consequent requirements that the user interface must be able to cope with unnormalised data and to adapt seamlessly to changes in the schema. Other domains may present cultural impediments to the easy adoption of database systems. For example Geographical Information Systems (GISs) manipulate spatial concepts such as contours, roads, watercourses, etc. and a lexicon of GIS application terms (e.g. overlay, buffer, etc.) has evolved. The "map algebras" used to model these spatial relations do not map easily onto standard DBMS query languages. Clearly, new database application domains will continue to impose novel requirements on their user interface design. New database technologies: New storage media, data models, languages, algorithms, etc. are being developed to address the increasingly diverse application domains [Lockemann 90]. Many are also motivated by -6-
the changing requirements of existing domains where issues such as distribution and interoperability are becoming increasingly significant. These new underpinning technologies exert an influence on database user interfaces. They may impose new user interface requirements or expose deficiencies in previously acceptable user interfaces. Conversely, however, they may actually remove technological constraints and enable the effective use of previously inappropriate interaction and visualisation techniques. A factor which is not directly technological but which is also exerting a demand for a new class of database tools and associated user interfaces is the changing role of the database administrator. Traditionally, there has been a model of a single database administrator (DBA), responsible for many aspects of the day-to-day running and maintenance of a database system. This role is coming under a number of challenges. Economic pressures are denying hard-pressed IT departments the resources to formulate and run database queries, reports, etc. for their clients. Users have therefore to be empowered to perform these functions themselves. This factor is of course closely related to that of user expectations. Many users of personal computer based DBMSs such as FileMaker are gaining experience and expertise with performing these functions through the use of form-based and other user interfaces. Even in the situation where the DBA will eventually perform the user's query, a growing number of end-users are capable of posing the query for themselves. Such users expect DBMSs on any platform to provide a useful and understandable user interface which allows the formulation and execution of an ad-hoc query. The notion of groupwork, as espoused in groupware and in CSCW (Computer Support for Collaborative Work), also challenges the single DBA position. In a more organic working situation, it may be that the DBA role changes from day-to-day. It may even be that there is a team working as the DBA or that any user of the system is allowed to perform a DBA function. This organic aspect is particularly relevant to the recently emerged (but still -7-
evolving) object-oriented database model [Kim 90]. Here, the schema, far from being a rigid artefact created at day zero of the database's life time, is allowed to evolve. Such evolution comes under the auspices of the DBA but in a groupworking environment it may become a group responsibility. Appropriate support in the form of group evolution interfaces is required. Groupwork has a number of interesting challenges, not only for the HCI community but for database technology itself. However, this is beyond the scope of the current paper and is not addressed further. New user interface technologies: Powerful graphical workstations, associated peripheral devices and windowing system software are now the norm in most workplaces. These have spawned many languages and user interface development environments which are available to a range of users from software development professionals to end-users. Even if the latter have no interest in programming they may nevertheless find themselves empowered to do so because of the availability of tools for (for example) tailoring applications. Similarly, as processor speeds increase and storage technology improves, new renderings and visualisation techniques are becoming available to system designers. These include integrated digital sound and video. The associated data types are quite different from the traditional data types and present challenges for the database technology. In addition, however, any user interfaces to multimedia databases must also accept the challenge of displaying data objects composed of both traditional and multimedia data items. Aside from technological aspects, HCI has matured as a discipline to the point where there is now a large and growing body of empirical data about what makes a good user interface. Similarly, a raft of methods for user requirements modelling and interactive system design now exist which, if not foolproof, make poor user interface design less excusable.
-8-
3. Characteristics of Database Systems Having examined the factors which are increasing the relevance of HCI to database systems, we now examine particular features of database systems which have a significant impact on the design of user interfaces. User functions There are broadly two functions provided by database user interfaces and these highlight two different classes of database user: •
To define the schema. Traditionally, this is the role of the database designer and subsequent modification is the role of the database administrator who may or may not be the same person.
•
To provide access to the data stored within the schema. This function is provided for end-users.
Data definition is a specialised task which traditionally follows a phase of requirements analysis. This results in a model of what needs to be stored in the database and the schema designer must translate this, often through several levels of abstraction and involving normalisation of the model, into a schema design which can be directly encoded using a data definition language (DDL). In addition to textual DDLs, graphical DDLs and tools to support the higher levels of data modelling are becoming more common. There are broadly two techniques for accessing data in a database. The first is where the user formulates a logical expression over the types and relationships in the database to define the subset of data which is of interest. The second is where the user defines a path to navigate the relationships linking the types and the sets of instances of types in order to isolate the data of interest. Where the user knows the organisation of the schema and uses this knowledge to access data, they can query the database. Queries can be formulated interactively or embedded in application programs. An alternative which has gained importance recently is browsing. Here -9-
the user needs no a-priori knowledge of the schema and is able to explore the database by following connections between data. Performance Issues In normal use, there are two components in any database use. The server, which can be thought of as the DBMS, and the client, a program executing on behalf of the user which communicates with the DBMS. Such client programs which may, for example, use embedded SQL will "check out" data items which are then stored in program variables for manipulation. Eventually, if required, the data items are "checked in" to the DBMS and any value changes are recorded. This process of check out and check in can be encapsulated in a transaction, the unit of atomicity and consistency for databases. Such transactions are normally short (in terms of time). Systems which check out data which will not be returned to the database for some time (resulting in so-called long transactions) present a number of problems. Such systems include CAD/CAM, where the design is held in a database. A component of the design may be checked out and presented to the user for manipulation until such time as the changes are made persistent and the component returned to the database. The point here is that the end-user is manipulating a representation of the data; any changes to the representation are not made permanent (and thereafter available to all other users) until the end of the long transaction. This implies that some HCI techniques popular in other application areas may be impractical or may fail to support the semantics the user expects from these other areas. For example, consider the use of sliders. A slider is an interaction technique which frequently forms a component of direct manipulation user interfaces and as such, relies on rapid semantic feedback so that the slider tracks the user's hand movements as the user "drags" the mouse cursor. Such sliders may appear in a "control panel" style of interface, allowing the user to set some numeric value (for example, a temperature at which the thermostat of a heating system should switch on the boiler). The use of a slider in a naive - 10 -
way to set, say, a numerical value in a database record, where moving the slider generated frequent updates to the value, each to be encapsulated by a transaction would be most inappropriate. For example, using a slider to set an employee's current salary. The resultant latency in providing confirmation of the success of users' actions poses serious problems to the application of the principles of direct manipulation to databases. The example is somewhat nonsensical, but the point remains valid; an acceptable use of an direct manipulation component becomes problematic when applied to a database application. Having given the slider as example, however, we must note that sliders have been succesfully employed as part of the Starfield visualisation system [Jog 95]. Data modelling Another requirement of any database is that the data must be stored in an appropriate and definable format. This is one of the key differences between a DBMS and a simple file store; the database uses a high-level model of the real-world data. How that data is physically organised in files is transparent to users and application programs. A DBMS will provide a number of primitive types (integer, character, date, etc.) and mechanisms for composing higher level composite types. The set of high-level types and their relationships within a database constitute the database schema or meta-data. The semantics of the schema vary according to the data model which the DBMS implements. Data models vary from those providing little abstraction over the underlying storage medium and optimised for efficiency, to more abstract models allowing real-world data and application semantics to be modelled closely. A number of important data models have evolved and new ones continue to evolve. This is a constant theme of database research and has an important effect on user interface issues. The schema defines the actual implementation of how the real-world data is to be modelled by the database and this is of course highly data model-specific. Modelling notations have been developed for designing the schema at a more abstract, data model independent level. The best known of these is Chen's entity-relationship (E-R) model [Chen 76]. For a database - 11 -
to adopt the E-R model implies nothing about the schema which may adhere to any one of a number of models.
4. End-user interfaces The data model adopted by the underlying database system forms the background to most database user interfaces. In very broad terms, browsers appear best suited to meet the requirements of interaction with object-oriented systems while graphical query systems seem most effective as visual analogues of relational query languages. This is not a rigid distinction, however, as both browsers and query systems frequently coexist within the user's database toolkit. In addition, the above are only two of the many common data models. The decision of whether to adopt browsing or querying may be less clear-cut for other models. In this section we examine graphical user interfaces to databases and we have characterised these as querying or browsing systems. We are principally concerned with end-user interfaces in this section; systems which permit the retrieval and (usually) input of data rather than those which permit definition of or changes to the schema. However, many of the systems described also allow this facility and hence double as database design/administration tools. An additional characteristic is that all the systems are generic; they may be specific to a particular DBMS but they automatically generate interfaces for any database which is an instance of that DBMS. Graphical Querying Formulating and running queries on a database is a complex, incremental and error-prone task. Users have to understand the syntax and semantics of the query language and have a mental model of the schema. It has been clear for many years that graphics have the potential to help users query a database more effectively, reducing the cognitive load by visualising the data and the schema and reducing the syntactic knowledge required to be retained. Graphical querying refers to a range of database user interface styles. At one end of the spectrum, are the well established example-based style user interfaces and at the opposite - 12 -
extreme are systems which exploit sophisticated graphical visualisations of data and allow users to manipulate these spatially to formulate queries. Example-based systems The progenitor of all example-based query user interfaces was QBE [Zloof 75]. This was an early example of a user interface which exhibited characteristics of direct manipulation; users interact directly with tabular representations of the entities (relations/tuples) they are interested in instead of doing so through the medium of linguistic abstractions (such as SQL). In addition, QBE is characterised by Myers [Myers 90] as a visual programming language and this description applies equally well to all graphical database user interfaces as selecting and manipulating data in a large database with a complex schema is fundamentally a programming task. Indeed, the original rationale for QBE was to provide an alternative to the formulation of complex queries using a query language. QBE represents the tabular data model of relational systems directly on the user's screen as tables; two-dimensional visualisations of relations whose columns represent relation fields and whose rows represent tuples. This is a conceptually simple but effective means of visualising the data within a relational database and requires only a character-based terminal. To express a query, the user chooses the name of the relation(s) of interest and is presented with a table representing the relation with each field labelled with its name. The user fills in the table cells to constrain the set of tuples and values of interest and these are then converted into an equivalent relational algebra expression and executed as a normal query. Users do not need to remember the names of the relation fields and must only learn the syntax of the QBE operators and expressions. As a simple example, consider the relation SE30 containing the fields owner, cpu, internalHD, memory, floppy, network and peripheral defined in a database modelling an organisation's computing resources. To print the value of the owner field (indicated by the P.) in all instances of SE30 where the owners of all SE30s with at least 8Mb of memory and an
- 13 -
A4 monitor, the QBE query in figure 1 could be formulated (the 8 and A4monitor values are the "examples"):
SE30
owner
cpu
internalHD
P.
memory
floppy
network
>=8
peripheral A4monitor
a. Formulating a query with QBE SE30
owner John Pete Wolfgang
b. Result of the query select owner from SE30 where memory >= 8 and peripheral = A4monitor c. Equivalent query expressed in SQL Figure 1. An example QBE query and SQL equivalent Practical limitation to the effectiveness of QBE exist. Relations with many attributes are difficult to represent because of limitations on screen size. Queries employing more than one relation (e.g. joins) are difficult to represent in a meaningful way. These limitations suggest that QBE is most useful for simple queries on small relations but becomes unwieldy when more complex situations pertain. Comparative tests by Yen and Scamell [Yen 93] suggest that within limits QBE has some advantages over SQL. While users made roughly the same number of errors with both SQL and QBE, coding and debugging times were reduced for users of QBE. Yen and Scamell hypothesise that the effect of having template tables in making QBE's syntax much smaller than SQL's is a likely explanation for this result. The tests also found that users who learned QBE before SQL first were subsequently able to use SQL better than if they learned SQL
- 14 -
first. This suggests that the graphical display of relations might provide a more intuitive introduction to relation concepts than a textual query language. Many variants on QBE have been developed which attempt to address its perceived shortcomings with respect to complex queries and to extend the paradigm to other domains and towards application generation. They provide greater expressiveness such as decision statements and sub-query forms (analogous to subprograms with parameter passing). A comprehensive review of by-example systems appears in [Ozsoyoglu 93]. Most example-based systems are implemented on top of relational DBMSs and require the database to be normalised to at least first normal form (i.e. fields have atomic values). This has the effect of reinforcing the division between end-user and designer because normalisation of a large relational database is a highly complex task. Unusually, a system called Generalised-Query-by-Example [Jacobs 83] also works for unnormalised relational databases (as well as Hierarchic and Network systems) raising the possibility of non-expert database design (presumably for small non-critical applications) while preserving the efficacy of the user interface tools. This gives a pointer to future developments where the by-example query style has been adapted to recently developed data models. Iconic query tools Since the mid 1980s graphical displays, windowing systems and pointing devices have become increasingly commonplace and researchers have sought to exploit their facilities for a more visually sophisticated kind of query language. One recurring theme of many of these systems is the use of stylised graphical representations of database entities (relations, tuples, etc.) which can be directly manipulated to express queries. For convenience we characterise these as iconic query systems although not all employ icons in the accepted sense of atomic graphical objects as their central unit of interaction. An underlying theme is that expressions of a set-based non-procedural formalism such as relational algebra could be visualised as graphical representations of database entities arranged spatially. Consider the relations staff and postgrad in figure 2. - 15 -
staff
name
room telnum
postgrad name
supervisor room
a. staff and postgrad relations
postgrad
staff
b. icons visualising staff and postgrad
staff
postgrad
c. graphical expression of a simple join Figure 2. staff and postgrad relations Staff and postgrad contain the attributes illustrated in 2a. and might each be represented iconically as shown in figure 2b. Figure 2c. illustrates how, using this visualisation scheme, a simple join between staff and postgrad could be expressed by the user selecting and dragging one or both of the icons until they shared a common border (if the icons overlapped the semantics of the resulting query may be different). The purpose of this is to find all staff who are also registered for postgraduate degrees. In other words the union of all tuples of the staff and postgrad relations whose values for the common name and room attributes are identical; the semantics of a shared border being taken to mean that common attributes are shared. This is the approach taken by IconicBrowser [Tsuda 89] (a query tool despite its name). Here, the relation fields/object attributes are represented as icons and the juxtaposition of icons (adjacent, overlapping, etc.) form query expressions processed by a graphical interpreter. PICASSO [Kim 88 ] adopts a similar strategy but instead of icons representing relations in an opaque manner as shown above, relations are represented as polygons containing textual objects - the relations' attributes. Users can hence manipulate individual attributes by - 16 -
selection and annotation with operators to define queries. This contrasts with purely iconic representations where the queries which can be expressed simply by manipulating the icons representing relations is severely limited and must be augmented with other dialogue mechanisms in order to allow users to formulate queries. PICASSO gains much of its strength from the particular data model for which it was developed - the universal relational model. This is a variant of the relational model in which the whole schema is represented by a single top-level (universal) relation. Hence users of PICASSO need never worry about the details of the schema - by simply displaying this universal relation, every attribute in the database is represented and PICASSO exploits this to allow users to make arbitrary selections of and application of operators to any of these attributes to formulate a query. Iconic user interfaces illustrate attempts to trade ease of use off against expressiveness. We have already seen how purely icon-based query languages have to be augmented with other means of eliciting users intentions. Other common problems include ambiguity, inability to formulate queries incrementally, difficulty in expressing set operations, and difficulty in integrating input and hence allowing update as well as retrieval operations. Iconic query languages tend to be less complete than the more sophisticated example-based languages which appear, however, to be less easy to use. Queries on visual and spatial data A consequence of the emergence of new complex data types is that traditional query languages are incapable of expressing the user's retrieval criteria. This is well seen in databases of visual data where the retrieval of particular items of visual data (graphic images, digitised photographs, video images) needs to be formulated in terms of spatial attributes such as shapes and geometrical relationships between shapes. This is sometimes called query by content and a raft of, mostly experimental, techniques have been developed to formulate queries. One approach is to employ the concept of the symbolic projection where the spatial relationships of objects on a 2-dimensional plane are characterised in terms of, for example, overlaps, abutments, containments, etc. Many of these concepts are described in [Arndt 95].
- 17 -
Most research has concentrated on the mathematical underpinnings of content based querying, although some prototype systems (e.g. [Cody 95]) are experimenting with graphical tools permitting users to draw shapes and trying to match these to similarly shaped images in the database. In Sketch [Meyer 92], for example, the user employs a graphical editor to draw data objects and the system generates a query from these objects' relative positions. For example, to find a river which flows through two countries the user can draw two disjoint polygons representing countries connected by a sinuous line representing the river. As with graphical user interfaces, such techniques are of particular relevance to users of visual databases because instead of needing to know about their database's data model and the syntax of the underlying query language, they can express queries using their own realworld (if highly stylised) representations of the data. Geographical Information Systems (GIS) impose their own requirements on DBMSs and their user interfaces. Traditional query languages and concepts are inappropriate for users wishing to manipulate spatial concepts and specialised query languages based on map algebras have evolved (e.g. ARC/Info [ESRI 90]) to permit queries to be expressed in terms of geographical spatial concepts. As in other application domains, graphical user interfaces have been developed to enhance GISs' usability. Several (e.g. [Brossier-Wansek 95]) adopt variants on iconic query systems, where icons represent geographical entities such as towns, woods, etc. which can be directly manipulated to express (for example) adjacency, or connected by arcs representing (for example) road links. Egenhofer [Egenhofer 95] describes an interesting approach to the use of user interface metaphors for enhancing the usability of an application in a particular application domain. A graphical map algebra based on a geographer's desktop metaphor is used. Here, a query is built up by placing iconic representations of queries on top of a virtual light table. The idea is that each query isolates some set of data from the database (e.g. vegetation, roads, etc.) and represents an analogue of the physical, transparent map overlays which geographers use to build up composite maps of various features. Hence, by positioning these virtual overlays over the virtual light table a composite query can be built up and executed by the underlying - 18 -
query engine. This spatial manipulation of queries maps closely onto the existing manual working practices with which users of geographical data are usually familiar. Egenhofer's work is representative of a wider concern in database user interfaces in that it is strongly oriented towards supporting the user's task, in this case a single, well-defined task with a clear set of long-established concepts. Other query systems are being developed which also provide this level of concentration on the application semantics rather than requiring the user to become concerned with constraints imposed by the underlying data model. For example, Bags and Viewers [Inder 94] and GUIDANCE [Haw 94] are prototype systems which permits mappings between the data model and a model world of the user's domain. This effectively permits users to graphically query the database where their view of the database is filtered through an appropriate model world. Browsing Browsing allows exploration of a database by following connections between data entities and inspecting the contents of entities. For example, by following owner-member relationships in a hierarchical database, or references to component objects from a composite object in an object-oriented database. Browsing is most appropriate where the data model has a complex structure or where the user is uncertain of what they are seeking in the database and where it is to be found. The browsing approach is embodied in the nature of hypertext and hypermedia systems [Conklin 87]. A hypertext can be thought of as a network consisting of nodes and links. A user typically navigates through the system by examining the information contained within a node and by following (embedded) links to other nodes. Hypertext is used to good effect in the world wide web where nodes exist in a world wide distribution of sites; if it were not for the latency in retrieving information from across the globe, the end-user need not be aware of the distributed nature of the information. In many hypermedia systems, information within nodes can be unstructured and untyped (as in the KMS system [Akscyn 88]). The structure of the information is thus at a coarser level of granularity than we would expect in a database - 19 -
system and hence care must be taken to distinguish between true hypermedia systems where arbitrary links may be defined and what we will refer to as a browser. In the latter case, the connectivity of the database entities is determined by the underlying data model and may be thought of as a highly constrained form of hypermedia system. It is clear that the nature of databases, across most of the data models, involve the concept of nodes and links, where nodes can be thought of as records (or tuples) and links are relationships between tuples. Thus, in the network model there are owner and member record types. An owner record might then permit links to be followed to the members. In the hierarchic model there are parent-child links to follow. In the relational model, there are symbolic links in the form of foreign keys. In the object model there are object identities and references. In principle, therefore, browsing can be applied to all database models and systems. Browsers and hypermedia systems depend upon an effective visual representation of data elements and navigation paths. Database browsing is constrained by the database schema. The schema of a database can be described at the level of the data model supported; this is the (logical) conceptual level. It is possible to describe the schema at an even higher level, which is independent of the data model supported by the DBMS. Here, formalisms such as the E-R model are used. Note that as at all levels of database technology, these conceptual design models are subject to research and new models frequently emerge such as the extended E-R model [Elmasri 94], object-oriented E-R models, etc. Browsing of a database is typically performed at either the conceptual or the logical conceptual schema level. A conceptual schema browser will present a graphical display of the schema according to the conventions of a high level model notation such (e.g. the E-R model) and this will map down onto the logical conceptual schema as determined by the underlying database model. A logical conceptual schema browser presents this lower level directly. Hence, conceptual schema browsers may be portable between different DBMSs and data models.
- 20 -
The notations employed to represent a conceptual level model are carefully designed to compactly represent all the different entity types, attributes, and relationships in a single diagrammatic form. This can be reproduced by a browser using graphical primitives composed to form interactive schema diagrams. Different levels of abstraction may be supported so that, for example, an address entity may be represented as an atomic entity by a single symbol or expanded to separate house, street, town, and country attributes. Such schema diagrams can be represented as graphical user interfaces with each symbol implemented to have its own associated interaction semantics. Hence, the higher abstractions provided by a high level schema notation permit browsers to visualise the database schema compactly, completely and in an implementation-independent manner. QBD [Angelaccio 89] and [Gulla 92] are examples of high level schema browsers for E-R databases. Unfortunately, existing conceptual models do not map onto some logical conceptual models very well. In the case of the object model in particular, the multiplicity of concepts (different kinds of relationships, methods, etc.) are simply unsupported by, say, the E-R model. As a consequence, logical conceptual schema browsers are predominant in OODBMSs. While new high level models have been proposed and browsers developed for these (e.g. OOdini [Halper 92]), these have yet to gain widespread acceptance. Figures 3 and 4 illustrate a logical conceptual schema browser in Moggetto [Sawyer 95]; a user interface framework to the Oggetto OODBMS [Mariani 92a]. The designer of such a browser has to compensate for the lack of a single, all-encompassing notation to represent all the features of the data model by determining what the important concepts are for users, and how best to present them. In the case of Moggetto and most OODBMSs the user has to be aware of two lattices; the type lattice (specialisation/generalisation relationships) and the structural lattice (component_of relationships). Users are presented with a graphical representation of the type lattice where nodes represent types. This provides an overview of all the types and their specialisation/generalisation relationships extant in the database and provides navigation mechanisms such as scrolling when the type lattice grows too large to view within the window area. Selection of a node allows instances (objects) of that type, or a - 21 -
template for a new instance to be displayed as editable forms. Figure 3 illustrates the user selecting instances of the type SE30 which comprises part of the schema for a network management application.
Figure 3: Browsing the type lattice with Moggetto Components of objects which are themselves instances of objects (rather than primitive types) can be viewed by following references from the object of which they form a part. Figure 4 shows the user following a component_of relationship from an instance of SE30 (PetesMac) to an instance of RadiusMonitor (PetesRadius) forming the value of one of PetesMac's attributes.
- 22 -
Figure 4: Browsing a complex object As stated above we distinguish between hypermedia systems, allowing arbitrary connections between entities, and browsers where the underlying data model constrains the connectivity of nodes. This distinction, while convenient for the purpose of the foregoing discussion of database browsers is actually less clear-cut than might at first appear as some systems permit . Several recent systems such as Tioga [Stonebraker 93b] support hypermedia browsers on top of DBMSs (Postgres [Stonebraker 91] in the case of Tioga). Such systems are an example of database HCI work responding to emergent application domains; in Tioga's case these are large-scale scientific applications such as biological survey data manipulating a mix spatial, video and other data. Here, users require browsers where their entry point to the data is not constrained by the underlying data model but is determined at a level appropriate to the task. Hence, users can build and tailor browsers to their data in a hypermedia fashion where any entities of interest can be connected by browsing links; those entities being related not by type or schema relations, but by domain-relevance. Clearly, such
- 23 -
systems extend the usability of database browsers to the point where they are now addressing users' tasks rather than imposing a data modeller's view of the database upon the user. Combined approaches While querying and browsing have different strengths in different circumstances, they should be viewed as complementary rather than mutually exclusive. For example, a browser may provide the means to discover an entity type of interest but if that same entity type has many instances, the user may find it more appropriate to select the appropriate subset of instances with a query. Just because a requirement for browsing tools has been identified by the newly emergent DBMSs addressing new applications, it does not mean that browsing will supersede querying. Browsers are simply another tool in the database user's toolkit which should be integrated with query tools to enable the user to switch between appropriate tools.
5. Data Visualisation The discussion so far has concentrated on mechanisms for interacting with the database. In this section we briefly consider visualisation; how database entities can be represented graphically in a meaningful manner. With the current attention on the use of virtual reality, visualisation has become increasingly important to the provision of recent, modern database interfaces. Visualisation techniques offer the ability to, for example, display large volumes of data on the screen at one time. This may be a requirement if the interpretation of the data is assisted by employing mapping data values onto spatial coordinates. Clearly, the display space is limited to two or three dimensions. In order to represent multidimensional data, we have to produce mappings from the n-D data space onto the 2/3-D display space. By following a scatter graph approach, we can choose 2 or 3 domains of the data space to be represented in the 2 or 3-D display space.
- 24 -
The Starfield work of Shneiderman and others is a good example of a 2D visualisation [Jog 95]. This is essentially a scatter graph of a single relation (although the literature insists on calling it a database; this may eventually be a drawback in some applications). Within the starfield context, a number of classic graphical interaction techniques are employed, such as zooming and the use of increasing detail. Queries can be dynamically expressed through the use of alpha sliders. Each attribute that the user is allowed to query on is associated with a slider. By decreasing the size of selected values in a slider, it is possible to thin down the number of tuples represented in the field. The effects of slider movement are displayed dynamically, giving the user an excellent feel for the effects of narrowing and widening the query. This seems to strike at the heart of the "all the database at once" technique; the user is actually getting some sense of the overall content of the database in a way that could not be achieved through the use of querying or even, to some extent, browsing. The Starfield approach seems to merge both of these activities. Data displayed by the Tioga browser [Woodruff 95] shows a two dimesnional viewing space that appears to share some of the Starfield features. Such systems, which present the entire database, allow us to explore the topography of the data as a whole. This means we can get an overall feel for the data. It may be, therefore, that such systems make it harder to find individual objects within the database. There is a slider for every attribute the user is allowed to query on. Of course, in a multiattribute relation, there may be many more attributes than can comfortably be represented by sliders in limited screen real estate. Q-PIT [Benford 94] is, in some ways, a 3D scatter graph and thus shares some similarities with the Starfield approach. Like Starfield, it essentially views a single relation, although we believe it can be applied to object-oriented databases where the three attributes mapping onto the x-y-z coordinates are taken from as high up in the object type lattice as possible.
- 25 -
Figure 5 : a Q-PIT terrain In an attempt to represent those attributes not mapped onto the 3-D positional coordinates, we follow the work of Michael Benedikt on the structure of Cyberspace [Benedikt 91]. Benedikt argues that the attributes of an object may be mapped onto extrinsic and intrinsic spatial dimensions . Extrinsic dimensions specify location in space (e.g. x, y, z co-ordinates). Intrinsic dimensions determine characteristics of the resulting point in space such as colour, shape, size, spin, texture, vibration and sound quality. Using Benedikt's approach, we might extend database schema notations to specify which attributes map to which intrinsic and extrinsic dimensions. These notions have been implemented in the Q-PIT prototype. Another approach can be found in statistical methods, which have been used to analyse collections of data (often documents) in an attempt to group objects together according to some measure of semantic "closeness" (i.e. do they logically belong together). The resulting proximity measures are typically scaled and returned as numerical values that are then used to cluster the objects in a data space. Systems adopting this and similar approaches include VIBE [Olsen 93] and BEAD [Chalmers 92]. The 2D approach of VIBE has been extended into a 3D version, VR-VIBE [Mariani 94].
- 26 -
The VIBE system was developed at the University of Pittsburgh to support document retrieval from a large collection of documents with the emphasis on being able to provide users with an overview of documents they are interested in with respect to the rest of the document space. A set of queries are specified by defining a number of “Points of Interest” (POIs) which contain keywords. A full text search then takes place on the document space and the keywords are compared to individual documents producing a relevance score for each document and point of interest. This determines how the space will be laid out; a vector is defined giving the position of each document in space in relation to the POIs. An example VIBE visualisation can be seen below. scientific visualization
document retrieval
virtual reality Figure 6: A simple VIBE display In this trivial example three POIs have been specified (the dots) and there are four documents in the document space (the squares). One document is relevant to all three of the POIs and so is placed centrally, the other documents have no relevance to document retrieval but are relevant to scientific visualisation and virtual reality and so appear to the right of the display Whilst this gives an overview of the relevance of documents to the POIs for small numbers of documents, the display can become very cluttered if a large document space is
- 27 -
being examined. VR-VIBE approaches this problem through the use of the third dimension and through the use of filtering mechanisms.
Figure 7: A query with five POIs in three dimensions - using the third dimension
The VIBE approach also forms part of LyberWorld [Hemmje 94], where points of interest are restricted to appearing on the surface of a sphere. There has been a great deal of work on the topic of visualisation at Xerox PARC; this has been most recently published as [Rao 95]. They have produced a rich set of systems within the Information Visualisation project, including the well-known Perspective Wall, Cone Tree and Document Lens. These use 2- and 3-D graphics to promote the identification of understandable patterns within the data. They have produced visualisations for hierarchic, linear, document, temporal and tabular structures. All of these visualisations are focus and context techniques which display some of the data at a greater level of detail while still displaying all or much of the context. This can best be seen in the Wall and the Lens. In the Lens, the focus is on a portion of one page within the document, but the rest of the document is still visible, albeit in a much unfocused way, providing undefiled context for the focus page within the rest of the work. Similarly, with the Wall, one portion of the wall is directly displayed in front of the user but the rest of the Wall fades on either side into the distance.
- 28 -
Experimentation with various kinds of metaphors is reported in [Rapley 94] and [Boyle 95]. The latter work, based on a system called Amaze, is a general purpose interface to an object-oriented DBMS. Its major use has been to support the query specification and retrieval of biochemical information. Queries are specified by manipulating a visualisation of the schema. This seems contrary to current trends in such systems where the emphasis is on the data instances rather than the schema. Amaze has proven successful, however. This may be partly because the user group at which it is aimed is familar with large structures and the manipulation thereof.
6. User Interfaces for Database Design The discussion so far has been concerned with end-user interfaces. This section concentrates on user interfaces for database designers and those with a requirement to manipulate the schema. The reasons for separating this function from manipulation of the data are that schema change is dangerous and because the schema in many data models is static, necessitating recompiling of the database. More recently, notably with the development of OODBMSs, dynamic data models have been produced where data and metadata are treated orthogonally. This has been necessitated by such systems' need to support highly evolutionary applications. Another dimension has been added to manipulation of the meta-data by recent data models' support for behavioural semantics; methods in OODBMSs. Here, schema manipulation is no longer only about defining composite types and their relationships, but may also involve defining operations which may be invoked on these types. Hence, the task has become one of programming, perhaps involving the writing of C++ functions to be invoked by users, other database objects or external applications. The association of behaviour with database entities has also raised the possibility of defining their user interfaces. In the rest of this section we consider user interfaces for "traditional" schema manipulation (data definition) and mechanisms for defining database user interfaces separately.
- 29 -
User Interfaces for Data Definition In this section we concentrate on OODBMSs as their requirement for dynamic schema evolution has been the most significant recent driving force in the development of data definition languages and tools. Although manipulation of the schema is supported explicitly in many OODBMSs, this does not imply that it is a trivially matter. Indeed, it is particularly awkward if the inheritance hierarchy is changed by the insertion of a new type, or an existing type has a component changed. Mechanisms (e.g. versioning [Skarra 86, Monk 92]) and policies (e.g. new types may only be created as leaves of the type lattice) to control schema evolution in OODBMSs are still very active research areas. What concerns us here are the tools to allow the schema to be manipulated and the extent to which these tools are intended for end-users and for database administrators. It has been argued that because OODBMSs model data in a manner which is closer to the structure and semantics of the corresponding real-world data, then the domain experts (rather than database experts) who are the users of the database are able to understand and should therefore be able to modify the schema as and when they identify the need for such modifications. In practice it is doubtful if this is realistic in a shared database environment but it has encouraged the development of tools for directly manipulating the physical schema, often integrated with end-user tools such as a browser. Tools in this category include GS Designer [Almarode 91], the type editor in Moggetto [Sawyer 95], CLOSQL [Monk 94] and the type/class editors provided by many commercial OODBMSs. The user interface of such tools typically consist of a graphical representation of the type lattice with which the database designer can not only access the nodes (types) but also insert new nodes. Figure 3 illustrates this; the user could define a new subtype of SE30 by selecting Derive from in the Type menu. Definition of the "internals" of the types themselves is either performed using a text editor or by directly manipulating "template" properties.
- 30 -
Tools for manipulating the conceptual schema also exist and include ISIS [Goldman 85] and DEED [Radermacher 92]. Such tools typically providing a fixed set of symbols corresponding to the notation supported (e.g. ER or a variant) which can be used to define a schema by direct manipulation which is then automatically processed (subject to validation stages) to produce the physical schema. Graphical user interfaces to database schema ease the database designer/maintainer's job by visualising the schema, the types composing the schema and by structuring the user's interaction with these types. Certainly, being able to have an at-a-glance overview of the database schema and physically (i.e. by direct manipulation of a graphical entity) isolate a type or relationship comprising an element in the schema is likely to make schema manipulation more accessible than by using a DML. More fundamentally, for databases where the application cannot be rigorously analysed and the schema designed a-priori, graphical tools raise the possibility of a prototyping approach to database design. Tools for Database User Interface Design In the traditional database models, users either interact with the database through a standard user interface where common display and interaction mechanisms are applied to every entity (e.g. forms), or a user interacts with an application which sits between the database and the user and presents its own user interface. The drive for lower development costs spawned high-level development tools such as application generators and 4GLs [Misra 88] with which users could develop database applications and which included elements of user interface design (mainly just screen design). With some applications, however, the data has a complex structure, including composite entities and perhaps bitmaps, video, and other media types. This means that a single user interface mechanisms is not always adequate as different entity types have their own display requirements. Hence, for example, definition of a new entity type in an object-oriented database may include definition of how instances of the type are to be displayed. What is really happening here is that the activities of database design and application programming - 31 -
have become intermingled. King [King 93] argues that because OODBMSs encapsulate procedural semantics in the form of methods, and because these are stored in the database as data, database application should now reside in the database instead of being external to it. Hence, the distinction between schema manipulation and application programming is blurred. UIMS and DBMSs User interface management systems (UIMSs) [Hix 90] offer mechanisms for the run-time management of user interfaces. Integrated with tools for defining user interfaces, a UIMS provides an environment for the rapid prototyping of applications' interactive components and for the execution of these. Of particular importance to the design of a UIMS are the partitioning of application and user interface functionality, the mode of binding of user interfaces to applications, and the protocols for communication between them. It is interesting to consider user interfaces to database systems in relation to the Seeheim UIMS model [Green 83] (figure 8) consisting of presentation, dialogue control and application interface components. The Seeheim model was one of the earliest explicit UIMS architectures and while its suitability as a run-time architecture has been questioned, it still serves as a useful conceptual model. Broadly, the Seeheim model's components’ responsibilities are as follows: •
Presentation refers to the display of the entities with which the user is interacting and the trapping of user input.
•
Dialogue control manages the user interface-specific functionality (as opposed to that of the underlying application functionality). For example, triggering the display of a pull-down menu in response to a particular input event.
•
The application interface model is a definition of the interface between the application and the user interface. It often takes the form of the sub-set of application entities (objects, functions) which need to be accessed by the user interface code.
- 32 -
Communication between these components is linear in nature so all input is routed between the presentation and application interface model components via the dialogue controller. The small box (the "by-pass" or "fast switch") represents an alternative, direct route between the application interface model and the presentation component for output events which do not affect the state of the dialogue model.
user
Presentation
Dialogue control
Application interface model
application
Figure 8. The Seeheim model While providing a useful conceptual decomposition of roles within the UIMS, the Seeheim model has been criticised for two reasons: •
The linear nature of communication has been seen as a bottleneck for direct manipulation user interfaces where rapid semantic feedback is required.
•
It relies on being able to decouple user interface functionality (represented by dialogue control) and application functionality (represented by the application interface model).
For most applications, the first point is only true if the architecture is implemented naively or if very fine-grained communications are needed; for example at the per-pixel level for dragging objects. In part, the problem is due to the fact that the by-pass is less well defined and correspondingly more difficult to implement as an architectural feature than the other components. In terms of UIMS to databases, where the database forms the application, the problem is unlikely to occur because the granularity of communication is typically coarse. In any case, the practicalities of the secure management of large volumes of data mean that the
- 33 -
DBMS is a far more significant performance bottleneck than the user interface’s run-time architecture. The second point is true for many applications where a clean separation of concerns is hard to achieve resulting in the user interface embodying application semantics and/or user interface functionality being embedded in the application. With the well-defined query interfaces provided by most DBMSs and through which a user interface must communicate, a clean separation is relatively easily achieved. In a UIMS supporting graphical user interfaces to an OODB, the three Seeheim components may be broadly characterised as follows: The presentation and dialogue control components are much as described above, where the entities with which the user interacts are database objects. The application interface is the set of query operations provided by the OODB management system and through which the user interface communicates with the database. Hence, user events are mapped by the dialogue controller onto queries on the state of the database or requests to update the database’s state. DBface [King 93] is a good example of a Seeheim-based database UIMS. Here, the display of database entities is managed by a representational component and the user interface functionality is managed by an operational component. These correspond approximately to the Seeheim presentation and dialogue control components. A novel feature of DBface is that the database acts, not only as the repository of the underlying application data, but also as the repository of the information used by the representational component to display the data. Moggetto [Sawyer 95] also adopts a Seeheim architecture and also exploits the DBMS to store user interface definitions along with the data itself. Embedded in Moggetto is a GUI builder which provides the user with a seamless means with which to define objects' user interfaces and manipulate the schema and type definitions. Moggetto's particular strengths are for database applications which must be prototyped and where the end user requires a simple means to tailor objects' user interfaces to their own requirements.
- 34 -
A UIMS can be viewed as a means to decouple a database and its user interface(s). If the UIMS is integrated with a tool set, then designers and users are provided with a flexible environment oriented towards the easy development and tailoring of user interfaces. However, to date little work has been done in this area. This is surprising because database applications are particularly tractable to the adoption of the UIMS approach to user interface design and management. It is likely that as OODBMSs become more widespread and address applications with diverse presentation and interaction requirements that UIMS-like environments will become increasingly integrated into the database management system.
7. Discussion and Conclusions The preceding sections have highlighted some of the major trends in user interfaces and user interface systems for databases. While covering only a subset of user interface issues (for example, we have ignored natural language and information retrieval techniques) it has covered a representative sample of major trends in this area; essentially those user interfaces exploiting computer graphics and direct manipulation. We have shown how a number of trends have emerged or are emerging: • The distinction between end user and database design user interfaces. This is partly the historical result of DBMSs with static schema but persists even in more recent systems employing dynamic schema. • Querying v. browsing. Graphical analogues of declarative query languages have emerged over a long period and attempt to make (mostly relational) queries easier to formulate usually at some cost to completeness. These take many forms from tabular to iconic tools. Recent developments in data models have encouraged an exploratory approach to retrieval of data from the database. Because the schema in such models tends to be relatively complex, users cannot be expected to have a sufficiently detailed mental model of the schema to enable them to formulate queries, graphically or otherwise. browsers are graphical tools which allow both the schema and the data to be explored by following
- 35 -
relationships between entities. Even in such systems, however, query user interfaces are still frequently useful. • Novel visualisation techniques are beginning to attract interest, particularly the use of 3dimensional graphics. Developments in this area are immature and it remains to be seen whether such techniques are best suited to individual applications or whether automatic methods can be used to generate useful 3-D visualisations generically. • As data models become more sophisticated by including procedural semantics and different media types, and as target application domains for DBMSs diversify, the utility of generic user interfaces diminishes. Application-specific, and even entity-specific user interfaces are increasingly important and database UIMS are emerging to allow the design of these. It is interesting to consider database user interfaces in relation to contemporary issues in (main-stream) HCI. As implied by the title of this paper, databases offer both problems and opportunities which other applications of HCI technology lack. The most obvious distinguishing feature of database systems is the sheer volume of data which the user of a database has to be equipped to manage. This is of course why query languages are so important and why graphical user interfaces to these have attracted so much interest. Draper [Draper 92] has suggested that this approach is flawed, largely because the Boolean operators required to formulate a complex query map poorly onto the cognitive techniques used by humans to sort through large data sets; users tend to perform better if they can use recognition than if they have to remember details (such as the organisation of a database schema). Draper considers that an approach which employed iteration and convergence on the subset of data of interest would be more appropriate and suggests partial matching as used in information retrieval as a possible means to achieve this. This would certainly be an approach which would reward closer examination but two of the existing techniques described above already meet his criteria, at least in part. Some example-based systems do permit iteration in so far as the results of one query can be used as the input to - 36 -
subsequent queries. This allows the formulation of a query to be decomposed into small steps. This also, to some extent, circumvents some of the restrictions on incomplete expressiveness of graphical query tools. More interesting still however, is the development of browsers which allow users to explore data both through the schema and by exploring structural relationships on the data. When integrated with graphical query tools such an approach meets Draper's iteration and convergence criteria. The techniques used in the creation of information terrains can also support the human operation of recognition, whereby groupings and patterns of data impart information not easily spotted in massive volumes of textual data representation. Examples of these patterns can be found in [Mariani 92b]. Another important point arising from volume of data is that it effectively proscribes one of the central tenets of direct manipulation (so effective in other domains). As defined by Shneiderman, a direct manipulation user interface necessitates that the data of interest is continuously displayed for the period that the user retains interest in that data. Interpreted literally, this would require that a user confronted with, say, a direct manipulation user interface to a large relational database would be presented with a list of (textual, iconic or otherwise) of all extant tuples in the database. More realistically, however, existing direct manipulation user interfaces do not work by presenting such a flat information space. If the Macintosh can be accepted as a good example, use of the open dialogue box in a typical Mac application lists only files within the current folder of interest and a means is provided for changing this focus. Hence the information is (necessarily) structured and is analogous to the relationship between tuples and relations or objects and types. We believe, therefore that it is misleading to be pedantic about the minutiae of direct manipulation and other user interface paradigms. Perhaps it is more useful to observe (as Draper does) that databases are different from other user interface domains and that principles developed for other applications do not necessarily successfully translate directly to database applications. A possible solution to the problems of direct manipulation of massive volumes of data may lie in the use of 3-D information terrains, where the user only has a moveable “window” on - 37 -
the information space, and within that window, distance plays a part in limiting the number of data items available for manipulation. By changing position in the information space (an operation analogous to browsing), the user can move nearer to objects of potential interest and change the groups of data which can be operated upon. Databases design (manipulation of schema) and interaction with data have long been seen as fundamentally different. In particular, database design is a specialist and complex activity which, with the development of OODBMSs is becoming increasingly associated with application development. High-level programming tools are therefore of great interest to the database community. The history of application generators and 4 GLs illustrates that this has long been the case; another area where database issues have been developing in parallel with other areas of computer science. Of particular interest now is the requirement for tailored user interfaces on a per-application or per-object type granularity. This has led to several OODBMS toolkits incorporating UIMS characteristics. It is interesting that these conform closely to the much-abused Seeheim model. We believe that, far from showing that database systems are behind main-stream UIMS research in this respect, it merely illustrates that DBMSs conform more closely to the abstract model of an underlying application implicit in the Seeheim model. The Way Forward Stonebraker's [Stonebraker 93a] fears notwithstanding, a brief glance at the reference section of this paper - which is not intended as a comprehensive survey of the field but rather an indication of trends - shows a substantial amount of work being carried out in this area. The question we attempt to address here is how to advance the work. Something which for sound technical reasons has become an article of faith in HCI is the need to assess and evaluate new user interfaces. Issues in this area are addressed in [Yen 93, Paton 94, Catarci 95a]. Evaluation is of course a central activity in user-centred development and necessary to allow a user interface design to iterate towards one which meets its requirements; requirements which cannot be reliably elicited or modelled by "normal" - 38 -
analytical methods. Evaluation would also permit the accumulation of empirical evidence on user interface performance which could be used to inform the designers of database systems. For example, evaluation data could contribute to a model of the relationship between enduser database functionality and user interface techniques. This is particularly urgent with the burgeoning interest in the application of new, for example 3-dimensional, user interface technology to databases. We suggest that user interface evaluation tools should become as much part of the database application designer's toolkit as have embedded query languages, 4GLs, form definition tools and GUI builders. Similarly, if 3-dimensional user interfaces prove to be of real value to database applications, there is a need for better and higher-level 3-dimensional toolkits and UIMSs oriented towards database applications. The importance of appropriate user interface metaphors is inescapable [Catarci 95b]. 2 and 3-dimensional metaphors need to be identified, prototyped and evaluated. It may prove impossible to identify a metaphor as widely applicable as the desktop metaphor because the range of database application domains is so wide. Nevertheless, we suggest that it is essential for database application designers to use the domain to guide the design of their user interface rather than imposing a generic database culture on users; after all, most database applications are designed to support or automate existing, often long-established, nonautomated tasks. This need not result in a sterile research topic dominated by variations on electronic card indexes, however. The explosive increase in use of the World-Wide-Web illustrates how appropriate HCI technology can be used to help shape as well as meet users' expectations. Further technical issues include the need to interface with distributed, heterogeneous databases. We must discover how to cope with non-traditional data types such as multimedia data and investigate textual and graphical extensions to query mechanisms to cope with retrieval. Displaying the multimedia results of a query is of concern as are the issues of very large databases, particularly with the current trend of displaying the "whole database".
- 39 -
Final comments In conclusion, it is demonstrable that user interfaces to database is an application domain which should be attracting far more interest from HCI researchers than it has hitherto. Even ignoring the two outstanding (in the authors' opinions) new areas of cooperative database systems and 3-dimensional/VR user interfaces, there is considerable scope for innovation and experimentation. The underlying database technology is in a state of continual flux (we have only touched on evolving data models, many other issues are of equal significance). The challenge to HCI researchers is to develop appropriate, generic or tailorable user interface mechanisms for users of these new technologies while recognising that even existing databases systems would reward and benefit from HCI research.
8. References [Akscyn 88] Akscyn, R.M., McCracken, D.L., and Yoder, E.A.:"KMS: A Distributed Hypermedia System for Managing Knowledge in Organisations", CACM, 31 (7), July 1988. [Almarode 91] Almarode, J.: "Issues in the Design and Implementation of a Schema Designer for an OODBMS", in Proc. ECOOP '91, Geneva, July 1991. [Angelaccio 89] Angelaccio, M., Catarci, T., Santucci, G.:"QBD: A Fully Visual System for E-R Oriented Databases", in Proc 1989 IEEE Workshop on Visual Languages, Rome, October 1989. [Arndt 95] Arndt, T., Petraglia, G., Sebillo, M., Tortora, G.:"Representing Concave Objects Using Virtual Images", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Benedikt 91] Benedikt, M.:"Cyberspace: Some Proposals", in Cyberspace: First Steps (Benedikt, M. Ed.), MIT Press, 1991.
- 40 -
[Benford 93] Benford, S. and Mariani, J.A.:"Virtual Environments for Data Sharing and Visualisation - Populated Information Terrains", in Proc. 2nd International workshop on Interfaces to Databases, Lancaster, July 1994. [Berners-Lee 94] Berners-Lee, T., Cailliau, R., Luotonen, A., Nielsen, H., Secret, A.:"The World-Wide Web", CACM, 37 (8), August 1994. [Boyle 95] Boyle, J., and Gray, P.M.D.:"The Design of 3D metaphors for Database Visualisation", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Brossier-Wansek 95] Brossier-Wansek, A., Mainguenaud, M.:"Manipulation of Graphs with a Visual Query Language: Application to a Geographical Information System", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Catarci 95a] Catarci, T. and Santucci, G.:"Diagrammatic Vs Textual Query Langauges : A Comparative Experiment", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Catarci 95b] Catarci, T., Costabile, M.F., Cruz, I.F., Ioannidis, Y., and Shneiderman, B.: "Data Models, Visual Representations, Metaphors : How to Solve the Puzzle?", panel session, in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Chalmers 92] Chalmers M. and Chitson, P.:"Bead: Explorations in Information Visualisation", in Proc. SIGIR'92, published as a special issue of SIGIR forum, ACM Press, June 1992. [Chen 76] Chen, P.:"The Entity-Relationship Model: towards a unified view of data", ACM Transactions on Databases Systems, 1 (1), March 1976. [Cody 95] Cody, W.et al.:"Querying Multimedia Data from Multiple Repositories by Content: the Garlic Project", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. - 41 -
[Conklin 87] Conklin, J.: "Hypertext: An Introduction and Survey", IEEE Computer, 20 (9), September 1987. [Draper 92] Draper, S.:"HCI and Database work: Reciprocal relevance and challenges", in Proc. International workshop on Interfaces to Databases, Glasgow, July 1992. [Egenhofer 95] Egenhofer, M., Bruns, T.:"Visual Map Algebra: A Direct Manipulation User Interface for GIS", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Elmasri 94] Elmasri, R. and Navathe, S.B.:"Fundamentals of Database Systems", (second edition), Benjamin/Cummings, 1994. [ESRI 90] ESRI : "Understanding GIS: The Arc/Info Model", Environmental Systems Research Institute Inc., Redlands, California, 1990. [Goldman 85] Goldman, K., Goldman, S., Kanellakis, P., Zdonik, S.:"ISIS: Interface for a Semantic Information System", in Proc. 14th ACM SIGMOD International Conference on the Management of Data, 1985. [Green 83] Green, M.:"Report on Dialogue Specification Tools", in User Interface Management Systems (Pfaff, G. Ed.), Springer-Verlag, 1983. [Gulla 92] Gulla, B.:"A Browser for a Version Entity Relationship Database", in Proc. International workshop on Interfaces to Databases, Glasgow, July 1992. [Halper 92] Halper, M., Geller, J., Perl, Y., Neuhold, E.:"A Graphical Schema Representation for Object-Oriented Databases", in Proc. International workshop on Interfaces to Databases, Glasgow, July 1992. [Haw 94] Haw, D., Goble, C., Rector, A.:"GUIDANCE: Making it Easy for the User to be an Expert", in Proc. 2nd International workshop on Interfaces to Databases, Lancaster, July 1994.
- 42 -
[Hemmje 94] Hemmje, M., Kunkel, C. and Willett, A.:"LyberWorld - A Visualization User Interface Supporting Fulltext Retrieval", in Proc. of the 17th Annual Int. Conference on Research and Development in Information Retrieval (SIGIR '94), Dublin, July 1994. [Hix 90] Hix, D.:"Generations of User-Interface Management Systems", IEEE Software, 7 (5), September 1990. [Inder 94] Inder, R., Stader, J.:"Bags and Viewers: A Metaphor for Intelligent Database Access", in Proc. 2nd International workshop on Interfaces to Databases, Lancaster, July 1994. [Jacobs 83] Jacobs, B., Walczak, C.:"A Generalised Query-by-Example Data Manipulation Language Based on Database Logic", IEEE Transactions on Software Engineering, 9 (1), January 1983. [Jog 95] Jog, N., and Shneiderman, B.:"Starfield Information Visualisation with Interactive Smooth Zooming", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Kim 88] Kim, H-J, Forth, H., Silberschatz, A.:"PICASSO: A Graphical Query Language", Software -Practice and Experience, 18 (3), March 1988. [Kim 90] Kim, W.:"Object-Oriented Databases: Definitions and Research Directions", IEEE Transactions on Knowledge and Data Engineering, 2 (3), September 1990. [King 93] King, R., Novak, M.:"Designing Database Interfaces with DBface", ACM Transactions on Information Systems, 11 (2), April 1993. [Lockemann 90] Lockemann, P.C., Kemper, A., and Moerkotte, G.:"Future Database Technology : Driving Forces and Directions", in Database Systems of the 90s, (Blaser, A. Ed.), Springer-Verlag, 1990. [Mariani 92a] Mariani, J.:"Oggetto: An Object Oriented Database Layered on a Triple Store", The Computer Journal, April, 1992.
- 43 -
[Mariani 92b] Mariani, J.A., and Lougher, R.:"TripleSpace : an experiment in a 3-D graphical interface to a binary relational database", Interacting with Computers, 4 (2), 1992. [Mariani 94] Mariani, J., Rodden, T., Colebourne, A., Benford, S., Bullock, A., and Snowdon, D.:"PITS- Populated Information Terrains", chapter 2 of COMIC Deliverable D4.2, "Computable Models and Prototypes of Interaction", (Benford, S., Bullock, A., Fuchs, L., and Mariani, J., Eds)., 1994, available from http://www.comp.lancs.ac.uk/computing/research/soft_eng and ftp.comp.lancs.ac.uk [Meyer 92] Meyer, B.:"Beyond Icons Towards New Metaphors for Visual Query Languages for Spatial Information Systems", in Proc. International workshop on Interfaces to Databases, Glasgow, July 1992. [Misra 88] Misra, S., Jalics, P.:"Third-Generation versus Fourth-Generation Software Development", IEEE Software, 5 (4), July 1988. [Monk 92] Monk, S., Sommerville, I.:"A Model for Versioning of Classes in Object-Oriented Databases", in Proc. BNCOD 10, Aberdeen, 1992. [Monk 94] Monk, S.:"A Graphical User Interface for Schema Evolution in an ObjectOriented Database", in Proc. 2nd International workshop on Interfaces to Databases, Lancaster, July 1994. [Myers 90] Myers, B.:"Taxonomies of Visual programming and program Visualisation", Journal of Visual Languages and Computing, 1, 1990 [Olsen 93] Olsen, K.A., Korfhage, R. R., Sochats, K. M.., Spring, M. B. and Williams, J. G.:"Visualisation of a Document Collection: The VIBE System", Information Processing and Management, 29 (1), 1993. [Ozsoyoglu 93] Ozsoyoglu, G., Wang, H.:"Example-Based Graphical Database Query Languages", IEEE Computer, 26 (5), 1993.
- 44 -
[Paton 94] Paton, N.W., al-Qaimari, G., Doan, K., and Kilgour, A.C.:"Techniques for the Effective Evaluation of Database Interfaces", in Proc. 2nd International workshop on Interfaces to Databases, Lancaster, July 1994. [Rao 95] Rao, R., Pedersen, J.O., Hearst, M.A., Mackinlay, J.D., Card, S.K., Masinter, L., Halvorsen, P-K., and Roberston, G.G.:"Rich Interaction in the Digital Library", Communications of the ACM, 38 (4), April 1995. [Radermacher 92] Radermacher, K.:"Ab Extensible Graphical Programming Environment for Semantic Modelling", in Proc. International workshop on Interfaces to Databases, Glasgow, July 1992. [Rapley 94] Rapley, M.H. and Kennedy, J.B.:"Three Dimensional Interface for an Object Oriented Database", in Proc. 2nd International workshop on Interfaces to Databases, Lancaster, July 1994. [Sawyer 95] Sawyer, P., Colebourne, A., Mariani, J., Sommerville, I.:"Database object display definition and management with Moggetto", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Shneiderman 82] Shneiderman, B.:"The future of interactive systems and the emergence of direct manipulation", Behaviour and Information Technology, 1 (3), 1982. [Skarra 86] Skarra, A., Zdonik, S.:"The Management of Changing Types in an ObjectOriented Database", in Proc. OOPSLA '86, ACM Press, Portland, Oregon, September 1986. [Stonebraker 91] Stonebraker, M., Kemnitz, G.:"The POSTGRES Next-Generation Database Management System", CACM, 34 (10), October 1991. [Stonebraker 93a] Stonebraker, M., Agrawal, R., Dayal, U., Neuhold, E.J., and Reuter, A.: "DBMS Research at a Crossroads : The Vienna Update", Proc VLDB 93, Dublin, August 1993
- 45 -
[Stonebraker 93b] Stonebraker, M. et al.,:"Tioga: Providing Data Management for Scientific Visualisation Applications", Proc. VLDB 93, Dublin, August 1993. [Tsuda 89] Tsuda, K., Hirakawa, M., Tanaka, M., Ichikawa, T.:"IconicBrowser: An Iconic Retrieval System for Object-Oriented Databases", in Proc. 1989 IEEE Workshop on Visual Languages, Rome, 1989. [Woodruff 95] Woodruff, A., Su, A., Stonebraker, M., Paxson, C., Chen, J., Aiken, A., Wisnovsky, P, and Taylor, C.:"Navigation and Coordination Primitives for Multidimensional Visual Browsers", in Proc. 3rd IFIP 2.6 working conference on Visual Database Systems, Lausanne, March 1995. [Yen 93] Yen, M., Scamell, R.:"A Human Factors Experimental Comparison of SQL and QBE", IEEE Transactions on Software Engineering, 19 (4), April 1993. [Zloof 75] Zloof, M.M.:"Query-by-Example", Proc. AFIPS National Computer Conference, 1975.
- 46 -