A Pen-based Database Interface for Mobile Computers Rafael Alonso V.S. Mani Matsushita Information Technology Laboratory 2 Research Way, 3rd Floor Princeton, NJ 08540 falonso,
[email protected]
Abstract
The appearance on the market of pen-based computers with wireless communication capabilities has led to the ability to access geographically distributed information bases. Moreover, such computing devices require database interfaces that are easy-to-use, avoid large scale data input via keyboard use, and are suited for smaller display screens. We establish the need for newer interfaces to deal with problems speci c to mobile computing devices that use the pen as a primary input device. We describe an implementation of a pen-based graphical database interface on a pen computer (using the PenPoint operating system) with a built-in cellular phone. Using the cellular phone capability, the interface connects to remote databases and displays the schema information of the database chosen by the user. In response to gestures made by the user using the pen, the interface automatically generates queries and retrieves results. We use the Universal Relation concept to aid in automatically generating queries based on the attributes chosen by the user. We provide a summary of the status of this interface, some implementation details and suggest some future directions. 1 Introduction
The advent of pen-based computers with wireless communication capabilities (also known as Personal Communicators or Personal Digital Assistants) has opened the door to a new wave of possible applications. Users' expectations have grown beyond the obvious requirements of inter-personal communication (e.g. electronic mail), simple computations (e.g. spreadsheets), basic record-keeping (e.g. calendars), to more sophisticated requirements such as accessing information stored in geographically distributed information bases, including database management systems (DBMSs). Typically, a DBMS is accessed either by directly supplying a data manipulation language such as SQL, or by accessing the data using an application program which embeds database commands. Most commercial DBMSs today support both interfaces. As applications are ported to pen-based computers, users can continue to access DBMS information in the usual manner. However, the obvious approach of try-
ing to use character recognition software to translate pen strokes into SQL commands has several problems. First, the state of the art in handwriting recognition is such that writing more than a few words at a time is very cumbersome. Additionally, even if perfect character recognition were available, - and the state of the art as evidenced by the AT&T EOs ([EO93]) and the Apple MessagePads ([App93]) is far from perfect a graphically-oriented interface is preferable for users than one based on direct SQL input. Furthermore, the size of the typical pen based computer is much smaller than that of a normal workstation, placing additional constraints on interface design. Indeed, the screen sizes of these devices vary from being as small as 15 square inches on the Apple MessagePads (\Newtons") to about 48 square inches on the larger EO 880s. Finally, users are interested in accessing data having a complex schema, and thus require a more powerful interface than is possible by supporting solely SQL commands. Thus, we believe that a powerful DBMS interface for mobile environments must have the following characteristics: (1) be graphically oriented (2) minimize pen-strokes (3) be more powerful than a simple SQL front-end and (4) make judicious use of the limited screen area. We have attempted to address these and other mobile interface issues as part of our work. Initially, we have tackled the rst two issues just listed. Others, such as optimization techniques for smaller screen sizes, operations in the face of frequent disconnections, etc., will be addressed in the future. Our basic approach is based on previous work in data modeling and in graphical user interfaces for databases. Our initial prototype has now been completed, and the status of the system is described here also. 2 The Interface
Most previous research on graphical user interfaces [AGS90, BH86, CERE90, KKS88, Kun92, SBMW91, Kim86] has made use of mouse operations to simplify user interactions. In pen-based environments, the mouse can almost always be replaced by the pen without any loss in functionality. However, workstation based GUIs typically use the keyboard to input large amounts of data. In a pen-based environment,
for reasons of faulty handwriting recognition, it is not feasible to replace all keyboard input with pen writing input. Furthermore, \soft" keyboards available on most pen computers do not fully address the problem. 1 The universal relation model [KKF+ 84, Ull89, Fag83] simpli es the relational model by providing a \relationless" view of the schema. We use the concept of the universal relation to gure out interesting join orders between tables in the database. However, we present the database schema as a collection of tables and attributes as opposed to the unilevel collection of relational attributes advocated by the universal relation. We believe that the structure within the schema often helps rather than hinders the user in understanding the semantics of the database in formulating queries. [Mak77, KR89, Kim90] all describe related work in nested relational databases and object-oriented databases that are relevant in this regard.
2.1 Implementation details
An earlier version of the interface resided on a DECstation 5000/200 workstation [AHK92]. The code was about 8000 lines of C++ code, and had bene ted largely from the use of the InterViews toolkit to generate X Window screens. The prototype had been connected to a Sybase relational DBMS containing small synthetic databases. The pen-based interface, based on the same ideas as the proof-of-concept prototype mentioned above, has been implemented successfully on two mobile platforms - the Intel based NCR 3125 and the Hobbit based EO 440 and 880 computers. The system is written in C (with additional macro support to support the object-oriented capabilities of the operating system) on top of the PenPoint operating system ([GO 92]). The lack of availability of a commercial SQL-based relational database engine that worked on a pen-based machine resulted in the need to write a small \quick and dirty" database engine that performed basic simple operations on the pen computer to enable basic database operations (scan a table, perform equi-join across two tables, select speci c columns of interest from a table etc.). Using the PenAscii ([Com93]) programming interface, which allows connection to a Unix host via a terminal emulation program over the phone line, the interface can connect to the Sybase database system on the Unix host. The character based terminal emulation program allows our interface to make database requests over to the server side. The results of the requests are gathered in a le and transferred over to the pen host. Currently, the program works over both the phone line as well as the cellular phone. The above solution ( le transfer using terminal emulators over the cellular phone line) represents only a rst generation solution towards being able to access remotely stored data over a wireless communication connection. In the future, as wireless communi1 A soft keyboard is a keyboard image that appears on the screen; the user taps on each individualkey using the pen. Needless to say, this is fairly cumbersome and is not a mechanism for either high speed or large scale input.
cation products for these pen-based devices become more commonplace and sophisticated, we will employ more elegant solutions to achieve our purposes. The pen-based interface currently is made up of about 20,000 lines of PenPoint C code, in addition to about 1,000 lines of C code residing on the DECstation 5000/200 under Ultrix. The latter code implements a simple server that serves as a gateway to the Sybase database system on the Unix host. The interface (on the pen machine) connects to this server using the terminal emulation program and uses a simple protocol to supply commands and receive results via a le transfer (using either the XMODEM or the KERMIT protocols).
2.2 System Architecture
As mentioned earlier, we use the universal-relation concepts to minimize the amount of actual \writing" the user must undertake in order to formulate a given query. Currently, the database schema is displayed as a set of tables with attributes that correspond to the columns of each table. Further work is necessary to integrate the interface to other types of information services which may not be relational in nature. The interface consists of two distinct components (see Figure 1), the user graphical interface piece that the user interacts with (using the pen) and the database access component that actually performs the actions of accessing the database and issuing commands and receiving results. In addition, there is a simple database engine present on the pen computer that performs very rudimentary database commands using data stored on the local storage medium of the pen computer (either a hard disk or ash memory). The database access component is responsible for all interactions with the terminal emulation program and issues commands to and receives results from the remote server. All interfaces shown on the pen computer are implemented using the message passing paradigm supported by the object-oriented PenPoint operating system. The implementation ensures that as superior communication facilities become available, only those pieces of the system need to be rewritten.
2.3 Sample Interactions
We present a brief description of a sample session interaction with the database using the interface. Pen-based computers allow the pen to combine the traditional roles of the mouse as well as the keyboard. The analogue to clicking a mouse button is \tapping" the pen on the item of interest. Handwriting recognizers help convert handwritten text into characters, thus serving the role of the keyboard in traditional computers. In addition, the pen allows the use of a set of core \gestures". Using the pen, the user draws a gesture on the screen which is captured by the system and delivered to the application to handle. The PenPoint operating system provides recognition of a number of gestures (ranging from alpha-numeric characters to gestures such as a quadruple-tap); in addition, some of the gestures have consistent meanings across applications. For instance, the caret gesture in many contexts is a request to display a list of items to choose from; in our domain, this maps very well to
User interacts
Graphical User Interface
DB commands
DB results
cellular line Simple Sybase Application
Database Access Component
Terminal
Emulator
Sybase library calls Simple Data base
Database engine
Remote Host
Pen Computer
Figure 1: System Architecture displaying a list of databases whose schemas are to be displayed. To allow users to be able to use the interface easily we allow for two modes of using the pen to input commands; one is the usual pull down menu, and the other is via gestures. In addition, some of the more common operations have associated control buttons at the bottom of the screen for easy access. The user connects to any DBMS the system recognizes by tapping on a control button with the pen or making a caret gesture with the pen (the list of gestures that the program understands can be obtained by drawing a question mark - \?" - on the screen which is a means of obtaining help). When the user connects to the database management system, a list of available databases is displayed. One of these is then selected by the user. As soon as the database represented by the user's choice is opened, the \top-level" relations are displayed. The user then picks attributes of interest by simply tapping on them with the pen. He or she may then ask to display all possible join paths between them with either another tap of the pen or a \P" gesture. All possible join paths are displayed one by one. The join paths are sorted with the shorter length paths appearing rst. The user now selects the desired join
path. Currently, the criterion for joining two tables is that the names of elds in the two tables should be identical. In future, we plan to limit the number of such join paths displayed by using a variety of criteria. For instance, rather than asking the user to pick a join path, the system would automatically pick one based on previous user access history. The resulting join is then displayed along with the SQL query that represents it and the results of the query. The query may be modi ed by the user by adding to (or deleting from) the SQL query text with the pen and re-submitting the query for execution. Using the pen, the user could similarly browse through a table's contents and perform update operations (insert, delete and update) on the table, if required. Most other obvious operations can be easily performed with the pen as well: scrolling, selecting and moving objects etc. All communication between the pen machine and the remote host is performed over the phone line (either standard wired telephone connection or cellular phone). Two screen dumps are shown in Figures 2 and 3; notice that the user has the option of having the type of the columns displayed or turned o in the interest of saving screen space (this may be crucial when the
Figure 2: Simple Join Query Execution
Figure 3: Insert into table \person"
schema consists of many tables and the user wishes to see as many tables displayed on the working area as possible).
dierence in baud rates in the communication speeds for the two modes (9600 for the wired and 1200 for the wireless modes). A solution using TCP/IP, for instance, would reduce processing times for the remote operations signi cantly, since the overheads involved in interacting with character-based terminal emulations will not be a factor any longer. The terminal emulation approach is also slowed down by the le transfer mechanism to return results to the pen computer. The le transfers occasionally had to resend packets of data (about 1 packet of 128 bytes was resent roughly for every 30 or so error-free packets sent). This occurred only in the wireless mode of operation. In the browse mode of operation, request for the rst row results in the query being actually executed. Subsequent row requests simply return rows one by one; this explains the disparity between processing times for the rst versus subsequent row retrieval.
2.3.1 Sample Measurements
We used two small databases \pizza" and \pubs" 2 containing 3 and 10 tables respectively. Both these databases contain tables containing anywhere from 10 to about 80 rows. The information in these database was replicated on the pen computer as well to facilitate local operations. We performed a set of simple queries on both these databases in three modes of operation - local, where the pen machine uses its local database system to access all the data, wired, where we use a phone line connection running at 9600 baud to connect to the Unix host, and wireless, where we use a cellular phone connection at 1200 baud 3 to connect to the Unix host. We tabulate the results in the form of a table (see Table 1). All times shown in the table are in seconds. As can be seen, there is an order of magnitude dierence in processing time between local and remote operations. Since, the results are returned via a le transfer, as the results' size becomes larger and larger, the dierence in processing speed between the wired and wireless modes increases. This is, of course, due to the This is the sample database shipped with Sybase [Syb90] Speeds faster than 1200 caused communication errors over the cellular line. 2 3
3 Conclusions and Future Work
We have established the need for pen-based database interfaces suited for mobile computers, and have presented a description of a working prototype that accesses remotely stored data over a wireless connection. In future, we would like to be able to develop the ideas described on a variety of mobile platforms with potentially dierent wireless access solutions. In addition, we are studying various other problems in the area of nomadic systems. For example, we have explored issues related to the presentation of large
Operation Local Wired Wireless Result size (in bytes) Login - 64.20 70.24 Connect to database 0.67 14.57 17.08 128 Open \pizza" 2.17 19.04 22.12 384 select * from person 0.84 16.46 22.17 768 select * from likes 0.69 25.56 57.07 3200 join person,toppings 0.94 19.84 22.59 128 Open \pubs" 7.18 31.33 48.92 1152 select * from authors 7.54 31.48 68.47 4992 select * from titles 8.03 39.74 109.90 7808 join authors,titleauthors 1.07 26.14 37.43 1664 Browse Operations Get rst row 0.15 29.22 32.06 128 Subsequent rows 0.06 14.01 16.44 128 Table 1: Measurements schemas in a limited screen space, developed query optimization techniques to minimize battery utilization [AG93], developed new crash recovery techniques that help users when their \local" database operations are lost owing to failures, etc. We have also outlined a number of relevant issues in mobile database management systems in [AK93]. We plan to integrate some of these ideas into our current prototype implementation. References
[AG93] R. Alonso and S. Ganguly. Query optimization for energy eciency in mobile environments. Proceedings of the Fifth Workshop on Foundations of Models and Languages forData and Objects, 1993. [AGS90] R. Agrawal, N. H. Gehani, and J. Srinivasan. Odeview: The graphical interface to Ode. In Proceedings of ACM-SIGMOD 1990 International Conference on Management of Data, Atlantic City, New Jersey, pages 34{43, May 1990. [AHK92] R. Alonso, E. Haber, and H. F. Korth. A database interface for mobile computers. In Proceedings of the 1992 Globecom Workshop on Networking of Personal Communication Applications, December 1992. [AK93] R. Alonso and H. Korth. Database system issues in nomadic computing. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1993. [App93] Apple Computer, Inc. Newton MessagePad Handlbook, 1993. [BH86] D. Bryce and R. Hull. SNAP: a graphics-based schema manager. In Proceedings of the Second International Conference on Data Engineering, Los Angeles, page 151, February 1986.
[CERE90] B. D. Czejdo, R. A. Elmasri, M. Rusinkiewicz, and D. W. Embley. A graphical data manipulation language for an extended entity-relationship model. IEEE Computer, 23(3):26{36, March 1990. [Com93] Compsoft Services Inc. PenAscii Version 2.0 Application Developer's Guide, 1993. [EO93] EO. Lookup Guide to the EO Personal Communicator, 1993. [Fag83] R. Fagin. Acyclic database schemes (of various degrees): A painless introduction. Research report rj3800, IBM Almaden Research Center, San Jose, CA, February 1983. [GO 92] GO Corporation. PenPoint Architectural Reference Volumes I and II, 1992. [Kim86] H.-J. Kim. Graphical interfaces for database systems: A survey. In Proceedings of the 1986 Mountain Regional ACM Conference, Santa Fe, NM, 1986. [Kim90] W. Kim. Introduction to Object-Oriented Databases. MIT Press, Cambridge, MA, 1990. [KKF+ 84] H. F. Korth, G. M. Kuper, J. Feigenbaum, A. Van Gelder, and J. D. Ullman. System/U: A database system based on the universal relation assumption. ACM Transactions on Database Systems, 9(3):331{347, September 1984. [KKS88] H.-J. Kim, H. F. Korth, and A. Silberschatz. PICASSO: a graphical query language. Software Practice and Experience, 18(3):169{203, March 1988. [KR89] H. F. Korth and M. A. Roth. Query Languages for Nested Relational Databases, volume
361 of Lecture Notes in Computer Science, pages 190{206. Springer-Verlag, 1989. [Kun92] M. Kuntz. The gist of GIUKU { graphical interactive intelligent utilities for knowledgeable users of data base systems. ACM SIGMOD Record, March 1992. [Mak77] A. Makinouchi. A consideration on normal form of not-necessarily-normalized relation in the relational data model. In Proceedings of the Third International Conference on Very Large Databases, Tokyo, pages 447{453, 1977. [SBMW91] G. H. Sockut, L. M. Burns, A. Malhotra, and K.-Y. Whang. GRAQULA: A graphical query language for entity-relationship or relational databases. Research report RC 16877, IBM T. J. Watson Research Center, Yorktown Heights, NY, March 1991. [Syb90] Sybase Corporation. Sybase Release 4.2 Documentation Set, 1990. [Ull89] J. D. Ullman. Principles of Database and Knowledge-Base Systems, volume 2. Computer Science Press, Rockville, MD, 1989.