It also discusses related technologies and systems for using physical media as ...... construction kit which allows people to manipulate digital information by connecting identical ... Think of activities like cooking, drawing, assembling a tinkertoy, or ...... which is fine enough for applications, where the relationship between the ...
Integrating Physical Media into Human Computer Interaction
Master’s Thesis at Graz University of Technology submitted by
Raimund Schatz
Institute for Information Processing and Computer Supported New Media (IICM) Graz, University of Technology A-8010 Graz, Austria
Graz, April 2002
Thesis Supervisor: o. Univ. Prof. Dr.phil. Dr. h. c. Hermann Maurer
Die Einbindung physischer Medien in die Mensch-Maschine-Kommunikation
Diplomarbeit an der Technischen Universität Graz vorgelegt von
Raimund Schatz
Institut für Informationsverarbeitung und Computergestützte neue Medien (IICM) Technische Universität Graz A-8010 Graz
Graz, im April 2002
Diese Diplomarbeit ist in englischer Sprache verfasst.
Begutachter: o. Univ. Prof. Dr.phil. Dr. h. c. Hermann Maurer
Abstract As computers become smaller and smaller in size while getting more powerful at the same time, they find their way into an ever-broadening number of human activities. The more computers are becoming pervasive in our daily lives, the more the interaction with them becomes a critical factor for their success in aiding us in our tasks. This tendency demands for an evolution of novel interaction paradigms and interfaces, which let human users keep pace with the progressing spread of information technology. This thesis describes novel interaction paradigms that have emerged during the last decades, suggesting the seamless integration of computation with our physical environment. It also discusses related technologies and systems for using physical media as an integral part of the human-computer interface. The main part of this work describes the design and implementation of the senseDesk, a fully functional interface system that enables users to interact with computers by means of direct manipulation of physical objects instead of handling traditional pointing-devices. In addition to the discussion of the requirements analysis, design, and implementation of hard- and software, some demonstration applications for the senseDesk and related interaction design issues are presented.
Zusammenfassung Computer werden laufend kleiner und gleichzeitig immer schneller, eine Entwicklung, welche die Anwendungsbereiche von Rechnern stetig ausweitet. Je mehr Computer in unser tägliches Leben eindringen, desto mehr wird auch die Interaktion zwischen Mensch und Maschine ein kritischer Faktor für ihren erfolgreichen Einsatz. Diese Tendenz macht die Entwicklung neuer Interaktionsparadigmen und dazugehöriger Benutzerschnittstellen notwendig, um den Benutzer mit der wachsenden Verbreitung von Informationstechnologie Schritt halten zu lassen. Die vorliegende Diplomarbeit diskutiert neuartige Interaktionsparadigmen, die im Laufe der letzten Jahre entstanden sind und die nahtlose Integration von Computern mit unserer physischen Umgebung beschreiben. Weiters werden Technologien und Systeme erläutert, die physische Objekte und Medien als integralen Bestandteil der Mensch-Maschine Kommunikation einsetzen. Der Hauptteil dieser Arbeit beschreibt die Entwicklung des senseDesk, ein vollständig realisiertes System, welches dem Benutzer erlaubt, mit dem Computer mittels realer, physischer Objekte zu interagieren, anstatt unter Verwendung herkömmlicher Zeigegeräte. Es werden die Analyse der Anforderungen, Design, und Implementierung von Hard- und Software erläutert. Weiters werden einige Anwendungsbeispiele des senseDesks diskutiert, gemeinsam mit auftretenden Fragen zur Gestaltung der Interaktion.
Acknowledgements I would like to thank Prof. Hermann Maurer for his thesis advice and for supporting me and my unorthodox project. Thanks to my brother Gernot Schatz, Axel Schmitzberger, Stefan Rinner, the Nofrontierans, and all my other friends and colleagues who have helped me with fruitful discussion and comments during the evolution of this work. I also have to thank Sibylle for her enduring love and patience during my hard times of writing this thesis. Finally I would like to thank my parents Josef und Gunhild Schatz for their overwhelming love and support during all the years of my studies. Raimund Schatz Graz, Austria, April 2002
I hereby certify that the work reported in this thesis is my own and that work performed by others is appropriately cited.
Signature of the author:
Ich versichere hiermit, diese Arbeit selbstständig verfasst, andere als angegebenen Quellen und Hilfsmittel nicht benutzt und mich auch sonst keiner unerlaubten Hilfsmittel bedient zu haben.
"Any sufficiently advanced technology is indistinguishable from magic." The Third Law of Arthur C. Clarke
Table of Contents 1
Introduction ..........................................................................................................1
1.1
Motivation ..............................................................................................................1
1.2
Thesis Scope and Structure ....................................................................................1
2
The Graphical User Interface..............................................................................3
2.1
Basic Idea ...............................................................................................................3
2.2 2.2.1 2.2.2 2.2.3
Concepts .................................................................................................................3 The WIMP Paradigm..............................................................................................3 The Desktop Metaphor ...........................................................................................4 Direct Manipulation................................................................................................6
2.3
Problems of GUIs ...................................................................................................7
3
Beyond GUI – Novel Interaction Paradigms ...................................................10
3.1 Ubiquitous Computing .........................................................................................10 3.1.1 The Computer of the 21st Century........................................................................10 3.1.2 Calm Technology .................................................................................................10 3.2 3.2.1 3.2.2 3.2.3
Augmented Reality...............................................................................................12 Concept and Technology......................................................................................12 The DigitalDesk....................................................................................................13 Tangible Interaction for Augmented Reality........................................................14
3.3 Graspable User Interfaces.....................................................................................16 3.3.1 Concept and Technology......................................................................................16 3.3.2 Key Properties ......................................................................................................16 3.4 3.4.1 3.4.2 3.4.3 3.4.4 3.4.5
Tangible Media.....................................................................................................17 Tangible Bits ........................................................................................................17 Key Concepts........................................................................................................18 Goals .....................................................................................................................18 Tangible User Interfaces.......................................................................................19 Additional Examples of Tangible User Interfaces ...............................................24
4
senseDesk – System Design and Implementation............................................32
4.1
Motivation ............................................................................................................32
4.2 4.2.1 4.2.2 4.2.3
System Definition.................................................................................................32 Usage Scenarios....................................................................................................33 Related Work ........................................................................................................33 System Concept ....................................................................................................35
4.3
System Requirements ...........................................................................................37 i
4.3.1 4.3.2 4.3.3 4.3.4 4.3.5 4.3.6 4.3.7 4.3.8
Interaction Space ..................................................................................................37 Complexity, Costs and Feasability.......................................................................37 Integration of Multimedia .....................................................................................37 Elegance and Aesthetics .......................................................................................37 Number of Objects, Degrees of Freedom.............................................................38 Flexibility and Extendibility.................................................................................38 Robustness ............................................................................................................38 Low Latency, High Accuracy...............................................................................38
4.4 4.4.1 4.4.2 4.4.3 4.4.4 4.4.5
Hardware Architecture .........................................................................................39 Sensing of Physical Properties .............................................................................39 Sensing Tags .........................................................................................................40 Hardware for Sensing ...........................................................................................44 Prototype Hardware Architecture .........................................................................45 Hardware Details ..................................................................................................46
4.5 4.5.1 4.5.2 4.5.3 4.5.4
Software Architecture...........................................................................................46 Research: Software Development Platform .........................................................47 Macromedia Director............................................................................................47 Software Components Overview..........................................................................51 Software Framework Architecture .......................................................................52
5
senseDesk – Application Development .............................................................62
5.1
Application Requirements ....................................................................................62
5.2
Development Process ...........................................................................................63
5.3 5.3.1 5.3.2 5.3.3 5.3.4 5.3.5 5.3.6
humanExplorer .....................................................................................................63 Motivation ............................................................................................................63 Prototype...............................................................................................................65 Concept .................................................................................................................65 Application...........................................................................................................66 Discussion.............................................................................................................67 Conclusion and Future Work................................................................................67
5.4 5.4.1 5.4.2 5.4.3 5.4.4 5.4.5 5.4.6
wayFind ................................................................................................................68 Motivation ............................................................................................................68 Prototype...............................................................................................................68 Concept .................................................................................................................71 Application...........................................................................................................72 Discussion.............................................................................................................73 Conclusion and Future Work................................................................................75
5.5 5.5.1 5.5.2 5.5.3
forceField..............................................................................................................76 Motivation ............................................................................................................76 Concept .................................................................................................................76 Application...........................................................................................................77 ii
5.5.4 Discussion.............................................................................................................78 5.5.5 Conclusion and Future Work................................................................................79 5.6 5.6.1 5.6.2 5.6.3 5.6.4 5.6.5 5.6.6
spacePlan..............................................................................................................80 Motivation ............................................................................................................80 Concept .................................................................................................................80 Application...........................................................................................................82 Discussion.............................................................................................................83 Related Work ........................................................................................................85 Conclusion and Future Work................................................................................87
5.7 5.7.1 5.7.2 5.7.3 5.7.4 5.7.5
megaPong .............................................................................................................87 Motivation ............................................................................................................87 Concept .................................................................................................................88 Application...........................................................................................................89 Discussion.............................................................................................................91 Conclusion and Future Work................................................................................91
6
Results..................................................................................................................93
6.1
Advantages of the senseDesk ...............................................................................93
6.2
Disadvantages of the senseDesk...........................................................................94
6.3
Design Guidelines ................................................................................................96
7
Conclusion.........................................................................................................100
7.1
Summary.............................................................................................................100
7.2
Future Work........................................................................................................101
Bibliography .................................................................................................................102
iii
1 Introduction 1.1 Motivation One of the important goals in Human Computer Interaction is to develop more intuitive and usable interfaces. The Graphical User Interface (GUI) with its visual metaphors has become the successful standard interaction paradigm for desktop computing during the last three decades. Nowadays computers are used in an ever-broadening number of both business and leisure activities, a trend moving computers away from the typical desktopconfiguration with its mouse, keyboard and monitor, and distributing them in our environment. Computing devices are becoming smaller and ubiquitous, and interaction with them is becoming more and more pervasive in our daily lives. In all cases, the need arises for more context-specific and human-centered ways of interacting with technology. Pointing, clicking, and typing, though still appropriate for many uses of computers, will not be the only way how most people interact with the majority of computing devices in the foreseeable future. The interaction with computers merged into the real world cannot be handled with the limited traditional, screenbased GUI-approach alone. Novel interaction paradigms along with appropriate interface systems are necessary in order to transcend the limitations of GUIs and to better integrate computing with the physical environments we inhabit.
1.2 Thesis Scope and Structure This thesis begins with an explanation of the concepts underlying traditional Graphical User Interfaces followed by a discussion of selected novel interaction paradigms which have emerged during the last decade. Furthermore, the design and development of a working interface system – the “senseDesk” - is described, which allows users to directly manipulate digital environments by means of physical media. The implemented system is then discussed and evaluated with regard to technology and interaction issues. Furthermore, guidelines supporting the interaction design for this type of interfaces are presented. Chapter 1 Introduction: This chapter gives an introduction to the topic and a thesis overview. Chapter 2 The Graphical User Interface: The fundamental concepts underlying the Graphical User Interface and the relationships between them are explained, followed by a discussion of the inherent weaknesses of Graphical User Interfaces, motivating the investigation of novel interaction paradigms.
1
Chapter 3 Beyond GUI – Novel Interaction Paradigms: This chapter describes interaction paradigms such as Ubiquitous Computing, Augmented Reality, Graspable User Interfaces along with emerging concepts of Tangible Media, focussing on the idea of Tangible User Interfaces. Chapter 4 senseDesk – System Design and Implementation: The design and implementation of senseDesk is described. SenseDesk is a hardware/software system which allows exploration and demonstration of applications for desk-based Tangible User Interfaces. Chapter 5 senseDesk – Application Development: Conception and development of various demonstration applications for the senseDesk are explained, which show the uses and benefits of desk-based TUIs. This chapter also illustrates the interaction design process and related implementation issues. Chapter 6 Results: The senseDesk is evaluated and its tangible interface compared with traditional GUI interaction. In addition, a set of general design guidelines regarding tangible interaction design is presented. Chapter 7 Conclusion: In this chapter the work of this thesis is summarized as well as an outlook on future work is outlined.
2
2 The Graphical User Interface This chapter describes the idea and fundamental concepts of the Graphical User Interface (GUI). Furthermore, it discusses the inherent shortcomings of GUIs, motivating the investigation of novel interaction paradigms.
2.1 Basic Idea A GUI is the part of a computer’s operating system, which enables a person to communicate with the machine by means of symbols, visual metaphors, and pointing devices. The GUI is based on the idea that pointing to something is the most basic human gesture, and using the mouse is easier than typing commands with the keyboard, as known from operating systems such as MS-DOS or UNIX. Enabling the user to issue commands and manipulate objects directly by pointing gestures instead of requiring him to remember and type in hundreds of keywords, fundamentally altered the nature of computing. Since the eighties, it has made computers accessible to a broad audience for a variety of applications such as word-processing, spreadsheet-calculation, and drawing. Today, GUIs are integral part of widespread operating systems such as Microsoft Windows™ and Apple MacOs™.
Figure 2.1-1: Screenshot of a Windows GUI (left) and Macinthosh GUI (right).
2.2 Concepts The GUI itself is not a singular concept invented by one person or institution. It evolved over decades by means of a series of innovations and principles originating from different directions, which have formed the today’s notion of GUIs. 2.2.1 The WIMP Paradigm WIMP was conceived by Douglas Englebart around 1968. At the SRI lab, located in the area of California which is known today as Silicon Valley, Englebart experimented with a number of wooden mouse prototypes for selecting and highlighting, menus, and text-based windows. The acronym WIMP itself stands for Windows, Icons, Menus and Pointing device.
3
A WIMP interface consists of the following standard elements [Dix1998]: •
Windows have a display area, a title bar, and controls. The controls, when present, may be used to enlarge, shrink, or close a window. The title bar can generally be used to move the window. Some types of windows cannot be moved or resized. Scrollbars may or may not be present.
•
Icons are small pictures used to represent entities or actions. Related icons may be grouped into a family in which each type in the family is a variant of a base icon design.
•
Menus and Controls come in different style and types such as pull-down menus, popup menus, toolbars, palettes, radio buttons, selection lists, dialog boxes, etc.
•
Pointing devices, such as the mouse are most common, but trackballs, touchpads, tablets and pens, etc., are also frequently used in WIMP interfaces.
• Figure 2.2-1: A typical WIMP interface.
WIMP interaction can be summarized as follows [Bea2000b]: •
application objects are displayed in document windows.
•
objects can be selected and sometimes dragged and dropped between different windows.
•
commands are invoked through menus or toolbars, often bringing up a dialog box.
Metaphors are not an essential part of WIMP interfaces, but many applications involve some use of metaphors. A good example provides the desktop metaphor which merges elements and concepts related to real-world office desks with the interface. 2.2.2 The Desktop Metaphor The Xerox Star from PARC (Palo Alto Research Center) was introduced 1981 as the first commercial personal computer to use the desktop metaphor, which is still common to today’s GUIs.
4
An early publication [Smith1982] about the Star said: “Every user’s initial view of the Star is the Desktop, which resembles the top of an office desk, together with surrounding furniture and equipment. It represents a working environment, where projects and accessible resources reside. On the screen are displayed pictures of familiar office objects, such as documents, folders, file drawers, in-baskets, and outbaskets. These objects are displayed as small pictures, or icons. The Desktop is the principal Star technique for realizing the physical office metaphor. The icons on it are visible, concrete embodiments of the corresponding physical objects. Star users are encouraged to think of the objects on the Desktop in physical terms. You can move the icons around to arrange your Desktop as you wish. (Messy Desktops are certainly possible, just as in real life). You can leave documents on your Desktop indefinitely, just as on a real desk, or you can file them away.”
Figure 2.2-2: Screenshots of the Star desktop.
The advent of the desktop metaphor made the user deal mostly with documents (= data files), with the corresponding application associated automatically to it. Documents could be placed on the desktop, or be filed, or be placed in the out-basket for e-mailing, etc. Everything could be arranged in an arbitrary order, just as in real life. As a helpful unification, a small set of generic commands could be applied to all data (Move, Copy, Open, Delete, Show, Properties). In order to control the system, the user manipulated graphical elements on the screen, instead of typing commands as in the traditional computer systems then. Icons made it easier to find the files wanted, and provided a familiar way to sort the documents, either filing them or just piling them on the desktop. Another important quality was consistency, meaning that everything looked and worked the same way throughout the system. With the help of the desktop metaphor, the Xerox Star greatly reduced the steepness of the learning curve for their products and made computer technology available to a much larger class of users than before. This development has clearly had many benefits: it increased productivity in many office tasks and placed users in direct control of the computer, instead of making them use less efficient interaction techniques such as command line typing or having to rely on intermediary technical staff [John1989]. 5
Figure 2.2-3: The Xerox Star system.
2.2.3 Direct Manipulation In 1983 HCI researcher Ben Shneiderman published one of his most important papers, "Direct manipulation: A step beyond programming languages" [Shn1983]. In his paper, he offered a new vision of interactive systems. In a direct manipulation interface, task objects are presented on a screen and the user has a repertoire of manipulations that can be performed on any of them. The user has no command language to remember beyond a standard set of manipulations, few cognitive changes of mode, and a reminder of the available objects and their states shown continuously on the display. The Direct Manipulation interfaces provide •
a natural visual representation of task objects and actions of interest.
•
rapid, incremental, and reversible actions.
•
immediate presentation of the multiple impacts of a change.
•
selection instead of typing.
•
emphasis on task domain representations that produce low demands for syntactic and computer knowledge.
Direct Manipulation interfaces have the advantage that they usually are rapidly learned, have low error rates, produce high user satisfaction, and are easy for users to memorize over time [Shn1998]. Direct manipulation requires that distinct functions be invoked and controlled in spatially distinct screen locations, in ways that are specific and appropriate for the particular function being controlled. Continuous functions (e.g. screen brightness, scroll position) should be controlled via continuous controls such as sliders, knobs, and dials. Discrete functions (e.g. choosing a text style) should be controlled via discrete means such as commands, multi-position switches, or menus. This way, a direct manipulation system has a different input channel for every function the user can have it perform. Previous interfaces such as command line shells were indirect in that there is a single, 6
general interface to all functionality (e.g. a keyboard and command language or a menu). In other words, there was only one ni put channel for all kinds of input. Different kinds of input were distinguished linguistically, rather than spatially [Shn1997].
Figure 2.2-4: Controls in a direct manipulation interface (left) compared to a command line interface (right).
It is important to distinguish WIMP interfaces and direct manipulation interfaces. Direct manipulation interfaces allow the user to manipulate objects of interest or perform actions on them directly, which usually entails using a pointing device. In general, WIMP interfaces can be considered a type of direct manipulation interface. A WIMP interface also may just offer direct access to controls (i.e. buttons, menus), but not necessarily manipulate the objects of interest directly. For instance, on most of the commercial websites the user may have to click a button to put an item into a shopping cart instead of dragging the item and dropping it on the cart as in direct manipulation. This distinction is subtle but important.
2.3 Problems of GUIs The GUI has proven a successful and durable model for human-computer interaction, which has dominated the last two decades of interface design. The main problem with the dominance of the GUI paradigm is that it tends to blind user interface designers and researchers to other interaction styles. It has been almost universally assumed that a desktop-stype direct manipulation user interface is the best fitting user interface for all applications. Almost all user interface design and implementation tools support only this style of interaction. For example, a binary two-dimensional selecting device (i.e. mouse, trackball) is implicitly written into the Java programming language with no means of assuming alternative devices natively available to the programmer. Existing user interface style standards therefore only address desktop user interfaces. However, the GUI approach has its weaknesses and therefore falls short in some respects: GUIs do not always offer true direct manipulation of task objects. Most of the time, the user has to perform indirect manipulation of task objects, through direct manipulation of mediating interface elements such as menus and dialog boxes, tools, and controls. These 7
interface elements are not task objects, they are interaction instruments [Bea2000b], which introduce another level of indirectness between the user and the task object: when performing direct manipulation with a GUI interface, the user controls a single mousepointer acting as logical handle for working with interaction instruments, which makes three steps of interaction necessary: (1) acquire physical device (mouse), (2) acquire logical device (instrument), (3) manipulate logical task object [Fitz1995]. Due to this level of indirectness, switching between logical functions becomes costly in terms of cognitive effort, caused by distraction of the user’s attention from his actual task. Since in common everyday tasks people are used to pick up a tool (such as a pen) and work directly on the task object (e.g. drawing on a sheet of paper), the development of user interfaces should strive for similar interaction techniques in order to achieve a higher directness in manipulation. GUIs are biased towards graphical output at the expense of input from the real world. GUIs strictly separate input from output. They define a set of graphical interface elements (windows, icons, menus, etc.) and task objects that reside in a purely digital form. Generic input devices (mouse, keyboard, etc.) are primarily used to manipulate these virtual interface elements and interaction with them is limited to a generic vocabulary of gestures (point, click, drag). These interactions with GUI elements and task objects, which exist just digitally on the screen, are strictly separated from the ordinary physical environment which users live within and interact with. This results in a drastic limitation of bandwidth of human-computer interaction, because it neglects the potential of existing rich interface modalities between people and the physical objects. Physical objects do not only address our tactile and kinesthetic senses, they also yield stronger affordances than purely visual ones. People normally use their sophisticated everyday-skills for manipulatig objects: not just pointing and clicking, but also rotating, grasping, attaching. Most of these practices are neglected in GUIs because of the lack of diversity of input/output media [Ishii1997]. Visual representations and metaphors tend to be problematic. In a GUI the task objects and interface elements have to be visible on screen in order to be accessible for user manipulation. Consequently, visual representations have to be found for all of these objects and elements. But very often it is difficult to find suitable graphic representations or visual metaphors for the objects of the problem domain. Furthermore, users have to learn the meaning of components of these visual representations. A graphic icon may be meaningful to the designer, but may require as much or more learning time than a word. Visual representations may also be misleading. Users may grasp the analogical representation rapidly the first time, but then overestimate or underestimate the functions of the computerbased analogy [Shn1998, p.204]. These problems especially get worse with increasing complexity of tasks and software. User interfaces in general should provide ways of representation and manipulation matching to human input/output capabilities in order to minimize physical and mental 8
effort. Therefore the goal is to build on the equipment and skills humans have acquired through evolution and experience and exploit their potential for human computer interaction. This allows to reduce the gap between the user’s intentions and the actions necessary to input them into the computer. In many cases, this goal can be achieved with the help of different novel interaction paradigms, that move interaction away from purely visual on-screen output into the physical environment of the user.
9
3 Beyond GUI – Novel Interaction Paradigms This chapter describes the historical development of novel interaction paradigms that transcend the input/output capabilities of traditional GUIs. After a discussion of paradigms such as Ubiquitous Computing, Augmented Reality, and Graspable User Interfaces, the emerging concept of Tangible Media and its vision of Tangible Bits are explained in detail.
3.1 Ubiquitous Computing 3.1.1 The Computer of the 21s t Century In 1991 Mark Weiser published an article on his vision of ubiquitous computing [Weis1991], in which he proposes a new paradigm of computing and HCI, pushing computers into the background and attempting to make them invisible. Ubiquitous computing is fundamentally characterized by the connection of things in the world with computation. Access to computational services is delivered through a number of different devices, which are designed for supporting various tasks in various contexts. Weiser also emphasized the importance of transparent integration of these devices within the physical environment. His team at Xerox PARC implemented a variety of computational devices including Tabs, Pads, and Boards, which have different sizes and affordances in order to fill different interface roles: Tabs are inch-scale machines that approximate active Post-It™ notes, Pads should replace laptop computers and behave like active paper, and the walls in our offices should be turned into yard-scale, reactive displays. This initial vision of Weiser has been an inspiration and catalyst for the user interface community for long time. However, from a strict user interface viewpoint, most work in ubiquitous computing has followed traditional GUI approaches, adopting the button/stylus interaction with virtual widgets on a graphical surface and exporting this interaction style to small and large computer terminals.
Figure 3.1-1: The Tab, Pad, and Wall.
3.1.2 Calm Technology In 1995, Mark Weiser extends ubiquitous computing with the notion of calm technology [Weis1995], claiming that user interfaces should be better integrated into the physical periphery of user activities. Calm technology engages both the center and the periphery of the user attention, and moves back and forth between the two. He exemplifies this concept 10
by discussing the “Live Wire” display, which has been originally designed by Natalie Jeremijenko: Live Wire is a piece of plastic cord, which hangs from a small electric motor mounted in the ceiling. The motor is electrically connected to the area ethernet network in a way, that each passing packet of information causes a tiny twich of the motor. Formerly invisible bits flowing through the wires of the network become visible and tangible trough motion, sound, and even touch. The activity of the wire is visible and audible without being obtrusive, and takes advantage of peripheral cues. While GUI screen displays of traffic are common, their symbols require interpretation and attention of user, and hence do not peripheralize well. Because the string resides in the physical world and does not need direct attention, it has a better impedance match with human peripheral nerve centers [Weis1995].
Figure 3.1-2: A red dangling string serving as Live Wire.
The Live Wire and the concept of calm technology inspired many projects in HCI research such as ambientROOM [Ishii1998] and pinWheels [Ishii2001], which additionally introduced the notion of Ambient Media as a general mechanism for physically displaying activities in cyberspace. Ambient Media will be discussed with more detail in chapter 3.4. Though calm technology mainly focuses on the output side of the computer, the concept of using real-world physical objects to display digital information had a profound impact on the common notion of human-computer-interfaces.
11
3.2 Augmented Reality 3.2.1 Concept and Technology Augmented Reality (AR) is broadly concerned with the integration of the physical world with computational media. It tries to augment the interaction of the user with his physical environment by adding computer-generated information to his sensory perceptions. The most common AR approach is the visual overlay of graphics and text on the user's view of his surroundings with a variety of devices such as HMDs (head mounted devices), handheld displays, or projections. For instance, AR systems using an optical see-through HMD track the position and orientation of the user's head in a way that the overlaid material can be aligned with the user's view of the world as shown in Figure 3.2-1.
Figure 3.2-1: Conceptual diagram and photo of an optical see-through HMD.
Through this alignment-process called registration, graphics software can place a threedimensional image of a lamp for example, on top of a real desk and keep the virtual lamp fixed in that position as the user moves about the room (Figure 3.2-2). AR systems make use of some of the same hardware technologies used in virtual reality (VR), but there is an essential difference between AR and VR: whereas virtual reality tries to totally replace the real world, augmented reality aims to supplement it.
Figure 3.2-2: Examples of augmented views.
12
Azuma et al. define an AR system to have the following properties [Azuma2001]: •
Combines real and virtual objects in a real environment.
•
Runs interactively and in real time.
•
Registers (aligns) real and virtual objects with each other.
Many classes of potential AR applications are explored in scientific research: medical visualization, maintenance and repair, annotation, robot path planning, entertainment, and military aircraft navigation and targeting [Azuma1997, Azuma2001]. AR is often restricted to expensive high performance hardware, because it places very high demands on the accuracy, resolution, robustness and speed of tracking technologies. Therefore, large part of the research in AR concentrates on these topics rather than on issues regarding human computer interaction. 3.2.2 The DigitalDesk One notable pioneering system in AR interaction research is the DigitalDesk of Pierre Wellner [Well1993], which supports augmented interaction with physical paper documents on a physical desktop. A video beamer is mounted above a desk and video cameras combined with image-analysis techniques are applied for sensing the actions of the user. The system projects its output onto desk and paper documents, responds to pen- and finger-based interactions, and is also able to read paper documents placed on the desktop. Using computer vision techniques, the system can even recognize command icons drawn on small pieces of paper.
Figure 3.2-3: Concept and photo of the DigitalDesk
The DigitalDesk forms a computer-augmented desktop, which helps to overcome some of the limitations of physical paper. This is illustrated by several prototype applications: the augmented Calculator allows the entering of numbers by directly pointing to numbers which are printed on paper. PaperPaint is a mixed paper and electronic drawing program, that allows users to copy and paste paper documents in the same way that we copy and paste electronic documents. Symbols drawn on paper can be selected with a stylus and electronic copies of them can be pasted by projection in various sizes and positions. 13
DoubleDigitalDesk allows shared editing of documents between two remotely connected DigitalDesks. Each user can draw with a real pen on both paper and electronic documents, and the other user sees these marks appear in the corresponding places as shown in Figure 3.2-4.
Figure 3.2-4: The Calculator (top left), PaperPaint (right), and DoubleDigitalDesk (bottom left) applications.
The DigitalDesk is a good example of how well physical and electronic artifacts can be merged by taking advantage of the strengths of both mediums and using them in a combined physical/digital interface. 3.2.3 Tangible Interaction for Augmented Reality Until recently, most of the work in AR was concerned with displaying digital information and registering it to the real world rather than examining how potential users would interact with such systems (with a few exceptions such as the DigitalDesk). Prototypes were based on GUI interaction techniques and desktop-metaphors or adapted virtual reality interfaces using gesture recognition or tracking of pointers. Therefore AR user interfaces have generally been realized as graphical abstractions, virtually manipulated by gestures and hand-held GUI devices, and not as a direct physical manipulation of the user’s physical environment. Hence the real world in AR has mostly played the role of a separate input channel for the computer and the user, rather than an active part of the interface mediating between them. However, recent AR projects were able to successfully integrate the physical world with tangible interaction techniques: the AR-Toolkit by Kato et al. [Kato2000] attaches 3Dobjects to 2D-fiducials (black squares with a graphical symbol inside) printed on flat physical cards. Users of the system wear light-weight see-through HMDs and by means of pattern recognition and pose estimation, the 3D-objects are perfectly aligned with the cards. People can apply a variety of real-world interaction techniques to the physical cards 14
such as pushing, tilting, swatting, and other motions, which enables them to directly interact with the 3D content.
Figure 3.2-5: The AR-Toolkit.
In the Studierstube1 project, AR-Toolkit and the previously mentioned interaction techniques are used in conjunction with the Personal Interaction Panel (PIP) by [Szal1997]. The current PIP prototype system is a two-handed interface consisting of a vision-tracked blank physical board held in the non-dominant hand of the user, upon which virtual controls or parts of the artificial world are drawn (Figure 3.2-6). The user sees the virtual controls by means of a see-through HMD and interacts with them using a stylus, which is electromagnetically tracked. This way, the PIP provides basic haptic feedback and enables forms of manipulation similar to a real handheld display.
Figure 3.2-6: Studierstube and the Personal Interaction Panel (PIP).
As the low-level perceptual topics such as latency, tracking, and registration become extensively studied and technology advances, AR systems with sufficient I/O capabilities will be commonly available. This will lead to a strong growth in the area of research of high-level issues dealing with questions such as how to present information to the user and how to let him interact with it. This tendency will also raise deeper interest in combined physical/digital interfaces for AR [Azuma2001].
1
http://www.studierstube.org
15
3.3 Graspable User Interfaces 3.3.1 Concept and Technology In 1995 Fitzmaurice, Ishii, and Buxton introduced the concept of Graspable User Interfaces [Fitz1995]. They use small physical blocks called “Bricks” for the manipulation of synthetic objects in order to make them graspable (Figure 3.3-1). The user operates these artifacts on top of a large horizontal display surface called "ActiveDesk". Bricks are essentially new input devices that can be tightly coupled or attached to virtual objects for manipulation or expressing action, such as setting parameters or initiating processes. In a prototype drawing application, handling two bricks in parallel allows the flexible creation and manipulation of B-spline curves and stretching, rotating, and scaling of geometric entities (Figure 3.3-2). One key idea of graspable user interfaces is that the bricks are able to offer a significantly rich vocabulary of expression for input devices [Fitz1995]. Compared to most pointing devices (such as the mouse) which only provide one XYlocation, the bricks offer multiple XYZ-locations and orientation information at the same time. A brick is realized as a small cube which is magnetically tracked by a Ascension1 Flock of Birds™ system with six degrees of freedom (XYZ-position, angles). However, this system is limited because it only provides two bricks for the user to manipulate and the bricks themselves are tethered, hindering the user in some of his actions.
Figure 3.3-1: Concept sketches of bricks.
Figure 3.3-2: GraspDraw application.
3.3.2 Key Properties In [Fitz1995], also several important properties of interfaces using graspable objects for interaction are discussed: •
1
Space-multiplexed input vs. time-multiplexed input. The primary principle behind Graspable UIs is to use a space-multiplexed input design. Fitzmaurice et al. classify devices as being space-multiplexed or time-multiplexed. With space-multiplexed input, each function to be controlled has a dedicated transducer, each occupying its own space. For example, an automobile has a brake, clutch, throttle, steering wheel, and gear shift which are distinct, dedicated transducers controlling a single specific task. In
http://www.ascension.com
16
contrast, time-multiplexing input uses one device to control different functions at different points in time. For instance, the mouse uses time-multiplexing as it controls functions as diverse as menu selection, navigation using the scroll widgets, pointing, and activating buttons. •
Concurrency. Since Graspable UIs use a space-multiplex design, the development of interaction techniques involving multiple devices at the same time is possible. Using multiple manual devices concurrently suggests that the interactions will involve the use of two hands, which makes the application of a large spectrum of two handed interactions possible (e.g., two handed discrete actions; one handed discrete with the other hand continuous; two handed continuous, etc.).
•
Spatial awareness and reconfigurability. Since in Graspable UIs some interface elements take on a physical form, they should also be aware of their position/orientation. Devices which know their position on a given surface (e.g. bricks) can be queried by the application for this spatial information. Especially graphical tasks, which are inherently spatial by nature, can benefit from such spatially aware input devices. Also the ability to rapidly reconfigure and rearrange a set of devices in a workspace is important, because it allows users to customize their space to facilitate task workflows and rapid task switching.
The concept of Graspable UIs is considered as a groundbreaking step in HCI research, because it systematizes the properties of interfaces which integrate physical handles and presents a new design space and taxonomy laying the foundation for further exploration and evolution of tangible user interfaces [Broll2000, Gor1998, Ishii1997, Pat2000, Rek2000, Str1999, Stre1998, Ull1997a, Ull2000, Und1999a].
3.4 Tangible Media 3.4.1 Tangible Bits In 1995 Professor Hiroshi Ishii founded the Tangible Media Group (TMG) at the Media Lab of the MIT (Massachusetts Institute of Technology). The group focuses on the design of seamless interfaces between humans, digital information, and the physical environment. The TMG has its own vision of human computer interaction, called “Tangible Bits”: “People have developed sophisticated skills for sensing and manipulating our physical environments. However, most of these skills are not employed by a traditional GUI (Graphical User Interface).” “Tangible Bits seeks to build upon these skills by giving physical form to digital information, seamlessly coupling the dual worlds of bits and atoms. The goal is to change the "painted bits" of GUIs to "tangible bits", taking advantage of the richness of multimodal human senses and skills developed through our lifetime of interaction with the physical world.” [Ishii1997] 17
In their vision paper [Ishii1997], Ishii and Ullmer state that we live between two realms: our physical environment and cyberspace. Since the interactions between people and cyberspace are largely confined to tradional GUI-based interfaces, they are separated from the ordinary physical environment we inhabit. Tangible Bits should bridge the gaps between both spaces by seamlessly coupling bits and atoms and bringing haptic interactions with physical objects back into HCI design. 3.4.2 Key Concepts Tangible Bits attempts to overcome the divide between cyberspace and the physical environment by making digital information tangible. The Tangible Media Group has developed several key concepts for making bits physically accessible [Ishii1997]: •
Interactive Surfaces. Each surface within architectural space such as walls, desktops, ceilings, doors, windows can be transformed into an active interface between the physical and virtual worlds.
•
Coupling of Bits and Atoms. Everyday graspable objects such as card, books, models can be seamlessly coupled with the digital information that pertains to them.
•
Ambient Media. Ambient media such as sound, light airflow, and water movement can be used for background interfaces with cyberspace at the periphery of human.
3.4.3 Goals The ultimate goal of tangible bits is to find ways to turn each state of physical matter – solid matter, liquids, and gases – within everyday architectural spaces into interfaces between people and digital information. “Painted bits” of GUIs should be changed into “tangible bits” by taking advantage of multiple senses and the multi-modality of human interactions with the real world. Quality and bandwidth of interaction between people and digital information should be improved by •
allowing users to grasp and manipulate foreground bits by coupling bits with physical objects, and
•
enabling users to be aware of background bits at the periphery using ambient media in an augmented space.
The concept of foreground and background bits originates from a taxonomy by W. Buxton presented in [Bux1995], which is part of a human-centric model for characterizing technologies and interactions. Foreground activities such as telephony or typing into a computer reside in the fore of human conciousness, they are intentional. Background activities take place in the periphery of our senses. Being aware of someone in the next room who is typing is an example for a background activity. Another example is the cooling fan of our computer: as long as the fan works and produces its sizzling noise, we do not consciously recognize it at all, it just resides in the periphery of our perception. But when the fan and its noise suddenly stop, our attention is immediately drawn to it, because 18
something unusual has happened. HCI research and computer interfaces are primarily focusing on intentional, foreground activities such as pointing and typing, while neglecting the background. GUIs are also very limited in that their output is confined to a static small rectangular viewport, having little capacity to address the human periphery. But background awareness is very important for perception and action, because humans are able to monitor many sources of information (e.g. environmental noises, light, temperature) while beeing occupied with a foreground task. If anything unusual is noticed (e.g. the noise of a machine suddenly stoping), it immediately comes to the center of human attention, i.e. a transition of focus happens. The study of this transition of focus between background and foreground is a promising avenue for developing interfaces which better adapt to human needs, and is therefore a key challenge of Tangible Bits [Ishii1997].
Figure 3.4-1: Ambient Media distinguishes between foreground and background activities.
3.4.4 Tangible User Interfaces As part of the Tangible Bits project, also a new Tangible User Interface (TUI), which augments information to everyday physical objects and transition of HCI from the GUI of desktop PCs to change the world itself into an interface.
GUI
type of HCI is presented in [Ishii1997]: the the real physical world by coupling digital environments. Figure 3.4-2 illustrates the to the Tangible User Interface, which aims
TUI
Figure 3.4-2: From Graphical to Tangible User Interfaces.
19
Example 1: The Abacus In [Ishii1997] the abacus is suggested as a prototypical example for a tangible user interface. The abacus differs from a common electronic calculator in that it makes no distinction between “input” and “output”. Instead, the beads, rods, and the frame of the abacus serve as manipulable physical representations of abstract numerical values and operations. At the same time, these artifacts also serve as physical control for direct manipulation in order to compute on numbers. The abacus therefore does not only use physical objects as interface, it also embodies the seamless integration of representation and control, a characteristic property of TUIs. In addition, the simple and transparent mechanical structure of the abacus (without any digital black boxes) provides rich physical affordances. Anyone can immediately understand what he can do with this artifact without reading a manual (Beeing able to execute correct calculations with it is another question, of course).
Figure 3.4-3: Historical abacus compared to a today’s calculator.
Hiroshi Ishii first encountered the abacus when he was two years old. He could enjoy the touch and feel of the digits physically represented as arrays of beads. This simple abacus was not merely a digital computational device: because of its physicality, it also became a musical instrument, an imaginary toy train, even a back-scratcher. As a child, he was captivated by the artifact's sound and its tactile properties. But this childhood abacus was also a medium of awareness. When his mother used the abacus for household accounting, he was aware of her activities by the sound of it, knowing that he could not ask her to play while the abacus made its music [Ishii1997]. Hence the abacus also served as ambient display in Ishii’s peripheral background, informing him about the foreground activities of his mother. 3.4.4.1 Example 2: Illuminating Light Illuminating Light [Und1999b] by the Tangible Media Group is an example of a TUI for optical design and layout. In this system, users directly arrange and manipulate physical models representing lasers, mirrors, lenses, and other optical components on an augmented tabletop. The computationally mediated optical simulation of the system provides these models with the same meaning and function as their real-world counterparts, so that placing the laser-model on the augmented table results in a graphical beam of light, 20
apparently emanating from the laser’s front aperture. This beam remains registered with the laser as it is rotated and moved in the workspace. Position and orientation of all objects are recognized by the system and the behavior of the laser light is projected onto the table in the same physical space as the optical components. Users are thus able to make full use of their hands and bodies in affecting the simulation, as well as use their spatial and kinesthetic senses in understanding the arrangements.
Figure 3.4-4: Illuminating Light, a TUI for holography simulation.
Illuminating Light also supports multi-user actions as well: multiple users can simultaneously grab and manipulate the optical components in order to cooperatively create and explore simulated holography layouts. Therefore it si a good example how TUIs facilitate communication and collocated cooperative work between multiple users by distributing physical objects in a physical environment for supporting memory, learning, and interpersonal communication. 3.4.4.2 Differences between GUIs and TUIs So far, Tangible User Interfaces (TUIs) have been defined as user interfaces which rely on physical representations as physical interfaces to digital information. However, this definition has to be narrowed, because also GUI devices such as keyboard, mouse, monitor, and so forth are also physical objects, even though they are not TUI devices. In order to obtain a clear distinction, TUIs have to be directly compared with GUIs. As already discussed in this chapter, GUIs are based on graphical representations of virtual user interface elements: graphical widgetry such as windows, icons, menus, and other elements pervade modern computer interfaces. Physical input peripherals such as the mouse and the keyboard are used to indirectly access and manipulate these virtual interface elements, but it is fundamental to GUIs that graphical widgetry exists independently from the physical world. Hence GUIs represent information almost entirely in transient visual form [Ull2000]. In contrast, tangible user interfaces are based on tangible entities which physically embody the digital information and interfaces they represent. For instance, the models in Illuminating Light physically embody both the digital content and the means for manipulation of its digital associations, thus seamlessly coupling virtuality and physicality by embodying both at once. The physical forms of the models representing specific optical tools, as well as their position and orientation on the table, serve central roles in representing and controlling the state of the user interface. This seamless integration of representation and control also differs completely from the traditional GUI model in HCI. 21
GUIs make a fundamental distinction between output devices such as monitors displaying the digital representations, and input devices such as keyboard and mouse, which only have the role of physical controls [Ull2000]. TUIs try to eliminate this distinction by bridging the gap between representation and control, just in the same way as our first example (the abacus) does. 3.4.4.3 Interaction Model Traditional computer interfaces frame human computer interaction in terms of “input” and “output”. Computer output is delivered in the form of digital representations (e.g. onscreen graphics and text), while input is obtained from control peripherals such as the keyboard and mouse. The relationship between these components can be visualized with the “Model-View-Controller” or “MVC” pattern, an interaction model for GUIs developed in conjunction with the Smalltalk-80 programming language, shown in Figure 3.4-5: The “model” element represents the internal, digital computational structures such as code and data for the system’s core functionalities. The user can only indirectly access the “model” by means of input (“control”) and output (“view”) elements located at the boundary between the physical and the digital realm and mediating between them. MVC also emphasizes the GUI’s strong separation between the digital representations (= “view") provided by the graphical display, and the control mediated by mouse and keyboard.
Figure 3.4-5: GUI: MVC model.
Figure 3.4-6: TUI: MCRpd model.
In [Ull2000], Ullmer and Ishii extend the MVC approach in order to develop an interaction model for TUIs, which they call “MCRpd” for “Model-Control-Representation (physical and digital)”. They keep the “model” and “control” elements from the MVC model, but divide the “view” into two subcomponents as shown in Figure 3.4-6: physical representations (“rep-p”) are the artifacts constituting the graspable, physically embodied elements of tangible interfaces (such as the beads of the abacus), and the non-graspable digital representations (“rep-d”) are the computationally mediated components of tangible interfaces without embodied physical form (e.g. video projection, audio). Whereas the MVC model of Figure 3.4-5 illustrates the strong separation between graphical representation and control in GUIs, the MCRpd model highlights the TUI´s strong integration of physical representation and control. TUI artifacts therefore physically embody both, the control pathway as well as a representational part of the interface. 22
The MCRpd model can be better understood by applying it to concrete TUI instance such as the Illuminating Light system: the underlying “model” element designates the digital code/data part of the system, which simulates optical phenomena and holography. This “model” could also be interfaced with a mouse, keyboard, and monitor in the tradition of a GUI, according to the MVC model. Instead, the TUI of Illuminating Light uses the presence, position, and orientation of physical models for input (“control”). These models also physically embody different tools and objects such as lasers, mirrors and beamsplitters, therefore they carry information about their role and function (“rep-p”). The output of the simulation is projected (“rep-d”) onto the working surface along with additional information in the proximity of the objects, thus digitally augmenting them. As the example shows, the MCRpd interaction model provides a good basis for discussing concrete instances of TUIs. Furthermore, it is also useful for examining conceptual key characteristics of TUIs as well. 3.4.4.4 Key Characteristics TUIs have several important properties or key characteristics which are discussed in [Ull2000]. Three of them are basic relationships shared by the physical representations (“rep-p”) of TUIs, as shown in Figure 3.4-7.
Figure 3.4-7: Three key characteristics of TUIs visualized in the MCRpd.
These key characteristics of TUIs are: •
Physical representations (“rep-p”) are computationally coupled to underlying digital information (“model”). This coupling of “rep-p” and “model” is the central key characteristic of TUIs. The Illuminating Light example illustrates some of such couplings, including the binding of digital models (code and data instances for laser sources, beam-reflectors, beam-splitters) to the physical models, and their influence on the behaviour of the simulated laser beam.
•
Physical representations embody mechanisms for interactive control (“control”). The physical representations of TUIs also serve as interactive physical controls. The physical movement and rotation of these artifacts, their insertion or attachment to each other, and other manipulations of these physical representations serve as tangible interfaces’ primary means for control. An example mechanism is the behaviour of the laser beam in Illuminating Light, which can be interactively controlled by positioning and rotating the physical models on the table.
23
•
Physical representations are perceptually coupled to actively mediated digital representations (“rep-d”). TUIs rely upon a balance between physical and digital representations. While embodied physical elements play a central, defining role in the representation and control of TUIs, digital representations (especially graphics, audio) often present much of the dynamic information processed by the underlying computational system. Therefore, in the Illuminating Light example the physical models are digitally augmented by means of projected additional information located in their proximity.
According to [Ull2000], a fourth TUI key characteristic is significant, which is not directly visible in the MCRpd interaction model: •
The physical state of the interface objects partially embodies the digital state of the system. Tangible interfaces are generally built from systems of physical objects. Considered as a whole, these collections of objects have several important properties: as physical elements and therefore bound to the laws of matter, TUI artifacts are persistent, they cannot spontaneously appear and disappear, as opposed to GUI elements such as windows, buttons, etc., which can be arbitrarily created or destroyed on screen by the touch of a button. In addition, the physical configuration of the TUI artifacts is tightly coupled to the digital state of the systems they represent. The Illuminating Light system provides a good example for this key characteristic, because the actual spatial configuration of the physical models (lasers, mirrors, etc.) defines a specific set of parameters setting the state of the digital optical simulation.
Figure 3.4-8: The configuration of the tangible artifacts sets and displays the state of the system.
3.4.5 Additional Examples of Tangible User Interfaces 3.4.5.1 MetaDESK The metaDESK platform [Ull1997b] was the first fully operational implementation of a desktop-style TUI of the Tangible Media Group. The metaDESK is a nearly-horizontal, back-projected graphical workbench with an interactive surface, which senses and responds to physical-world stimuli. 24
The metaDESK consists of several components as shown in the overview in Figure 3.4-9: •
the desk, the horizontal surface, which is the center of interaction around which all other devices are managed. Inside, it uses back-illuminated computer vision to track passive objects on ist surface. It also hosts a Flock of Birds™ transmitter from Ascension for tracking two 6-DOF (degrees of freedom) position trackers above the desk’s surface.
•
the active lens, an arm-mounted, moveable flat-panel TFT display located above the desk, which is used to display a 3D-view on the content. It is tracked with a 6-DOF Flock of Birds™ receiver, which’ position and orientation defines the virtual viewpoint of the 3D-view.
•
the passive lens , an optically transparent surface used for viewing graphical overlays. It is a passive fiber-cluster disc through which the desk projects its graphical content and it is tracked with a Flock of Birds receiver.
•
a collection of various physical objects and instruments which can be used for tangible interaction. These objects are phyiscal icons, a tray, and other instruments.
passive lens
active lens
desk Figure 3.4-9: The metaDESK system.
One prototype application for the metaDESK is Tangible Geospace, which represents an interface supporting the manipulation of graphical views of the MIT campus, driven by the TUI of the metaDESK. Several physical objects and instruments for interaction with the geographical space are hosted in a translucent tray mounted on the desk’s surface. The desk itself displays a projected 2D-map of the MIT campus which can be interfaced with various objects and instruments. When the user picks the small physical model of the MIT’s Great Dome from the tray and places it onto the desk, the 2D-map is aligned with the Dome object according to its position and orientation tracked by computer vision (Figure 3.4-10). This way, the digital 2D-map (“rep-d”) is bound to the Great Dome (“rep-p”), serving as a physical handle (= „phandle“) for manipulating the map. The physical dome model also acts as a container for the digital information about the MIT („model“) as well as referring to the real building, therefore serving as a physical icon [Ull1997b]. When a tangible artifact is used in this way, it is also called “phicon” (= PHysical ICON), the TUI complement of the GUI-Icon. 25
The arm-mounted active lens displays a 3D-view of the MIT with its buildings in perspective, as shown in Figure 3.4-11. The active lens is coupled to the Dome-based model of the MIT campus and tracked with 6-DOF in a way that moving the lens results in a navigation in 3D-space consistent with the behaviour of a digital camera with a LCD backface, designed to decouple eye-position from camera-viewpoint. This approach also allows multiple users to share the lens view.
Figure 3.4-10: The Great Dome phicon.
Figure 3.4-11: The active lens in Tangible Geospace.
The phicon and the active lens are complemented with additional tangible instruments for further manipulation such as the passive lens . The passive lens is a wood-framed surface, that serves as an independent display when augmented by the back-projected desk Figure 3.4-12. When the user puts the lens onto the desk, it displays a secondary overlaid view of the MIT campus such as an aerial orthographic photo. Since the content of the lens is displayed by means of the desk’s back projection, it requires no additional display resources (i.e. screens) and is also light-weight and cheap. The passive lens showing an alternate view on the 2D map is also conceptually supported by the “Magic Lens” metaphor, introduced by Bier et al. [Bier1993].
Figure 3.4-12: The passive lens.
The underlying tangible interface design approach of metaDESK is based on elements which are part of the GUI desktop metaphor. GUI-widgets such as windows, icons, menus, and handles are physically instantiated and mapped back into the real world [Ull1997b] as shown in Figure 3.4-13: the GUI “window” is instantiated as a physical lens (active/passive lens), which can be grasped and moved above the desk-surface. Small scale models such as the Great Dome model are physical icons (= “phicons”) on one hand, and 26
physical handles (= “phandles”) for translating, rotating, and scaling the map on the other. The phandles are also an equivalent to the bricks in Graspable UIs, previously described in this chapter [Fitz1995]. The “tray” from where the models are picked up, is a mechanism for choice, which maps to the functionality of a GUI menu. This comparison of concepts shows, that the metaDESK is a good example, how a TUI can complement GUIs by embracing the richness of the physical environment and providing new ways of humancomputer-interaction.
Figure 3.4-13: TUI instantiations of GUI elements.
3.4.5.2 mediaBlocks MediaBlocks is a tangible user interface based on small, electronically tagged wooden blocks that serve as physical icons (= phicons) which are dynamically bound to lists of online media elements [Ull1998]. It is also a system for handling and manipulating collections of these blocks, which physically embody videos, images, sound, text, and other media elements. MediaBlocks do not actually store data internally in the way of common floppy discs. Instead, they are embedded with tags having a certain identity (ID), allowing them to function as a physically embodied URL. Though the blocks are only a reference to online content, in the eyes of the user they behave like “containers” for media data. MediaBocks interface with media input and output devices such as video cameras and projectors, and allow digital media to be rapidly “copied” from a media source and “pasted” into a media output display. As shown in Figure 3.4-14, they serve as medium of interchange between media sources, display devices, manipulators, graphical interfaces, and tangible interfaces. In addition, they should fill the user interface gap between physical devices, digital media, and online content [Ull1998].
Figure 3.4-14: MediaBlocks as seamless gateway between tangible and graphical interfaces.
27
MediaBlocks have two major kinds of usage: first, they function as capture, transport, and playback meachnisms, supporting the movement of online media between different media devices. This is realized with a physical analog of the GUI core concept of “copy and paste” by combining mediaBlocks with physical slots mounted upon associated media devices. Inserting the block into the slot of a media source such as a whiteboard starts the process of recording to an online server. When the block is removed, recording stops. This procedure can be understood as “copying” from the media source into the block. In the same way, the recorded content is “pasted” by inserting the block into the associated slot of a media output device, e.g. a projector or printer as shown in Figure 3.4-15. Alternatively, inserting the block into a slot mounted on the face of a computer monitor allows the contents of a mediaBlock to be exchanged with traditional computer applications using GUI drag-and-drop.
Figure 3.4-15: Media slots mounted on a printer (left) and a monitor (right).
The system’s second major function allows mediaBlocks to serve as media containers for playback as well as media controls for modification with a dedicated device, the physical media sequencer. This sequencer provides “racks” which serve as a physical constraint used to digitally index and sequence mediaBlock contents as a function of the blocks’ physical configuration on the rack (Figure 3.4-16). The “sequence rack” on top resembles a tile rack of the Scrabble™ game and allows the contents of multiple adjacent mediaBlocks to be assembled into a single ordered media list and to be dynamically bound to a new mediaBlock container. In the same way, the “position rack” at the bottom maps the physical position of a mediaBlock into an indexing operation upon the contents of the block: when a mediaBlock is positioned at the left edge of the rack, the block’s first media element is selected. Moving the block to the right edge of the rack selects the block’s last elements, while intermediate positions provide access to the other elements inbetween.
sequence rack
screen position rack Figure 3.4-16: The physical media sequencer.
The blocks themselves are symbolic “tokens” or “containers” representing a simple data structure, a list of media elements, which in terms of the MCRpd model is a strong 28
coupling between “rep-p” and “model”. When a block is put into one of the racks of the sequencer, its contents fold out from the side of the screen at the corresponding position, perceptually coupling the digital representation (“rep-d”) and the block (“rep-p”). Because of the way how the tangible objects are read by human and computer and constitute the state of the system, interfaces such as Illuminating Light and metaDesk represent a spatial approach: the position and orientation of the tangible artifacts within a frame of reference (e.g. table surface) represent and control the state of the system. In contrast, mediaBlocks represents a relational approach: it maps sequences, adjacencies, or other logical relationships between systems of physical objects to computational representations [Ull1998]. 3.4.5.3 ambientROOM Compared to the previously described examples, ambientROOM [Ishii1998] follows a different concept: TUIs such as Illuminating Light, metaDESK, and mediaBlocks support activities such as manipulating physical objects in order to control simulations, graphical representations, or exchanging media and watching the outcomes. These interfaces and the corresponding input/output activities occupy the foreground of the user’s activities, requiring his full, conscious attention. As already mentioned in this chapter, humans are also capable to monitor many sources of information, which reside in the background or periphery of perception. For example, we may have an idea of the weather outside from ambient cues such as light, temperatur, sound, and air flow from nearby windows [Dahl1998, Ishii1998]. Following this principle, ambientROOM shown in Figure 3.4-17 is not an interface, it is an interface environment designed to provide information for background processing. It displays information through subtle cues of sound, light, or motion which do not attract the user’s attention and reside at the periphery of awareness. These ambient media displays broaden the concept of “display” in order to use the entire physical environment as an interface. They draw inspiration from natural phenomena such as wind, sunlight, or the sounds of the rainforest.
Figure 3.4-17: Concept of ambient media and diagram of the ambient room.
29
The room itself is constructed from a Personal Harbor™ from Steelcase Inc.1 , a freestanding office room six feet wide and eight feet long. Parts of this room are augmented with the following ambient displays: The Water Lamp [Dahl1998] is based on the visual effect of water ripples created by raindrops on the surface of still water. Instead of physical raindrops, “bits” falling from cyberspace create the ripples. Incoming events such as network-traffic or webpage-hits trigger three computer-controlled solenoids, which in turn tap the water in a pan. A light shines upward through the pan and produces changing patterns of light and shadow projected onto the ceiling. The Light Patches provide awareness of the physical presence of other people. Electric field sensors measure the amount of human movement in a work area adjacent to the ambientROOM. This activity is represented by a pattern of illuminated patches projected onto an inner wall of the ambientROOM. This “active wallpaper” is rarley noticeable unless a sudden change occurs in the area’s activity or the number of people present. The Natural Soundscapes communicate information on the audio channel. A subtle but still audible soundtrack of birds and rainfall is present in the ambientROOM. The volume and density of the soundtrack are modulated in order to display approximate quantities such as the number of unread e-mails, or the value of a stock portfolio. water ripples
natural soundscapes
light patches
clock
bottles
Figure 3.4-18: Photo of the ambient room.
In order to manage the previously mentioned information-streams, the user can manipulate a number of ambient controls: The Bottles [Ishii1999] serve as graspable container for digital information. Customdesigned electromagnetic tags embedded in the bottles enable each one to be wirelessly identified. Uncorking a bottle activates its tag and “releases” information into the room. As
1
http://www.steelcase.com
30
an example, when the bottle is uncorked, the sound of vehicular traffic becomes audible corresponding to the amount of traffic in the local network. The large wall-mounted Clock is not only able to display time, it also serves as control, because its exposed hands allow navigation through temporal events. Manual rotation of the clock’s hands prompt the displays of the ambientROOM to shift to their former or future states. Meanwhile the actual time is projected onto the clock’s face. This way, a user who returns from an absence could review the activity of the display of past hours, or could read the displays of future events. The ambientROOM project shows, that ambient displays are capable of communicating a) people’s presence/state; b) atmospheric/climatical phenomena; and c) general states of large and complex systems (e.g. network traffic, activity in a building, webpage-hits). Although ambient media is processed continually in the background, it can quickly move into the center of attention, when a deviation from the normal pattern occurs or the foreground activity is interrupted. Using ambient media as additional way to convey information, advantage is taken of the natural abilities of the human brain as a parallel processor and as an attention manager [Ishii1997].
31
4 senseDesk – System Design and Implementation This chapter discusses the design and implementation of senseDesk, a system of hard- and software which allows exploration and demonstration of the application domains of tangible user interfaces (TUIs). After a description of project goals, related work and requirements, we will have a closer look on the research and development of the underlying hard- and software along with issues concerning practical implementation.
4.1 Motivation As already discussed in Chapter 3, Tangible User Interfaces have several advantages over traditional GUIs because they •
allow true direct manipulation by putting the task objects directly into the user‘s hand and making the computer „invisible“, so much less traditional computer skills are required for handling the interface.
•
increase communication bandwidth with the computer by allowing direct bi-manual interaction, making use of physical affordances and giving enhanced tactile and kinesthetic feedback.
•
make use of already existing and highly developed real-world skills and work practices, therefore combining manipulational and representational powers from the physical and the digital world.
•
facilitate communication and collocated cooperative work between multiple users by distributing physical objects in a physical environment for supporting memory, learning, and interpersonal communication.
While these advantages and benefits sound promising in theory, they also have to be verified in practice. Therefore, I have designed a custom system for further experimentation and evaluation of tangible interaction.
4.2 System Definition As first step, I had to find a concept of a feasible system, which would be able serve as a flexible environment for demonstrating and applying TUIs. I defined three possible contexts of usage, in which the system should operate. Then I also looked at various already implemented tangible user interface systems and analysed them, referring to the prospective usage scenarios. This procedure enabled me to find a preliminary concept for my own TUI-system.
32
4.2.1 Usage Scenarios Inspired by various existing TUI installations (see also chapter 3), I started with the conception of my own TUI system which should work in three different contexts of usage: a) Lab research, as a flexible and easily extendible hardware/software-toolkit. Developers should be able to integrate and try out new concepts and components with minimal effort. b) Demonstration, as an impressive and convincing prototype, communicating the high innovation potential of TUIs. Visitors should quickly grasp the concepts of TUIs and rapidly unterstand their benefits, possibilities and usefulness for future projects. c) As a robust and easy-to-use multimedia installation in settings like public spaces and fairs, where people are first-time users with no time for a learning curve. Nonetheless, they should enjoy playful interaction with digital spaces by applying well-known everyday interaction principles and habits without „first having to learn a computerinterface“. 4.2.2 Related Work Keeping these three usage scenarios (research, demonstration, exhibitions) in mind, I compared different concepts of exemplary, successful TUI systems in order to find the most promising direction for my own system development: •
Rasa, developed by McGee and Cohen [Gee2001] is an augmented whiteboard system for the support of strategic planning activies in military command posts . It is based on multi-modal I/O supported by speech recognition for verbal input and computer vision for tracking and reading POST-IT Notes. These POST-IT Notes act as (spatially interpreted) physical tokens representing military units. The interaction space is a map attached to a wall, in front of which participants communicate with each other and additionally interact with the system physically and verbally. A video-beamer projects graphical system output onto map and tokens, strongly connecting the physical representations with overlaid digital information.
Figure 4.2-1: Rasa, an interactive wall for computationally augmented strategic planning.
While interaction with a wall is a promising concept for gestural input (see also HoloWall [Mat1997], Laser Range Finder [Stri1998]) and collaboration (see also 33
BrightBoard [Staf1996] and Interactive Mural [Guim2001] ), the physical objects have to be attached to the wall to prevent them from falling down. While this is not a problem with sticky POST-IT Notes, heavier objects require a system using magnetic tags, thus raising costs and diminishing flexibility for producing the tangibles. As another side-effect, this approach would also limit the choice of position-sensing technology (e.g. tracking electromagnetic tags becomes impossible because of magnetic interference). •
Triangles, developed at MIT by Gorbet, Orth, and Ishii [Gor1998] is a physical/digital construction kit which allows people to manipulate digital information by connecting identical flat, plastic triangles. Combining tangible building blocks for building spatial relationships, users can tell non-linear stories, control multimedia presentations or explore scheduling, group dynamics, and workflow systems. The triangles act as physical embodiments of resulting digital information topography. Connection to the digital world is made via a „mother“ triangle which is serially connected to a PC. Graphical output is displayed on monitors or similar devices, so the triangles-system strongly separates input controls and output representations. It is therefore not an ideal basis for demonstrating the TUI concept.
Figure 4.2-2: Triangles as physical embodiments of digital information topography.
•
In the Luminous Room project at MIT [Und1999b] mentioned in the previous chapter, one or more users freely interact by moving real objects within a digital space which is projected onto a table-top surface. The physical objects (which can be made of almost any shape and material) are marked with patterns of coloured dots and tracked with computer vision software. The projected graphics immediately react to position and state of the objects, providing a strong coupling between real and virtual representations.
34
Figure 4.2-3: Luminous Room used as optical workbench, called “Illuminating Light”.
The realised system looked also robust and suited to my prospective usage scenarios, while beeing applicable to many different domains: the demo applications for Luminous Room range from the simulation of optics (Illuminating Light), exploration of fluid mechanics around free-form objects to simulations of shadow-casting and windbehaviour for assisting urban planning activities. 4.2.3 System Concept To further define the project vision, I had to consider two fundamental issues: 1. Issue: Which tangible interactions should be demonstrated ? ο
Which physical artifacts (tangibles) should be used ? Answer: people should be able to interact with the system via handling graspable and freely movable objects, because
ο
this form of manipulation is very natural for humans and knowledge can be drawn from many everyday tasks and habits.
ο
the objects can be used in many different ways: as tools (manipulation), tokens (references) or containers for information and media.
ο
this kind of interaction is a very good demonstration of the strengths and possibilities of TUIs.
2. Issue: Which spatial setting should be used ? ο
Which interaction space should be defined by the system? A device, a table, a wall ?
ο
Does the setting stimulate interaction among multiple users ? Answer: the interaction space of the system should be a reactive table-top environment in the fashion of a round table or a desk. The table setting appears to be most suited for the prospective usage contexts (research, demonstration, public installation) for several reasons:
ο
Similarity to many real world interaction situations: When creating or managing physical content, we mostly use a table as working surface. Moreover, when working with multiple graspable objects, we often use a table for moving, organizing, and 35
depositing them. Think of activities like cooking, drawing, assembling a tinkertoy, or playing with a jigsaw puzzle. ο
Stimulation of collaboration: In face-to-face meetings people usually congregate around a table for social interaction. They like to handle artifacts on the table related to the meeting for supporting their arguments and sharing information. Therefore a tablesetting does not only stimulate social engagement, it also has the potential of transforming a single-user experience into a shared, collaborative multi-user activity.
ο
Integration of multimedia: A video-beamer mounted above can project graphics and animations directly onto the table, turning its surface into a large graphic display. This setting enables the user to directly manipulate digital content with his hands and enables the computer to directly attach digital output to real objects. Projecting on top of the objects also offers interesting possibilies like colourizing and animating them, effects which transform them into reactive entities by digital augmentation.
For these reasons, the concept of an interactive table serving as input/output-surface mediating between real and virtual representations seemed to be the most promising direction in order to demonstrate and apply TUIs. Figure 4.2-4 shows a first conceptual sketch of the system, as it has been defined so far: the core-component of the system is a tabular surface with the tangible interface objects on top. The states and positions of the objects are sensed by a tracking unit which transmits its results to the application unit. The application unit processes all input data and runs the application software. The graphical output is projected back onto the table surface and the objects, closing the loop of input/output-information processing. Since the very heart of the system is the tabular surface, which is sensitive to the real world, the name of the project became “senseDesk”.
Figure 4.2-4: Conceptual sketch of the system.
36
4.3 System Requirements Since the main goal of the senseDesk project is the implementation of a real, working system, it was evident that the hard- and software being used would be subject to commercial and technical restrictions. In order to judge possible solutions for implementing components, a clear definition of minimum requirements was necessary. 4.3.1 Interaction Space In order to allow natural, spatial manipulation of the tangible objects, the physical space where the interaction takes place must encompass a certain area. The size of this area should be at least 40x30 cm to support bi-manual activities. For multi-user interaction, 70x50 cm is the minimum. 4.3.2 Complexity, Costs and Feasibility The system-hardware should only use not more than 2 PCs and consist of standard components only, except for the object tracking hardware which can be custom made within a development time of max. 6 Man-Months. The tracking interface should cost max. 800 € with additionally max. 30 € per interface object. 4.3.3 Integration of Multimedia Objects and Table should serve as canvas for rich graphical output, enabling the creation of impressive multimedia worlds extended with tangible interaction. High-quality stereo sound should be provided for enhancing the experience via the auditory channel. 4.3.4 Elegance and Aesthetics The technical inner workings of the system should not prevent an aestically pleasant design of the physical interface. The whole appearance of the system should convey the illusion of absence of technology and computers.
37
4.3.5 Number of Objects, Degrees of Freedom Interaction should be possible at least with 4 objects which can be used simultaneously. The continuous tracking of XY-Position (= 2 DOF) is minimum requirement, with solutions providing azimutal rotation information (= 3 DOF) preferred.
Y
azimut
Y X
X
Figure 4.3-1: 2-DOF and 3-DOF
4.3.6 Flexibility and Extendibility Hardware: Arbitrary artifacts should be easily convertible to active interface objects. Technical limitations regarding form, size, and material should be minimal. The implications on design of the interactive table and the space around it should also be minimal. Extensions with additional interfaces (like a trackball, buttons, etc. ) should be effortless. Software: The programming environment should allow quick and easy integration of new concepts and changes for rapid prototyping. Therefore it should provide sufficient abstraction for hardware and system issues. 4.3.7 Robustness The installation of the system should be possible with minimum effort and technical knowledge. Calibration procedures at setup time should be minimal and once the system is running and being used, no recalibrations should be required. During interaction, users should not have to handle table and tangibles with special care, due to their possibly fragile technical nature. The software environment should enable developers to evolve a demoapplication from prototype-level to a mature piece of software that excels in speed and stability. 4.3.8 Low Latency, High Accuracy The illusion of seamless visual coupling between digital output and physical objects is important for TUIs, because the synthesized visual, auditory, and real haptic stimuli have to be perceived as simultaneous atomic events: imagine a tangible object with a digital picture „tied“ to it via projection from above. When the object is moved in space and the projected image lags behind because of slow software or hardware, the illusion of unity of bits and atoms becomes immediately destroyed. Therefore the maximum system latency should not exceed 100 ms: when an object is moved 10 cm/s in space, the connected image 38
should stay 1 cm behind in the worst case. When an object is stopped, its position should be sensed accurately enough to almost perfectly align the digital information with it. Since the size of the active sensing area should be as big as possible and was more important for me than accuracy, I set the limit for the maximum static position error for each axis to +/- 5 mm.
4.4 Hardware Architecture In order to specify the hardware architecture of a system which should meet all previously defined requirements, I had to identify the critical hardware components and then find a feasible way to implement them. The most critical component of the TUI system is the mechanism for identifying and tracking the interface-objects, because its capabilities and constraints (like speed, accuracy, number of objects, degrees of freedom, etc.) define the quality of interaction and persuasiveness of the „tangible bits“ illusion. As next step I focused the research on methods for tracking of physical objects, which can be divided into sensing innate physical properties and into sensing special tags attached to them. 4.4.1 Sensing of Physical Properties In theory, identifying and tracking the interface objects‘ innate physical properties (weight, form, colour) is the optimal solution: no electronics or other special devices have to be built into the tangibles, thus keeping cost and complexity low. Weight could be measured by a pressure-sensing array of piezo-electric wires, similar to the „Magic Carpet“ system developed at MIT [Par1997]. Tracking form and colour could be realised with a computer vision system which uses a camera mounted above the table. After a number of investigations and experiments, I considered these methods as not useful for my purposes, because 1. the sensed weight of the objects constantly changes during interaction, which makes it hard to fulfill requirements 4.3.7 (Robustness) and 4.3.8 (Low latency, high accuracy and precision). 2. recognition by form violates requirement 4.3.6 (Flexilibility and Extendibility), because each object‘s form would have to be significantly different from the other objects‘ forms. During interaction, hands and body-parts of the user would get in the line of sight between objects and camera, making continuous and reliable recognition difficult. 3. recognition by colour also violates requirement 4.3.6 and gives the interface objects a strange visual appearance violating 4.3.4 (Aesthetics). Projection of readable text and aesthetic graphics onto the already coloured objects become difficult, too.
39
4.4.2 Sensing of Tags After my investigation in sensing physical properties, I decided that I had to attach certain “tags” to the tangibles, in order to turn them into trackable entities. By integrating such a tag into each object that is used for interaction, one attains the potential of identifying and tracking them. Because cost constraints limit the tag-price (requirement 4.3.2), each tag is desired to be passive (i.e. no battery required) and can only have minimal built-in computing circuitry. 4.4.2.1 Magnetic Tracking Magnetic tracking of a graspable object is done with an attached small cube which acts as receiver of a magnetic signal transmitted by a fixed source. Systems like NestOfBirds from Ascension1 or Fastrak from Polhemus2 do not only provide absolute position information along three axes in space, they also report orientation about the three axes noted as yaw, pitch, and roll. Though magnetic tracking fulfills the requirements regarding range (10 feet), latency (4ms), precision, stability, number of available samples per second (120 Hz), the drawbacks prevail: the cubes are connected to the hardware with a 0.5diameter cable. This results in permanently tethered objects, making free interaction difficult, especially when the number of objects becomes more than two. Also the price per system is very high, starting at 10.000 € for a NestOfBirds with 4 sensors, so I had to dismiss the idea of using magnetic tracking.
Figure 4.4-1: Tracking system by Ascension, consisting of two receivers, one transmitter, and electronics.
4.4.2.2 Optical Tracking Optical tracking is a promising approach, because the tags/objects don’t need to be tethered with cables. Successful systems as the Illuminating Light [Und1999b] and Shared Space [Kato2000], which were discussed in the previous chapter, prove the applicability of computer vision techniques to tangible and augmented-reality interfaces. In Illuminating Light, each object is marked with a unique pattern of coloured dots. The coloured dots are created with 3M reflective tape overlaid with coloured cellophane. A light source next to the camera causes those spots to appear much lighter than everything else in the scene. A standard CCD-camera captures the objects from above and feeds a computer vision system
1 2
http://www.ascension.com http://www.polhemus.com
40
that tracks and identifies the objects on the basis of their unique dot patterns. This allows robust, wireless sensing of multiple objects. Unfortunately, the vision based approach also has strong negative implications: when an object becomes partially occluded (e.g. by the user’s hand), the tracking and identification fails due to line-of-sight problems. To fulfill requirement 4.3.8 (low latency, high accuracy), a faster and sophisticated vision system than in Illuminating Light is needed, requiring seperate vision-workstation or costly DSP-hardware. The use of coloured dots is the biggest problem: projecting colourful graphics onto the table surface confuses the vision-system, the objects must have a minimum size (= the size of the pattern, at least the area of three 2x2 cm dots), and the coloured dots negatively influence the visual appearance of the objects, heavily violating requirement 4.3.4 (aesthetics).
Figure 4.4-2: In Luminous Room, tangible objects have to be marked with coloured dots.
Other systems using different types of visual markers such as symbols or 2D-matrix code [Kato2000, Rek1999] suffer from the same problems. Higher-level vision systems could track the objects without the need of marking dots, by just comparing their visual 2Dappearance against internally stored 3D-models (model-based vision). For my applications these systems would be too complex and expensive, also they would require elaborate calibration-procedures at setup-time, heavily violating requirement 4.3.2. Despite these drawbacks it has to be kept in mind, that the costs of computer vision systems are constantly dropping while their capabilities increase, so I expect this technology to become better applicable to TUIs in the near future. 4.4.2.3 Chipbased RFID Tags Another way to realise a TUI is to embed each object with a small RFID-tag. RFID stands for radio frequency identification: RFID-tags operate in the radio frequency band, enabling a reader device to remotely communicate with them by wireless communication. The advantage of this principle is that no line-of-sight contact between sender and receiver is required, reader and tags can be completely hidden, therefore the visual appearance of interface objects and table does not suffer. Since these tags are microprocessor-equipped, but required to be passive by requirements (no wired power-supply or batteries), the reader has to transmit energy signals powering each individual tag via electromagnetic coupling. These inductive near-field tags are then able to respond with their unique ID to a reader 41
which is connected to a computer. Near-field means that the sensing distance is short compared to the wavelength and to the size of the antenna involved. The antennas of tag and tag-reader are inductively coupled together, in a manner similar to transformer windings, allowing exchange of supply energy and information via RF.
Figure 4.4-3: RF-Tags (left) and the communication principle (right).
The functionality of chipbased RFID-tags ranges from providing just an identity code or sensory data to randomly reading/storing of data on the tag. This very robust, compact and cheap technology (0.5-3 € per Tag) is extensively used for applications like asset-tracking, animal-identification, access-control, sports-timing, and ticketing. Commonly available inductive tags have an operating frequency in the range of 60Hz to 100Mhz and come in sizes ranginging from 3mm² to 50 cm² with reading distances ranging from 0.5 cm to 1m, depending on application and technology. Other types of RFID-tags than inductive ones do exist, but either require a extremely short reading distance ( 1000 € for A3-Size). Also the number of objects which can be used at the same time is very limited: standard commercial tablets support at most two input devices at the same time, not enough for my TUI, which should support at least 4 objects (requirement 4.3.5). Patten et al. successfully built a TUI-system based on two Intuous tablets from Wacom and specially modified pucks as input devices [Pat2001]. The pucks are much more complex than a LC-tag, relying on custom-made circuitry, are battery-driven, therefore quite expensive and can’t be integrated into small objects. In addition, there is a noticeable latency, when more than two objects are moved at the same time, because a randomly-switched serial communcation scheme is used. Due to these limitations, I came to the conclusion that standard tablet hardware was not suitable for my purposes. 4.4.3.2 Custom LC-Tag Reader Since I wanted to use the benefits of LC-tags (simplicity, low costs, small size) for tracking the interface objects, I had to find reader-hardware that supports precise 2D-tracking of multiple tags at high speed. The ringdown tag reader presented in [Hsiao2001, 13] provided a good starting point: the reader sequentially sends out RF-bursts at certain frequencies. When a tag with a matching frequency is in range, it produces an echo which is detected and analyzed by the reader.
44
Figure 4.4-6: Schematic of MIT ringdown tag reader.
Some differences and extensions characterize the custom hardware for the senseDesk interface, provided by Zowie Inc.1 : the 2D-position of the tags can be tracked with a specially structured antenna, a grid of multiplexed wires, similar to a graphics tablet’s sensing mechanism. The bursts of different frequencies which identfiy the different tags are generated by DDS (Direct Digital Synthesis) and can be changed by software. The tags’ identity and XY-intensity information is transmitted to the PC via serial communication. After several experiments, I was able to build an interaction-surface with a sensitive area of 75x56 cm, large enough to fulfill requirement 4.3.1. A maximum of 8 tags can be tracked simultaneously with a precision of +/-2mm and an update rate of 20Hz. The tags are coin-shaped, flat air-coils with 35mm in diameter, are cheap to produce, and easy to integrate into different tangible objects. Therefore, the custom hardware solution fulfills the requirements as shown in Table 4.4-1. Property
Required
senseDesk reader
Interaction space
> 70 x 50 cm
75 x 56 cm
Max. number of objects at the same time
4
8
Min. Degrees of freedom (DOF)
2
2
Max. latency
100 ms
50 ms
Max. static position error +/- 5mm +/- 2mm Table 4.4-1: Requirements compared with the properties of the custom reader hardware.
Restrictions: the system tracks the XY-position of the tags (2-DOF), but is not able to provide Z-position or orientation information. Due to the electromagnetic nature of RFcoupling, conductive or magnetic material close to a tag causes problems. Therefore the tangible objects should not contain any metal or other conductives. 4.4.4 Prototype Hardware Architecture Since the system’s most critical component (object recognition and tracking) was defined, the next step was to define the system architecture and develop a first prototype. Figure 4.4-7 on the next page shows the architecture of the first research prototype: the tabular interaction surface exchanges RF-energy with the tags attached to the tangible objects.
1
http://www.zowie.com
45
The tag-sensing hardware takes the necessary measurements for position and identity information and transmits the data via RS-232 to the Multimedia-PC. The application running on the PC interprets the incoming data-packages and computes location and movement of the objects. According to what is happening on the table, the system changes its projected graphical output in real-time, merging the digital with the physical world.
Figure 4.4-7: Hardware architecture of the research prototype.
4.4.5 Hardware Details The Multimedia-PC is equipped with standard components: a 900Mhz AMD-Athlon CPU is inside along with main memory of 256Mb RAM, because when running out of physical memory, Windows 98 degrades in speed. I use a Matrox G400-DualHead graphics card with 2 VGA-outputs. When developing a system which generates interactive projections, the additional VGA-output is essential, especially during programming and testing phases: while the beamer projects the graphical output onto the surface, a monitor connected to the second output keeps the development environment accessible for coding and debugging.
4.5 Software Architecture With the different usage contexts (research, demonstrations, exhibitions) in mind, I had to find a software architecture, which suits to a system serving as flexible RAD-tool (rapid application development) for research purposes, enabling the quick implementation of prototypes. For impressive demonstrations, the creation of compelling multimedia presentations (graphics, animations, sound) should be supported as well. When used as self-running multimedia terminal at exhibitions, the software applications should run with maximum stability and reliability without the need for technical attendance. 46
4.5.1 Research: Software Development Platform I cross-checked three widely-used software-development platforms for Windows™ against the requirements (flexibility, multimedia-capability, stability and speed): Microsoft Visual C++, Microsoft VisualBasic and Macromedia Director. The results are shown in Table 4.5-1: while Visual C++1 excels in high speed and stability, it is not a RAD-language. This especially holds true for multimedia-intensive applications, because already simple interface-changes and multimedia effects result in noticeable coding-effort and require good programming skills, even with the support of toolkits. Visual Basic2 is a very popular tool for developing prototypes and applications due to its flexibility and the short learning curve for beginners. Unfortunately Visual Basic is more targeted on application-development than on multimedia and it is therefore not ideal for developing highly interactive graphic interfaces. Macromedia Director3 is a professional multimedia-authoring tool for PCs and Macintosh. Movies (= programs and media assets in one file) can be quickly developed and many media types can be integrated, therefore it is an ideal RAD-tool for interactive multimedia-applications. Advanced users can create complex movies within Director by using its native scripting language called Lingo. Director’s virtual machine (which executes precompiled bytecode) is not highly optimized for compatibility reasons, therefore Lingo executes 3-4 times slower than Visual Basic and roughly 100 times slower than C++. This shortcoming in speed can be compensated with self-written Xtras, C++ plugins containing time-critical code or additional functionality. Standalone .exeapplications (=projectors) can be created from authored movies and since Director version 7.0, they are stable enough for self-running multimedia installations. These features make Director the development platform of my choice for implementing the senseDesk-software. Flexibility
Multimedia
Stability
Speed
Visual C++
-
-
++
++
Visual Basic
+
o
++
+
Director (Lingo)
++
++
+
o
Table 4.5-1: Comparison of different software platforms.
4.5.2 Macromedia Director This subchapter gives a very brief introduction to the basic concepts of Macromedia Director. It also discusses the possibilities of Director’s scripting language Lingo and some of its features, which where important for implementing the software framework.
1 2 3
http://msdn.microsoft.com/visualc http://msdn.microsoft.com/vbasic http://www.macromedia.com/software/director
47
4.5.2.1 Authoring Metaphor Director uses the metaphor of directing a film for authoring projects, therefore the files are called movies. The environment offers several tools and windows that enable the developer to have complete directorial control of all facets of the project. Figure 4.5-1 shows the most important windows when authoring in director: stage, score, cast, and the property inspector. When a movie is authored, media elements (graphics, sound, text, buttons, ect.) are first imported into the cast window, so they become cast members. In the next step, the developer places the cast members on the stage, where they become instantiated as sprites. The stage is the window, where the project actually exists and where all input/output of the movie happens. Sprites are the on-screen instances of cast members and their appearance and position can be freely changed over time with the help of the property inspector. Each sprites occupies a sprite channel in the score, which directs the pace and flow of the movie. The score window consists mainly of a large table, where the columns are the frames (screens) of the project and the rows (sprite channels) are reserved for the sprites. Each sprite that is placed in a frame resides in a separate table cell. This arrangement allows fine control over the manipulation of sprites within a frame. Advanced developers can provide sprites and frames with functionality by attaching behaviours (lingo scripts) to them, in order to produce loops, buttons, animations, etc.
stage
score
property inspector
cast
Figure 4.5-1: The Director authoring environment.
When a movie is executed, director displays one frame after the other, from left to right, with a specified frame rate. During a frame director renders the sprites onto the stage and executes the currently active behaviours. These behaviours (or scripts) can react to almost every kind of event (mouse, keyboard, system) and therefore allow fully interactive multimedia applications. 4.5.2.2 Scripting in Lingo In order to create applications which reach beyond the complexity of simple animations and slide-shows, the developer has to obtain more control by adding Lingo scripts to his movie. Scripts support constructs like navigation, computations, decision trees, or access to input devices.
48
In detail Lingo allows •
Control of cast members and sprites (position, scaling, rotation, colours, blend, etc.)
•
Logical structures like conditional branching, loops and case construcs.
•
Jumps and branches in a movie, between movies, or between application windows.
•
High-level constructs like object-oriented scripting.
•
Access to director extensions (Xtras).
Several type of Lingo scripts exist: behaviours (sprite and frame-scripts), movie-scripts, and parent scripts. Behaviours are the most common scripts in Director-authoring. They can be attached to a sprite or a frame, in order to make them behave in a certain way. A standard frame-script found in almost every movie is -- Frame-script: A simple loop on exitFrame go to the frame end
Figure 4.5-2: A standard frame-script.
Scripts are not executed from the first to the last line of code, they contain handlers which react to events like the exitFrame event. Events are messages, which are sent from and to the Director environment. When Director arrives to a frame containing the script above during playback, it simply repeats this frame forever or until something else happening, because the “go to the frame” code is executed, when the frame is about to be left. A sprite-script for a simple button would be -- Sprite-script: A simple button on mouseDown go to frame “menu” end
Figure 4.5-3: A simple sprite-script.
When this script is attached to a sprite, a mouse-click on the sprite causes a mouseDownevent, which in turn makes the movie jump to a frame called ”menu”. Movie-scripts are very similar to behaviours, but they relate to the whole movie and not to specific entities. They mainly contain global subroutines (like math functions, string functions, etc.) or event handlers for global events (like prepareMovie, startMovie, stopMovie, etc.).
49
An example for a movie-script is -- Movie-script: A simple button on stopMovie put “Good Bye!” end
Figure 4.5-4: A simple movie-script.
When the movie is about to stop or quit, the stopMovie handler is called and displays the message “Good Bye!” in the message window. Macromedia Director also supports the OOP (object-oriented programming) paradigm by offering parent scripts. Parent scripts look like normal Lingo scripts, but they cannot be directly executed or attached to a sprite, because they just act as class definitions, like here: -- parent script: demo class -- The member name of this script is “DemoClass” property pWert on new me, parameter1 pWert = parameter1 end on printWert me put pWert end
Figure 4.5-5: A simple parent-script.
Such a class definition becomes an executable scripting object (= instance), when it is instantiated with the new()-constructor. -- Object Instantiation obj1 = script(“DemoClass”).new(“Hi, i’m Object 1”) obj2 = script(“DemoClass”).new(“Hi, i’m Object 2”)
Figure 4.5-6: Instatiation of two objects from the class demoClass.
The scripting objects obj1 and obj2 are now encapsulated instances with executable methods (printWert) and their own properties (= attributes). -- Method execution obj1.printWert() -> “Hi, i’m Object 1” obj2.printWert() -> “Hi, i’m Object 2”
Figure 4.5-7: Execution of methods of objects of the same class with different attributes.
Using object-oriented Lingo was advantageous especially for implementing the software framework, which will be discussed in 4.5.4. A more detailed treatise of Director and Lingo would be beyond the scope of this thesis. For more information on this topic, please refer to [Leske2000].
50
4.5.3 Software Components Overview Figure 4.5-8 hows an overview of the different software components and the sequence of their execution.
Low-level Driver
Demo Movie 1
Software Framework (Main Movie)
Demo Movie 2
...
Demo Movie n
Figure 4.5-8: Software components and their sequence of execution.
4.5.3.1 Low-Level Driver The low-level driver is a small autonomous application to startup the tag-reader hardware. It has three functions: •
Detection and Troubleshooting: the driver detects, whether a senseDesk readerhardware is connected to one of the PC’s serial ports and if it is responding correctly. In case of an error, a troubleshooting dialogue provides the user with help.
•
Initialization: basic parameters, which are stored in the senseDesk.ini text-file, like number of tags, sensing frequencies, receiver sensitivity are uploaded to the reader.
•
Calibration: according to the parameters, the driver calibrates the reader-hardware for maximum performance at the different tags’ RF-frequencies.
4.5.3.2 Software Framework The software framework is located is a director-movie called “senseDesk.dir”, launched by a stub-projector. A stub-projector is a small, directly executable .exe-file, which launches the another director-movie which is an editable .dir-file. The advantage: whenever changes are made to the main movie, the developer doesn’t have to tediously convert it into an .exeprojector each time. The movie contains the various software-modules for initialisation, diagnosis, screen calibration, a menu for launching the demo movies and sub-screens. 4.5.3.3 Demo Movies The demo movies are launched by the main movie, which also provides them with a global code and data. Each demo is a director movie containing one demonstration application for the senseDesk interface. The demos are separate .dir-files which are autonomous in execution: when the hardware is not present or the software framework is not loaded, the TUI-interaction is simulated by mouse input. This makes the software development-phase much easier, because the senseDesk hardware-system is not required for coding and testing the demos.
51
4.5.4 Software Framework Architecture The software framework works primarly as testbed for rapid and easy generation of TUI prototype applications. It is divided into 4 layers of abstraction as shown in Figure 4.5-9: on top the application layer handles hardware/software initialization tasks and offers functions for diagnosis and calibration of the hardware. New TUI applications or demos are also part of the application layer. The core of my software framework is the high-level API (application program interface) layer, which provides the developer with all lingo commands and objects necessary for creating TUI-applications. The API layer completely hides the other layers below, in order to let developers fully focus on the TUI-interaction routines, without having to care about low-level functionality or hardware-specific issues. The Xtra layer is a suite of plugins which bridge the gap between Director’s Lingo environment and the low-level System/Hardware layer, which represents the Windows OS, parameter files and hardware devices. Initialization
Calibration
Diagnosis
Main-Menu
Application Layer
ini – File R/W
OS Control
TUI-Events, Object-Positions
API Layer
FileIO Xtra
BuddyAPI Xtra
senseDesk Xtra
Xtra Layer
ini - Files
Windows
senseDesk Hardware
Figure 4.5-9: Software Framework Layers.
52
System/Hardware Layer
4.5.4.1 Application Layer The top level of the software framework consists of the application layer modules which are executed as shown in Figure 4.5-10. The initialization module •
reads in miscellaneous parameters (filepaths, information), which are stored in the app.ini text-file.
•
opens the serial port and connects to the senseDesk hardware.
•
creates an instance of the tuiManager object (described in later in 4.5.4.2).
hardware
parameters,
timing
Initialization
Main Menu
Calibration
Diagnosis
Demo launcher
Figure 4.5-10: Application modules and their execution sequence.
The main menu screen displays a random colourful screensaver animation, which prompts the user to put a physical interface object on the table. When one or more objects are on the senseDesk surface, the screensaver-animation vanishes and the physical objects are highlighted with different colours and numbers. This is the users’ first contact with tangible interaction: a computational graphical object (a coloured circle with a number) becomes attached to a physical object and also follows its movement in real-time, which results ina perceptual coupling of a physical artifact with a digital representation. Though this interaction has not “result”, it serves as a quick test to verify whether the system is working correctly. A menu-bar at the bottom of the screen lets the user choose one of the external demo applications, which are started by the demo launcher module. For a technical setup or further checks, the internal calibration of diagnosis modules are accessed by pressing keys ‘C’ or ‘D’.
Figure 4.5-11: Screensaver with menu bar at the bottom.
53
Calibration module: for a convincing illusion of digital and physical objects beeing merged together, the coordinate space of the input surface (senseDesk) and the coordinate space of the output front-projection of the beamer have to be matched via mathematical transformation. This ensures that the projected graphics are perfectly aligned with the tangible objects’ position. The calibration procedure is straightforward: the user has to align the center of an input object with a spot close to the top left corner of the projection and confirm. The same step is repeated with a second spot close to the bottom right of the projection. A simple calibration algorithm is then able to relate the two known absolute 2D screen-positions to the two measured 2D input-points and computes the scaling and the translation values of the 3x3 transformation-matrix T. Whenever the physical object’s position i is sensed during run-time, its corresponding projection screen-coordinates are computed by multiplying the input-Position vector i with the transformation matrix T. d1 = T * i1 i1 = (x1 ,y 1,1)
d2 = T * i2 i2 = (x2,y2,1)
input coordinate space (interface)
sx 0 T = 0 sy 0 0
tx t y 1
s ... scaling t ... translation
screen coordinate space (projection)
Figure 4.5-12: Mapping of input coordinates to screen coordinates.
The diagnosis screen graphically and numerically displays the raw sensing data coming in from reader hardware. The developer can examine the intensities of the RF-signals received by the reader’s antenna and also check, if parts of the antenna-grid or the tags are damaged. Also various software parameters, which relate to the current antenna’s size and geometry, can be adjusted for optimum performance of the sensing mechanism. The demo launcher module manages the transition from the main application to the various demo movies located in separate .dir-files. The launcher releases all allocated resources except the ones related to the senseDesk-hardware: when the tag-reader is connected and started by the initialization module, all parameters and methods for TUIinteraction are saved in a global tuiManager scripting-object, which permanently stays in memory, even when changing from main to demo movie. After a demo movie has been started by the launcher, it accesses the senseDesk interface only via the global scriptingobject, without any extra initializations required. With this method, the function calls in the demo movies stay unaffected by changes made to the underlying low-level hardware/software. Storing methods in a global scripting-object is an elegant Lingo technique for sharing subroutines among several movies: instead of running into versioning problems by putting the code of each function into each movie or having to deal with file-access-problems by 54
moving all functions into a shared script-file, the main movie stores all routines as methods in a global object, which serves as a script-library residing in memory. All subsequently called sub-movies can invoke these methods, avoiding the versioning or access-problems mentioned before. 4.5.4.2 API Layer The API is the core layer of the software framework: it contains all the commands and objects necessary for writing TUI-applications, completely hiding the inner workings of the layers below. This layer also the most relevant one for application developers and will therefore be discussed in detail, with an emphasis on the TUI objects and methods. OOP techniques were especially userful for the design of the API Layer, because the encapsulation of methods and properties into objects has two primary benefits: •
Modularity: The source code for an object can be written and maintained independently of the source code for other objects, which allows the flexible extension of the software framework. Also, an object can be easily passed around in the system, so all parameters and methods for TUI-interaction can be saved in a global senseDesk scripting-object, which acts also as a common script-library permanently staying in memory.
•
Information hiding: An object has a public interface that other objects or the application layer can use to communicate with it. The object can maintain private information and methods that can be changed at any time without affecting the other objects depending on it. This allows the shielding of the lower layers from the application layer. The application developer’s job becomes easier, because he can deal with tag- and tangible object software-classes matching to the real-world classes of tags and interface-objects, while the inner workings and specialities of the senseDesk hardware become invisible to him. Future changes to the underlying hardware also do not affect already programmed applications, because they just interface the high-level API Layer. For more information about object oriented Lingo, please refer to [Kloss2000].
The API layer is divided into sets of classes which manage different resources: the TUI classes for interfacing the senseDesk, sound classes for controlling audio output, profiler classes for code-speed measurements, etc. I modeled the different software classes following the straightforward principle of using natural analogies: Whenever possible, every software object relates directly to a real-world entity. This way, a tuiTag object directly relates to a real RF-tag, a tuiObject to a physical input object, a soundChannel object to a soundcard’s sound-channel, etc. This principle ensures a logically consistent object oriented design of the API layer. In order to avoid a detailed listing of all functions, just the design of the TUI-classes will be discussed here, as an example for how the API layer is working: the TUI classes manage all aspects of TUI-interaction with the senseDesk hardware and is therefore the most interesting ones for discussion.
55
As shown in Figure 4.5-13, the tuiManager on top of the class hierachy represents the TUI hardware, which in this case is the senseDesk tag-reader. It is the most complex class of the software framework, because it encapsulates all methods and data for handling the TUI-hardware and interaction: Initialization, serial interface handling, packet-oriented serial communication, hardware-recalibration, reading/writing parameters from/to the hardware and updating state and position of DF-tags and objects. The tuiManager class becomes instantiated as tuiManager object during the initialization of the software framework and then stays resident in memory as a shared, global scripting-object, constantly communicating with the tag-reader via the Xtra-Layer. When the situation on the desk changes (i.e. an object is added/removed or moves on the surface), the tuiManager receives data packages from the reader and processes them. The tuiManager interprets the data and generates messages (show/hide object, set position, etc.), which are sent to the corresponding tuiObject and tuiTag scripting-objects. senseDesk hardware
tuiManager contains
tuiObject
tangible object
contains
tuiTag
RF-Tag
Figure 4.5-13: TUI classes and their real-world counterparts.
The tuiTag class represents a single RF-tag and is the basic entitiy of tangible interaction. The attributes of a tuiTag are state (present/not present), position (XY-coordinates) and the ID of its associated real-world RF-tag. The tuiManager constantly updates the attibutes of all tuiTags by interpreting the data stream received from the senseDesk hardware-reader. tuiTag attribute
value
state
(present/not present)
position
XY-coordinates
tagID RF-tag number Table 4.5-2: Attributes of the tuiTag class.
56
The tuiManager also manages a number of tuiObject scripting-objects, which can be instantiated/destroyed anytime. A tuiObject represents one real-world interface object, which contains one or two tuiTags. An interface-object normally needs one RF-tag, but in many cases it is necessary to use two of them: because only the XY-position of a tag can be detected by the system, two tags are needed for measuring azimutal rotation, as shown in Figure 4.5-14.
Tag 2 Y-Axis interface-object
Tag 1
angle = arctan ( ∆ y / ∆ x )
X-Axis
Figure 4.5-14: Measuring the azimutal rotation angle of an object using two tags.
When the update-method of a tuiObject is invoked by the tuiManager, the states (present/not present) and XY-positions of the associated tuiTag(s) are processed. If only one tag is associated with the object, the data is interpreted as Table 4.5-3 shows: if the tag is present and its position is valid, the position of the tuiObject updated with the tag’s position. Because the position measurement is done by reading and processing many different radio-frequency sensory values, an invalid tag position is computed from time to time. If such an invalid position (e.g. coordinates off table surface, big distance jumps) is detected, an incorrect data reading is assumed and the update is omitted. When the tag is detected as beeing suddenly not present anymore, two different causes could be responsible: either an incorrect data reading has occured (à omit update, do nothing) or the tag is really not present anymore (à hide object). For distinguishing these two cases, a timer is started, setting the object’s state to hidden after 200 ms if no correct data reading follws up. As I found out, false readings are rare exceptions and are always followed by a correct reading, this delay timer acts as a low-pass filter filtering out the short peaks of false readings, which otherwise would cause disturbing flicker of the associated graphic. tuiTag
tuiObject heuristics
state
position
assumption
update
present
valid
reader data correct
update position and state
present
invalid
reader data incorrect
no update
not present
valid
either incorrect data or tag no update, hide after 200 ms not present anymore Table 4.5-3: Update-heuristics of a tuiObject with one tag.
If a tuiObject has two tags associated with it, the position processing is a little more complex, as shown in Table 4.5-4. If both tags are present and have valid positions, the tuiObjects position and rotation are updated. If one tag is missing or invalid positions are read (e.g. the distance between them is too small/large), an error is assumed and therefore 57
nothing happens. If both tags of a currently active tuiObject are suddenly not present anymore, the associated tuiObject is set to hidden state after a delay of 200 ms. tuiTags
tuiObject heuristics
state
positions
assumption
update
all present
valid
reader data correct
update position,rotation, state
all present
invalid
reader data incorrect
no update
one missing
-
read data incomplete
no update
none present
valid
either incorrect data or tags not present anymore Table 4.5-4: Update heuristics of a tuiObject with two tags.
hide after 200 ms
As an example for how these TUI-classes work together, a simple interaction scenario is discussed here as shown in Figure 4.5-15: An empty interaction surface is assumed in the beginning, therefore all tuiTags and tuiObjects are hidden. The user puts a physical interface object A, which contains tag 1 inside, on the table (a) at position (x1 ,y1 ) and the tuiManager object receives a stream of data packets from the senseDesk-hardware. The tuiManager processes the data and sets the state of tuiTag 1 to “present” and updates its position to (x1 ,y1 ). TuiObject A, which contains tuiTag 1, is also set to “present” and to the same position. A graphical object (e.g. a white dot) associated with tuiObject A, becomes visible at screen position (x1 ,y1 ), thus highlighting the real object A. The user now moves object A on the surface (b) to position (x2 ,y2 ), which generates an data stream according the movement. The tuiManager updates the tuiTag 1 and tuiObject A, which causes the white dot to follow exactly the object to position (x2 ,y2 ). The user lifts off object A from the table (c) and the tuiTag 1 is read as “not present” and tuiObject A is set to “maybe not present anymore”, triggering the 200 ms delay timer. After 200 ms, tuiTag 1 is still detected as not present, so tuiTag 1 and tuiObject A are set to “not present” and the white dot disappears.
Figure 4.5-15: An example TUI interaction scenario.
58
tuiTag 1
tuiObject A
a
present (x1,y 1)
present (x1,y 1)
b
present -> (x2,y 2)
present -> (x2,y 2)
c
not present
maybe not present anymore
c’
still not present (after 200 ms)
not present
4.5.4.3 Xtra Layer Macromedia Director is not limited to the functionality of its native scripting language Lingo. It can also be extended via special plugins, which are called Xtras. Xtras are packages of functions written in C++, which can be invoked like normal Lingo commands or media elements. In general these plugins are dedicated to certain task domains like hardware-access (serial port, printer, frame-grabber), OS-access (manipulation of registry keys, screen resolution), graphic effects (transitions, special sprites) or fast computations (math routines, cryptography). Many commercial and freeware Xtras are available, offered by third-party vendors and private developers. Macromedia also provides a free XDK (Xtra Development Kit), which allows developers to build their own Xtras. The Xtra Layer of the software framework plays the role of a mediator between the Lingoscripted API layer and the system/hardware layer. Three Xtras will be discussed here: senseDesk-Xtra, FileIO-Xtra and BuddyAPI-Xtra. The SenseDesk-Xtra: As already mentioned, Director’s native scripting language Lingo tends to be relatively slow for computations. On the other side, response times (tangible input è graphical output) shorter than 100ms are crucial for TUIs (see 4.3.8). Especially the low-level routines for interfacing the tag-reader, which handle the serial protocols and convert the sensory reader-data into reliable XY-coordinates (+ error estimates), have to be as fast as possible to avoid unnecessary latencies. Therefore, I ported these time-critical functions from Lingo to C++ for achieving maximum speed: first I used Lingo for developing the low-level routines. By examining them during profiling sessions, I identified the time-critical functions, rewrote them in C++ and integrated them into a dedicated custom-written Xtra, the senseDesk-Xtra. This way, I used the advantages of both languages: the flexibility of scripted Lingo during the development stage and the speed of compiled C++ at execution time. The senseDesk-Xtra provides all necessary commands for interfacing the senseDesk hardware: •
General handling of the serial interface (open, close, reset, flush buffers)
•
Raw byte-wise communication (readRaw, writeRaw)
•
Communication via a packet-oriented protocol (readPackets, writePackets)
•
Basic processing of the incoming sensory data for position detection (getXY)
The FileIO-Xtra from Macromedia1 supports the handling of filebased I/O. It provides all commands needed for finding, opening, reading, writing and appending text-files. These functions are necessary for accessing the .ini-files, which contain all parameters regarding input-devices, launching demo applications and hardware calibration.
1
http://www.macromedia.com
59
The BuddyAPI-Xtra1 is a commercially available utility package, which contains more than 100 functions for accessing the Windows API and Macintosh Toolbox. The functions used in the software framework are: •
getting information about the system environment (Windows version, QuickTime, etc.)
•
preventing multiple instantiation of an application
•
controlling screensavers (e.g. for deactivating them, to prevent unwanted interruptions)
•
detecting and changing the current screen-resolution
•
manipulating and restricting the mouse cursor
•
controlling other applications
•
The BuddyAPI-Xtra provides many functions for controlling the OS environment at run-time, which are essential for realising advanced applications, but are actually missing in Lingo. Therefore it has become a standard-Xtra for most multimedia developers.
4.5.4.4 System/Hardware Layer The filesystem is primarily accessed for reading/writing the two .ini-files, which contain the parameters regarding sensing hardware, application mode, demo filepaths and timing intervals. The file app.ini contains the overall application parameters and links in my case to hardware.ini, which in turn contains the special parameters concerning the sensing hardware and its calibration. This separation of .ini-files increases flexibility, because developers can switch to a different hardware setup by just changing the link to a different hardware.ini file. read
Main Movie
app.ini write link
hardware.ini
Figure 4.5-16: File structure of ini-files.
Figure 4.5-17 shows a typical app.ini file: the [Files] section contains the link to the hardware.ini file, while the [Hardware] section specifies, if a reader hardware is connected and at which port. [Applications] contains the path to the demo movies. The [Timer] section set the timing intervals (milliseconds) for regular refreshes of the tangible objects’ states and positions and the recalibration of the reader hardware. The [Misc] section enables the use of an optional trackball instead of a mouse, and for kiosk-mode, the screenresolution can be forced to the application’s native resolution of 800x600 pixels.
1
http://www.mods.com.au/budapi
60
[Files] sdini=hardware.ini [Hardware] connected=1 port=1 [Applications] numapps=5 paths=["..\app_humanexplorer\human04","..\app_urban\shadowTest08","flowFIELD","..\app_museu m\magicLens\magicLens03","../app_pong/pong_04"] isexternal=[1,1,0,1,1] [Timer] UpdateInterval=5010 CalibrationInterval=15000 AutoCalibrationInterval=1000 [Misc] TrackballPresent=1 ForceScreenResolution=1
Figure 4.5-17: A typical app.ini file.
The Windows operating system environment (output screen, active tasks, mouse pointer) is controlled with the BuddyAPI-Xtra, as described in 4.5.4.3. Controlling the mouse pointer is especially necessary, when alternative pointing-devices are used: If a trackball is connected instead of a mouse, the menu bar for choosing the demo applications has to operate in a different way: the various menu options are highlighted and selected according to the relative horizontal movement of the trackball, instead of having an absolute mousepointer for point and click. This is achieved by keeping the mouse cursor invisible and regularly resetting its absolute position to the middle point. The actual relative movement is then the difference between the current cursor position and the mid-point before the reset.
cursor mid-point
Figure 4.5-18: Measuring relative movement by subtracting absolute positions.
The senseDesk hardware designates the custom tag-reader, which detects the position of the interface objects on the desk surface, as already described in 4.4.3.2. The tag-reader communicates with the software via data packets. The packet-format is a structure of flexible length consisting of command, packet byte-count, data and checksum fields, which are preprocessed by the senseDesk-Xtra and interpreted by the tuiManager class.
61
5 senseDesk – Application Development In the previous chapter, I introduced a new system for tangible user interaction called the senseDesk, and described design and implementation of its hard- and software. This chapter discusses the development of the software applications for the senseDesk, which demonstrate the uses and benefits of desk-based TUIs.
5.1 Application Requirements In chapter 4, three scenarios of system usage have been defined: research, demonstration and public installation. In turn, the demonstration applications should •
reveal the benefits and capabilities of the senseDesk system.
•
impress their audience.
•
fertilize ideas for new TUI concepts and applications.
•
prove the applicability of the senseDesk to real-world application domains.
Simplifications: Since the applications are only demos for proving a concept, they should •
show little complexity and have not more features than necessary implemented, especially when they are not related to the TUI-interaction.
•
present content of the real-word task scenario of the suggested application domain less detailed, because the focus lies on showing the interaction principle.
62
5.2 Development Process Figure 5.2-1 shows the course of the development process: each demo application is motivated with a basic idea, which either relates to a basic interaction principle (e.g. using the tangible object as a Magic Lens) or introduces the senseDesk to a new application domain (e.g. implementing a TUI-version of the classic video-game “Pong” ). Next, the idea was verified by implementing a first prototype in order to study its applicability to tangible human computer interaction. If the protoype showed promising results, I produced a concept by envisioning the possible application domains and defining one suitable realworld scenario, which should be demonstrated with the senseDesk. Next I organized and produced the content (graphics, audio, text) fitting to the scenario and implemented the demo as Director-movie. At the end of a process step, I often conducted informal hallway usability tests: I exposed staff members and visitors to my demos, in order to integrate feedback of real-world users into the development process, which also provided clues for further refinements.
Motivation Implementation
Prototype Evaluation Knowledgefrom from Knowledge observing observing real-worldusers users real-world
Concept Implementation
Application Refinements
Figure 5.2-1: Course of the demo development process.
5.3 humanExplorer 5.3.1 Motivation 5.3.1.1 Coupling Physical and Digital Representations When a tagged object with a white and flat surface is put on the senseDesk, the image projected from above is also clearly visible on the object itself. The software is also able to project an additional image just onto the white object, because its position and identity are constantly detected by the tracking system. When the object is moved by the user, the second projected image attached to it follows in real-time. This perceptual coupling of physical and digital representations creates the illusion, that the surface of the object 63
surface is an active display, which is a very powerful concept: Passive input objects become enhanced with (seemingly) active graphical output capabilities. This effect is introduced to the user in the main menu screen of the software framework, where the different objects are attached with projected colours and numbers, as discussed in chapter 4.
Figure 5.3-1: A neutral white disc augmented with projected circle and number.
5.3.1.2 The Tangible Magic Lens The simulated spatial awareness of the tangible objects enables the implementation of another interesting concept: since the system tracks the location of a object, the coupled images are able to change according to its current position, an effect which by illusion turns objects into spatially aware displays. This display is able to give a different view on the content of the screen underneath, which may suppress/enhance data or reveal hidden information layers. Such an object acts as a visual filter called Magic Lens, a concept introduced by Bier et al. [Bier1993]: a Magic Lens is a screen region (similar to a window) with an operator (like “magnify” or “show wireframe”), which is performed on the objects viewed in the region. The Magic Lens is a sophisticated visualisation tool, because it provides rich context-dependent feedback and the ability to view details and context simultaneously, as displayed in Figure 5.3-2: a magnifying Magic Lens is moved over a piece of text and the letters underneath the lens are rendered in an enlarged version, an common tool used in graphic editors.
The Widgets and lenses can be composed by overlapping them, allowing a large number ofspecialized specialized toolsto to be created from a small basic set. basic set.Figure Fig 3 shows an outline color palette over a magnifying
r palette over
Figure 5.3-2: A text-magnifying Magic Lens.
In traditional GUI-systems, the Magic Lens is manipulated with a pointing device such as a mouse or trackball. This is an opportunity for TUI-systems like the senseDesk: The digital 64
Magic Lens could be intuitively and directly manipulated like a real-world tool by coupling it to a physical artifact. 5.3.2 Prototype For the first prototype TUI-application, I implemented a very simple effect: the user moves a white disc, a tangible Magic Lens, on a black surface and looks “through” the lens into a hidden world of graphics and texts (like peeking through a keyhole). Technically, the object is a moveable mask and the region of the screen occupied by it is rendered with the content of a separate graphics layer, as shown in Figure 5.3-3.
Object
Screen Mask
Graphics layer Figure 5.3-3: Prototype of a tangible Magic Lens.
The simple movable-mask prototype already shows, that the concept of the senseDesk and the concept of the Magic Lens go together very well: the user manipulates and moves the Magic Lens with his hands in a natural way, just like a real-world lens. during informal usability tests of the prototype, the test users •
quickly understood, how to use the lens after a learning phase of just a few seconds without the need for instructions.
•
liked to explore the whole interaction surface with the lens in order to examine the content that was “hidden behind the screen”.
•
stated that they didn’t have the feeling that they were dealing with a computer program or with a computer in general.
These observations indicate, that the strong analogy between a tangible Magic Lens and a real-world lens yields promising interaction techniques worthwhile further investigation. 5.3.3 Concept Since the prototype successfully featured the tangible Magic Lens as a tool for exploring or examining content, I used the same idea for the interaction scenario of the first demo: I conceived an application which could be installed as an exhibit in public space, like in a natural history museum. The visitor using the exhibit should be able to grab a tangible tool and examine the displayed content with it. The Magic Lens effect coupled to the tool should reveal another view of the content and provide contextual information. The theme of the application is exploring the human body, because it is a typical subject of 65
examination and also because there is plenty of material freely accessible via internet, published as a result of the “Visible Human Project” at NLM 1 . 5.3.4 Application In the beginning, the application displays the front view of the human body to be examined and prompts the user (assumed to be a museum visitor) to take the tangible tool and start the exploration. The tangible tool is a magnifying el ns with a grip, but the optical glass lens has been exchanged with a white disc with a tag inside. The white disc is tracked by the senseDesk and serves as passive projection display for the digital Magic Lens. This application of an everyday tool as input-device for human computer interaction is a good example of taking advantage of the user’s already learned skills and habits: the magnifier just affords to be picked up and to be used for examining an object. How to use the interface is obvious right from the start, because the user’s mental model matches the model of the system (= interface).
Figure 5.3-4: The real and the tangible magnifier.
When the user holds the lens above the desk, it displays an “x-rayed” view of the body showing the blood vessels and bones inside. The current vertical cross-section of the body which depends on the x-position of the lens, is shown on the upper right of the screen. The application also visually marks some of the body’s hotspots (organs like heart, lung, stomach, etc.) with a red dot and a textual label when the lens is moved close to them (Figure 5.3-5).
Figure 5.3-5: Interaction with the humanExplorer.
1
http://www.nlm.nih.gov/research/visible/visible_human.html
66
When the lens is held directly above a hotspot, additional contextual information such as explanatory text and animated graphics concerning the focused object of interest are displayed. In order to enhance the sensual qualities of the interaction, I also added an auditory layer to the interface: the closer the user moves the lens to an organ, the louder its typical noises become. This effect creates the illusion of listening into the human body (as with a stethoscope) in addition to visually peeking inside. 5.3.5 Discussion Concerning interaction, the current version of humanExplorer is quite simple: it consists of one interaction instrument, the tangible Magic Lens, with its 2D-position controlling the state of the application, comparable to the "passive lens" of metaDesk [Ull1997b]. The user is only able to spatially query the projected environment with the lens, without actually manipulating or changing any data. But the simplicity of this demo is also its key advantage: the test users especially enjoyed the playful interaction with the digitally augmented magnifier, which encouraged them to explore the application content and to learn more about it. 5.3.6 Conclusion and Future Work I consider education, entertainment, and artistic expression as the main application domains for an installation like humanExplorer, because ease of use, aesthetic appearance, and effective presentation of information combined with entertainment value are of high importance here. Regarding future work, an interesting extension of the humanExplorer would be a more sophisticated magnifier with one or two buttons added: by pressing a button the user could select a part of the body and get deeper into detail. The lens could also change its mode and display another layer of information, like the flow of blood or neural activity.
67
5.4 wayFind 5.4.1 Motivation The humanExplorer demo has featured a tangible Magic Lens, which gives view to a second graphical layer, so the operator of this Magic Lens is called “show second layer”. Alternatively, the lens could also operate as a magnifier, which enlarges the graphical objects underneath it, similar to a real-world magnifier. In addition, the user could vary the zoom factor continuously by rotating the lens (counter)clockwise with his hand (Figure 5.4-1). Such a 3-DOF tangible interface would allow rapid spatial querying of graphical content.
zoom rotation
Figure 5.4-1: Rotation controls zoom factor.
5.4.2 Prototype Since I wanted to implement the concept of zooming into graphical content with high speed and high resolution, I chose to use a vectorized 2D ground plan as graphic material. Because Macromedia Director is not able to handle complex vector graphics, I had to import the vector data of the floorplan into a Macromedia Flash1 movie, which can be embedded and controlled from a normal Director-movie. 5.4.2.1 Interface Object Design After some experimentation, I set the diameter of the projected Magic Lens to 20 cm (=200 pixel), which is a good compromise for showing enough processed content, while occupying as little screen space as possible (Figure 5.4-2). Since I wanted to project the enlarged content directly onto the interface object, I first chose to use a big white plastic disc (also 20 cm in diameter) for the tangible Magic Lens as shown in Figure 5.4-3.
Figure 5.4-2: Magic Lens magnifying a 2D map.
Figure 5.4-3: A big, white disc as tangible Magic Lens.
This solution turned out to be impractical, mainly for ergonomic reasons: in order to rotate a disc with a diameter of 20 cm while keeping it in position, the user has to use two hands.
1
http://www.macromedia.com/software/flash
68
Also moving the disc on the surface with moderate speed and accuracy cannot be accomplished by using only one hand. Therefore the big disc is disadvantageous, because the user is forced to use both hands for a kind of interaction, which could be executed with one hand as well. Furthermore, another fact has to be considered when designing physical objects for the senseDesk: the bigger the interface object, the sooner it reaches the margin of the interaction surface, therefore less space can be used for effective interaction, as shown in Figure 5.4-4. For these reasons I decided, that big objects (such as 20 cm in diameter) are not suitable for the zooming Magic Lens and that a smaller interaction object had to be found.
effective interaction space
effective interaction space
Figure 5.4-4: Effective interaction space of a big and a small disc.
Since in this application ergonomics were more important than achieving a desired aesthetic effect, I abandoned the idea of projecting the content of the lens directly onto the interaction object. This way, only ergonomic and technical requirements were left to be fulfilled: •
The object should be easily graspable with one hand and accurately movable on the surface.
•
One-handed azimutal rotation must be supported.
•
The object should accommodate two RF-tags for position and rotation sensing.
Considering these requirements I ended up with a small round Tupperware™ plastic container (normally used for food storage). The containter is 8 cm in diameter and 5 cm high and serves as easily graspable puck, which can also be effortlessly moved and rotated like a turning knob when resting on the surface, as shown in Figure 5.4-5.
Figure 5.4-5: Food container serving as prototype interaction object.
5.4.2.2 Zooming Mechanism In my prototype the projected Magic Lens resides directly in front of the puck, establishing a strong visual connection between the physical and the digital objects without overlapping them. Additional content layers of the floorplan like special locations, details, etc. are 69
displayed only in the region of the lens. This helps to avoid visual clutter, which would arise, when all information would be shown all over the screen. By turning the puck clockwise, the user zooms into the content, while turning counter-clockwise zooms out until reaching the original 1:1 factor. Unfortunately, the maximum zoom factor is limited by the resolution of the interface: The XY-sensing resolution of the senseDesk is +/- 2 mm, which is fine enough for applications, where the relationship between the movement of the graphical and the physical objects is 1:1. But when having zoomed in e.g. by factor 1:10, every small movement of the puck causes a 10 times larger movement of the displayed content, so a 2 mm step makes the graphics shift by 2 cm! Therefore I limited the zoom factor of the prototype to 1:25, which makes the content of the lens jump in (still tolerable) steps of 5 cm when having zoomed in to the maximum.
Figure 5.4-6: Controlling the lens with a puck.
5.4.2.3 User Testing When I exposed the prototype to test users, I gave them only one little hint: they should move the puck and rotate it. The rest of the application they had to learn by themselves, which all participants accomplished in less than one minute. The testers rapidly learned how to zoom in and examine spots of interest without getting lost in the enlarged map, because the full floorplan was permanently displayed around the lens serving as reference point. The short learning curve of the prototype interface can be explained by the concept of natural mappings: natural mappings are relationships between controls and actions, which take advantage of physical analogies and cultural standards [Nor1990, p.23]. They lead to an immediate understanding of the interface, which makes them instrumental in designing usable systems. In the prototype, every planar XY-movement of the puck is mapped to an equal xy-movement of the lens (which also directly relates to the behaviour of a real-world magnifier), a spatial analogy which is immediatly understandable for the user. The rotation of the puck resembles the rotation of a turning knob (like a volume knob, found on many appliances). As a de facto standard in our western culture, turning a knob clockwise means “more” (like more volume), while turning a knob counter-clockwise means “less” (exceptions like water faucets do exist, but when staying in the domain of 70
electronic devices or information technology, the former statement holds). In the zooming interface, turning clockwise means “more zoom” or “go deeper”, which is not a 100% cultural standard (it could also relate to “more out” or “more away”). Nonetheless, this mapping matches to the user’s everyday-logic and experience much better, than the mapping other way round. This has been underpinned by my observations, in which the test users remembered the mapping throughout the whole test, after once having tried the zoom out. With a second group consisting of four users, I tested the opposite mapping of the zoom, in which turning clockwise related to “zoom out”. The participants turned the puck in the wrong direction again and again throughout the whole test and when asked about it, they confirmed that they were confused by this mapping. In general, the test users appreciated the tangible zooming lens, because it allowed the exploration of the map in a playful and intuitive way. Some of the participants showed orientation problems when moving the lens at maximum zoom level, caused by the jumping of the graphics mentioned in 5.4.2.2 and by the fact that the difference of zoom levels between content and context has become too large. One participant made a good suggestion for improvement: the interface should provide more context information about the currently focussed content. 5.4.3 Concept As showed by the prototype, the tangible puck combined with the projected Magic Lens allows easy spatial queries and navigation. Therefore a promising application of the senseDesk could be an information kiosk serving as central part of the visitor guidance system in a museum. Visitors could easily access information about the different parts of the building as well as receive information about how to get to a desired place. The simplicity of the tangible user interface would enhance the overall experience and increase the usability of the kiosk. An additional physical model of the building with its actual parts of interest selectively highlighted by a second projector, could be placed in the proximity of the kiosk.
Figure 5.4-7: Conceptual sketches of wayFind.
71
5.4.4 Application Figure 5.4-8 shows the interface screen of wayFind: The user (as a visitor) stands in the entrance hall and explores the interactive ground plan of the Museumsquartier1 , a wellknown complex of art-museums located in Vienna. The main interaction device is a zooming lens window which is controlled with a freely movable puck similar to the prototype. A coarse ground plan of the complex is projected on the surface, labeled with the names of the different buildings. When a specific building comes into the focus of the lens, it is hightlighted with a different colour. Its name label grows bigger and snaps to the center of the lens, in order to clearly indicate which building is selected. The lens also displays a more detailed floor plan and additional information layers like toilets, infopoints, gastronomy and other points of the visitor’s interest. An important feature is the actual way-finder, which indicates how to get from the current location (“You are here”) to the focused point by displaying a red path and additional textual information. The bottom of the screen is the stage for a multimedia-presentation of the current exhibition(s) and also provides textual information about artists, opening times, dates, etc. The window in the upper right shows an aerial view of the building complex and simulates the selectively highlighted architectural model, which could be installed in the immediate vicinity of the kiosk and serve as additional attraction.
Figure 5.4-8: Interaction with wayFind.
1
http://www.museumsquartier.at
72
When the visitor zooms in far enough with the lens, he triggers a pseudo-mouseclick and changes one scale-level deeper into the map: the whole screen zooms in until the currently focused building is displayed at full size (Figure 5.4-9). He is then able to examine the enlarged building room by room in the same way as he explored the whole complex in the overview-level before. When the puck is moved to the right border of the screen, the application changes back to overview level.
Figure 5.4-9: Zoom-in transition from the overview level to building level.
5.4.5 Discussion Regarding interaction, the wayFind application is an extension of the humanExplorer: The tangible lens is more sophisticated, because it can be moved and also rotated to adjust the zoom-factor. When turned far enough, the pseudo-mouseclick allows the transition between different scales/levels of the application. I introduced this click-by-rotation in order to add trigger functionality to the puck, without increasing its complexity by adding physical pushbuttons. This way, zooming deeper into the content is always accomplished by rotating the puck clockwise, which keeps the handling of the interface consistent. Though this interaction technique is straightforward and simple, the relationship between rotation and the transition is not really obvious to the user: I observed deeply surprised test-users, when they encountered this “hidden” functionality for the first time and some of them were not sure what action had caused the zoom-in transition. This problem happened partially due to the circumstance, that turning kobs which additionally trigger a function at maximum position are not common in our everyday world. This early version of wayFind did not provide any on-screen information about how far to turn the puck and that a transition down the building hierachy was going to happen.
73
In order to make this functionality visible, I added a circular arrow around the puck and the text “ENTER”, which are gradually filled and faded in, as the user turns the puck clockwise (Figure 5.4-10). These visual and textual clues provide enough guidance for novice test-users.
Figure 5.4-10: Visual and textual clues indicating the zoom-in feature.
Another interesting problem is the lower third of the screen, which is used for displaying contextual content and information. Since the lens is positioned directly in front of the puck, the user moves the physical device around in the lower half of the screen. Therefore the puck, the user’s arm and his hand interfere with the projected graphics and text during his interaction, as shown in Figure 5.4-11. Consequently, the screen-space between the puck and the user can’t be used for contextual information (about the current building, etc.), because it regularly becomes unreadable by the body’s intrusion and therefore has to be moved to another place: A distinction between presentation and interaction elements becomes necessary. Interaction elements provide immediate feedback for the tangible interaction, therefore they should be clearly bound to their input devices. The remaining presentation-related elements are not critical for handling the interface and therefore can be separated from the interaction elements in order to increase overall clarity of the interface.
Figure 5.4-11: Puck and body interferring with the multimedia presentation.
Solution A is to move all contextual presentation into the upper part of the screen, which is out of the user’s reach and therefore safe. The drawback of this solution is the limited screen-space which is left there.
74
Solution B solves the problem with hardware: A monitor is added to the system, which only presents the contextual information, while the interaction-related content is displayed on the surface of the senseDesk (Figure 5.4-12).
Figure 5.4-12: Solution A (left) and Solution B with additional presentation space (right).
The additional physical separation of presentation and interaction space in solution B yields many positive effects: •
More screen-space is available because of the additional monitor.
•
The elements for presentation and the presentation directly related to the interaction can be both displayed at full size.
•
The tangible interaction is simplified, because more screen-space is available and only interaction-related graphics are shown on the senseDesk.
•
Group-experiences are better supported, because bystanders can comfortably watch the output of the installation on the additional monitor.
In this case, the presentation-related elements establish the multimedia information about the current building, which is not necessary to be viewed during the interaction with the puck. I moved these elements away from map, lens, and labels i.e. the interaction-related content, which has to stay bound to the puck. This visual separation simplifies the understanding of the interface, because the interaction area is less visually cluttered. 5.4.6 Conclusion and Future Work I consider information retrieval, information presentation, and education as the main application domain of an installation like wayFind, because it effectively supports querying and navigating content, which can be mapped into 2D-space. The system especially supports novice users by the simplicity and immediacy of its tangible user interface, resulting in a short learning curve. In order to support spatial queries of higher complexity (e.g. combination of multiple views and criteria, access to high-dimensional data), the 3-DOF puck is too limited and the interface has to be extended. One extension could be the use of push-buttons mounted at the bottom of the screen, associated with extra options displayed above (similar to the main 75
menu of the software framework). Another interesting extension would be mounting a resistive touch-screen foil on the interaction surface: The user could then select graphical content directly with his finger-touch, which allows flexible buttons and menus plus another layer of direct interaction with the 2D-map. These opportunities make resolving the resulting technical implications (like electromagnetic interference with the tag-sensing system and false touches caused by the puck) worthwhile future research.
5.5 forceField 5.5.1 Motivation The previously discussed demo applications wayFind and humanExplorer both feature the usage of one tangible object, which is used as an instrument that enables the user to perform a certain operation (magnifying, highlighting, augmented viewing) on the digital content of the senseDesk. Since the senseDesk offers tracking of up to 8 tags, multiple interface objects can be used. As stated by Fitzmaurice el al. [Fitz1995], tangible interfaces using multiple objects allow space-multiplexed input. With space-multiplexed input, each function to be controlled has a dedicated transducer occupying its own space (like in car with its steering wheel, pedals, etc.). In contrast, time-multiplexed input uses one device to control different functions at different points in time. This is typical for traditional GUIs, which channel a task through a sequence of actions performed with a pointing device like a mouse or trackball. A TUI that yields a higher degree of parallelism by offering multiple input objects would therefore be advantageous for applications where the user has to manipulate multiple representations on screen.
tangibles
Figure 5.5-1: Multiple tangibles allow space-multiplexed manipulation.
5.5.2 Concept An application which benefits from manipulation of multiple controls is a simulation of physical systems, because in such systems multiple phenomena interact at the same time, which are usually influenced by a number of parameters. The 2D-simulation of an electric field caused by the presence of point-charges provides a good demo concept, because the direct manipulation of the electric charges and the immediate display of the resulting effects enhance the understanding of the phenomenon very well. Small interface objects should act as the tangible representations of usually ungraspable electric charges, which influence the digitally simulated dynamic physical world.
76
5.5.3 Application Figure 5.5-2 shows several stages in the simulation of forceField: in the first screen, no charge is placed on the desk and therefore no electric field exists at this time. The small red and green dots scattered over the screen are small particles of positive or negative electric charge. These particles only probe the electric field, they do not influence it, they just react to it. In the beginning, no electric field is present (i.e. no force acts on them), therefore they remain idle and stay in place. When the user picks up one of the small tagged discs and puts it into the simulation, the situation changes: the tangible charge produces an electric field, which is visualised by a suddenly appearing grid of white arrows. When the user moves the charge, the white arrows instantly change, reflecting the changes of the electric field. The presence of the electric field also generates forces, which cause the particles to move in space. The user observes, that charges attract particles of opposite sign (i.e. positive-negative), while they repel particles of the same sign. The electric forces which act on each particle are visualized with little lines which change according to actual strength and direction.
Figure 5.5-2: Dynamic visualisation of the electric field.
The orientation of the arrows is determined by the electric field lines which emerge from green (=negative) poles (or infinity) and lead into red (=positive) poles (or infinity). The field intensity is visualized by the brightness of the arrows. Orientation and intensity for discrete positions in the field are computed by calculating the effects of each charge seperately and finally superimposing them. The whole grid of arrows visualizes the direction field of the underlying first-order differential equation. It is computed by superimposing the individual electric fields caused by the tangible point-charges. The field r vector e i (direction and intensity) for each arrow i on the grid is calculated using the formula below:
r r n (a − q ) r i j ei = ∑ r r 2 * q j * c j =1 a − q i j
n ... number of charges
r q j ... location vector of point-charge j q j ... charge of point-charge j [Coulomb] r a i ... location vector of arrow i
c
... constant
Figure 5.5-3: Formula for computing the field vector on a certain location.
77
The tangible objects are small white pucks (4 cm in diameter) and contain one tag. Up to eight pucks can be used simultaneously.
Figure 5.5-4: Small pucks serving as tangible point-charges.
A second version of forceField uses bigger pucks (8 cm in diameter) with two tags inside for additional rotation input. By rotating the puck like a turning knob, the user controls intensity and polarity of the point-charge which is indicated by fading the colour of the disc between red-grey-green. Though not more than four objects can be used in this configuration, the added layer of control significantly increases the interactivity of the simulation.
Figure 5.5-5: Big pucks with two tags representing adjustable charges.
5.5.4 Discussion This demo features the senseDesk as a dynamic simulation environment with a tangible interface that allows direct physical manipulation: by changing the spatial configuration of the tangible objects on the table, the user dynamically sets the simulation parameters and the system reacts in real-time. In this application the objects do not serve as tool (i.e. interaction instrument), they serve as tangible representations of a digital task-domain object and especially in the case of rotation-input, also as tangible parameter-controllers. Because the interface offers many physically separate controls, the user can rapidly switch between objects or manipulate a number of them simultaneously with his hands. Using a mouse in a traditional GUI, the user would have to physically grab the input-device, select the logical object on screen, manipulate the logical object, and then release the object and sequentially repeat the last three steps with each object. With a TUI such as the senseDesk, the user directly grabs one or more physical/logical objects and directly manipulates them without the need for indirect or sequential operations. This space-multiplexed interaction with the system is more enjoyable and also more efficient than the usual time-multiplexed 78
interaction provided by a GUI with a standard pointing device. I could also observe, that the tangible interface encouraged test users to ask “what if”-questions about the exhibited physical phenomena, which they subsequently answered by experimenting with the system. Another advantage of the direct access to the senseDesk via multiple physical representations is the stimulation of communication and collaboration: one or more users can simultaneously interact with the table and mutually explore the dynamic digital world, which is a good way to actively learn and understand phenomena in groups. A standard single-user interface consisting of a joystick, trackball or touchscreen would force one user into the role of the master who owns the control, while the other participants have to watch passively, a setting which is less appopriate for interpersonal learning experiences and discussions. 5.5.5 Conclusion and Future Work forceField allows true hands-on interaction with otherwise “intangible” phenomena, which shows the applicability of TUIs to the domain of simulation and presentation of dynamic systems. Educational institutions like science museums could especially benefit from using an installation like forceField, because it combines exploratory interaction and experimenting with advanced visualisation and discussion of scientific phenomena. Considering future work, flowField could be enhanced in several ways: the visualisation could additionally display equi-potential lines and field lines. I did not implement this feature in the actual version of flowField, because Macromedia Director is too slow for computations and renderings of such complexity. Therefore an interesting alternative to using Director would be interfacing the senseDesk with specialized software packages for visualizing dynamic systems, such as Sysquake from Calerga1 . Another issue concerns the missing physical output capabilities of the senseDesk: a GUI relies only on digital representations, which can be controlled by the user and the system. In a TUI system such as the senseDesk, the user manipulates physical and digital representations while the system itself can only influence the digital ones. The computer lacks the ability to move the physical objects on the table, which is an imbalance of physical ouput capabilities between human and computer. As a consequence the point-charges in flowField are not able to move (i.e. have to have infinite mass in the physical simulation), because they are attached to the real-world pucks, which cannot be moved by the computer. On the contrary, the particles move due to the simulated electric forces, because of their purely digital nature. In order to enhance realism, it would be highly desirable, that the pointcharges are able to attract or repel each other in the real world, too. Therefore the pucks would have to be mechanically actuated by the system, which makes force-feedback technology necessary: a device similar to a XY-plotter could be mounted underneath the surface of the desk. The pen-holder of the plotter could contain an electromagnet which enables the system to attract, move, and release a tangible object (containing a permanent
1
http://www.calerga.com
79
magnet) on the upside of the surface. Such a desk could spatially reconfigure the objects on its surface or clean itself up without any physical user intervention.
5.6 spacePlan 5.6.1 Motivation The forceField application has featured small white pucks serving as tangible input objects, which represent manipulable electric point-charges. The beamer gives them red or green colour through projection, depending on their electric polarity in the simulation. When taking only their physical appearance into account, these pucks are plain geometric entities (cylinders) of neutral white colour (Figure 5.6-1). Such objects can be called generic. The main advantage of a generic physical object is its flexibility: the same object can be used for different applications and additional information can be dynamically projected on it in order to give it a certain meaning or state (as is the case with the digitally coloured charges). On the other side, specific physical objects are much more common in our real world, because they take advantage of all their physical properties (form, colour, material, etc.) in order to communicate their purpose to the user [Fitz1997]. Because of their immediately perceivable specific appearance, these objects effectively designate a certain usage, task or meaning. This makes less interpretation work necessary for the user and therefore reduces cognitive load: for example, it is obvious that the action figures in Figure 5.6-2 are toys that represent well-known cartoon characters. Therefore a tangible user interface which combines the immediacy of specific real-world representations with the flexibility of digital spaces, is able to provide natural ways of interaction, that make accomplishing complex problem-solving tasks as intuitive and enjoyable as playing with toys.
Figure 5.6-1: Generic physical object.
Figure 5.6-2: Specific physical objects (from Monsters Inc., Disney).
5.6.2 Concept An interesting application in which external representations play an important role for planning and problem-solving, lies in the context of urban and regional planning. In the opinion of E. Arias, the focus of urban planning and design is on decision-making with the implementation of its outcomes, such as policies and plans [Arias1997]. Decision-making 80
usually happens in a number of meetings, where urban design problems and their solutions are discussed with the concerned stakeholders. In order to enhance discussion and problem-solving among the participants, visualisations such as maps, plans and physical scale models as shown in Figure 5.6-3 are frequently used. During the course of a meeting, these artifacts usually do not remain untouched and unchanged, because they are an integral part of the collaborative planning process: while the participants explore different design solutions, they visualize their arguments by drawing directly on the maps and plans or by picking up and moving parts of the scale models. According to Arias, this interaction with verbal and physical gestures augmented by external artifacts creates a shared understanding between the stakeholders through collaborative design: using external representations matching to the task as objects-to-think-with, users become able to keep track of complex events, participate fully in the design process, and collaboratively construct new knowledge [Arias2000].
Figure 5.6-3: Examples of a map and a scale model used in urban and regional planning.
It is obvious that artifacts like maps, plans, and scale models are of high value for collaborative decision-making in urban planning, but they have a severe shortcoming: due to their very physical nature they are also very static. In fact, people can draw on the maps and change parts of the scale model, but these representations cannot measure up with the complexity of effects present in real-world urban systems. Phenomena like solar shadow casting by buildings, pollution, air flow, noise, traffic, etc. are of highly dynamic nature and demand for sophisticated computer aided simulation tools of high complexity. Software systems of this kind usually display their output on a monitor or wall-projection and are controlled by a single user who has to be a trained expert, therfore they are not designed for true CSCW (computer supported collaborative work). Such a solution is too unflexible to be successfully used in meetings, because all participants (from novice to expert) should be able to directly manipulate the representations that support articulating their arguments. A tangible user interface that enables the user to control the digital simulation by direct manipulation of the physical scale model would therefore significantly enhance the discussion process: for example, a participant proposes a different location for a planned shopping mall. As soon as he moves the physical model of the building on the table from location A to location B, the computer projects a new digitally augmented map on it, which reflects the changes in traffic flow and pollution as a result of this activity. Such an immediate visualisation of the consequences of proposed actions would provide a better 81
understanding of the debated problems, serving as basis for better informed collective decisions. 5.6.3 Application Because the focus of my demos lies on the interface and not on the content, spacePlan displays a fictional scenario in urban planning: two appartment buildings and a factory have to be placed in an optimum configuration, which should yield minimal negative implications for the abutting owners in the discussed region. I chose shadow casting of the buildings and pollution caused by the factory as the main issues to be discussed, in order to simplify the fictional debate. The tangible interface objects are three small building models made of carton and contain two tags for position and rotation tracking. The models are different in form and size, they are specific objects, which represent the buildings of interest.
Figure 5.6-4: Specific building model as interface object.
When a user puts a model on the map, it becomes highlighted in red or purple colour, depending on is function (appartment or factory), as shown in Figure 5.6-4. Furthermore, a shadow perfectly registered to the building model, is rendered and projected on the screen, which creates the illusion of a real solar shadow cast from the physical model. This is a very good example of tangible user interaction: the physical building model and its simulated shadow become perceptually coupled (by accurate tracking and overlay) and form one conceptual entity. The main advantage of this digitally created shadow is its flexibility: depending on the actual position of the sun in the simulation, the shadow changes in size and direction as if caused by a real light-source. The location of the sun in the sky is visualised by the diagram in the lower right of the screen and is determined by the actual daytime and date, which can be adjusted using the controls on the bottom of the screen. When the buildings are placed on the desk, users can mutually shift and rotate them on the map and explore several possible spatial configurations in order to find the most ideal one.
82
Figure 5.6-5: The building models become digitally augmented via simulated shadows and smoke.
With the digital shadows attached to the building, planners can immediately check, whether one building blocks the solarisation of another one during the course of the day. Furthermore the factory building pollutes the environment, visualized by clouds of smoke emerging from its stack. This helps stakeholders in judging, whether one of the appartment buildings would reside in an unfavourable location because of bad air quality (Figure 5.6-5, picture 3). The path of the smoke and its distribution is determined by the prevalent direction of the wind, which can be dynamically adjusted. The controls for date, time and wind at the bottom of the screen can be adjusted in three different ways: they react to click-and-drag input by mouse/trackball, physically pressing the associated pushbuttons on the chassis, or using a physical object. The latter is the most innovative one and is therefore explained here in detail: the user takes an interface object (e.g. a white puck), which he places on the digital control and starts to rotate it. The selected digital control follows the rotation of the object in real-time, which gives the user haptic qualities similar to a real-world turning knob, as shown in Figure 5.6-6. Additionally the user can handle two objects at the same time for simultaneously adjusting two digital controls and rapidly switching to other ones by just moving the pucks.
Figure 5.6-6: The physical/digital turning knob.
5.6.4 Discussion The proposed system successfully augments physical models with digitally simulated phenomena, such as solar shadow casting and pollution. The simplicity of the interface and the meeting-table situation of the senseDesk encourages participants to manually move buildings and explore solutions during the course of their discussion. Support for this conclusion comes from the field of cognitive sciences: experiments prove, that working 83
with physical objects can tremendously help in discussing and solving a problem, because the objects and their configuration embody and retain information about the task state and constraints in a direct way [Kir1995]. As an example, the physical models in spacePlan clearly identify specific buildings and their spatial configuration represents the currently discussed planning situation. The ability to change the configuration of external objects in an intuitive way in order to quickly try out different scenarios is also an essential human need in problem-solving tasks: Kirsh and Maglio cassify human motor activity into either epistemic or pragmatic actions [Kir1994]. Pragmatic actions are goal-oriented, they base on a previously formed goal the user has in mind, while epistemic actions have no specific goal available for inital action. Epistemic actions are performed to uncover information that is hidden or hard to compute mentally. Only after receiving feedback, which gives information on the means available, a goal can be formed. Figure 5.6-7 visualizes this difference with a revised version of D.A. Norman‘s original action cycle, as presented in [Nor1990, p.47]. Good examples for epistemic actions can be found in puzzle games, where users often „think“ by physically trying out combinations of pieces instead of solving the puzzle purely in their mind. Pragmatic action cycle
Epistemic action cycle
Task
Task
Goal-setting
Execution
Evaluation
Action
Goal-setting
Execution
Feedback
Evaluation
Action
The World
Feedback
The World
Figure 5.6-7: The original action cycle by D.A. Norman [Nor1990] revised for epistemic actions (right).
While the goal-oriented pragmatic actions, which bring the user physically closer to a stated goal, are well known and have been extensively studied in cognitive science and HCI (see Norman [Nor1990], Dix [Dix1998], Laurel [Lau1993], Carrol [Carr2001]), epistemic actions have been often considered as non-important or wasteful, which is true from a pure motor-skill perspective. But since epistemic actions are of exploratory nature, they are an integral part of collaborative design and problem-solving activities. Therefore they are of high value, from a cognitive scientist‘s point of view: users can significantly reduce mental computation effort by manually trying out possible solutions with the help of external representations. Since spacePlan emphasizes simple, everyday modes of interaction with physical models, important exploratory epistemic actions can be carried out without the inhibitions known from indirect manipulation devices such as mouse or trackball. Users can easily try out their ideas with a low level of risk, which fosters an natural, impromptu interaction style in computer supported experimentation and exploration. 84
I admit, that a simulation and collaboration system has to offer more options and tools than spacePlan in order to be useful for real-world urban planning tasks, but nonetheless, the demo application gives a good idea how existing software packages could be interfaced and extended in order to support true CSCW in face-to-face meetings. 5.6.5 Related Work 5.6.5.1 Urp Urp has been developed by Underkoffler and Ishii [Und1999a] at MIT as part of their research in tools to support urban planning activities. It is technically based on the Illuminating Light framework, which uses optical tracking of coloured dots for tangible input and front projection for output, as already described in chapter 4. The version of Urp presented in [Und1999a] provides more planning tools than the current implementation of spacePlan: in addition to the display of solar shadows, it also renders wind flow around the buildings with acceptable accuracy and measures the real distance between them. Also the object-tracking surfaces can be tiled and made larger in extent. On the other side, the senseDesk performs better in matters of speed and accuracy of tracking together with the fact, that the tangible objects can be much smaller than in Urp and have a more pleasant visual appearance, because they do not have to be optically marked for tracking.
Figure 5.6-8: Planning with Urp.
85
5.6.5.2 EDC The Envisionment and Discovery Collaboratory (EDC) is an urban planning tool which has been developed at the University of Colorado, Boulder [Arias2000]. The main interface is a horizontal touch-sensitive electronic whiteboard called action space. By placing physical objects with pressure and moving them on the surface, participants directly interact with the content on the whiteboard. It is also possible to create virtual scenes on the surface and draw roads and bus-routes with physical tools. EDC is also able to run simulations of the created objects for evaluating proposed solutions.
Figure 5.6-9: The action space of EDC.
A second, vertical electronic whiteboard offers information concerning the object currently being manipulated and its current context. This reflection space gives view to dynamic information spaces like the WWW and helps participants understand the current issue at hand, facts surrounding the issue, and related public opinion polls [Arias2000].
Figure 5.6-10: Reflection space.
Figure 5.6-11: The concept of EDC.
Though the touch-screen based handling of the physical objects is not as fluid and intuitive compared as with the senseDesk with its seamless tracking, I consider the EDC as the currently most advanced urban planning system that involves tangible user interfaces: With its combination of Action and Reflection Space and its interface to the WWW, the EDC allows true collocated collaborative work as well as distributed cooperative work, even within large groups of stakeholders.
86
5.6.6 Conclusion and Future Work Applications like spacePlan, Urp, and EDC demonstrate that tangible user interfaces are able to play a vital role in interactive systems which support collaborative group work. Intuitive handling and fidelity to reality let physical representations become ideal mediators for human-human and human-machine communication. As technology develops, the underlying input and output systems will be able to track an increasing number of objects with higher speed and accuracy, while becoming less obtrusive to the user. This will foster the integration such systems into real-world workspaces. As already indicated, the current spacePlan application could be extended in many ways: advanced simulations of air-flow, traffic, or noise could be added as well the the automatic overlay of real-world constraints such as boundaries which are defined by law, economic, or statistical data of the local population. There is also a need for more tools (similar to Urp or EDC) which control wind, probe certain spots, or define the camera-viewpoint of an additional 3D-visualisation.
5.7 megaPong 5.7.1 Motivation The first breaktrough game in the arcade history was "Pong". The game was created in 1972 by Allan Alcorn, who was first engineer employed at Atari, a company founded by Nolan Bushnell. It was a simple game of electronic tennis, consisting of two small white bars that were used as bats, a small square that served as a ball and a black background. The ball bounced around the screen, off walls and the players’ bats (Figure 5.7-1). The object of the game was to shoot the ball behind the opponent’s paddle in order to win a point. Due to its novelty and striking simplicity, Pong was a big commercial success from the beginning and has been copied and imitated by many developers and companies since then [Her1994].
Figure 5.7-1: The original Pong arcade game.
87
The interface hardware of the original arcade game was a simple turning knob, called “paddle”, for each player, which allowed to control the vertical position of his bat by twisting it. Since then, this 1-dimensional control has been often replaced with joysticks, trackballs, etc. in later versions and clones. Nonetheless, all of these devices represent a more or less indirect way of controlling the bats, which could be replaced by a more hands-on solution. A more direct way of interfacing pong bats can be found in the physical world of pinball centers and recreation halls: In Air Hockey, one of the world’s fastest game tablesport, two players, each one with a mallet in his hand, try to strike a puck into the opponent’s goal. The puck hovers on a fine pillow of air coming out of the table surface, and therefore can reach high speeds (120km/h and above) due to minimized friction.
Figure 5.7-2: The Air Hockey game in action.
Concerning game-play, Air Hockey is very similar to Pong: two players are located at opposite sides of the field and try to hit a ball/puck in such a way, that interception becomes impossible for the opponent. But Air Hockey uses a very different interaction technique due to its analog nature: the players directly act on the ball on the surface through a physical object (the mallet) which serves as an extension of their body. A tangible user interface like the senseDesk could make a new version of Pong possible: the immediacy of Air Hockey could be blended with Pong’s digital nature by using physical bats as interface objects for gaming. The players could hit a virtual ball with their real bats, a combination that would form an up-to-date version of a 70’s arcade classic. 5.7.2 Concept The main goal in conceiving megaPong was to create a sophisticated version of Pong, which on one hand, preserves the essential idea of the original game. On the other hand, it should feature the benefits of using a TUI for entertainment purposes as well as the capabilities of modern hardware. In order to reach the first goal, I kept the original concept of the game almost unmodified: two players stand on opposite sides, one bat for each player. When the ball leaves the field at one player’s side the other player gets a point. At the end, the player with the most points wins.
88
5.7.2.1 Physics Sticking close to the original Pong concept is necessary for maintaining the simplicity and the idea of the game, but when it comes to conceiving the game-physics, the qualities of the senseDesk as a TUI have to be taken into account: Each player moves a tagged rectangular block on the surface, which acts physical/digital bat in the game (Figure 5.7-3). Since the user is able move the physical bat in an unconstrained way, it does not make very much sense to constrain the motion of the associated digital bat to the vertical Y-Axis, as it was necessary for the paddle-controlled original version. The digital bats have to stay perfectly registered to the physical bats and therefore can have any screen position and azimutal angle.
Figure 5.7-3: The physical bat used for megaPong.
This choice yields consequences for the game-physics of megaPong because more complex mechanisms such as rigid body dynamics has to be taken into account: since the bats can be moved with three degrees of freedom (x,y,azimutal angle), they also can be shifted forwards and backwards. Consequently when the bat hits the ball, the momentum of the bat must influence the ball’s momentum i.e. its speed. Also the reflection angle of the ball caused by collisions with bats and walls must be computed according to real-world physics. The advantage of this higher calculation complexity lies in an enhanced gameplay, because for the user, megaPong “feels” very much like a real Air Hockey table and he can influence the behaviour of the ball in many more ways than in the original version of Pong. 5.7.2.2 Multimedia A Pong clone designed for the year 2001 like megaPong is obliged to take advantage of the graphics and sound capabilities of up-to-date hardware. Therefore I decided that the game experience should be enriched by colourful graphics and animations. Also the audio layer should provide more variety than the monotonous beeps of the original version: different one-shot samples and sounds should accompany the current actions as well as underlying music loops should add a sense of drama. Voice samples like “get ready” or “player one wins” are also a good way to simulate a referee, who comments the current state of the game. 5.7.3 Application megaPong is used by two players standing on opposite sides of the senseDesk, each one equipped with a physical bat. Since the bats are real objects, which the computer is not in 89
control of, many different physical configurations are possible at the beginning of the game: nothing is on the table, one bat is present, both bats are on one side, etc. In order to obtain an defined state for starting the game, megaPong prompts the players by voice to place their bats on designated spots, which are marked with flashing animated text and arrows, while the rest of the graphics stays dimmed in the background (Figure 5.7-4, picture 1). Also the music loop uses only sparse sounds for creating an athmosphere of suspense. Each player is assigned a colour (red or green) in order to obtain a clear distinction between both parties. When a bat is put on the screen, it is overlaid with the projection of a red or green coloured digital bat, which stays registered to its physical complement. When both players have moved their bats into start position, the screen flashes and the match begins. When the ball comes into play, it starts from the center of the screen moving in a random inital direction. It is a composition of rotating yellow squares which perceptually form a fading trail (Figure 5.7-4, picture 2), a reminiscence to the original Pong, where the balls also left a trail on screen because of the slow fade-out of the phosphoric layer of 70’s CRT’s (cathode ray tubes).
Figure 5.7-4: Three different stages of the game.
When the ball hits one of the side-walls, it is reflected back in to the game area accompanied by sound. When a player hits the ball with his bat, the ball bounces off in the physically correct direction, according to the angle of its movement vector and the angle of the rectangular bat. For achieving a higher degree of realism, the component of the bat’s translation vector parallel to its collision surface normal introduces an additional momentum on the ball as shown in Figure 5.7-5. r ai ––
Calculated path
r ai
Bat
Ball
r ai Vector component of the translational moment parallel to surface normal
Figure 5.7-5: Computation of the path of the ball colliding with a bat.
90
This mechanism enables the player to give the ball some extra speed by hitting it. With time, the ball also gets faster in order to keep the game challenging. megaPong uses stereo sound for one-shot sounds, e.g. when the ball is hit on the left side of the field, the sound also originates from the left speaker, significantly enhancing the realism of the game. Because the players are able to azimutally rotate their bats, I was able to add a little extra to the game: when bat is twirled in a full rotation (=360 deg), it switches to a “power-up mode”. If the player then hits the ball, he launches two bombs which let the opponent’s bat explode, if he does not avoid them (Figure 5.7-4, picture 3). When a player’s bat has been destroyed or the ball leaves his side of the field, the other one gets a point. When one player reaches eight points, the opponent explodes and the game announces the winner with animated text and by voice. 5.7.4 Discussion megaPong demonstrates, how tangible input can extend the concept of an action video game: the tight coupling of real-world objects to digital game objects provide the players with more degrees of control and realism. As a consequence, the underlying game-physics have to take these factors into account by considering laws of rigid body dynamics (masses, forces, moments), otherwise the illusion of direct physical interaction breaks. The latency of the interface is an additional critical factor for maintaining this illusion: as soon as the digital environment starts to noticeably lag behind the physical actions of the players, the gaming experience is not satisfying anymore. Also the fact, that only the user is able to control the physical objects, has to be considered when designing the game: if special spatial configurations are required at a certain step, the users must be prompted to take the required actions, because the computer is not able to update the physical part of the interface. Users are also able to place the tangible objects anytime at any position of the screen, sometimes yielding illegal or unwanted configurations, situations to which the game has to react properly. Because Air Hockey and Pong are well-known games, the test-users were able to play megaPong from the start. Many of them remarked positively, that the reaction of the ball was “close to reality”, which is a result of the extended game-physics implementation. In the beginning, all participants handled their bats with care, but with time they forgot about the technical nature of the objects and interacted with the senseDesk with almost the same physical enthusiam as with an Air Hockey table. 5.7.5 Conclusion and Future Work Action video games are digital real-time environments for entertainment purposes, where players have to react quickly with their body movements. Therefore their physical interaction with the system plays an important role for the overall gaming experience. megaPong demonstrates that the senseDesk with its robust, fast, and accurate tracking can be sucessfully applied in the domain of action games and entertainment, because it enables direct, physical manipulation of relevant game-objects. 91
As also mentioned in previous chapters, Macromedia Director is slow in scripted calculations, so I had to simplify the game-physics engine in order to keep overall latency low: only location, direction, speed and momentum of ball and bats are used for computation and the collision handling uses only rough approximations. A faster implementation of the game engine (i.e. coded in C++) would not only compute collision points and response with more accuracy, it could also enhance realism by taking effects like torque, dynamic friction, and spin into account. Similar to forceField, megaPong could also benefit from physical output mechanisms: the game would feel much more realistic, if the interface provided additional haptic feedback. A bat could physically react on the ball’s impact, an effect which can be achieved by stopping the bat by force or making it vibrate by low-frequency sound. Though such mechanisms potentially get in technical conflict with the existing electromagnetic sensing mechanism, prodiving such kind of force-feedback on the bats could be an interesting future research activity.
92
6 Results This chapter evaluates the senseDesk project in terms of the experiences made during design and implementation of the system. In the first part, the advantages of using the senseDesk interface are discussed in comparison to a traditional GUI setup. The second part describes the conceptual challenges and disadvantages of the current system. In the last part, I present a set of general design guidelines, which can be applied to the process of tangible user interface design for the senseDesk.
6.1 Advantages of the senseDesk •
True direct manipulation of digital objects of interest. As discussed in chapter 2, when performing direct manipulation with a GUI interface, the user controls a single mouse-pointer acting as logical handle for working with interaction instruments. This makes three steps of interaction necessary: (1) acquire physical device (mouse), (2) acquire logical device (instrument), (3) manipulate logical task object [Fitz1995]. Due to this level of indirectness, switching between logical functions becomes costly. Alternatively, the senseDesk TUI uses real objects as physical handles, which represent tangible interaction instruments. By means of these physical instruments, the user can directly operate on task objects, resulting in a more immediate two-step way of interaction: (1) acquire physical device, (2) manipulate logical device. Good examples of this principle are wayFind, where the physical puck is visually and spatially connected to the zooming lens, and humanExplorer with its tangible magnifier. Furthermore, the augmentation of existing physical objects by adding digital output via projection allows for immediate feedback directly in the space, where the physical interaction happens: when a user of forceField grasps and moves one of the electrical charges on the active surface, the projected output in its proximity reacts instantly. This seamless integration of representation and control gives the user the feeling of directly working on the task without having to deal with a computer mediated interface.
•
Higher representational capabilities by combining the advantages of physical and digital objects. Within a GUI, the user manipulates digital objects more by means of generic, standard devices such as raster display, mouse, and keyboard. These devices are cheap and universal, but their physical form gives no useful information about the actual objects, the task, possible commands, or the state of the system [Jac2002]. Because these input devices have no representational significance, all of these informations have to be displayed on screen with the help of digital objects such as widgets, dialogues and metaphors (like the desktop, folders, etc.). In contrast, a TUI like the senseDesk lets the user directly manipulate physical tools and objects, which belong to the actual task domain by appearance and function. Examples are the humanExplorer application, which uses a real-world magnifier as interaction device for examining content, or the spacePlan application, which lets users directly manipulate an urban design scenario by means of small scale models of buildings. This 93
combination of digital and physical representations allows for interfaces with less complexity and higher learnability, because the user can directly work with objects and instruments matching to the task, without having to adapt to digital tools and metaphors. •
Higher effectivity by taking advantage of human motor skills and parallel input. Almost every graphical manipulation in GUIs is accomplished by using a pointing device such as a mouse or trackball. Objects and widgets are manipulated with a limited set of gestural vocabulary (point, click, drag) and the user can only perform one manipulation at a time, because every action is channeled through one pointing device. This time-multiplexed input style forces the user to break up his actions into mutually exclusive sequential manipulations, which demand mental effort and lead to inefficiencies [Fitz1997]. In contrast, the senseDesk TUI supports a space-multiplexed input style by offering parallel input with multiple transducers (= tangible objects). For example, the forceField application associates each electric charge with a graspable puck. Therefore users are able to directly manipulate all objects at the same time, without having to sequentially select and move each one with a pointing device. Another example is megaPong, where players can interact with the ball with many degrees of freedom and gestures, because they handle the digital bats with physical blocks. These types of tangible interfaces tend to be more effective, intuitive, and enjoyable than GUIs, because they make much better use of already learned human motor skills by enabling bi-manual grasping, gesturing and manipulation.
•
Support of collocated cooperative work between multiple users. As a direct consequence of providing space-multiplexed input by means of multiple tangible objects, the senseDesk facilitates interaction in a group of collocated persons. Since a GUI channels graphical manipulations through one input device handled by one user, conflicts arise, when multiple persons should share the same interface, which typically happens in meeting situations. The senseDesk TUI application spacePlan is an example how to minimize this conflict by using small scale building models as input objects: each model can be individually picked up and manipulated by different users without the need for a special pointing tool and user synchronization. Such a democratic interface effectively supports collocated group work, because participants can use physical and digital objects for communication and problem-solving with low additional cognitive effort for handling them.
6.2 Disadvantages of the senseDesk •
Physical intrusion. Since the tangible objects are directly placed into the region of projected graphical output, they always take away a certain amount screen-space, depending on their size and shape. Especially when many objects are present, they tend to obscure important graphical information and clutter up the interface. Furthermore, the user’s arms and hands get in the way of the projection during his interaction, causing additional disturbances of the graphical output of the system. A GUI designer 94
is not confronted with these issues, because the output device (raster display) is physically separated from the input devices (mouse, keyboard). Nonetheless, these problems can be overcome by carefully designing the interface objects and the graphical output in order to minizime these interferences, as shown in the wayFind application. •
Limited transportability. The current version of senseDesk uses a special working surface in conjunction with a video beamer, a setup which is bulky in transportation and also needs a calibration procedure after installation. This limits the senseDesk to applications with low demand for transportability such as an interactive exhibit for museums, as a presentation-kit for fairs, or as an fixed installation for meeting rooms.
•
Higher software design and implementation effort. Software systems based on a GUI are easier to design, because GUIs use a small set of well-understood techniques, which are also extensively supported by standard development tools [Bea2000b]. In comparison, a developer using a TUI like the senseDesk, has first to understand the interface paradigm, its usage and implementation, which in general causes higher software development effort. As one step in order to minimize these efforts, I abstracted the TUI-interface with the senseDesk software framework described in chapter 4.
•
Demand for specialized interface objects. In order to fully exploit the advantages of interaction with physical intermediaries, custom tangible objects have to be designed, which fit to their intended usage and task. Examples are the building models in spacePlan or the magnifier in humanExplorer. In general, this specialization makes hardware costs more expensive as when using a GUI with its standard components such as mouse and keyboard: special attention has to be given to form factors such as ergonomics, affordances, haptic qualities, and the materials used (noise, friction, robustness). These additional design efforts due to specialization can be justified by the advantage of creating an interface, which is more intuitive, more efficient, and more enjoyable. I was also able to lessen these efforts by developing a set of semi-universal standard objects, such as the white pucks described in chapter 5.
•
Lack of universality. GUIs can be applied to almost any task domain, because they are based on standard all-purpose interaction devices (mouse, keyboard, monitor) with a generic vocabulary of gestures (point, click, drag). This high degree of universality is a key strength of GUIs, and in general cannot be achieved by TUIs, because they use specific physical forms to represent and manipulate the pieces of data in the system [Ishii1997]. Therefore TUIs have to be tailored to certain tasks and applications with dedicated, specific input devices, in order to outperform GUIs. This especially holds true for the senseDesk tangible interface: in the current version of the senseDesk, the interface objects can only be sensed in position and rotation without the possibility to trigger a mouse-click, which makes the implementation of typical GUI selectiontechniques difficult. Therefore I added menu-pushbuttons and a trackball to the
95
senseDesk hardware, in order to effectively combine the universality of GUIs and the specific advantages of TUI interaction devices.
6.3 Design Guidelines After reviewing the results of the senseDesk project and studying related work, I suggest the following set of guidelines concerning the design of desktop-based TUIs: 1. Use physical instances to externalise digital objects of interest. 2. Tightly couple digital representations to their corresponding physical objects. 3. Consider the persistence of physical objects. 4. Utilize the different advantages of generic and specific physical representations. 5. Respect the different mechanical input/output capabilities of user and system. 6. Consider the intrusion of the user’s body during interaction. 7. Respect the physical limitations of the user’s body. 8. Take advantage of the user’s everyday motor skills and practices. 9. Facilitate problem-solving and collaboration by supporting epistemic actions and assuring low risk. Table 6.3-1: Design guidelines for desktop-based TUIs.
1. Use physical instances to externalise digital objects of interest. Advantages of representing digital objects with physical instances are intuitive handling, really direct (bimanual) manipulation, and support of spatial problem-solving and reasoning skills. As discussed in this chapter, objects of interest either belong to the task domain or they are interaction instruments, which let the user act on task domain objects [Bea2000a]. So far, the question which of these objects to externalise, cannot by answered by strict rules. Nonetheless, general tendencies do exist: objects suitable for physical instantiation are static, need continuous representation, and are subject to frequent and direct manipulation. Good examples for such objects are icons, pointers, the lens in wayFind, or the buildings in spacePlan. Objects which should remain virtual, are often transient, highly dynamic in function and appearance. Good examples are dynamic text, dialogue boxes, or menus. These objects depend on the flexible nature and redraw capabilities of GUIs and should not be bound to static, physical instances. 2. Tightly couple digital representations to their corresponding physical objects. A TUI uses physical objects for direct, immediate manipulation of digital entities in order to seamlessly couple bits and atoms [Ishii1997]. Therefore the system should connect its digital representations to their corresponding physical objects with minimum spatial and temporal offset. Low spatial offset (i.e. distance between physical and digital object) gives the user the impression of direct manipulation of digital content by means of a physical handle and ensures a strong causality between the actions of the user and the reactions of the system. Therefore the X-ray view is projected directly on the lens in humanExplorer and also directly attach the digital bats to the physical ones in megaPong. Minimal temporal offset (i.e. response to the users action in real-time) is essential for natural interaction and the realism of the tangible bits illusion: as soon as the digital objects start to noticeably lag behind the movement of their corresponding 96
physical handles, the user has to adapt his interaction style to the latency of the system, resulting in less speed, less accuracy, and a less satisfying overall experience. For example, when the magnifier of humanExplorer is moved too rapidly, the digital lens is not able to catch up. Consequently the X-ray view is projected partially off the tangible lens and the interface is not convincing anymore. Therefore, a TUI should couple digital and physical objects together as tightly as possible. 3. Consider the persistence of physical objects. Digital representations are transient, they can arbitrarily pop-up and also disappear in the same way, because they are made of bits. Physical objects are made of atoms and are bound to the laws of matter. They cannot arbitrarily appear and disappear, they are persistent. At times, these objects can be physically present in the workspace without actually beeing necessary or having a function. Therefore they tend to clutter up the interface. In this the case, the TUIdesigner can reduce this clutter by minimizing the total number of different objects, making objects smaller and less specific, and designing the interface in a way, that the objects always have a meaningful function. 4. Utilize the different advantages of generic and specific physical representations. Generic physical objects are flexible, they can be coupled to different meanings and functions, and reused for different tasks/applications (e.g. the white pucks in wayFind, forceField). Specific objects have a customized form factor, therefore they are easier to manipulate, and serve as both visual and tactile reminders of the associated tool assignment or represented meaning [Fitz1997]. Good examples for specific objects are the building models in spacePlan, different styluses used in conjunction with graphic tablets, and home-tools. On one hand, such specialized devices generally can be handled with more ease and speed than generic ones, because they take advantage of physical affordances and constraints (like the magnifier in humanExplorer). On the other hand, generic representations can be used for different applications and actions. This reduces the total number of objects necessary in the interface and helps to avoid clutter (see Rule 3). Therefore the TUI-designer has to consider these different advantages and tradeoffs of generic and specific objects in order to match the physical interface to its designated task domain in an optimal way. 5. Respect the different mechanical input/output capabilities of user and system. In general, computers are able to fully control their visual and auditory output, while their mechanical output capabilities are limited: they need complex mechanical actuators like motors, robot arms, or force-feedback devices. Their performance is limited compared to humans, which can easily manipulate their physical environment with high speed and accuracy. The situation is even worse with the senseDesk: the system has no influence on the physical objects on the table, and only the user is able to manipulate them. Therefore, a TUI developer has to consider such imbalances of mechanical input/output capabilities by designing applications with minimal need for specific object configurations on the working surface or by choosing appropriate human-computer dialogues (e.g. in megaPong, the players are prompted to move their bats to the start position). 97
6. Consider the intrusion of the user’s body during interaction. When working with a GUI, the user manipulates the digital representations indirectly by means of input devices (mouse, keyboard), which are physically separated from the output (screen). TUIs remove this barrier, because the user directly manipulates the digital representations with his body by touching or grasping. Depending on the technology used, the user’s body more or less interferes with the digital output: for instance, the senseDesk system projects its output from above onto the working surface and objects. When the user interacts with the system, his arms and hands occlude parts of the projected output and cast shadows. Hence his physical interaction makes parts of the screen unusable for displaying important information. Such occurences of physical intrusion have to be considered, when designing a tangible interface. 7. Respect the physical limitations of the user’s body. Since TUIs rely on direct, physical manipulation by the user, the interface designer also has to take various restrictions of the human body into account: people have a limited arm length, so the working surface and the application have to be designed in a way that physical objects are always in reach. Consequently, when designing for the senseDesk, the upper third of the screen should not be used for tangible interaction, because the objects sometimes would be more than 35 cm away and hard to reach. People also only have two hands, therefore not more than two objects should be required to be concurrently manipulated for a certain task. These two examples already show, how deeply TUI-design is intertwined with the descipline of human ergonomics. 8. Take advantage of the user’s everyday motor skills and practices. Through a lifetime of practice, people have learned to manipulate objects and tools by grasping, turning, picking, and using bi-manual techniques. Therefore a TUI should make use of these already learned motor skills and habits. By utilizing tangible objects with affordances and constraints matching to their intended usage, users become able to manipulate physical and virtual interface elements with very little learning effort. For example, the interaction instrument in humanExplorer has the appearance of a realworld magnifier, which affords to be picked up and used for examining content. wayFind uses a round puck and the analogy of a turning knob in order to suggest rotation for zooming. Such applications of everyday knowledge and principles facilitate the design of highly learnable interfaces. 9. Facilitate problem-solving and collaboration by supporting epistemic actions and assuring low risk. By supporting internal cognition processes with the manipulation of external objects, people can solve task-related problems with lower mental effort than without [Kir1995]. For example, we sometimes use our fingers for counting. This kind of external computation by means of epistemic actions also makes thought processes accessible for other persons, because physical artifacts are used for communicating problems and their solutions. In order to foster these external cognition processes, a TUI should allow for direct, physical manipulation of the actual task objects with lowest mental effort: users should be able to concurrently manipulate multiple objects with many degrees of freedom in order to achieve high input bandwidth. This 98
minimizes the user’s effort in handling the interface. The interface should also provide rapid feedback on the physical input of the user, in a way that he immediately sees the result of his actions in relation to his goals. The system also should support trial-anderror style of problem-solving by allowing actions with low risk of harming system and data, especially, when multiple users are involved. In spacePlan for example, users are able to freely manipulate the buildings of interest via small scale models and immediately see the results of different spatial configurations by means of a projected simulation. Therefore such an interface is able to support problem-solving and collaboration very well, because it supports external computation with physical artifacts at a low level of risk.
99
7 Conclusion This chapter is structured into two parts. The first part summarizes the work presented in this thesis and its results. The second part provides directions for future work.
7.1 Summary In this thesis I have described the fundamental concepts of the GUI-paradigm and pointed out some of their limitations and shortcomings. I have also discussed several novel interaction paradigms, that are able to overcome the inherent weaknesses of GUIs by breaking away from the image of the computer as a monitor, keyboard, and pointingdevice based terminal. In particular, I have presented paradigms such as Ubiquitous Computing, Augmented Reality, and Graspable User Interfaces. These paradigms suggest a strong connection between digital and physical realms by means of embedding computers into the physical environment, augmenting the real world with overlaid digital information, or using physical handles for manipulation of digital information. Furthermore, I have described the concepts of Tangible Media and Tangible User Interfaces (TUIs), which use physical objects, instruments, surfaces, and spaces as interfaces to digital information. I have discussed particular examples of TUIs and the interaction mechanisms exemplified by them. As main part of senseDesk, a fully senseDesk is a electromagnetically video projections.
this thesis, I have described the design and implementation of the working system which allows user interaction with tangible media. The tabular surface, which tracks the position and orientation of tagged physical objects and digitally augments them by means of
In order to explore the various possibilities of desk-based TUIs, I have developed several demonstration applications, which illustrate the usage of the senseDesk in different application scenarios. I have also discussed various design and implementation issues regarding tangible user interaction. Finally, I have evaluated the advantages and drawbacks of TUIs compared to GUIs by means of the experiences made in the senseDesk project and the interface examples described in the theoretical part. In addition, I have presented a set of guidelines which can be applied to the process of tangible interaction design. The results of this thesis demonstrate, that TUIs should not be considered as a universal replacement for GUIs. The GUI concept is neither out-of-date nor ill-conceived: it just does not take the rich modalities between people and their physical environments into account. This neglecting can also be seen as an advantage of GUIs, because it also can help in escaping certain real-world limitations. Tangible user interfaces should therefore be understood more as a further evolution of GUIs or direct manipulation style interaction [Jac2002]. Both types of interface have their specific strengths/weaknesses and represent 100
only parts of a much larger design space: the domain of the human body interacting with physical and digital spaces.
7.2 Future Work The tag sensing mechanism of the senseDesk could be improved in many ways such as •
Tag size. The current tag-size is 3.5 cm in diameter which limits the minimum size of objects, especially when two tags are needed for rotation sensing. Tags smaller in diameter but still reflecting the same amount of electromagnetic energy would therefore permit input with smaller physical objects such as pens, figures, or fine tools.
•
Number of tags. The number of objects which can be simultaneously tracked is limited to eight. By expanding the range of radio frequencies scanned, more tags could be used for sensing. However, since only the frequency of the tags is used for identifaction, the number of possible IDs is very limited compared to the billions of possible IDs for RFID tags.
•
Sensing area. The active sensing area could be increased in size either by scaling the antenna or by tiling multiple antenna boards together, in order to cover larger working surfaces.
Furthermore, the system could be extended with different input/output modalities such as •
Button input. Push-buttons could be attached to the tangible objects for issuing commands and clicks similiar to the functionality of common mouse buttons. The buttons could be either realised by direct cable connections or by letting them switch the inductors of their associated tags, causing the tag-frequency to change. This would allow to extended the senseDesk TUI with point-and-click interaction techniques originating from the GUI.
•
Touch input. The senseDesk interface could additionaly support the activation of projected graphic elements by means of directly touching them. This functionality could be implemented using a resistive touch-screen foil mounted on the active surface.
•
Physical output. Since by now only the user is able to move the tangible objects on the surface, the system could benefit from a mechanism which enables it to actuate the objects, too. This could be realised using objects with magnetic bases and a motorcontrolled two-axis plotter arm underneath. The ability of the system to actuate the physical objects by itself would not only provide mechanical output to the user, it also would allow remote collaboration between multiple senseDesk installations.
The development of interfaces based on novel interaction paradigms such as Tangible Media is a matter of many experiments and studies. Nonetheless the senseDesk project could show, that active surfaces, which electromagnetically track multiple physical objects augmented by video front-projection, are a reliable platform for building and exploring Tangible User Interfaces. 101
Bibliography [Arias1997]
E. Arias, H. Eden, and G. Fisher, "Enhancing Communication, Facilitating Shared Understanding, and Creating Better Artifacts by Integrating Physical and Computational Media for Design", Conference on Designing interactive systems, Amsterdam, NL, 1-12, 1997.
[Arias2000]
E. Arias, H. Eden, G. Fisher, A. Gorman, and E. Scharff, "Transcending the Individual Human Mind - Creating Shared Understanding Through Collaborative Design", ACM TOCHI, vol. 7/1, 84-113, 2000.
[Azuma1997] R. T. Azuma, "A Survey of Augmented Reality", Presence: Teleoperators and Virtual Environments, vol. 6/4, 355-385, 1997. [Azuma2001] R. T. Azuma, Y. Baillot, R. Behringer, S. K. Feiner, S. Julier, and B. MacIntyre, "Recent Advances in Augmented Reality", IEEE Computer Graphics and Applications, vol. 21/6, 34-47, 2001. [Bea2000a]
M. Beaudouin-Lafon and W. E. Mackay, "Reification, Polymorphism and Reuse: Three Principles for Designing Visual Interfaces", AVI'2000, Palermo, Italy, 102-109, 2000.
[Bea2000b]
M. Beaudouin-Lafon, "Instrumental Interaction: An Interaction Model for Designing Post-WIMP User Interfaces", CHI'2000, The Hague, NL, 446453, 2000.
[Bier1993]
E. A. Bier, M. C. Stone, K. Pier, W. Buxton, and T. D. DeRose, "Toolglass and Magic Lenses: The See-Through Interface", International Conference on Computer Graphics and Interactive Techniques, Anaheim, California, 73-80, 1993.
[Broll2000]
W. Broll, E. Meier, and T. Schardt, "The Virtual Round Table - A Collaborative Augmented Multi-User Environment", CVE'2000, San Francisco, California, USA, 39-46, 2000.
[Bux1995]
W. Buxton, "Integrating the Periphery and Context: A New Model of Telematics", Graphics Interface '95, 239-246, 1995.
[Carr2001]
J. M. Carroll, Human-Computer Interaction in the New Millennium. Massachusetts: Addison-Wesley, 2001.
[Dahl1998]
A. Dahley, C. Wisneski, and H. Ishii, "Water Lamp and Pinwheels: Ambient Projection of Digital Information into Architectural Space", CHI'98, Los Angeles, California, USA, 269-270, 1998.
[Dix1998]
A. J. Dix, J. E. Finlay, and G. D. Abowd, Human-Computer Interaction: Prentice Hall Europe, 1998.
[Fink1999]
K. Finkenzeller, RFID-Handbuch. München: Carl Hanser Verlag, 1999.
[Fitz1995]
G. W. Fitzmaurice, H. Ishii, and W. Buxton, "Bricks: Laying the Foundations for Graspable User Interfaces", Conference on Human Factors and Computing Systems, Denver, Colorado, 442-449, 1995.
[Fitz1997]
G. W. Fitzmaurice and W. Buxton, "An Empirical Evaluation of Graspable User Interfaces: Towards Specialized, Space-multiplexed Input", CHI'97, Atlanta, Georgia, USA, 43-50, 1997.
102
[Gee2001]
D. R. McGee and P. R. Cohen, "Creating Tangible Interfaces by Augmenting Physical Objects with Multimodal Language", International Conference on Intelligent User Interfaces 2001, Santa Fe, New Mexico, 113-119, 2001.
[Gor1998]
M. G. Gorbet, M. Orth, and H. Ishii, "Triangles: Tangible Interfaces for Manipulation and Exploration of Digital Information Topography", CHI'98, Los Angeles, California, USA, 49-56, 1998.
[Guim2001]
F. Guimbretière, M. Stone, and T. Winograd, "Fluid Interaction with Highresolution Wall-size Displays", UIST'2001, Orlando, USA, 21-30, 2001.
[Her1994]
L. Herman, Phoenix: The Fall and Rise of Home Videogames: Rolenta Press, 1994.
[Hsiao1999] K. Y. Hsiao and J. Paradiso, "A New Continuous Multimodal Musical Controller Using Wireless Magnetic Tags", International Computer Music Conference, 24-27, 1999. [Hsiao2001] K. Y. Hsiao, "Fast Multi-Axis Tracking of Magnetically-Resonant Passive Tags: Methods and Applications", Master's Thesis in Department of Electrical Engineering and Computer Science: MIT, 2001. [Ishii1997]
H. Ishii and B. Ullmer, "Tangible Bits: Towards Seamless Interfaces Between People, Bits and Atoms", Conference on Human Factors and Computing Systems, Atlanta, Georgia, USA, 234-241, 1997.
[Ishii1998]
H. Ishii, C. Wisneski, S. Brave, A. Dahley, M. Gorbet, B. Ullmer, and P. Yarin, "ambientROOM: Integrating Ambient Media with Architectural Space", CHI'98, Los Angeles, California, USA, 173-174, 1998.
[Ishii1999]
H. Ishii, R. Fletcher, J. Lee, S. Choo, J. Berzowska, C. Wisneski, C. Cano, A. Hernandez, and C. Bulthaup, "musicBottles", SIGGRAPH'99, Los Angeles, California, USA, 172-173, 1999.
[Ishii2001]
H. Ishii, S. Ren, and P. Frei, "Pinwheels: Visualizing Information Flow in an Architectural Space", Extended Abstracts of CHI'01, 2001.
[Jac2002]
R. J. K. Jacob, H. Ishii, G. Pangaro, and J. Patten, "A Tangible Interface for Organizing Information Using a Grid", CHI'2002, Minneapolis, Minnesota, USA, 2002.
[John1989]
J. Johnson, T. L. Roberts, W. Verplank, D. C. Smith, C. Irby, M. Beard, and K. Mackey, "The Xerox 'Star': A Retrospective", IEEE Computer, vol. 22/9, 11-26, September, 1989.
[Kato2000]
H. Kato, M. Billinghurst, I. Poupyrev, K. Imamoto, and K. Tachibana, "Virtual Object Manipulation on a Table-Top AR Environment", ISAR 2000 Conference, 2000.
[Kir1994]
D. Kirsh, "On Distinguishing Epistemic from Pragmatic Action", Cognitive Science, vol. 18/4, 513-549, 1994.
[Kir1995]
D. Kirsh, "Complementary Strategies: Why We Use Our Hands When We Think", 17th Annual Conference of the Cognitive Science Society, Hillsdale, NJ, 1995.
[Kloss2000] M. Kloss, Lingo Objektorientiert. Bonn: Galileo Press, 2000. [Lau1993]
B. Laurel, Computers as Theater. Massachusetts: Addison-Wesley, 1993. 103
[Leske2000] C. Leske, T. Biedorf, and R. Müller, Director 8 für Profis. Bonn: Galileo Press, 2000. [Mat1997]
N. Matsushita and J. Rekimoto, "HoloWall: Designing a Finger, Hand, Body, and Object Sensitive Wall", 10th annual ACM symposium on User interface software and technology, Banff, Alberta, Canada, 209-210, 1997.
[Nor1990]
D. Norman, The Design of Everyday Things. New York: Doubleday Books, 1990.
[Par1997]
J. Paradiso, C. Abler, K. Y. Hsiao, and M. Reynolds, "The Magic Carpet: Physical Sensing for Immersive Environments", CHI'97, Atlanta, Georgia, USA, 277-278, 1997.
[Pat2000]
J. Patten and H. Ishii, "A Comparison of Spatial Organization Strategies in Graphical and Tangible User Interfaces", DARE'2000, Elsinore, Denmark, 41-50, 2000.
[Pat2001]
J. Patten, H. Ishii, J. Hines, and G. Pangaro, "Sensetable: A Wireless Object Tracking Platform for Tangible User Interfaces", SIGCHI'2001, Seattle, Washington, 253-260, 2001.
[Rek1999]
J. Rekimoto and M. Saitoh, "Augmented Surfaces: A Spatially Continuous Work Space for Hybrid Computing Environments", CHI'99, Pittsburgh, Pennsylvania, USA, 378-385, 1999.
[Rek2000]
J. Rekimoto and E. Sciammarella, "ToolStone: Effective Use of the Physical Manipulation Vocabularies of Input Devices", UIST'2000, San Diego, California, USA, 109-117, 2000.
[Shn1983]
B. Shneiderman, "Direct Manipulation: A Step Beyond Programming Languages", IEEE Computer Graphics and Applications, vol. 16/8, 57-69, 1983.
[Shn1997]
B. Shneiderman, "Direct Manipulation for Comprehensible, Predictable and Controllable User Interfaces", IUI'97, Orlando, Florida, USA, 33-39, 1997.
[Shn1998]
B. Shneiderman, Designing the User Interface: Strategies for Effective Human-Computer-Interaction, 3rd ed: Addison-Wesley Longman, Inc., 1998.
[Smith1982]
D. Smith, C. Irby, R. Kimball, W. Verplank, and E. Harslem, "Designing the Star User Interface", Byte, 242-282, 1982.
[Staf1996]
Q. Stafford-Fraser and P. Robinson, "BrightBoard: A Video-Augmented Environment", CHI'96, 134-141, 1996.
[Str1999]
N. A. Streitz, J. Geißler, T. Holmer, S. i. Konomi, C. Müller-Tomfelde, W. Reischl, P. Rexroth, P. Seitz, and R. Steinmetz, "I-LAND: An Interactive Landscape for Creativity and Innovation", CHI'99, Pittsburgh, Pennsylvania, USA, 120-127, 1999.
[Stre1998]
N. A. Streitz and D. M. Russell, "Basics of Integrated Information and Physical Spaces: The State of the Art", CHI'98, Los Angeles, California, USA, 273-274, 1998.
[Stri1998]
J. Strickon and J. Paradiso, "Tracking Hands Above Large Interactive Surfaces with a Low-Cost Scanning Laser Rangefinder", CHI'98, Los Angeles, California, USA, 231-232, 1998. 104
[Szal1997]
Z. Szalavári and M. Gervautz, "The Personal Interaction Panel - A TwoHanded Interface for Augmented Reality", EUROGRAPHICS'97, Budapest, Hungary, 335-346, 1997.
[Ull1997a]
B. A. Ullmer and H. Ishii, "Models and Mechanisms for Tangible User Interfaces", Master's thesis: Massachusetts Institute of Technology, 1997.
[Ull1997b]
B. Ullmer and H. Ishii, "The metaDESK: Models and Prototypes for Tangible User Interfaces", UIST'97, 223-232, 1997.
[Ull1998]
B. Ullmer, H. Ishii, and D. Glas, "mediaBlocks: Physical Containers,Transports, and Controls for Online Media", SIGGRAPH'98, Orlando, Florida, USA, 379-386, 1998.
[Ull2000]
B. A. Ullmer and H. Ishii, "Emerging Frameworks for Tangible User Interfaces", IBM Systems Journal, vol. 39/3,4, 915-931, 2000.
[Und1999a]
J. Underkoffler and H. Ishii, "Urp: A Luminous Tangible Workbench for Urban Planning and Design", CHI'99, 386-393, 1999.
[Und1999b]
J. Underkoffler, B. Ullmer, and H. Ishii, "Emancipated Pixels: Real-World Graphics In The Luminous Room", Siggraph'99, Los Angeles, USA, 385392, 1999.
[Weis1991]
M. Weiser, "The Computer of the 21st Century", Scientific American, vol. 265/3, 66-75, 1991.
[Weis1995]
M. Weiser and J. S. Brown, "Designing Calm Technology", Powergrid Journal V1.01, 1995.
[Well1993]
P. Wellner, "Interacting With Paper on the Digital Desk", Communications of the ACM, vol. 36/7, 86-96, 1993.
105