Waldemar Rosenbach. (*) ... applications that are required for user tests with minimal effort and thus ... evaluation the test subjects then use the instrumented.
ART 03
Supporting User Interface Evaluation of AR Presentation and Interaction Techniques with ARToolkit Volker Paelke(+), Jörg Stöcklein(+), Christian Reimann(+), Waldemar Rosenbach(*) University of Paderborn(+ ), Siemens Business Services(*) C-LAB Visual Interactive Systems Group Fürstenallee 11 D-33102 Paderborn Germany {vox, ozone, reimann, bobka}@c-lab.de Abstract
1. Introduction
Usability oriented design is essential for the creation of efficient, effective and successful real-world AR applications. The creation of highly usable AR interfaces requires detailed knowledge about the usability aspects of the various interaction and information presentation techniques that are used to create them. Currently the expertise in the AR domain with regards to efficient and effective visual presentation techniques and corresponding interaction techniques is still very limited. To resolve this problem it is necessary to support designers of AR user interfaces with a knowledge-base that covers the AR specific aspects of various information presentation and interaction techniques. This knowledge can only be gathered by evaluating different techniques in systematic usability tests. To make the systematic evaluation of a variety of AR information presentation and interaction techniques viable we have created a workflow that supports the fast and easy creation of the necessary test applications. The workflow uses well established tools including Maya for 3D modeling, the i4D graphics system for graphics rendering and ARToolkit for tracking, as well as some new custom developments to integrate them into a coherent workflow. This approach enables the creation of the small-scale AR applications that are required for user tests with minimal effort and thus enables us to systematically compare different approaches to common AR user interface design problems.
Advances in mobile computing and wireless communication technology now enable the creation of interactive multimedia applications on a variety of mobile devices ranging from mobile phones over smart phones to PDAs and other portable computing devices. Applications ranging from location-based services, mobile e-Commerce applications and entertainment to maintenance support and virtual prototyping are often envisioned as future applications based on these technologies. AR as an interface technology has high potential for the creation of intuitive and highly usable user interfaces in these domains, especially for the presentation of spatially organized information. However, research in AR has primarily focused on fundamental technological issues with limited attention to usability so far. We believe that usability will be of critical importance for the development of successful real-world AR applications. Despite advances in several areas the creation of attractive and usable AR user interfaces is still hindered by a lack of related design experience, guidelines, processes and corresponding tools. Key problems in the user interface design for AR applications are the development of appropriate visual presentation designs and the implementation of suitable interaction mechanisms because these differ significantly from the well known desktop GUI design domain ([4,5]). As a foundation for future usability centred AR application design it is therefore necessary to create a knowledgebase of AR user interface design expertise. As a first step towards informed user-centred design of AR applications and their content we have therefore started to study two basic aspects in the user interface of every AR application: information presentation and content selection techniques. To compare different approaches to
1
ART 03
these problems in a useful way we employ an experimental approach in which a number of different information presentation or content selection techniques are used in simple test applications that are evaluated with real users. This approach requires the creation of many simple AR applications that only differ in the aspect of the information presentation technique (or in the interaction technique) under examination. To create these “variations” of the test application “theme” in an effective way we employ a set of existing tools that we have combined with some new developments and techniques to create an affordable workflow. The approach uses ARToolkit for position tracking both during the AR application design phase and the actual tests conducted with the finished AR applications.
test-bed application in which only a small part is modified for each test. The modification is furthermore supported by an efficient workflow and corresponding tools. The test-bed application is based on previous experience with an AR-based museum guide [7]. The museum guide application uses the infrastructure created as part of the AR-PDA project [1] and provides guidance information through the museum as well as information on individual exhibits. The original AR-ENIGMA illustration was designed to explain the history, function and operation of the Enigma encryption machine (Fig. 1).
2. The Evaluation Environment To be effective, a usability evaluation technique should be expressive, easy to use, fast, low cost and require a minimal number of test subjects (which are usually difficult or costly to obtain). Research in human-computer interaction has therefore developed numerous usability evaluation techniques with different advantages and limitations ([8,9]). While an empirical evaluation of a system with real users typically provides the most expressive results it also tends to incur high costs, requires working prototypes and often requires a large number of test subjects. Evaluation techniques like cognitive walkthrough, heuristic evaluation, design reviews, and modelbased evaluation have therefore been developed to decrease these requirements, often replacing tests with the application of existing design expertise and guidelines. However, these techniques cannot be easily applied to AR interaction, because currently only limited design expertise and few design guidelines exist for this domain. This leaves the empirical method of experimental evaluation as the most promising technique for AR user interface evaluation. The design of such an empirical evaluation requires a careful choice of test subjects, variables and hypotheses. Once the test design has been completed the application must be instrumented to record and save the experimental data. In an experimental evaluation the test subjects then use the instrumented application under controlled conditions. The independent variable (e.g. visual presentation or interaction technique) designated by the test designer is manipulated to produce different conditions while the dependent variables (e.g. completion time, error rate) are recorded to test the hypotheses. The cost of ad-hoc creation of test AR applications that differ in the variables under examination would be prohibitive if more that a small number of visual presentation or interaction techniques are to be examined. We have therefore decided to built a quite generic AR
Figure 1. The AR-ENIGMA Illustration
The Enigma is a mechanical encryption machine that was used by the German military in WW2. It became famous when – after decades of secrecy – the successful attacks of the Allied codebreakers became public knowledge in the 1970s. Since then the Enigma continues to captivate public interest due to the vital role that intelligence from deciphered Enigma intercepts played in the Allied war effort and because the seemingly impossible task of breaking the Enigma ciphers was achieved by a fascinating combination of mathematical brilliance, daring espionage actions and one of the first “industrialized” information processing applications: the electromechanical “bomb” designed by Alan Turing and Gordon Welchman that is one of the predecessors of today’s computers. The AR-ENIGMA application enables museum visitors to actively explore and experience the operation of an Enigma encryption machine in the Heinz Nixdorf MuseumsForum [3]. The application provides background information on a wide range of topics ranging from the Enigma’s historical context, over the basics of cryptography to the mathematical foundations of the enciphering algorithm implemented by the Enigma and the techniques employed by the successful Allied codebreakers. In an interactive simulation mode visitors can use a virtual keyboard on the PDA display to encipher and decipher messages. The events that correspond to this enciphering in a real Enigma machine (e.g. key-presses, rotor movements) are illustrated in a 3D model of the Enigma that is registered with it’s real-world counterpart and displayed on the PDA. The simulation allows to configure arbitrary start-states (selection and position of rotors that correspond to different keys in the Enigma’s
2
ART 03
The AR-based museum guide has several features that make it useful as a test-bed for usability tests: • The explanation of a museum exhibit like the Brunsviga RK offers a wide range of presentation tasks that are representative for many AR applications. • Similarly, a wide range of interaction tasks can be examined in the museum scenario, especially if users can physically manipulate the illustrated object as with the Brunsviga RK. • The museum provides a controlled environment (lighting, noise, climate) that eliminates many common problems with AR applications in other locations that shouldn't interfere with the evaluation. • The museum setup makes it possible to install recording equipment for the tests, a prerequisite for the interpretation of the test results. • The museum setup enables the use of cables both for power supply and data transmission and thus allows to reduce problems associated with power limitations, EM-interference and lag that could otherwise interfere with the evaluation results. • The museum provides easy access to test users.
cipher) and enables several visitors to explore the functionality of the Enigma independently while the physical exhibit remains safely in its glass box (Fig. 2).
Figure 2. AR-ENIGMA: Illustration and Simulation Mode
For the evaluation environment we have developed an AR illustration of another exhibit, the Brunsviga RK, a mechanical calculator from the 1920s that uses pin-wheels to perform addition, subtraction, multiplication and division (see Fig. 8). The advantage of the Brunsviga is that the exhibit is far less expensive and fragile, so that users of the AR illustration can experiment with the actual machine and are not restricted to a simulation as with the AR-ENIGMA. When conducting user tests the existing AR museum-guide is used to familiarize test users with the general AR setup and the application context first (see Fig. 3).
3. Requirements To derive useful information on information presentation and content selection techniques from tests it is necessary to compare different versions of the same museum application in which only a single aspect of the user interfaces is changed. For example, in the explanation of the function of a lever on the Brunsviga RK, the same content selection technique could be used in all “variations” while different approaches to information presentation are employed. These "variations" of the Brunsviga RK illustration "theme" are then tested by a group of users to derive the necessary insight on the differences and advantages of the different information presentation techniques. The creation and examination of these kind of test applications results in a number of requirements both at the design-time when the AR application and its content are created and at run-time when users interact with the resulting application as part of a test.
Figure 3. Museum Guidance Introduction
At the design-time the main requirements are: • It must be easy and fast to create new presentation techniques • It must be easy and fast to create new interaction techniques • It must be easy and fast to create a spatial reference between the information presentation and the physical exhibit
For the evaluation of a specific visual presentation or interaction technique test users are then presented with a specific information gathering or interaction task that they have to perform using a version of the Brunswiga AR illustration that has been modified to feature the technique under examination.
3
ART 03
At run-time the resulting AR-application must support the following requirements: • Reliable real-time tracking • Real-time 3D rendering of the (possibly animated) presentation techniques and presentation in either optical- or video-seethrough displays • Handling of user interaction • Recording of user interaction The following two sections describe how these requirements are addressed in our approach. It should be noted that since we use a test-bed application (the museum guide) and focus on a specific application context (the museum) the task is much simpler than in a general AR content creation situation.
Figure 5. Maya to i4D Export
4. Design-Time Support 4.2. Fast Creation of Interaction Techniques: The i4D system that we use as the run-time platform for our test application is fully integrated with the Tcl/Tk scripting language [6]. New interaction techniques can be scripted easily in Tcl/Tk using the i4D behaviour concept. Since we test only individual new interaction techniques that have to be scripted the performance penalty incurred by the interpretation of Tcl/Tk is usually acceptable. Otherwise it is also possible to implement new interaction techniques in C++.
4.1. Fast Creation of Presentation Techniques: In our approach the presentation techniques that are required for the tests can be easily created in the well known 3D modeling package Maya [10] (see Fig.4).
4.3. Establishing Spatial Reference: Many information presentations in AR applications are spatially located with respect to an object. This usually involves measurement of the intended locations and careful calibration and fine-tuning of the resulting positions. To simplify this process we use ARToolkit not only for tracking during run-time but also to establish a spatial reference between an object and associated information presentation techniques: To achieve this we use so-called “Post-It” markers in addition to the marker(s) that are used for tracking. Each information presentation technique is assigned to one “Post-It” marker. The marker is simply placed at the desired place on the object surface. ARToolkit is then used to determine the positions of both the tracking and the “Post-It” markers. The relative offset between the two positions determines the spatial reference that is later used in the application at run-time. To minimize errors the camera can be placed on a tri-pod and the resulting positions and orientations can be averaged over several “measurements”, e.g. using a median filter.
Figure 4. Presentation Technique Development
We use the scripting language MEL that is supplied by Maya to create a custom export format that can be used in the AR application to load the presentation technique designs at run-time (see below). Using Maya it is possible to create a variety of presentation techniques for a similar purpose fast and with optimal tool support. Our custom Maya export function is not limited to static presentation techniques but also supports the use of animation as part of the presentation technique (Fig. 5).
4
ART 03
5.1. Real-Time Tracking:
5. Run-Time Support
At run-time we use the tracking functionality provided by ARToolkit to establish the current position of the exhibit in real-time. To simplify the integration with graphics generation and interaction handling ARToolkit has been integrated as a component into the i4D system, where it provides the AR application with a transformation matrix [12].
Run-time support is provided by extension components in the i4D framework [6] (Fig. 6). i4D is a high-level component based library for interactive 3D animations. Based on the actor model an i4D application consists of a number of actors placed on stages. Actors include conceptual objects like lights and cameras, visual 3D objects and software elements without a visual representation. Actors perform actions (e.g. animations, sending messages, sound) by continuously modifying their attributes. Conceptually, a stage is viewed by a number of cameras and displayed on monitors which themselves are actor objects. The assignment of monitors, cameras, stages and actors can be altered interactively during runtime. Similar to scene graphs, actors can be hierarchically structured. The new visual presentation or interaction techniques to be tested are specialized actors in the i4D
5.2. 3D Rendering: For rendering we use the i4D system. The extendibility of i4D through new components has been exploited to integrate AR support, e.g. video compositing for videosee-through applications. A special extension created for this specific purpose allows to import the information presentation techniques created in Maya directly as new actors into i4D. This component also includes the necessary capabilities to control the animation of animated techniques defined in Maya [12].
5.3. User Interaction: User interaction is also handled by i4D, exploiting the Tcl/Tk scripting capabilities. The scripting interface provides access to the complete i4D functionality – including the AR extensions – and is independent of the host language used since a native interpreter is integrated in i4D. The current version of i4D uses Tcl/Tk as its host language.
5.4. Recording of User Interaction: Through a simple extension all user interactions are logged into a file by i4D and are thus available for later analysis.
6. Example framework.
The basic structure of an i4D AR application is best illustrated with a simple example. The following i4D/Tcl script creates a “HelloARWorld”: if the specified pattern is recognized a simple red cube is projected:
Figure 6. i4D System Architecture
load Illu // load i4D library into Tcl Create Stage s s Create Camera c s Create LiveCam lc MainMonitor SwitchTo s.c s.background s.lc s Create DirectionalLight dl orient ((1 1 1) 30) s Create RealObject ro1 s.ro1 Create 3DObject Cube o1 color (1 0 0) s Create PatternRecognition pr s.pr InsertPattern s.ro2 "Data\\patt.sample1" s.pr.picture s.lc s.lc Start
In contrast to existing scene graph APIs, i4D provides access to high-level components that can be loaded into the system at runtime and can be combined into more complex building blocks. I4D can be extended with additional components, including third-party components. We have exploited these mechanism to extend the basic i4D toolkit with the AR capabilities required to conduct the intended user tests, namely through the integration of video-streams and the marker-based tracking capabilities provided by ARToolkit [11].
5
ART 03
Using Tcl/Tk scripts like the one above it is easy to modify the Brunsviga RK test-bed application to cover different visual presentation or interaction techniques. Figure 7 shows a scenario in which different highlighting techniques are used to indicate points of special interest on the Brunsviga RK.
Figure 8. The Brunsviga RK Illustration
To simplify the tests and to avoid problems with lag, limited power supply, and limited processing power on the PDA we perform the tracking and graphics rendering on a PC server. The PDA is only used to display the resulting graphic streams and to handle user interaction that is then transmitted as a stream of events to the server. Data transmission between the camera, the server and back to the PDA can either be by wireless connection as in the original AR-ENIGMA application or using a direct wire-based connection. When conducting user tests the high power requirements of the camera and the wireless LAN-card result in very short runtimes of the PDA, so that a cable for power supply is typically used since this doesn't interfere with the test scenario (which takes place in a limited area around the exhibit) (see Fig. 9).
Figure 7. Different Highlighting Techniques Indicate Points of Interest
In the same way different interaction techniques can be evaluated using the Brunsviga RK test-bed. Since we currently explore interaction and presentation techniques for the AR-PDA scenario, in which an AR enabled PDA is used to present AR content, the interaction techniques can be based either on the information provided by the camera, the buttons on the PDA or pen-based interaction. In head-mounted display (HMD) based applications the set of applicable interaction techniques obviously differs so that additional experiments are necessary for the corresponding techniques. The Brunsviga RK test-bed application is designed in a way that allows to extend it to other AR input and output devices like HMDs or different tracker technologies. We plan to use these capabilities in future experiments with completely vision-based interaction techniques that exploit the video information provided by the camera on a HMD to detect simple user gestures that are coupled to selection and command actions.
Figure 9. Hardware Setup
6
ART 03
7. Conclusions and Outlook
8. References
Our system is currently used to study several information presentation and content selection techniques for the ARPDA project. As a first subject the question of appropriate content selection techniques is addressed. Based on the 3D position information provided by AR tracking several camera-based approaches for the selection of augmentation information can be used. For example, a AR-PDA user interface can be designed around the magic lens metaphor. In this case the information that is spatially related to a certain location on the illustrated object is shown when it is in the viewing area of the camera. The technique can be refined to adjust the amount of detail presented according to the distance between the camera and the object. De-selection of the information presentation can either be by explicit user action or implicit when the point of interest gets outside of the field of view (although this could be unintentional or due to loss of tracking information). Alternatively, an interface with the same content selection functionality can be based on the gun-sight metaphor in which the user is presented with a cross-hair on the video image and where additional content information is selected by an explicit trigger action. In our test-bed application all these ARbased interaction techniques can be compared to conventional pen-based menus as a base-line (Fig. 10).
[1] www.ar-pda.de [2] mixed-reality-gems.net [3] www.hnf.de [4] Bergman, E. (Ed.) (2000): Information Appliances and Beyond, Morgan Kaufmann Publishers [5] Paelke, V.; Reimann, C.; Rosenbach, W.: “A Visualization Design Repository for Mobile Devices”, in: Proc. ACM Afrigraph 2003, Cape Town, February 2003 [6] Paelke, V.: Design of Interactive 3D Illustrations, Dissertation, University of Paderborn, 2002 [7] Paelke, V. et al.: “The AR-ENIGMA - A PDA based Interactive Illustration” in: SIGGRAPH Sketches and Applications, San Antonio, Texas, USA, July 2002 [8] J. Preece: Human-Computer Interaction, Addison Wesley, 1994. [9] B. Shneiderman: Designing the User Interface: Strategies for effective Human-Computer Interaction, 3rd ed., Addison Wesley, 1998. [10] www.aliaswavefront.com [11] www.hitl.washington.edu/research/shared_space [12] Stöcklein, J.:Prototyping von Mixed Reality Illustrationen, Diploma Thesis (in German), forthcoming, University of Paderborn
Acknowledgement: This work was supported by the Deutsche Forschungs Gemeinschaft (DFG) as part of the research grant SFB 614 "Selbstoptimierende Systeme des Maschinenbaus".
Figure 10. PDA Menus vs. AR Cross-Hair Interface
Based on the results of these on-going experiments it will become possible for developers to make informed decisions when selecting information presentation and interaction techniques for AR applications. Our aim is to integrate the resulting findings into a public repository of information on AR and MR interaction and presentation techniques that is currently under construction [2] and for which third-party contributions are actively invited.
7