Translation and Rotation of virtual Objects in Augmented Reality: A Comparison of Interaction Devices Stefan Reifinger, Florian Laquai, Gerhard Rigoll Lehrstuhl f¨ur Mensch-Maschine Kommunikation Technische Universit¨at M¨unchen Munich, Germany
[email protected],
[email protected],
[email protected]
Abstract—This paper describes an evaluation, which compares three interfaces for translating and rotating virtual objects in an Augmented Reality environment. We used a mouse/keyboard based interface, an infrared-tracking based gesture recognition system, and our tangible user interface, which we developed for this evaluation. Our tangible user interface consists of two accelerometers, one gyroscope, and four buttons integrated in a cuboid casing. The evaluation focused on a comparison of immersion, intuitiveness, mental and physical workload, and task execution times. The test persons had to solve a given task with all interfaces (translating an object, rotating an object, and a combination of translation and rotation). The results showed that a translation of a virtual object takes nearly the same amount of time with all interfaces, but much more than a similar manipulation in a real environment. Rotation is performed as the slowest using a mouse/keyboard based interface. For a combination of translation and rotation, gesture recognition based interaction turned out to be the fastest way of interaction. All kinds of manipulation showed, that gesture recognition provides the most immersive and intuitive interaction with the lowest mental workload. But these benefits are limited by the highest rating of physical workload, caused by additional hardware mounted at the user’s hand, as well as the fact, that the user has to hold his arm upwards during the task execution.
I. I NTRODUCTION AND M OTIVATION A human in general interacts with a computer via a graphical user interface (GUI). Since 1981 they have become a standard in human-computer interaction [5]. GUIs in combination with mouse and keyboard based interfaces allow the user to interact by simply pointing and clicking objects of the GUI. In this way, manipulations of virtual objects are possible. Manipulations, based on such systems, mainly consist of translation and rotation. Human is able to interact in a much more complex way. Especially while interacting in virtual 3D environments, such as Virtual Reality, interacting by mouse and keyboard based interfaces lead to a abstract manipulation of virtual objects. Desktop applications are limited in the user’s movement area, because mouse and keyboard have to be placed on a plain surface. This results in a lower user’s feeling of immersion. The mental workload of a user is high, because complex manipulations in three dimensions have to be split into a set of manipulations in two dimensions, which is a unnatural way of interaction. Augmented Reality enhances the
c 2008 IEEE 1–4244–2384–2/08/$20.00
traditional GUI from a 2D environment to a 3D environment. Interaction takes place in a virtual environment, which is visualized in three dimensions, similar to a real environment. Thats why such a virtual environment needs an appropriate interface, which enables the user to interact with a virtual environment in three dimensions. Thus, this paper presents an tangible user interface, which represents interactions of virtual objects in three dimensions. To compare this interface, we performed an evaluation with other 3D interfaces, concerning immersion, intuitiveness, mental, and physical workload. Also, we separated the tests in translation, rotation, and a combination of both manipulations.
II. P REVIOUS W ORK Tangible Augmented Reality notes the usage of tangible user interfaces for the manipulation of virtual objects [6]. Virtual objects in a virtual environment are associated with real objects in a real environment. Manipulation of physical represented objects results in the manipulation of their associated virtual objects. “TARBoard” is a tangible Augmented Reality System, which consists of a glass table, a mirror, and two cameras [3]. Quadratic markers are placed on the bottom side of real objects. These markers are tracked by a camera, using the mirror, placed under the table. The second camera films the real environment as basis for the integration of virtual objects. In the “Shared Space” application, those markers self act as tangible user interface [6]. In a gaming sample application an animation is triggered, if two markers, belonging to each other are placed side by side. “Illuminating Clay” [2] is an tangible Augmented Reality application, not basing on optical markers. They use an laser scanner, which is mounted on the ceiling and retrieving information about the objects placed below on a table. These objects form models of landscapes (streets, houses, etc...). These scenery is calculated regarding influences of insolation and shading, erosion, and the flow of water. The results then are added by a beamer directly into the model.
SMC 2008
III. I MPLEMENTATION OF OUR TANGIBLE U SER I NTERFACE A. Overview Our tangible user interface consists of an USB board for a connection to a pc, and two accelerometers, one gyroscope, and four buttons for user interaction. All components are integrated in a cuboid, connected by a USB connection to the computer. The accelerometers deliver data about the rotation around the X and Y axis, and the gyroscope measures the rotation around the Z axis. The buttons are simple binary buttons, which status can be true or false. All data can be retrieved and used in any application. To ensure this, we developed the system as a master-client system. The master retrieves all data of connected sensors, computes them, and sends them to the client, which is connected by UDP network. The client is build up as a C# software library, which offers all sensor’s data, and can easily be integrated in any application. B. Hardware Our tangible user interface consists of a sensor system, which is integrated in a USB board, enabling a connection between the sensors and the computer. These components are mounted in a cuboid casing, which is extended by four buttons on the top side of the box. We use the Microchip “PICDEM FS” USB Board, which is specialized in USB applications by an easy to program memory [4]. Main component is a “PIC18F2455” USB micro controller, which supports USB 2.0 with a data transfer rate of 1.5 Mbit/s and 12 MBit/s. The board comes with an programing interface including several basic functions, which enables easy to develop USB applications. These applications can be transfered to the board’s memory by integrated software. As sensors we use the two dimensional “ADXL320” accelerometer for each X and Y axis. For measuring rotations around the Z axis, we use the “ADXRS300” gyroscope. The sensor system is connected to the “PICDEM FS” USB board by the 5V supply voltage. For our tangible user interface we limit the retrieved and processed data by the sensors to the rotations around each axis. The two accelerometers are able to detect the inclination up to an angle of ninety degrees, because there is no differentiation between angles greater than ninety degrees from angles lower than ninety degrees. Basing on these data, also translation in X and Y direction could be calculated. But this calculation requires the sensor system to be placed on a plain surface, else the calculation could lead to falsified data. Also, accelerometers imply a drift in translation, so we decided to leave out this approximation, which would limit the user’s free interaction with our tangible user interface. Our tangible user interface is integrated in a cuboid casing sized 25cm x 19cm x 4cm. A cuboid offers a well-known appearance of technical interfaces, e.g. remote controls. With three different lengths of it’s sides, the cuboid’s orientation always can be identified by the user. Fig. 1 shows our tangible user interface with it’s coordinate system. We added four buttons to the tangible user interface for enhanced interaction possibilities. Two buttons are able to snap in after pressing
1 Z
Y
1 2
2
X
Fig. 1.
Casing with integrated sensor system, USB board, and 4 buttons
them (marked as “2” in fig. 1), and two buttons are only enabled, while pressed (marked as “1” in fig. 1). They are arranged in an two-handed interaction style, which allows the user to touch each button with his thumbs, without changing the whole position of the tangible user interface. C. Software The software for our tangible user interface bases on a master-client approach. This allows the distribution of all components on different computers, which rises the system’s overall performance. Communication between master and client is performed by a network connection. As a stand-alone application, the master module extracts the data delivered by the connected tangible user interface and sends them via a UDP connection to the client module. For a differentiation of the data, a predefined set of network messages is used. Data of the sensors are processed and marked either as “x Accel:” or “y Accel:” for the accelerometers or “Rot z:” for the gyroscope. “Buttons:” defines data describing the current state of the four buttons. The client module is developed as a C# library, which allows a simple integration in any application. Listening to the UDP messages, which are sent by the master module, the client receives information about the tangible user interface and the state of all sensors and buttons. This information is stored locally in the client module, accessible for the application, in which the client is integrated. The client provides an interfaces, which allows easy accessibility of data. IV. E VALUATION Main focus of this paper is the comparison of different interaction devices for manipulation virtual objects. Manipulation in our evaluation is defined as translation and rotation of virtual objects. In our evaluation, virtual objects are placed in an Augmented Reality environment. As interaction devices we use a mouse/keyboard based interface, a gesture recognition system, based on infrared tracking, and the tangible user interface, which is described in this paper. All interfaces are
SMC 2008
compared with similar tasks in an real environment. The evaluation focused on five characteristics of interfaces, each with assumptions we made for each interface: A. Assumptions on the focused characteristics In the evaluation, we focus on four characteristics. We would like to know, how different interfaces are rated regarding their degree of immersion, intuitiveness, as well as their mental and physical workload. Immersion We assume, that interaction using the gesture recognition system will be rated as the most intuitive interface. Because of the limited working space, while using mouse and keyboard as interface, we think, that our tangible user interface will be evaluated with better marks than mouse/keyboard interaction. Intuitiveness Similar to our assumption regarding immersion, interaction with the gesture recognition system will turn out to be the most intuitive way of manipulating a virtual object. Tangible user interface and mouse/keyboard have in common, that the user has to learn their way of working. Each function has to be triggered by a button, or a combination of buttons. This fact will lead to lower intuitiveness. Mental workload Every interaction needs attention of the user. In this evaluation we compared the mental workload, which each interaction induces. We assume, that mouse/keyboard will be the interface with the highest mental workload, because the user has to abstract 2D movement into 3D movements. Also, this interface is a combination of two devices, leading to combinations of right and left hand interactions. Interaction with gestures we believe to be the interface with the lowest mental workload, because interaction is very similar to real interaction. Physical workload The physical workload will be the highest while interacting with gestures in our opinion. This type of interaction requires the user to keep his hand and arm upwards for the task execution. This will be exhausting for the user and result in a high degree of physical workload. We think, the tangible user interface will be also rated very high, because this interface has to be held in the hand for interaction. Mouse/keyboard will not be very exhausting, because these devices are placed on a plain surface, on which the user can lay down his arm. B. Experimental set-up The evaluation took place at our laboratory. We differentiated the test set-up for a test in real and virtual environment. All experiments in real environment did not require additional hardware to be worn by the user. The user sat in front of a desk, on which all real objects were placed for interaction. For test in the virtual environment, all users had to wear a Head-Mounted Display for visualization of the virtual content. According to each interface additional hardware had to be used. For all experiments, the user sat in front of a desk, on which virtual objects were placed (visualized by the HeadMounted Display). In total, we used three different interfaces for the interaction in the virtual environment.
1) Reference task in real environment: For the reference task of manipulating objects in a real environment no additional hardware was needed. The users had to manipulate the real objects only by using their hands. 2) Mouse/Keyboard interface: Our mouse/keyboard interface had a set of predefined functions, which were used to manipulate virtual objects. Selection of an object is performed by clicking and pressing the left mouse button. While pressed button, manipulation is performed, according to a selected button on the keyboard. “T” triggered the translation, “R” selected the rotation mode. By default, moving the mouse right/left leads to a translation in X direction/rotation around Y axis, moving the mouse up/down resulted in a translation in Y direction/rotation around the X axis. Translation in Z direction and rotation around the Z axis were started by pressing the right mouse button additionally to the left button for selection. 3) Gesture recognition system: We used the gesture recognition system described in [1]. This gesture recognition system triggers the selection of a virtual object by grasping (thumb and index finger are touching at the fingertips). If a virtual object is close enough to the thumb, it will be selected and bound to the thumb. Manipulation is performed by moving and rotating the hand. Un-grasping will lead to the release of the object. For this system, the user had to wear additional hardware at his fingers. 4) Tangible user interface: Selecting objects with our tangible user interface is done by two buttons. The left button selects the object with the next lower internal identification number, the right button accordingly the next higher one. Thus, there is no direct selection of an object, the user has to “step through” all objects until the desired object is selected. Selected objects are highlighted by a short change of their size (they “pop-up”). Manipulation of an object is paired with the inclination of the tangible user interface. One button triggers the translation, another button triggers the rotation mode. If in translation mode, inclining the interface around the X axis lead to a translation in Y direction (accordingly Y inclination and X direction). Translating in Z direction requires additional pressing a button. Rotation mode rotates objects in a continuous way by inclination. Rotation around Z axis is performed by turning the interface around it’s Z axis. C. Evaluation procedure At the beginning of the evaluation, every test person was introduced to Augmented Reality. After that, all different interfaces were explained, including the assignment of buttons and their associated functions. The introduction was limited to 5 minutes for each interface to ensure same conditions for evaluating the intuitiveness of the interfaces. The evaluation is divided in three single test. Test 1 only refers to translation as manipulation, test 2 only deals with the rotation of a virtual object, and test 3 combines translation and manipulation of virtual objects in a complex way. We measured the task execution time for each test and interface for a comparison of the time, which the user needs to solve the given task with a given interface. After each test, the test person had to fill out
SMC 2008
140
118
120 102,8 96,3
100
83,6 80
Minimum Average Maximum
66,7 60,4
60 43,7
40,8
40
20 4
6,7
42,1
12,1
0 Real
Mouse/Keyboard
Fig. 3.
Fig. 2. Test 3: translating and rotating virtual building blocks in a virtual environment using gesture based interaction
a questionnaire, which included questions and ratings about immersion, intuitiveness, mental as well as physical workload of the interface. All characteristics were rated by marks started from “1” (very bad) to “5” (very good). Test 1 (translation only) This test only deals with the translation of virtual objects. So, the user had to move virtual objects in this test. At the beginning of the experiment five letters are arranged randomly, but with same starting conditions for each interface. The task of the user is, to arrange those letters in a way, which shows “AR TUI” as resulting writing. Test 2 (rotation only) The second task is to rotate a virtual object. This test only evaluates the user’s interaction while rotating a cube. This cube has printed six city names and cyphers on it’s sides. The user’s task is to enumerate the city’s names ascending, according to the cyphers printed on the sides. Test 3 (combined translation and rotation) The third task combines translation and rotation for manipulating virtual objects. We used a building block scenario for this test. Eight building blocks are arranged sorted by their type at the beginning of this test. The user’s task is to arrange the building blocks in a defined specification (e.g. by gestures as shown in fig. 2). D. Results of the evaluation For our evaluation we had a total of 15 test persons with an average age of 25 years, ranging from 23 to 27. Four persons were female and 13 persons had heard of Augmented Reality before this evaluation. Ten test persons had no experience with Augmented Reality systems, two persons had small experience and one person was considered to be an expert. Comparing these three groups, we did not find any significant difference in the results of the performed evaluation. Test 1 (translation only) and Test 2 (rotation only) Fig. 3 shows the minimum, maximum, and average task execution times for solving the task given in test 1. The diagram
Gestures
TUI
Test 1: Task execution times
compares the time needed for the task in the real environment and the virtual environment using the three different interfaces. The average time was 7 seconds interacting with real objects, 66 seconds using mouse/keyboard, 60 seconds using gestures, and 83 seconds using the tangible user interface. Compared to interaction in real environment, manipulation takes much more time in virtual environments. No matter which interface was used, task execution time is greater than arranging real objects. This is caused by the fact, that interaction in real environment is performed using both hands, whereas in virtual environment only one object can be manipulated at once. Interaction by mouse/keyboard turned out to be the fastest way to translate virtual objects in this given task. This could be caused by the necessity of moving to the object by using gestures and the way of selection of an object using the tangible user interface. While interacting with gestures, the user has to move his hand next to the object, which should be selected. As there is no direct selection of an object by the tangible user interface, the user has to “step-through” the objects, which requires time for navigating through all visible objects. Also, the moving speed of an object can not be controlled as precisely as with the hand. This leads to many manipulation steps until an object is placed correctly. Figure 4 shows the minimum, maximum, and average task execution times for rotation a cube in a real and a virtual environment using different interfaces. In average, users finished the task in real environment in 9 seconds, using mouse/keyboard as interface in 30 seconds, gestures in 17 seconds, and the tangible user interface 18 seconds. Solving the given task in an virtual environment using gestures is only twice as slow than interaction in the real environment. Also, using the tangible user interface is quite as fast as using gestures. The mouse/keyboard based interface is much slower compared to these times. This results from the fact, that this interface is a 2D input device used at a task in 3D (rotation around three axis). Gestures and the tangible user interface are devices, which are used within a 3D working space. Thus, the user does not have to transfer 2D actions to 3D reactions of the virtual object. Figure 5 shows the average rating of the main characteristics for translation and rotation,
SMC 2008
45
250
42,7
40 35
35 29,9
30
200,4
200
28,9
158 150
25 20,1
20
18,2
17,2
15
138
Minimum Average Maximum
122,1 111,3 93
100
13,1 11,2 8,7
50,6
50
4
5
94
70,9
8,8
10
Minimum Average Maximum
6,4 0
9,9
13,5
0 Real
Mouse/Keyboard
Fig. 4.
Gestures
TUI
Real
Test 2: Task execution times
Mouse/Keyboard
Fig. 6.
Gestures
TUI
Test 3: Task execution times
6
5
5
5 4,5
4,3 4
3,5
3,4
3,3 2,9
3 2,5 2
Real Mouse/Keyboard Gestures TUI
2,9
2,4 2
1,7 1,2
1
1
1,4
0 Immersion
Intuitivness
Mental WL
Physical WL
Fig. 5. Test 1/2: Ratings of immersion, intuitivness, mental and physical workload (WL)
which we focused in this evaluation. Real interacting was rated with 5 regarding immersion and intuitiveness. Gesture controlled interaction was rated with 4.3, the tangible user interface with 3.4, and the mouse/keyboard with 1.7. Intuitiveness of real interaction was evaluated with 5, gestures with 4.5, tangible user interface with 3.5, and mouse/keyboard with 2.5. Looking at the average ratings of the characteristics of virtual interaction, immersion is rated at highest while interacting with gestures. Also, the best rating of intuitiveness is the gesture recognition system for translating objects. This could be explained by the way of interacting, which is very similar to the real interaction. Grasping objects is very well known interaction for the test persons, only a haptic feedback is missing compared to the real interaction. Interacting with mouse/keyboard turned out to be the interface, which is the least intuitive and immerse way of moving objects in virtual environments. Remarkable is, that gestures are benchmarked almost the same marks as real interaction. While gestures and the tangible user interface have in common, that the interface has a direct connection between user’s action and objects reaction, the mouse/keyboard control is very abstract. This also leads to the higher rating of mental workload. Mental workload was lowest while using gestures for interaction, but in the same time, the physical workload was highest for this
way of interaction. Gestures require, that the user moves to the object and grasps it. So the user has to keep his arm upwards while interaction, which is very exhausting. Mouse/Keyboard does not force the user for physical exhaustment, because the interface is placed on a plain surface, on which the user can lay down his hand. Test 3 (combined translation and rotation) Figure 6 shows the minimum, average and maximum task execution times of the third test, which focused a combination of translation and rotation of (virtual) objects. The average time, needed for the task execution in the real environment was about 10 seconds. Using mouse/keyboard in virtual environment took in average 111 seconds, gestures 93 seconds, and the tangible user interface 138 seconds. Again, this evaluation shows, that interacting in real environments is much faster, than interacting in virtual environments at this given task. Main reason is a missing physical simulation, which enables the user to stack building blocks. Virtual objects does not interfere each other, so virtual objects don‘t collide, as real objects do. Also, missing two hand interaction is one possible explanation for the big difference between real and virtual environments. The tangible user interface lacks in selecting virtual objects. This task contained the selection of several objects, which needs very much time using this interface. In this case, mouse/keyboard has an advantage, because objects can easily be selected by clicking an object. The best interface for this task was the gesture recognition system. Figure 7 shows the average rating of the main characteristics for a combined translation and rotation, which we focused in this part of the evaluation. Similar to test 1 and test 2, real interacting was rated with 5 regarding immersion and intuitiveness. The virtual interaction with gestures is rated best with 4.3 points. The rating for mouse/keyboard is as low as in test 1 and test 2 with 1.7 and 2.5 points. Using the tangible user interface is scored with an average of 3.5 and 3.2 points. Looking at the ratings of mental workload, interaction with our tangible user interface was rated higher than in test 1 and test 2. This is caused by the higher amount of virtual
SMC 2008
6
5
5
5 4,3
4,3
4 3,5
3,5 3,2
3
3 2,5 2
3 2,3
2,3
Real Mouse/Keyboard Gestures TUI
1,7 1,3
1,4
1
1
0 Immersion
Intuitivness
Mental WL
Physical WL
Fig. 7. Test 1/2: Ratings of immersion, intuitivness, mental and physical workload
objects, which caused more steps for the selection of an object. For gestures and mouse/keyboard interface, selection is independent of the amount of virtual objects. Physical workload was similar to the first two tests. V. C ONCLUSIONS This paper presented our tangible user interface, which is independent from a tracking system. Aim of this interface is the manipulation of virtual objects (translation and rotation) within Augmented Reality environments. As sensors, we used two two-dimensional accelerometers “ADXL320” and one gyroscope “ADXRS300”. The sensor data of the accelerometers were used to determine the inclination of the interface around it’s x and y axis up to an angle of ninety degrees. The gyroscope’s data is used to calculate the rotation around the z-axis. Communication of interface and computer was realized by a USB connection with a “PICDEMTM FS USB Board” by “Microchip”. We added four buttons to enhance the interaction possibilities. All components were build into a cuboid casing. We developed a master-client software, in order to integrate the tangible user interface within any application. Master and slave are connected for communication via an UDP network, which leads to a better performance due to distributed computational power. The master receives data of the interface, processes them, and sends them to the slave, which provides these data via an library. We performed an evaluation, which focused on the comparison of four different interaction interfaces and four different characteristics. As interfaces we used interaction in a real environment, interaction in virtual environments using a mouse/keyboard based interface, a gesture recognition based interface, and the presented tangible user interface. As characteristics we evaluated the immersion, intuitiveness, and mental
as well as physical workload induced by each interaction. We separated translation, rotation and the combination of both in three tests. The results showed, that the tangible user interface performs well for translation and rotation with few virtual objects. Tasks, which require several objects to be manipulated, lack in a simple selection method. The tangible user interface selects objects by a “step through” method pressing buttons. Mouse/keyboard and gesture based interaction interfaces are able to select objects directly by clicking or grasping. Gestures reached high ratings for immersion and intuitiveness (almost as good as interaction in the real environment), caused by the direct connection between the user’s action and the object’s reaction. The tangible user interface was rated in this characteristics better than the mouse/keyboard based interface, which was rated low, because this interface maps 2D to 3D movements. The most exhausting interface was the gesture based interface, because the user has to keep his arm up for the whole manipulation. Here, mouse/keyboard scores better, which can be explained by the fact, that the user can lay his arm on a plain surfaces, which is required by mouse and keyboard. The tangible user interface was rated between gestures and mouse/keyboard, but was rated worst regarding the mental workload, which is caused by several buttons. The user has to learn combinations of buttons for a manipulation in all degrees of freedom. Mouse/keyboard has to solve a similar problem and was rated worse than gesture based interaction, too. Concluding, we found out, that a gesture based interaction system performs very well for our evaluated tasks, reaching high ratings in immersion and intuitiveness. But this comes with a higher physical workload. Mouse/keyboard lacks in these characteristics, but interaction is very unexacting for the user. The tangible user interface provides a solution, which covers both of gestures and mouse/keyboard. R EFERENCES [1] S. Reifinger and F. Wallhoff and M. Ablassmeier and T. Poitschke and G. Rigoll, Static and Dynamic Hand-Gesture Recognition for Augmented Reality Applications, In J. A. Jacko, editor, Human-Computer Interaction, volume 4552 of LNCS, pages 728737. Springer, 2007. [2] B. Piper and C. Ratti and H. Ishii, Illuminating clay: a 3-D tangible interface for landscape analysis, In CHI ’02: Proceedings of the SIGCHI conference on Human factors in computing systems, 2002 [3] W. Lee and W. Woo and J. Lee, TARBoard: Tangible Augmented Reality System for Table-top Game Environment, In PerGames 05, 2005 [4] Microchip Technology Inc., PICDEM F S T M USB Demonstration Board User’s Guide, 2004 [5] A. van Dam, Post-WIMP User Interfaces, In Communications of the ACM, Vol.40, No. 2, 1997 [6] M. Billinghurst and H. Kato and I. Poupyrev, Collaboration With Tangible Augmented Reality Interfaces, 2001
SMC 2008