A SENSORS-BASED TWO HANDS GESTURES INTERFACE FOR VIRTUAL SPACES Arturo Arroyo Palacios, Daniela M. Romano University of Sheffield The Department of Computer Science Regent Court 211 Portobello street Sheffield S1 4DP United Kingdom
[email protected],
[email protected]
ABSTRACT Tri-dimensional (3D) environments are currently built and manipulated in two-dimensional (2D) spaces trough the use of the WIMP (Windows, Icons, Menus and Pointers) paradigm that visualizes the objects from multiple views in several windows. This document presents the advantages of use of a six degree-of-freedom input device for the creation and manipulation of 3D objects in 3D spaces. A new ergonomic interaction technique for 3D spaces involving the use of both hands has been designed based on the theory of bimanual gestures. A newly created gesture-based language is interpreted into input commands. The user is totally immersed in the environment he/she is building and the hands’ position is tracked by magnetic sensors indicating the center position of the back of the hands. The advantages of use a six degree-of-freedom input device based on the theory of bimanual gestures are presented together with the gesturebased language designed. The implementation and design of has been evaluated for the construction of an urban virtual space and the results are given in this paper. KEY WORDS gesture recognition, graphical user interfaces, virtual reality, wearable devices.
1. Introduction A fully immersive computer approach for creating and manipulating 3D object is presented, which avoids providing limited 2D views to the users as in traditional WIMP applications. The intention is to provide a better and more immersive interaction space for designers of three dimensional scenes. Creating a 3D scene with one own hands while immerse in the actual space intuitively seems to be more appropriate and natural that using an
interface that proposes a third person experience and the use of tools such as pointers. Navigation and manipulation of objects can instead be done with hand gestures, while the user constructs the desired scene immersed in the 3D space of his/her creation. A gesture language, based on the theory of asymmetric bimanual activities, has been created and tested to navigate and to manipulate 3D objects in order to offer a more intuitive interaction with the created space.
2. Related Work The third generation of User Interfaces (UI), known as the WIMP Paradigm (Windows, Icons, Menus and Pointers), was developed in the eighties and shows its age and limitations in particular for the navigation and manipulations of virtual objects in 3D spaces. Some of 3D word builders (e.g. Autodesk Maya1 and 3ds Max2, etc) use the 2D WIMP techniques not only for the menu and windows, but also to manipulate the 3D objects’ handles that allow the rotation on the axes; virtual sphere[1] and the arcball[2] that work as crystal spheres around an object. These techniques when used in 2D software that outputs various bi-dimensional views on a tri-dimension scene completely lack of realism. In addition, the limitation of bi-dimensional input devices, mainly mouse, keyboards or 3D ball, does not provide intuitiveness of what users are designing. We propose a new user interface paradigm based on gesture recognition and virtual reality technology. The idea of using gestures for interactions in 3D environment, such as been already used for systems like the put-thatthere [3] or VIDEODESK[4], that offer a more natural way to manipulate objects. Such systems, although use 1 2
www.autodesk.com/fo-products-maya www.autodesk.com/fo-products-3dsmax
gesture for the manipulation of objects, still present the disadvantage of having a 2D projection of the 3D space. Such limitation has been overcome in the system presented here using a 3D projector system in a 10ftx 8ft screen and active glasses, that gives the user the 3D prospective and the feeling of being surrounded by the environments are generated. Furthermore Pierce et al [5] describes different visual techniques used for the selection and manipulation of objects, and Boussemart[6] presents a visual recognition approach for this task. However vision based techniques for gesture recognition still presents some inaccuracy, consequently the user hands’ position is system presented is tracked by magnetic sensors indicating the center position of the back of the hands. Guiard[7] analysed the advantages of using both hands over those gesticulations that only employs one hand on human multiple activities. The interface presented here uses bimanual interaction to create the 3D space. Other more recent approaches of hand movements patterns, motivated by the good achievements of hand written character recognition techniques [8][9], attempt to extract the 3d position sequence of the hands to a linear or bi-dimensional trajectory[10]. These methods have been successful on command-like gestures, but present disadvantages when are used on virtual reality for example commands to navigate in the scene is very limited.
3. A Gesture Interface in a three-dimensional space In this section the gesture interface implement is described in its components: the devices utilised, the gesture language developed and how the insertion and manipulation of objects is performed.
3.2 The Gesture Language and the Bimanual Interaction Interface The system uses only hand gestures to give instructions to the computer. The movements involve asymmetric gestures, which were established by considering the theory of bimanual interfaces by Guiard[7]. The higher principles of the Theory of the Human Bimanual Gestures are: 1. Right-to-Left Spatial Reference in Manual Motion. 2. Left-Right Contrast in the Spatial-Temporal Scale of Motion. 3. Left-Hand Precedence in Action. In this implementation the Right hand as been considered as the preferred hand and the Left as the non-preferred one. The list of the bimanual-commands to be performance by the user are visualised in table 1. The initial command to tell the application that the creation of the scene is starting is performed with gesture number one in where both hands touch each other. To insert an object the user performs gesture number two, he/she pinches the object from a displayed list and places it in the location desired on a 3D grid by releasing the thumb index contact. A list of objects rather than pie menus as proposed by Hopkins[11] is provided to by the users due the larger number of different objects that be present to the user. To remove an object the user occludes the vision of the object to be selected with the right hand an moves it away from the screen extending his/her hand towards the right as far as possible, see gesture 3 table 1. To move an object in a different position the user can simply pinch it and place it in a different location.
No. 1.
Image
Commands Initial command
3.1 Interaction and Visualization Devices The application has been developed to work within a Reconfigurable Automatic Virtual Environment (RAVE) system, a back projection system onto a 10ftx 8ft screen, that provides a sense of being immersed in the 3D scene if viewed using active crystal glasses. As well as the glasses the user wears a pair of ‘pinch gloves’ that have a magnetic sensor placed in correspondence of the back of each hand indicating the position of the gloves/hands and sensors on each of the finger tips that allow signals like pinching (making contact with the sensor of the thumb and any of the other fingers) to be interpreted by the computer. Furthermore the position of the user in the world is tracked by a magnetic sensor located on the side of the active crystal glasses.
BOTH HANDS touch
2.
Insert an object (Initial command) + LEFT HAND move
No. 3.
Image
Commands Select an object
No. 10.
RIGHT HAND pinch index
4.
11.
5.
Zoom in (Initial command)+ RIGHT HAND moves up
7.
Zoom out (Initial command)+ RIGHT HAND moves down
8.
Rotate the scene RIGHT HAND pinch index and move
9.
Load a project
Close application (Initial command) + BOTH HANDS cross
Move an object RIGHT HAND pinch index and move
6.
Commands
(Initial command)+ BOTH HANDS move up
Remove an object (Initial command) + RIGHT HAND move
Image
Save a project (Initial command)+ BOTH HANDS move down
Table 1. Hand Gestures commands used in the system To zoom in and out of the scene gesture six and seven are used. The user moves his hands with the palms towards the floor respectively up (zoom in) or down (zoom out). To rotate the scene the user can simply pinch the 3D grid on which the objects are placed and rotate it. To save a scene both hands have to be placed with the palms toward the screen and slighted down. To close the application the arms have to cross making a ‘X’. In figure 1 there is an example of pinching an object and placing in a different position. The 3D models are imported into the system in ASCII format and stored in a list. In order to manipulate or insert a 3D objects, the vertex and their position on the virtual space are required as well as the faces or polygons that compose the whole object. Two system parsers have the task to read any polygon and then to produce two .txt files with all the information to render them. One of them stores vertex data and the other information of polygonal meshes.
1. 2. 3.
4. 5. 6. 7. 8.
How easy is to use the interface? What do you think about the interface layout? How do you rate the overall experience of contracting a space within an immersive 3D environment? How easy is to insert and object? How easy is to move an object? How easy is to delete an object? How easy is to change point of view? How easy is to close the application?
Furthermore a general box for further comments was given at the end of the questionnaire. The mean and standard deviation of the volunteers’ answers are shown in table 2, while Graphic 1 shows the answers distribution graphically. Figure 1. Interface in use: pinching an object and placing in a different position
4. Evaluation
Excellent/Very Easy Good
A user evaluation of the gesture interface has been conducted with 10 male volunteers; the average age of being 25.6 years. The volunteers were all computer literate, but none of them had used virtual reality hardware before. At the beginning of the experiment, a user guide was given with all the gestures/commands that the users can perform and each one of them was explained graphically by the researcher. After feeling comfortable with wearing the input devices each volunteer was asked to perform do five simple tasks: 1. open the application, 2. insert at least four objects, 3. move each inserted object in a different corner of the plane, 4. selected each object inserted and removed it from the scene, 5. close the application. When the application was open a timer started counting and when it was closed the end time of the experiment was registered. The volunteers were asked to repeat do the same five tasks in the same order five times. The average time spent by a participant completing the whole experiment was fourteen minutes and nineteen seconds. Finally after completing all the repetitions the volunteers were asked to answer a questionnaire about the experience of using the gesture interface. The first three question in the questionnaire were about to the general use of the interface, its friendly/unfriendly lay out and the whole 3d experience. Question four to eight enquired on the ease of use of the bimanual-gesture commands. The questions posed to the users were those following, where the volunteer were given a five point Liker-like scale from ‘not easy at all’ to ‘very easy’ and ‘Poor’ to ‘Excellent’:
Average Below Average poor/ not easy
1
2
3
4
5
6
7
8
Graphic 1. Results of the Usability Test None of the users judge the interface poor. The majority of users judged the quality of the interfaces and the ease of use of the gesture above average, and the 3D immersive environment excellent, see table 2. Questions
Mean
Standard Deviation
Q1
3.7
0.67
Q2
3.5
0.85
Q3
4.3
0.67
Q4
4.7
0.48
Q5
3.9
0.74
Q6
4.2
0.63
Q7
3.5
1.08
Q8
4.1
1.10
Table 2 – Results of the Gesture Interface evaluation
Consequently it can be concluded that the overall functionalities of the system were well accepted by all the participants and in particularly the ease of use of the system was very satisfactory. This result was expected since the gestures are natural respecting the bi-manual theory and there are no added third person points of view, complexity or icons.
framework for 3D visualisation and manipulation in an immersive space using an untethered bimanual gestural interface, Proceedings of the ACM symposium on Virtual reality software and technology, 2004, Hong Kong [7] Guiard, Y., "Asymmetric Division of Labor in Human Skilled Bimanual Action: The Kinematic Chain as a Model," The Journal of Motor Behavior, 19 (4), 1987.
5. Conclusions A new gesture-based interface has been presented that provides a new way for constructing and manipulating virtual scenario by two-hand gesture commands within a three dimensional environment. The gesture language implemented in the system is based on the theory of bimanual gestures activities: the preferred hand is used where precision is necessary, the tasks are performed faster and easily using two hands when necessary and gestures were registered without pointing on the screen all the time. The use of the bi-manual theory of interaction , as shown by the results of the usability test, facilitates the interaction in the system suggesting a more natural immersive interface.
Acknowledgements Financial support received from the Mexican Council of Science and Technology (CONACyT) is acknowledged with thanks by the authors.
References [1] M. Chen, S. J. Mountford, A. Sellen, "A Study in Interactive 3-D Rotation Using 2-D Control Devices," Computer Graphics, 22, 4, August 1988. [2] Shoemake, K., "ARCBALL: A User Interface for Specifying Three-Dimensional Orientation Using a Mouse," Graphics Interface, 1992. [3] Richard A. Bolt, “Put-that-there”: Voice and gesture at the graphics interface, Proceedings of the 7th annual conference on Computer graphics and interactive techniques, 1980, Seattle, Washington, United States. [4] Krueger, M. 1991. Artificial Reality II. AddisonWesley, Reading, MA. [5] J. S. Pierce, A. S. Forsberg, M. J. Conway, S. Hong, R. C. Zeleznik, and M. R. Mine. Image plane interaction techniques in 3d immersive environments. In Proceedings of the 1997 symposium on Interactive 3D graphics, pages 39–ff. ACM Press, 1997. [6] Yves Boussemart , François Rioux , Frank Rudzicz , Michael Wozniewski , Jeremy R. Cooperstock, A
[8] Y. Ha, S. C. Oh, J. H. Kim and Y. B. Kwon, “Unconstrained Handwritten Word Recognition with Interconnected Hidden Markov Models,” IWFHR-3, May 1993, pp. 455-460. [9] T. Fujisaki, K. Nathan, W. Cho and H. Beigi, “Online Unconstrained Handwriting Recognition by a Probabilistic Method,” IWFHR-3, May 1993, pp. 235241. [10] Y. Nam and K. Y. Wohn. Recognition of space-time handgestures using Hidden Markov model. ACM Symposium on Virtual Reality Software and Technology, pp. 51--58, Hong Kong, 1996. [11] D. Hopkins. The design and implementation of pie menus. Dr. Dobb’s J., 16(12): 1991