Comparing eye and gesture pointing to drag ... - ACM Digital Library

8 downloads 0 Views 530KB Size Report
Pointing to Drag Items on Large. Screens. Ilkka Kosunen. Helsinki Institute for. Information Technology HIIT. Aalto University [email protected]. Luca Chech.
Posters

ITS'13, October 6–9, 2013, St. Andrews, UK

Comparing Eye and Gesture Pointing to Drag Items on Large Screens Ilkka Kosunen Helsinki Institute for Information Technology HIIT Aalto University [email protected] Antti Jylha Department of Computer Science University of Helsinki [email protected] Imtiaj Ahmed Helsinki Institute for Information Technology HIIT University of Helsinki [email protected] Chao An Helsinki Institute for Information Technology HIIT University of Helsinki [email protected]

Luca Chech University of Padova Italy [email protected] Luciano Gamberini University of Padova Italy [email protected] Marc Cavazza Teesside University England, UK [email protected] Giulio Jacucci Department of Computer Science University of Helsinki [email protected]

Abstract Large screens are populating a variety of settings motivating research on appropriate interaction techniques. While gesture is popularized by depth cameras we contribute with a comparison study showing how eye pointing is a valuable substitute to gesture pointing in dragging tasks. We compare eye pointing combined with gesture selection to gesture pointing and selection. Results clearly show that eye pointing combined with a selection gesture allows more accurate and faster dragging.

Author Keywords Eye tracking; hand tracking; evaluation methods; human factors; large screens; multimodal interaction;

ACM Classification Keywords H.5.2. [Information Interfaces and Presentation (e.g. HCI)]: User Interfaces

Introduction Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. Copyright is held by the owner/author(s). ITS ’13 , Oct 06–09 2013, St Andrews, United Kingdom ACM 978-1-4503-2271-3/13/10. http://dx.doi.org/10.1145/2512349.2514920

Large displays have become ubiquitous in both public and private spaces. Store windows run video advertisements and bus stops display up to date time tables. In private spaces, the size of TVs has increased dramatically and even full wall video projectors are not uncommon. Still, most displays are either passive or used, as in the case of TVs, with dedicated devices such as remote controls that

425

Posters

ITS'13, October 6–9, 2013, St. Andrews, UK

are not practical in public spaces and do not support natural interaction. A number of solutions have been explored for interacting with large displays at different distances (e.g., [5]). One solution for interactive public displays is multitouch interaction, which can be useful in some cases, but impractical when touch is inappropriate or the user is far from the screen. A recently common approach is to track movements of the body such as gestures, which became popular with the spread of depth cameras that provide a cheap and easy way for anyone to access depth-tracking information. There are, however, known limitations of gesture-based interaction including fatigue, reliability of gesture recognition, and the difficulty of designing gestures that are both natural and intuitive for all users. Another direct modality, eye gaze, has also been studied extensively especially for accessible interfaces. Generally, eyes have been found to offer a fast and reasonably accurate way of pointing, but less suitable for selection tasks [1, 2]. Two main issues are considered usually with this modality: selecting with eyes seems to incur high cognitive load and discomfort. Another issue is the so-called Midas’ Touch phenomenon [3], i.e., users inadvertently activating the objects they are looking at.

Figure 1: Interaction flow.

To overcome the problems inherent in both separate approaches, we propose a solution that does not require any physical devices, does not impose any restrictions to the distance, size and placement of the content, is hygienic while at the same time easy, natural and fast to use. The solution combines the speed and precision of eye pointing with the naturalness and effortlessness of hand selection. We contribute a comparison study, in which we compare by dragging tasks eye pointing combined with gesture selection to gesture pointing and selection.

Comparison Study We compared eye pointing with gesture selection with a similar setup, in which the eye pointing had been replaced with pointing by the hand used for the gesture selection. Thus, the only difference between the two conditions was the pointing paradigm. Interaction Techniques The goal of the user was to point at a target object, use a pinch gesture to select the object, then move the object to a target location, and release the object by releasing the pinch gesture. The interaction flow can be seen in Figure 1. Hand Pointing After assessing different pointing paradigms we chose to combine the concepts of Fixed-origin [4] (ray casting direction along a line to the hand from one fixed point, e.g. hip-center) and Relative space [6] (mapping of absolute space corresponding to the relative space) techniques. We used the right hand for pointing. The center or origin of the hand movement space was below the right shoulder. The hand movement in 2D space relative to this origin was transformed into 2D screen coordinates. These coordinates were used to position the hand cursor on the screen. Since the pointing depended only on the hand movement relative to the right shoulder, users were able to move and rotate while pointing. Eye Pointing We mapped the normalized 2D gaze point provided by the eye tracker into 2D screen coordinates. If the eye tracker reported invalid samples, for example when the subject was blinking or looking outside of the screen, we let the 3D hand acting as a cursor to remain stationary. This allowed the experience to seem smooth even during large head movements or other interference.

426

Posters

ITS'13, October 6–9, 2013, St. Andrews, UK

Pinch Gesture For selection we implemented a pinch gesture: a gesture of connecting the index finger and the thumb. We first tried to use a grabbing motion as the selection gesture but the open hand and closed fist can be very hard to identify from the depth camera data when the user is allowed to freely move and rotate their hand. The pinch gesture on the other hand creates a distinct shape both when ”open” and when ”closed” allowing for robust recognition from any angle. We also found that pinch feels more accurate than grabbing: when grabbing, the hand moves sligthly and the detected position even more, while the pinching motion only utilizes two fingers resulting in negligible movement.

Each block was made up of 5 trials, hence the whole task was comprised of 45 trials. The targets composing a block were arranged in a circular fashion [7] and only one of them was shown to the participant at a time (see Figure 2). At the end of each trial (when the object was released) the old target would disappear and a new one would appear at the next designated position. Furthermore, at the end of each trial the object was automatically placed at the center of the previous target (regardless of where it had been released). The order in which the blocks were presented was randomized.

Participants Eight participants were recruited for this study (3 male, 5 female). The age range was 19 to 31, with a mean age of 24.4 years (SD=4). All participants had normal or corrected-to-normal vision. Experiment Setup The task was to move a hand-shaped cursor over a spherical object, select and drag the object over a circular target area and finally drop it as close as possible to the center of the target, which was marked with a black cross. The object could be located 718, 844 or 970 pixels away from the target. Three different values were also chosen for the radius of the target area (75, 89 and 103 pixels) to encompass a wide range of movements with a different level of complexity. Each subject had to complete 9 blocks of trials resulting from all the possible combinations between the values representing the object-target distance and the values representing the target radius. Both the object-target distance and the target radius varied from one block to another, but not within the same block.

Figure 2: Screen capture of the experiment setup. User is asked to move the circular object to the green target.

Two aspects were measured: the duration of the dragging task from the moment the object was selected to the moment it was released, and the the accuracy of the task, as number of pixels from the center of the target circle to the location where the object was released.

Results Each subject performed 45 trials of dragging tasks with both eye and hand pointing. Trials were divided into nine groups of five, where each of the five trials had the same

427

Posters

ITS'13, October 6–9, 2013, St. Andrews, UK

distance and target size. However, due to an issue in the logging procedure, only the first four of the five trials were logged, producing 4*9=36 trials for both hand and eye pointing modalities for a total of 72 trials for each subject.

Acknowledgements This work has been partially funded by the EU FET grant CEEDS ICT-258749.

References The results of trial completion time and pointing error are summarized in Figure 3. There was a significant difference in time required to perform one trial between eye pointing (M=2.73s, SD=1.7) and hand pointing (M=3.51, SD=2.26). Eye pointing was approximately 29% faster than hand pointing. The difference in pointing speed was highly significant (p < .0001). There was a significant difference in the amount of error, i.e., the difference of the location where the dragged object was released and the target between eye pointing (M=75.52px, SD=156.29) and hand pointing (M=199.67, SD=263.81). Eye pointing was approximately 2.5 times more accurate than hand pointing. The difference was highly significant (p < .0001). We also examined whether the pointing speed and accuracy follow the Fitts’s law, but there was no statistically significant difference in pointing speed or accuracy between different distances and target sizes.

Conclusion

Figure 3: Differences between pointing modalities. Top: error, bottom: trial completion time.

Comparing hand pointing and gesture selection to eye pointing with hand gesture selection we found out that eye pointing was significantly faster and more accurate than hand pointing in a dragging task. Also, combining eye pointing with gesture-based selection helps overcome typical issues with eye selection, such as the Midas touch phenomenon and dwell time effects. In the future we will extend the work beyond the abstract setup in a more contextualized setting.

[1] Bieg, H.-J., Chuang, L. L., Fleming, R. W., Reiterer, H., and B¨ ulthoff, H. H. Eye and pointer coordination in search and selection tasks. In Proc. 2010 Symp. Eye-Tracking Research & Applications (ETRA ’10), ACM Press (New York, New York, USA, 2010), 89–92. [2] Fono, D., and Vertegaal, R. Eyewindows: evaluation of eye-controlled zooming windows for focus selection. In Proc. CHI, ACM, ACM Press (2005), 151–160. [3] Jacob, R. J. The use of eye movements in human-computer interaction techniques: what you look at is what you get. ACM Trans. Information Systems (TOIS) 9, 2 (1991), 152–169. [4] Jota, R., Nacenta, M. A., Jorge, J. A., Carpendale, S., and Greenberg, S. A comparison of ray pointing techniques for very large displays. In Proc. Graphics Interface, GI ’10, Canadian Inform. Proc. Soc. (2010), 269–276. [5] Vogel, D., and Balakrishnan, R. Interactive public ambient displays: transitioning from implicit to explicit, public to personal, interaction with multiple users. In Proc. Symp. User Interface Software and Technology, UIST ’04, ACM (2004), 137–146. [6] Vogel, D., and Balakrishnan, R. Distant freehand pointing and clicking on very large, high resolution displays. In Proc. Symp. User Interface Software and Technology (UIST ’05), ACM Press (New York, New York, USA, 2005), 33–42. [7] Zhang, X., and MacKenzie, I. S. Evaluating eye tracking with iso 9241-part 9. In Human-Computer Interaction. HCI Intelligent Multimodal Interaction Environments. Springer, 2007, 779–788.

428

Suggest Documents