Navigation as Multiscale Pointing: Extending Fitts ... - Semantic Scholar

3 downloads 74 Views 1MB Size Report
input devices, stylus, mouse, graphical tablet ... spaces, including virtual offices, libraries and “cities of ..... Server 7250 connected to a 832 x 634 pixel screen.
Papers

CHI

99

15-20

MAY

1999

Navigation as Multiscale Pointing: Extending Fitts’ Model to Very High Precision Tasks Yves Guiard Mouvement & Perception CNRS & UniversitC de la MediterranCe - France +33(0)491 172257 [email protected]

Michel Beaudouin-Lafon Laboratoire de Recherche en Informatique CNRS & UniversitC de Paris-Sud - France +33 (0)l 69 15 69 10 [email protected]

ABSTRACT Fitts’ pointing model has proven extremely useful for understanding basic selection in WIMP user interfaces. Yet today’s interfaces involve more complex navigation within electronic environments. As navigation amounts to a form of multi-scale pointing, Fitts’ model can be applied to these more complex tasks. We report the results of a preliminary pointing experiment that shows that users can handle higher levels of task difficulty with two-scale rather than traditional one-scale pointing control. Also, in tasks with very high-precision hand movements, performance is higher with a stylus than with a mouse. Keywords Fitts’ law, pointing, navigation, multiscale input devices, stylus, mouse, graphical tablet

interfaces,

INTRODUCTION Interfaces in the 1980’s relied heavily on pointing to modify window size, select icons, and choose from menus. Today’s user interfaces have moved beyond the desktop metaphor, enabling users to explore far richer information spaces, including virtual offices, libraries and “cities of knowledge” [e.g., 4, 6, 221. These interfaces can all be considered “zoomable” in one way or another, allowing users to navigate through electronic worlds at different scales [9]. These new interfaces require additional theorizing on the nature of human movement in human-computer interaction (HCI). We believe the distinction between pointing and navigation is of special importance. Pointing, which involves simple aimed movements of the hand typically performed on the surface of a desktop by a seated individual, has long been understood in HCI in light of Fitts’ law [see 17 for a review]. Navigation, unlike pointing, is a metaphor in HCI-that of a living organism moving itself as a whole relative to a complex environment that is only partially accessible to the senses. Pemlission to make digital or hard copies of all or parI of this WW~ for personal or classroom use is granted without fee provided that copies arc not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. ‘1’0copy otherwise, to republish, to post on servers or to redistribute Lo lists. requires prior specilic permission and/or a kc. CHI ‘99 Pittsburgh PA USA Copyright ACM 1999 0-201-48559-1/99/05...$5.00

450

Denis Mottet Facultk des Sciences du Sport UniversitC de Poitiers France +33 (0)5 49 45 33 43 [email protected]

So pointing and navigation, two categories of human motion, apparently call for different conceptuaJizations. Below we discuss some difficulties that seem to hinder the application of Fitts’ pointing model to multiscale navigation. We suggest that some apparent obstacles vanish if the abstractness and generality of Fit& model is fully acknowledged. The Scale Problem in Graphical User Intetfacw Scale has always been a concern in HCI. The first computer screens showed too few lines to make an entire document visible. An early solution was to provide users with four cursor keys for local movements of the cursor and page-up and page-down keys for large-scale movements. WIMP (Windows, Icons, Menus, and Pointing) interfaces provide a clearer separation between document and desk levels, with the option of either working within the document or manipulating it as a whole on the virtual desktop. The problem remains, however, that windows are typically smaller than the documents they display (all the more so with multiple windowing), hence the scrollbar, widely used in today’s applications. The scrollbar represents all the space along one dimension while the thumb represents the position, and sometimes the size, of the window. To reach a particular out-of-view target, one first moves th,e thumb to move the window; when the target is revealed within the window, one positions the cursor to make the selection. This amounts to two-scale pointing with a coarse-grain first component and a fine-grain second component. Scrollbars represent the document at a scale s = w/d determined by the ratio of window size (w) and document size (4. Moving the thumb by p pixels scrolls the document by p/s pixels. If the document is larger than the square of the window size (d > Wz), a problem arises: Moving the thumb by one pixel scrolls the document by more than a full window, making parts of the document inaccessible. Also, when documents are large relative to the window size, even small thumb movements make the contents of the window jump, causing the user to lose context. This occurs frequently, typically with documents that are 20-50 times the size of the display window. One solution is to let the user control the scale. Many drawing

CHI99

15-20

MAY

Papers

1999

programs (e.g., Claris Draw or Adobe Photoshop) allow zooming in and out, usually by a predetermined set of scales. Zooming interfaces (e.g., Pad++) offer continuous control over the scale, allowing elements of the space to be visible or concealed according to viewing scale [semantic zooming, 91. Attempts to improve the scrollbar device include new mice with a finger device, a wheel (Intellimouse) or a small joystick (Scrollpoint), bound to the scrollbar of compatible applications. These devices allow users to maintain their focus on the document rather than shift attention to the scrollbar. Because global navigation (e.g., with a scrollbar) and local pointing within a window are used in combination to reach a specified target, we can treat them as the macro and micro components of the same act of selection. This insight enables us to extend the application of Fitts’ law from simple pointing to more complex navigation tasks. Fitts’ Law in WIMP interfaces

According to Fitts [7, 81, movement time (MT, the time to reach a target of width W placed at a distance D) varies linearly with log2(2D/W). Better data fits may be obtained with linear [21] or power [ 191 models, but in this paper we need only to retain the basic, truly important fact acknowledged by all versions of Fitts’ law, namely that task difficulty is determined by the dimensionless ratio D/W. According to this generic formulation of Fitts’ law, MT= f (D/W). Since Card’s first demonstration of the suitability of Fitts’ model to characterize the user’s motor action in HCI [.5], Fitts’ law has been a very successful tool for designing and evaluating interfaces [ 171. The success of Fitts’ model is largely due to the fact that most actions carried out with the pointer in a desktop environment amount to, or at least involve, simple target acquisitions. Accot and Zhai [2] recently argued that Fitts’ model does not take into account constraints on trajectories such as those imposed by hierarchical menus, but they were able to demonstrate that the law does generalize to the case of motion along paths of arbitrary shapes and widths. Peter Pan Pointing Worlds

in Complex,

Multiscale

Electronic

Does Fitts’ law have any relevance at all to navigation in the first place? Imagine that, like Peter Pan, you can freely move in a purely kinematic 3D space, that is, in a world with time and length but no mass. As you fly in 3D space, you experience none of the constraints of the real world (e.g., gravity, limited speed, limited acceleration). Your only limit is your rate of information intake, since to guide your locomotion, you need to accommodate the flow of optical information induced by your motion. In such a world, if you started from Paris, you could easily pick up a flower in Central Park, New York City (assuming this is permitted!). You would just rise up to a

sufficient height (zooming out) that you could simultaneously see Europe and North America, then quickly traverse the Atlantic (panning), and finally plummet to the East Coast of the U.S. (zooming in and panning), regulating your progress as your perception of location got finer and finer: New York City, Manhattan, Central Park, a nice flower bed, and, in the end, one particular flower. As you grasped the stem with the tip of your fingers, you would have just executed a pointing movement over a huge distance (D > 5,000 km) and, more importantly, with a tremendous accuracy (W = about 1 mm) given the distance covered. According to Fitts’ index of difficulty (ID), this task would involve a spatial difficulty of log2(5*106/10-3 +l) = about 32 bits.’ Using the terminology of Furnas and Bederson [9], Peter Pan’s world is a 2+1D world with two dimensions of space (latitude and longitude) and one of scale (altitude). Obviously, such a world is no longer as imaginary as it was in 1952, when Walt Disney produced his animated movie. For example, geographical information systems will let you move to any location on the planet, even specific buildings, by panning and zooming. In fact this characterizes management of complex navigation information visualization systems in general [e.g., 4,6]. So Fitts’ pointing model does indeed seem to have relevance to the problem of navigation in complex, multiscale electronic worlds. Yet, navigation is certainly not pointing. Below we examine some apparent and real the spatial dimensionality of differences involving movement, coordinate systems, and task difficulty. The Spatial

Dimensionality

of Movements

In a tapping task like Fitts’ [7], participants must use the hand to reach a target placed on a tabletop. Although the hand moves typically in 3D space, the task is onedimensional (1D) in the sense that D, W, and movement endpoints are measured along a single line. In recent years, cognitively oriented studies [e.g., 191 have often used even simpler mono-articular movements, since a single spatial dimension or a single articular degree of freedom (DOF) suffices to capture the essence of Fitts’ law. Whereas the bulk of Fitts’ law research has been concerned with 1D space, navigation typically has to do with 2D or 3D space. Fitts’ model, however, need not be confined to the 1D world. Fitts’ law has already been generalized to the 2D space [ 18, 201 and accommodating aimed movements in 3D space would require just one more step, presumably using the same logic as in [ 181. SO the spatial dimensionality of movements should cause no real difficulty when applying Fitts’ ideas to multidimensional navigation.

i As is customary in HCI, we define the ID as log2 (D/W +l) (see [ 171 for justifications).

451

CHI 99 15-20 MAY

Papers The Task Coordinate System F&s’ law has been studied with a variety of tasks involving either hand translations in task space or rotations in the angular space of a single joint. Note that since Fitts’ law links MT to the dimensionless ratio D/W, it must hold with any coordinate system, whether Cartesian or articular. Recourse to single-joint movement to study Fitts’ law may look attractive to those interested in rigorous experimental control. However, this simplification takes us one step toward physiology and one step away from the spirit of Fitts’ research because of its emphasis on the actor’s body space. Fit& law, whose validity is quite independent of the muscular and skeletal means involved in reaching the target, is in essence a relationship within tusk space. To make this clear, let us compare four target-acquisition tasks of increasing complexity in body space. .

Task 1: Moving a cursor to a target by means of a wrist pronation/supination as in [ 191.

.

Task 2: Hand tapping on a tabletop as in F&s’ experiments, with the participant seated.

.

Task 3: Same as Task 2 except that D = 100 m, so that actual running is required.

l

Task 4: Peter Pan pointing with D = 5,000 km, so that virtual flying is required.

In body space, Task 1 is the only one that requires a simple movement. The movement in Task 2 is already very complicated: F&s [7] had his participants cover a distance with the hand that varied from 5 to 40 cm. Considering the body’s coordinate system, a 5-cm and a 40-cm hand movement represent strikingly different acts since the former mainly involves the most distal joints of the upper limb (wrist and fingers) whereas the latter heavily involves the proximal joints of the elbow and shoulder [15]. So it is tempting to criticize Fit& experiment on the grounds that changes in D were confounded with dramatic biomechanical changes. In fact such a criticism does not apply if one uses the appropriate coordinate system. The task paradigm with which Fitts established his law involves a kind of movement that, however complex in body space, remains quite simple in task space. Importantly, the same argument holds for Tasks 3 and 4. Whether one physically runs in order to move the stylus 100 m or virtually flies to move oneself (or a visualization window) 5,000 km is in no way more problematic than mobilizing all the DOF of one’s arm in Fit& tapping task. In Tasks 2 through 4, the uncontrolled biomechanical complexity is indeed enormous, but irrelevant to F&s’ model. This model, which only considers target distance and target width, has essentially to do, not with movements of arm segments in

1999

body space, but rather with moves in task space:.2 In task space, which incorporates the target and the cursor, lengths D and W are measurable for all four tasks and hence an ID can always be computed. Tasks 2 through 4 comply just as rigorously as Task 1 with the operational requirements of Fit& paradigm. In sum, the motor activities involved in pointing and navigation are quite different in terms of the usual bodycentered view of human movement. However, they can be tackled in the same conceptual framework, provided they are treated as moves in task space--the coordinate system that happens to be most appropriate to Fit& law. Scale and Task Difficulty As first hinted by Fit&, the difficulty of an aimed movement is best estimated as the logarithm of the inverse of a probability ratio-namely, the ratio of the size of the target destination subset and the size of the set of accessible destinations. The best justification for using Fit& logic here is that it provides an abstract metric of task difficulty, based on Shannon’s notion of information, for jointly thinking about pointing and navigation. 12

1

10 aNumber of Studies

642 0 2

3

4

5

6

7

8

9

10

ID Max (bits)

Figure 1. Distribution of ID,, for a random sample of 20 Fitts’ law studies. ID’s were (re)computed as log2 (D/W+I). Using the metric of information, task difficulty in typical navigation tasks is vastly superior to that seen in traditional pointing. For instance, selection of one Web page from among the 5,000,OOOpages made accessible by the Library of Congress [22] involves an ID of about 22 bits. Compare this to Fitts’ law research, in which the task ID rarely exceeds 8 bits. Figure 1 shows our (estimate of the distribution of the maximum ID used in Fitts’ law research,3 based on a random sample of 20 studies.

2 The game of chess is an exemplary instance in which only moves (piece displacements in the chessboard coordinate system) are considered, the gestural technique of manipulating the pieces (the hand movement) being ignored as irrelevant. 3 Figure I is based exclusively on studies using the classic time-minimization paradigm [ 191.

452

CHI

99

15-20

MAY

Papers

1999

It is no surprise that the range of manageable ID’s appears so limited in Fitts’ law research. In tapping, for example, if one uses only the arm to the exclusion of the trunk and legs, then the hand can cover a maximum D of about half a meter, while the smallest workable target size, without the help of a magnifying lens, is about a millimeter. So an ID of about 9 bits (log2(0.5/0.001 +l) does indeed represent an upper limit for hand pointing in traditional Consistent with this Fitts’ law experimentation. observation, numerous data sets from the literature show dramatic decrements in bandwidth as the ID rises beyond 4-6 bits [7, 8, 10, 12, 161. The reason why the ID is strongly limited in traditional pointing is because it involves selection at a single scale. As has been occasionally noted [e.g., 93, this limitation must disappear in multiscale navigation. This is most easily understandable in the 1D case, by simply comparing one vs. two levels of scale. Consider a device like an optical microscope with a macro and a micro control knob. Notice that even with a double-knobbed microscope, the task of adjusting the lens/section distance remains in essence a 1D target acquisition task: It is simply a two-scale, as well as a two-step, pointing task, and this task must abide by Fitts’ law. Many sophisticated devices have been designed with multi-knob control systems in order to accommodate the limitations of the human perceptual-motor system. Multiknob controls make it possible to handle very high D/W ratios that cannot be achieved with single-scale controls. For example, one regulated power supply of Lambda Electronics Inc. has two concentric knobs for the control of the output voltage. The external, macro knob offers a high-gain control (148 mV per degree of knob rotation), making it possible to cover, with a limited resolution, the whole voltage range (O-40 V for a 270” rotation). The gain of the internal, micro knob is only 4 mV/deg, which allows the user to set the voltage to the nearest l/100 of a volt. With this two-scale control, a total of 4,000 settings can be differentiated and the information the user can produce is 12 bits, a figure well above the maxima of task difficulty identified in Figure 1. AN EXPERIMENT ON TWO-SCALE POINTING We investigated two-scale pointing with an ID of 12.2 bits (D/W = 4800), a level of task difficulty which, to our knowledge, has not yet been explored in Fitts’ law research. Participants were presented with two cursors representing a single input device, a puck or a stylus on a digitizing tablet. One cursor, the macro cursor, provided a complete, but low-resolution view of the task, while the other, the micro cursor, provided a high-resolution view of the target region. The experiment was designed to test two hypotheses. l HZ. In standard single-scale pointing, the decay of the rate of information processing makes it difficult to handle

ID’s greater than 9 bits. This limitation should vanish with a suitable two-scale (if not multiscale) representation of the movement. Not only should users provided with double-scale control be able to successfully handle an ID far above the usual maxima of Fitts’ law research, but their processing rate should remain similar to that observed in standard single-scale pointing. The capacity limitation for single-scale pointing obviously reflects the existence of both a numerator maximum and a denominator minimum in the D/W ratio. In the present experiment, participants were asked to handle an unusually small W by using the equivalent of a magnifying lens, while the D to cover remained relatively large. In their famous microscopic pointing experiment, Langolf et al. [ 161 required very fine-grained movements, but the whole range of the movement was visible within the field of the microscope, making this a single-scale task of standard difficulty (ID,, = 8.2 bits). Our hypothesis was that people can master higher levels of ID if they are provided with a two-scale control. l HZ. At a high level of task difficulty, the stylus should allow better performance than the puck or the mouse because it more fully exploits the high-resolution movement capability of the fingers.

Card’s [5] early demonstration that the mouse was a nearly optimal input device has received much confirmation, even after the introduction of digitizing tablets with a stylus as the pointing device [e.g., 141. So far, however, authors have only considered relatively easy pointing tasks with single-scale control. We reasoned that the stylus would most clearly surpass the mouse with movements involving a very fine terminal adjustment because the stylus, with its pen shape, elicits a precision grip favorable to fine’ finger motion, whereas the mouse elicits a power grip which allows very limited mobilization of joints more distal than the wrist. We used a Wacom puck with absolute mapping, rather than a mouse with relative mapping, to make shape the only experimental contrast between the puck and the stylus. Note that the Wacom puck, quite similar in shape to a regular mouse, elicits a similar hand posture. Methods Participants

Eight unpaid adult volunteers (seven male and one female, all with normal or corrected-to-normal vision) participated in the experiment. Equipment The experiment was programmed on an Apple Workgroup Server 7250 connected to a 832 x 634 pixel screen. We used a Wacom A4 digitizing tablet (304 x 304 mm), which discriminates 15,240 positions on each of its two dimensions. The pointing tools were the Wacom puck UltraPoint Ergonomic UC-520 and the Wacom stylus UltraPen Eraser UP-801E.

453

Papers Screen Display and Task Conditions

The pointing task was one-dimensional, with the pointing tool (either a puck or a stylus) and the screen cursors moving along the horizontal dimension. Because the tablet range (r) was more than 18 times larger than the screen range (Sj, no one-to-one mapping of the former onto the latter was possible. The screen display offered, on two horizontal lines separated by a distance of 150 pixels, two involved opposite, representations that pointer complementary compromises (see Figure 2). On the lower line, the macro cursor represented tool motion over the full range of the tablet at the cost of a poor visual a display/control (DC) gain of only resolution -with 0.05. The equation of the macro cursor was y = x * S I T, with y denoting cursor position on the screen and x denoting tool position on the tablet.4 On the upper line, the micro cursor represented tool position with a complete resolution (DC gain = I), at the cost of an incomplete coverage of the tablet range. This cursor, as if appearing under a magnifying lens, only became visible when the macro cursor below entered its target. The equation of the micro cursor was z = x - T + S, with z denoting the position of the micro cursor on the screen. Both the macro and the micro cursors had the form of a vertical line segment l-pixel thick and 50-pixels high. The participants were asked to perform the movements as fast as possible, but there was no time pressure between target acquisition and initiation of the next movement. The task involved a discrete pointing movement in alternating directions, rather than a reciprocal movement [ 121, and in this sense it was more representative of usual pointing in HCI.

CHI 99 15-20

MAY 1999

cursor had stayed in a target for 0.5 s, the alternate target became brighter, an invitation for the participant to start the next move.

6 L-l

Difficult, double-scale task

-/

Micro 1cursor

-‘-t--_. ) I L

Macrocursor

Figure 2. The screen display in the single-scale (A) and double-scale (B) condition (here targets are represented in gray). In each of the two examples shown, the lef target has just been acquired on the lower, macro line, and so the micro cursor above has just popped in from the right. Acquisition of the right macro target would cause the micro cursor to pop in from the lef.

In the easy, single-scale (1s) task condition, used as a control, participants had to move the macro cursor back and forth to left and right targets visualized by red-colored rectangles 45-pixels wide (and 40 pixels high, as were all targets of this experiment) separated by a D of 674 pixels from center to center (ID = 4.0 bits). The currently-to-beacquired target appeared brighter than the other. As soon as the cursor entered one of the two targets on the macro line, the micro cursor popped in above. In this easy condition, however, this event conveyed no information as the micro target (also a red rectangle) covered the whole width of the screen (w’ = 828 pixels)-that is, acquisition of the lower, macro target automatically caused acquisition of the upper, micro target. When the macro

In the difficult, double-scale (2s) task condition, acquisition of the macro target (d = 788 pixels, w = 45 pixels, ZD = 4.2 bits) again caused the micro cursor to appear on one end of the upper line, but here the micro target, placed in the middle of the upper line,. was very narrow (w’ = 3 pixels) and so target approach had to be continued, using the high-resolution visual feedback offered on the upper line. As the distance that :remained to be covered on the upper line was ‘/2 S = 414 pixels, the ID for this second component of the movement was 7.12 bits. In tablet units, w’ was 3 points (0.060 mm), d’ was 14,500 points (289 mm), and hence the overall .ZD for microtarget acquisition was 12.24 bits. In essence, this task amounted to pointing to a minuscule target with the help of a fixed magnifying lens for the terminal phase of the movement [see 1 for an exploratory experiment on pointing through a mobile 2D magnifying glass:].

4 Below we denote tool position on the tablet as x, macrocursor position as y, and micro-cursor position as Z. For the distance and the tolerance, we use upper-case characters (D and IV) for the tablet, lower-case characters (d and w) for the macro cursor, and primed lower-case characters (d’ and w’) for the micro cursor. Note that all three variables x, y, and z correspond to the horizontal dimension of the tablet and screen.

Participants were tested individually in a single session which lasted about 90 min. Each participant executed 32 blocks of movements. The first 16 comprised only four movements each and served as warm-up blocks (the data were not analyzed). The next 16 were experimental blocks, with four blocks for each of the four treatments (IS puck, 2s puck, 1s stylus, and 2s stylus). Each experimental block comprised 15 movements, only the

Procedure

454

Papers

CHI 99 15-20 MAY 1999

last ten of which were considered for data analysis. So each participant provided 40 MTs per condition, and 160 MT’s overall. The order of the four experimental treatments obtained with the two factors were balanced with Latin squares within each group of 16 blocks. Data Elaboration

MT was measured from the moment hand velocity on the tablet reached a threshold of 50 mm/s to the moment the target-acquisition criterion was satisfied (less 0.5 s, the time required for the system to acknowledge target acquisition). The 2s condition offered the opportunity to decompose MT into its macro and micro components, with the former measured from movement initiation to the time when the macro cursor entered the macro target and the latter measured as the remaining time until acquisition of the micro target (less 0.5 s).

Index of Performance

As shown in Figure 3, the effect of task condition on the IP was tool dependent (F( 1,7)=46.86, p=.OOO2). With the puck, 2s pointing yielded an ZP considerably lower than 1S pointing (3.09 vs. 5.17 bits/s, Wilcoxon T(8)=0, pc.01). With the stylus, however, there was no significant difference between 1s and 2s pointing (4.30 vs. 4.16 bits/s, respectively, T(8)=13.5, non significant). So, even though our 12-bit test was successfully passed with both the puck and the stylus, it was only with the latter tool that participants conserved a normal processing rate.

To compare 1S and 2S performance across very different levels of ID we used Fitts’ [8] index of performance (ZP), simply defined as the ratio of ID and MT (in bits/s). Results and Discussion An analysis of variance (ANOVA) was run on the MT and ZP data with task condition (1S vs. 2S) and tool (puck vs. stylus) as within-participant factors. A further ANOVA was run on the 2s data with movement component (initial macro vs. terminal micro) and tool as within-participant factors. Movement time

Table 1 shows a strong cross-over interaction between condition and tool (F(1,7)=53.02, p=.0002). For the easy 1s condition, all participants but one performed faster with the puck than with the stylus (Wilcoxon T(8)=1, p

Suggest Documents