101 Spots, or How Do Users Read Menus? - CiteSeerX

34 downloads 4135 Views 195KB Size Report
program used in the pilot tests into a custom made in order ... The total number of tasks was 101. ... The computer was connected to a scan converter because.
101 Spots, or How Do Users Read Menus? Antti Aaltonen Aulikki Hyrskykari Kari-Jouko Räihä Department of Computer Science University of Tampere P.O. Box 607 (Kehruukoulunkatu 1) FIN-33101 Tampere, Finland +358 3 215 6111 {aa, ah, kjr}@cs.uta.fi http://www.cs.uta.fi/research/hci/ ABSTRACT

In modern graphical user interfaces pull-down menus are one of the most frequently used components. But still after years of research there is no clear evidence on how the users carry out the visual search process in pull-down menus. Several models have been proposed for predicting selection times. However, most observations are based only on execution times and cannot therefore explain where the time is spent. The few models that are based on eye movement research are conflicting. In this study we present an experiment where eye movement data was gathered in a menu usage task. By analyzing the scan paths of the eye, we found that menus are read in sequential sweeps. This may explain why the best models produced by previous research are hybrid models that combine systematic reading behavior with random reading behavior. Keywords

Menu selection, visual search process, eye movement, eye tracking INTRODUCTION

Pull-down menus (also known as drop-down menus) form one of the key components of the popular WIMP (Windows, Icons, Menus and Pointing devices) interfaces. Therefore it is important to understand how the users read and search menus to improve the usability of software products. There has been a lot of research in the past about using menus in general [11], but all studies do not deal directly with pull-down menus. Pull-down menus are usually command menus and the interaction with them may differ greatly from interaction with other kinds of menus. Menu research has traditionally focused on performance. In empirical research the easiest way to evaluate this is to measure the selection time of a specified menu item. A model can then be constructed to fit the data and to predict selection times. A popular class of models is based on Fitts’ law [2] which predicts motor response time, that is, the time required to

move the pointer of a pointing device over a certain distance to a target item. The predicted time is essentially a logarithmic function of the ratio of the distance from and width of the target. Fitts’ law treats the menu as a ‘black box’ which takes user input and after some time and a mysterious process a menu item is selected. The main drawback of such models is that they are unable to explain what actually happens during the process. Obviously, in addition to the movement of the pointing device a cognitive process is required. The few studies that try to explain the selection process in more detail are based on analyzing eye movements during a selection task. When a subject scans a menu (or examines anything on the screen) the eyes have to remain relatively still for some period of time for the information to be processed. The points where the eye remains still are called fixations. A movement that connects two fixations is called a saccade. The pattern formed by fixations and their connecting saccades is called a scan path. The velocity of a saccade is very high and therefore practically no information is gathered during it. The information is acquired only while the eyes are relatively still during the fixation phase. The duration of a fixation is typically 120-600 ms, while the saccades last only 30-120 ms. Two distinct visual processes, searching and reading, can be found in every menu-related task [3, 12]. Eye movements are necessary for reading and searching because the acuity of the visual system has some limitations. The acuity of the vision is highest in the foveal region. The radius of the fovea is only one degree of a visual angle from the centre of the fixation. Beyond that the acuity decreases markedly and for example the ability to identify letters is extremely low. Therefore eye movements are needed in order to change the place of the fovea during searching and reading. It seems natural to assume that there is a strong correlation between selection time and the number of fixations during a menu selection task. Evidence for this was found as early as 1983 [1]. However, no satisfactory explanation has been found for the number of fixations needed. Various search strategies (systematic vs. random) have been envisioned. The number of menu items scanned during a fixation has been considered (sequential vs. parallel strategy). The resulting models are conflicting and have to rely on a combination of strategies to achieve good

prediction level. However, none of the studies have analyzed the scan paths in detail and explained what really happens during a menu selection task. It is hard to believe that an otherwise systematic search process would be occasionally interrupted by completely random hops. We carried out an experiment in order to understand the selection process more deeply. Twenty subjects were each given 101 tasks. Our analysis of their eye movements confirms previous findings that indeed, selection is neither systematic nor random. Our main new result is that the random component can be largely explained by the fact that given a reasonably long menu (more than just a few items), search appears to occur in alternating top-down and bottom-up sweeps. This, of course, is not very surprising, but it can form a basis for a model that is both predictive and intuitively acceptable. The rest of the paper is organized as follows. We first review previous research that has used eye movement analysis in studying menu selection. We then describe our experiment and the methods for analyzing raw eye tracking data. The results are analyzed and presented using information visualization methods. RELATED RESEARCH

Card’s result [1] that search time depends on the number of saccadic eye movements the user makes to find a desired item is a first logical step in analysing the search process. Differences between search times are caused by difficulty in locating the target visually. This is also affected by the user’s experience with the menu structure. The more familiar the user is with the menu, the fewer saccadic movements have to be made. Several theories about the visual search patterns in menu reading have been proposed. It is interesting that two of the visual search theories are conflicting and none of them have been proven by empirical data. The first theory concerning the search strategies is based on Card’s [1] hypothesis that visual search order is completely random. Intuitively this assumption seems unnatural. MacGregor and Lee [9] suggested a systematic strategy where search is done step by step starting from the first menu item at the top of the menu until the desired menu item is found. Hendrickson [4] studied these competing theories about search order in more detail. He observed that the visual search patterns could not be described as purely random, and not as purely deterministic, either. He concluded that menus were scanned to a large extent by systematic patterns of eye movements, and conjectured that menu scanning could be represented reasonably well as a firstorder Markov process. The nature of the “systematic patterns” was not explained in the paper. In a recent study Hornof and Kieras [5] developed and studied six computational cognitive models, which simulated the visual search strategy of a human interacting with unordered pull-down menus. The lengths of the pulldown menus in the experiment were three, six or nine

items. Each menu item was a single number. The basic models studied by Hornof and Kieras were serial (one menu item processed at a time), parallel (many items processed at the same time), random and systematic search order. The last two models were combinations of the others. The evaluation of the models was based on observed selection times in experiments run with human subjects. No eye tracking data was gathered, although the model was based on predicted eye movements. The cognitive models were evaluated with respect to how well they matched the data trends observed in the experiments. Hornof and Kieras found that parallel search, combined with both random and systematic search, conformed very well to the observed data and based on that they suggest that: 1) people seem to process more than one item at a time, and 2) people use both random and systematic search strategies when scanning the menus. The question that still remains is: which factors cause a diversion from a systematic search strategy into a random strategy? In fact, it may still be the case that although the hybrid model fits the data well, there is some non-random logic that explains the deviations from a pure systematic strategy. This is what we set out to study in our experiment. THE EXPERIMENT Task design

The experiment was carried out in our usability laboratory in two stages. In the first stage three pilot tests were performed to evaluate the usefulness of our eye tracking system for determining the visual search pattern. In the pilot test we used Corel Corporation’s WordPerfect as the test program. After the pilot tests, we proceeded to the next stage with a larger test population. We decided to change the test program used in the pilot tests into a custom made in order to vary menu structure more precisely. In our previous test program, there were too many variables that could not be studied simultaneously. For instance, we decided to exclude hierarchical menus and concentrate on linear pulldown menus. Furthermore, it was observed that indications of menu item status (e.g., checkmarks to the left of the menu items) and shortcut keys caught the users’ attention, even though they were not used in the selection tasks. These were excluded as well to simplify the setup. Finally, the effect of greyed out (disabled) menu items was also left as a topic for further study. Next, we planned the menu structure. Menu items were grouped into collections of similar items. Previous research (e.g., [11, 13]) clearly shows that this improves selection times, and it is a technique customarily used in current software. Group size varied from 1 to 9, with one control menu containing 20 items without any grouping. Within the groups the menu items appeared in a random order,

since the groups in command menus seldom convey a structure that would be useful in searching the items within the groups. The lengths of the menus were 3, 11 or 20 items. The upper limit of 20 items comes from the biggest single pulldown menu that we found in commonly used programs. The lower limit of 3 items is the smallest reasonable length of a menu and 11 is the rounded mean of the two extremes.

experience with GUIs was 3.0 years. In the following, the eight users whose experience with GUIs was more than 3 years are called experienced users, and the twelve users with less experience are called novice users. The average experience with GUIs in these two groups was 6.9 years and 0.4 years, respectively.

The actual items were chosen to be names or concepts that were assumed to be familiar in our subject population. The menu structure, groups and lengths of the menus are shown in Figure 1. Movie

Geography

Animal

Person

Music

Date

Drink

Car

Action

Capital

Fish

Heads of

Classical

Weather

Today

Wines

Manufacturers

cities

Reptiles

the state

composers

Winter months

Alcoholic

Insect Mammals Actors/ Drama

Pop groups/

Spring

Artists

months

Actresses

Nonalcoholic

Summer Musical

months Continents

Western

Beer Bird

Sportsmen/ Sportswomen

Sci-fi

Countries

War

Autum Songs

months Days

Figure 2. Screen shot of the test program.

Artists

Equipment Historical

Figure 1. Menu structure of the test program The total number of tasks was 101. The tasks were divided in two categories, similar to an experiment carried out by McDonald, Stone and Liebelt [10]. The first 60 tasks were explicit commands to select a particular menu and menu item, such as “Select Person | Jack Nicholson” or “Select Date | Monday”. The second category, the last 41 tasks, were so-called dictionary definitions. There was no direct cue what to select. Typical descriptions of the target item were “The leading actor in One Flew Over the Cuckoo’s Nest” or “First workday of the week”. Test program

The test program was written in C++. The program consists of standard Windows 95 components. In the program, the tasks were placed in the top right corner of the window, just below the menu bar. This position was chosen because this way the menus did not hide the task descriptions. A screen shot of the test program is shown in Figure 2. The test program logs the users’ mouse events in pulldown menus. The logged events are menu and menu item selection as well as highlighting of an item. All these events are recorded in a log file with time stamps. Test users

We had twenty test users. All users were right handed and they have normal vision. To get as reliable eye tracking data as possible, the subjects were chosen so that they did not need eye glasses or contact lenses. Ten test users were male and ten were female. Their average experience in using computers was 3.6 years. On the average, their

The computer used in this experiment was a 100% IBMcompatible PC with 90 MHz Pentium processor, 32 Mb of memory and a 17” Super VGA color monitor. The computer was used without a keyboard to prevent the user from using the shortcut keys in the pilot test, and the same setup was retained for the actual test. The operating system was Microsoft Windows 95. The eye tracking system used in the experiment was Applied Science Laboratories model 4250R+ with floor mounted optics. The system illuminates the eye of the test user with infrared light, producing bright pupil and a corneal reflection. The system computes the distance between the centre of the pupil and the reflection. These measurements and the calibration of the system are then used to calculate the point of gaze. Similar systems have been previously used in HCI research by, e.g., Jacob [7]. The system accommodates tracking mirrors and extended head movement options, which allow the subject to move his/her head more freely. This freedom makes the user feel comfortable, but it has one major disadvantage: the distance between the eye and the monitor varies. Therefore, depending on the distance of the eye and the screen, the fovea can vary from one menu item to three. However, our subjects remained reasonably still and there was very little variation in the users’ distance from the monitor. In theory, the accuracy of the eye tracking system is better than half a degree, but the human eye can change its fixation point within one degree of visual angle without physically moving the eye [7]. One degree visual angle is more than one centimetre on the screen from the typical viewing distance of a 17” monitor. A centimetre is more

than the height of a menu item in Super VGA mode and therefore 640x480 resolution with 256 colors (VGA mode) was chosen for the experiment. The computer was connected to a scan converter because the eye tracking system uses the screen as a scene. The control unit of the eye tracker superimposes the point of gaze cursor on the scene and the combined video signal is recorded using a Super VHS video tape recorder. The test setup included a video camera. The camera recorded the users’ distance from the monitor, which is relevant for the test analysis in order to calculate the size of the area visible in the fovea. Test setup

Each test session began with an introduction of the equipment and purpose of the test. After calibration of the eye tracker the subject was first asked to read aloud two of the menus in order for us to obtain some ground data. Then the first 60 tasks were carried out. After the subject made the correct choice, a new task appeared in place of the previous one (see Figure 2). The subjects were told that this was not strictly a speed test, but nevertheless, they were asked to act promptly, in a manner that they would normally use when interacting with menus. After the first 60 tasks there was a short break. The eye tracker was then recalibrated and the remaining 41 tasks carried out. The session ended in an interview. We first asked each subject to reproduce out of memory as much as s/he could about the contents of each menu. This was done to study the effects of learning, and is not included in this paper. Then the subjects were asked to give their own impressions of the search strategies they used, and also of the organization of the menus: did they observe the groupings, were the groupings logical, and did the subjects make use of the groupings.

writing programs that filtered the merged log files produced by our test program and the eye tracker. Moreover, some data could not be used because of user actions. If several menus were opened during a task, or if the user made an incorrect choice, we excluded the task from our analysis. We realized only after the tests that leaving the tasks on the screen after the menu was opened was not a good idea. Occasionally a user started a task by opening a menu, scanned a menu for a short time, and returned to read the task description more carefully before continuing to scan the menu. Such tasks were, eventually, also excluded from the analysis. In the initial analysis phase we allowed one or two rescans of the task description. This would have produced a larger set of observations, but we chose the safe option of excluding all potentially unreliable data. There was no noticeable difference in the results. After all this filtering, 1273 out of 2020 tasks (63%) were accepted for further analysis. We feel that the filtering was justified to obtain reliable data. A lot of data was still left for analysis; altogether there were 7928 fixations that took place during the 1273 tasks. To give an indication of the mass of data, all the fixations are plotted in Figure 3 using the Spotfire information visualization tool [6], which was used to analyze the data in more detail. It is illustrative to compare Figure 3 with Figure 1—some of the long menus clearly stand out. The size of each data point (though hardly visible) is proportional to the duration of the corresponding fixation.

On the average, the sessions lasted less than 40 minutes, varying between 30 and 55 minutes. ANALYSIS

Initial analysis was made with the Applied Science Laboratories Eyenal analysis software package. The eye tracking system takes 50 samples in a second, so that tracking the eyes for, e.g., 15 minutes produces 45000 points of raw data. The Eyenal package can then be used to calculate the fixations from the raw data points. The algorithm uses a sliding window technique with a window size of 5 observations. We used the threshold of 120 ms for determining fixation starts as the first step in filtering the noise in the data. Eye tracking data is never perfect. The system may loose track of the pupil or the corneal reflection, or the observation may be simply incorrect (e.g., beyond the screen limits even though the subject is clearly looking at the screen). Tasks during which noisy data appeared were excluded from the analysis. Similarly, fixations that took place between tasks (from the opening of a menu to the selection) were excluded. This was done automatically by

Figure 3. A plot of all the fixations that were analyzed. RESULTS

We start by examining differences of eye movement in reading and selection tasks, and continue by showing that execution times in our tests conform to those obtained in previous research. We then analyze eye movements in more detail and show that they give additional evidence for menu search not being random. We then introduce the concept of sweeps, passes through the menu either top

down or bottom up, and discuss how they explain menu reading behavior. Vertical eye movement

To begin with, we present some data about eye movements in reading and selection tasks. Figure 4 shows side by side typical scan paths found in our experiment. The scan path on the left was created during the read-through task in the beginning of the test session, whereas the scan path on the right shows what happens during a selection task (here the user is selecting “Staying Alive”). The difference is remarkable. In general, the scan paths during the selection tasks followed a fairly straight vertical line, which often ran slightly to the right of the left edge of a menu. However, as Figure 3 shows, there was a lot of horizontal variation.

An observation that has been made in previous studies (e.g., [5]) is that there is a small drop in the time curve at the second menu item, i.e., that selecting the second item is faster than selecting the first item. The phenomenon can be seen in our data, too, although not in the case of very long menus. We do not know why this is the case. One of the main observations of our study is that producing predictive models for menu selection times is extremely difficult because of high variation between test subjects. For instance, the surprising peek at item 9 of menus with 11 items would not be nearly as high had we used median times instead of average times. The extremely long execution times of three test subjects for that particular item bias the results. Even using the median times would not completely remove the problem, however. The ninth item in menus with 11 items was selected only once in our 101 tasks. The particular item happens to be “Bear” in the Animal menu, and the task was to select “A wild animal that likes honey”—not a particularly difficult task. But it was only the third dictionary task, so the users were not completely comfortable with the tasks at that point. A problem with eye tracking data is that variance is extremely high. Moreover, Figure 5 shows the combined results for explicit selection tasks and dictionary tasks, and for the outlying data point there were no corresponding explicit selection tasks. Average selection times for menus of various length

Average selection time

5,00 4,00 Time (size 3)

3,00

Time (size 11) 2,00

Time (size 20)

1,00 0,00

Figure 4. A test subject is reading the Music menu (on the left) and searching for “Staying Alive” (on the right). The left part of Figure 4 shows that in a reading task, there was a fixation on almost every menu item, but only rarely (in case of long menu items) more than one fixation on the same level. The latter result is understandable, since from our viewing distance and with our screen setup, the fovea covered an area roughly the size of the circle in the beginning of “Depeche Mode” on the left. During selection tasks, the average saccade length in the vertical direction (over all saccades during the tasks that were analyzed) was 2.21 menu items. This supports the parallel search strategy (suggested by Hornof and Kieras [5]) where more than one menu item are processed at a time. Selection times

Previous research has produced a fair amount of data on performance times, and it is interesting to see how our selection times fit the observations in previous studies. The selection times are plotted in Figure 5 for our three different menu lengths (3, 11 and 20 menu items).

01

06

11

16

Item in menu

Figure 5. Average selection times for menus of various length. Fixation patterns

We now turn to the main question of our paper: what actually happens during a selection task? In Figure 6 we have divided the fixations in three categories. We say that a fixation is stable if its y-coordinate differs by less than half the height of a menu item from the y-coordinate of the previous unstable fixation. Thus, for instance, in the reading task in Figure 4, the fixations during the rescanning of “Still Crazy After All These Years” are stable, whereas the first fixation in “Like a Virgin” represents a down movement. Figure 6 plots the average number of down, up and stable fixations for each subject. The value on the x-axis is the total average number of fixations for that user.

Average number of down, up and stable fixations per task

Average number of selection times, fixations, and sweeps per task

5,00

8,00 7,00

Fixations

4,00 Down (ave) Up (ave) Stable (ave)

3,00 2,00

Selection time

5,00

Fixations 4,00

1,00 0,00 4,59

6,00

Sweeps

3,00 2,00 5,25

5,47

5,55

5,94

6,25

6,49

6,86

7,00

7,36

1,00

Users sorted by total average number of fixations

1

3

5

7

9

11

13

15

17

19

Test subject

Figure 6. Down, up and stable fixations. Figure 6 provides strong evidence that, indeed, menu search is not random. In the case of random search the number of down and up moving fixations should be roughly the same, and this is clearly not the case. It can also be seen that indeed, there are only few fixations within the same menu item, since the average number of stable fixations is below 1. Figure 6 gives additional evidence for the fact that differences between the search strategies of different subjects can be huge. In fact, the same subject can use different strategies in different tasks. In Figure 6, the first subject (who was the fastest of all) had roughly the same number of down and up moving fixations. Next in the up/down ratio was the sixth subject, also an experienced, fast user. Clearly, these subjects used a more random and opportunistic strategy especially in the tasks that belonged to the first category.

Figure 7. Average number of fixations, sweeps, and selection times (in seconds) for the test subjects. How many sweeps should one expect to find in a scan path? By computing E(k, n), the expected number of sweeps in a menu of n items and a scan path of k fixations, we could at least get information on whether the numbers seen in Figure 7 differ from what we would get with random search. Unfortunately, it is difficult to find a formula for E(k, n). The expected number of sweeps without replacement (i.e., when eye can never return to a previously visited menu item during a scan sequence) has been analyzed (Exercise 15 in Section 5.1.3 of [8]), but the case where replacement is allowed is harder. On the other hand, it is fairly easy to prove the asymptotic lower bound of E(k, n) •k/2. Figure 8 shows that invariably, the number of sweeps in a fixation sequence of length k is much smaller than even the lower bound of k/2, thereby giving evidence that sweeps do not occur at random.

This leads to the hypothesis that experienced users resort to different search strategies than novice users. Indeed, if we order the subjects by calculating the ratio of up and down fixations, then out of the top five subjects, four were experienced users. That is half of all experienced users. If search is not random, what can explain the number of fixations and their direction? By viewing the video tapes and plotting numerous scan paths, a pattern began to emerge. It seemed that users often scanned the menus in sweeps—sequences of eye movements in the same direction. Formally, we say that a sweep is a sequence that consists of saccades that are stable or move in the same direction, and of the fixations that are the end points of the saccades. A down sweep starts with a saccade that moves down, and an up sweep starts with a saccade that moves up. (In the beginning of a task, a sweep may start with a stable saccade; then the first unstable saccade determines the direction of the first sweep.) For instance, the scan path on the right side of Figure 4 consists of two sweeps, the first being a down sweep and the second an up sweep. Figure 7 shows the average number of fixations, average number of sweeps and average selection times for the test subjects.

14 12 10 Sweeps

Sweeps

Average number of sweeps for menus and scan paths

Sweeps (size 3) Sweeps (size 11) Sweeps (size 20) k/2

8 6 4 2 0 0

10

20

30

Fixations

Figure 8. Observed average number of sweeps for menus and scan paths of varying length. To summarize, sweeps do not occur at random, they fit well the behavior that can be observed in video taped tests, and they are intuitively acceptable as the basis of a model of menu search. Altogether there were 2670 sweeps in the 1273 tasks analyzed. Average sweep length was 6.85 menu items and average sweep duration was 1.14 seconds. To get a better view of how sweeps occur, Figures 9 and 10 plot the lengths and durations of all sweeps in the analyzed tasks. The x-axis is the ordinal number of the sweep within a task. The jitter in the figures is produced to make the data points stand out.

Table 1 shows the average and standard deviation values for the same data. In some rare cases the first sweep was upwards, but even then it was short in length or duration, and the remaining sweeps followed the same pattern as the sweeps in other tasks. Therefore a total of 44 sweeps that went in the opposite direction from the general trend have been excluded from Table 1.

As shown already by Figure 7, the majority of the tasks are solved in a rather small number of sweeps. From Figures 9 and 10 we can see that after the first sweep, sweeps that take a long time disappear, and they generally take less than 3 seconds, in spite of being in some cases as long (in menu items) as the first sweeps.

Figure 9. Lengths of sweeps (in menu items) as the function of the sweep number within task.

Figure 10. Durations of sweeps (in seconds) as the function of the sweep number within task.

Number of sweep within task Direction Sweeps Average sweep length Standard deviation Average sweep duration Standard deviation

1 down 1259 8.50 5.57 1.55 1.05

2 3 up down 609 358 4.20 7.46 3.77 5.11 0.60 1.03 0.44 0.86

4 5 up down 174 97 4.91 6.34 4.42 4.73 0.72 1.00 0.65 0.77

6 7 up down 56 40 5.26 5.92 5.28 5.03 0.63 1.00 0.43 0.83

8 9 up down 19 9 5.44 4.46 5.04 2.85 0.68 1.52 0.57 0.73

10 11 12 up down up 3 1 1 2.49 1.55 12.24 2.05 0.83 0.16 1.48 0.47

Table 1. Sweep lengths and durations by the sweep number within task. We saw before that sweeps do not occur at random. There are two logical explanations for the occurrence of sweeps. First, the user can read the menus in consecutive passes, presumably top down, and from the bottom of the menu make a quick hop to an item higher up in the menu. Another possibility is that there are careful reading scans alternating between top down and bottom up directions. The data (and the video tapes) give support to the existence of both reading styles. Figure 10 shows that sometimes the upward sweeps take almost as long as downward sweeps. However, Table 1 shows that in a majority of cases, upward sweeps are shorter than downward sweeps, both in length and in duration. The shorter duration is explained by the sweep being a fast hop to the top of the menu. Shorter length is partly caused by a return to a menu item that was just passed and which the user wants to recheck before proceeding.

To summarize, many reading styles emerge from the data, the most common being consecutive top down scans of the menus. The difference between up and down directions becomes less noticeable if several sweeps are needed. CONCLUSIONS AND FUTURE WORK

Previous research has shown that menu selection times cannot be satisfactorily explained using a simple model based on any single search strategy. The models that best fit execution time data have been hybrid models combining various search strategies. We have carried out an extensive empirical test where menu reading behavior was studied using both program log data and eye tracking data. The main observation presented in this paper has been the usefulness of sweeps, fixations moving in the same vertical direction, in explaining menu reading behavior.

The experiment produced a massive amount of data, and here we have only presented the first results. We plan to analyze carefully the effect of learning (using both observed behavior and the interviews), task type (explicit selection vs. dictionary definition), groupings, and position within both menu and group. Initial analysis of the video tapes has also revealed interesting differences in menu reading styles when both the mouse cursor and point of gaze are studied simultaneously. Some users use the mouse cursor as a reading stick (move it from one item to the next), others leave it at the menu title and first search the target with their eyes, and yet others use the menu cursor as a landmark, leaving it to highlight the most likely candidate found so far while continuing the search with their eyes. Analyzing the reading styles used in each task could be used to create more accurate models to predict menu selection times. Many interesting questions require further experiments. These include the effects of greyed items, marks of shortcut keys in menu items, and the effect of screen resolution. ACKNOWLEDGEMENTS

We would like to thank all voluntary test users that made this experiment possible. REFERENCES

1. Card, S.K. User perceptual mechanisms in the search of computer command menus, in Proceedings of CHI’82 (Gaithersburg MD, March 1982), ACM, 190-196. 2. Fitts, P.M. The information capacity of the human motor system in controlling amplitude of movement. Journal of Experimental Psychology 47 (1954), 381391. 3. Graf, W., and Krueger, H. Ergonomic evaluation of user-interfaces by means of eye-movement data, in Work with Computers: Organisational, Management, Stress and Health Aspects, Smith, M.J., and Salvendy,

G. (eds.), Elsevier Science Publishers B.V., 1989, 659665. 4. Hendrickson, J.J. Performance, preference, and visual scan patterns on a menu-based system: Implications for interface design, in Proceedings of CHI'89 (Austin TX, April-May 1989), ACM Press, 217-222. 5. Hornof, A.J., and Kieras, D.E. Cognitive modelling reveals menu search is both random and systematic, in Proceedings of CHI'97 (Atlanta GA, March 1997), ACM Press, 107-114. 6. IVEE Corporation, http://www.ivee.com/.

Spotfire

Pro

2.2.

See

7. Jacob, R.J.K. Eye tracking in advanced interface design, in Virtual Environments and Advanced Interface Design, Barfield, W., and Furness, T.A. (eds.), Oxford University Press, 1995, 258-288. Available as http://www.eecs.tufts.edu/~jacob/papers/barfield.html 8. Knuth, D.E. The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-Wesley, 1973. 9. MacGregor, J., and Lee, E. Menu search: random or systematic? Int. J. Man-Machine Studies 26 (1987), 627-631. 10. McDonald, J.E., Stone, J.D., and Liebolt, L.S. Searching for items in menus: The effects of organization and type of target, in Proceedings of HFS’83, Human Factors Society, 834-837. 11. Norman, K.L. The Psychology of Menu Selection: Designing Cognitive Control at the Human/Computer Interface. Ablex Publishing Corporation, 1991. 12. Rayner, K. Eye movements and cognitive processes in reading, visual search, and scene perception, in Eye Movement Research: Mechanisms, Processes and Applications, Findlay, J.M., Walker, R., and Kentridge, R.W. (eds.), Elsevier Science B.V., 1995, 3-22. 13. Shneiderman, B. Designing the User Interface. Addison-Wesley, 1992, 99-137.