Systematic Inspection of Information Visualization Systems

Systematic Inspection of Information Visualization Systems Carmelo Ardito, Paolo Buono, Maria F. Costabile, Rosa Lanzilotti Dipartimento di Informatica, Università di Bari, 70125 Bari, Italy

{ardito, buono, costabile, lanzilotti}@di.uniba.it ABSTRACT Recently, several information visualization (IV) tools have been produced and there is a growing number of commercial products. To contribute to a widespread adoption of IV tools, it is indispensable that these tools are effective, efficient and satisfying for the intended users. Various evaluation techniques can be considered and applied at the different phases of the IV software life-cycle. In this paper we propose an inspection technique based on the use of evaluation patterns, called Abstract Tasks, that take into account the specific nature of information visualization systems.

1. INTRODUCTION AND MOTIVATION Information Visualization (IV) is becoming mature. It is moving out of research laboratories with a growing number of commercial products (such as those from Spotfire, Inxight, and HumanIT), additions to statistical packages (SPSS/SigmaPlot, SAS/GRAPH, and DataDesk) and commercial development environments (e.g. ILOG JViews). For a widespread adoption of information visualization tools, it is indispensable that these tools are effective, efficient and satisfying for the intended users. In the field of IV, literature reports several usability studies and controlled experiments helpful to understand the potential and limitations of visualization tools. Plaisant suggests that we need to consider other evaluation approaches that take into account the long exploratory nature of users tasks, the value of potential discoveries or the benefits of overall awareness [12]. We need better metrics and benchmark repositories to compare tools, but also guidelines to be used by designers and evaluators. Researchers are encouraged to – and rewarded for – designing techniques that are generic in nature, can be used with a wide variety of data and in many application domains [12]. To this aim, information visualization techniques should provide mechanisms that allow users to perform tasks that are specific for such types of systems. Different methods can be used for evaluating interactive systems; choosing among them is a trade-off between cost and effectiveness. The most commonly adopted are user-based methods and inspection methods. User-based methods mainly Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. BELIV 2006 Venice, Italy. Copyright 2006 ACM 1-59593-562-2/06/05 ...$5.00.

consist of user testing, in which usability properties are assessed by observing how the system is actually used by some representatives of real users [5]. User-based evaluation currently provides the most complete form of evaluation, because it assesses usability through samples of real users. However, this technique has a number of drawbacks, such as the system is usually evaluated with users on the field only at the end of its development; thus, the evaluation results arrive too late to be considered by developers during design phase. In this way, it is not possible to avoid serious mistakes and to save reimplementation time. The effort and time to set up reliable user testing is often conspicuous. It is difficult to properly select adequate user samples and to train these users to manage advanced functions of an interactive system might require a considerable amount of time. In order to test early prototype of a system, it is difficult to reproduce ecological settings of usage in a limited amount of time. Failures in creating real-life situations may lead to “artificial” conclusions rather than to realistic results. Inspection methods involve expert evaluators only, who inspect the application and provide judgments based on their knowledge [11]. With respect to user-based evaluation, usability inspection methods are more subjective, having heavy dependence upon the inspector skills. Their main advantage is however the cost saving: they “save users”, and do not require any special equipment, nor lab facilities [10]. In addition, experts can detect a wide range of problems and possible faults of a complex system in a limited amount of time. Among the inspection methods, the most commonly used is the heuristic evaluation, which is defined by Nielsen a “discount usability” method [10]. It involves a small set of experts inspecting the system and evaluating the interface against a list of recognized usability principles: the heuristics. However, heuristic evaluation presents some limitations. Preece et al., discussing about heuristic evaluation, write: “… some of these core heuristics are too general for evaluating new products coming onto the market and there is a strong need for heuristics that are more closely tailored to specific products” [13]. This is also pointed out by other researchers, who have developed more specific heuristics for particular system classes, i.e., heuristics for the usability evaluation of groupware systems [2], for systems with large display [7], those used for fairs or other expositions [15]. In our opinion, providing more specific heuristics is not enough since an accurate heuristic evaluation requires skilled inspectors with knowledge not only on human factors, but also on the application domain, on the users, and the tasks they perform. For this, training the inspectors before they perform the evaluation is usually required. Thus, a question is: how to best support evaluators with this kind of training while keeping the cost as low as possible? Moreover, heuristics even more specific

do not provide adequate guidance especially to novice evaluators. In the next section, we describe an inspection technique that solves the drawbacks of heuristic evaluation, and systematizes the work of inspectors by giving them adequate support in carrying out their evaluation. In Section 3 we show how this technique is applicable to the evaluation of IV systems.

2. AT INSPECTION In order to overcome the drawbacks of heuristic evaluation, a novel inspection has been originally introduced in [8] as part of a methodology for systematic usability evaluation of hypermedia systems. This technique has later been revised and applied to different types of application in various domains. It is called AT inspection since it exploits evaluation patterns, called Abstract Tasks (ATs), that guide the inspectors’ activities. ATs precisely describe which elements of the application to look for, and which actions the evaluators must perform in order to analyse such elements. In this way, even novice evaluators, with lack of expertise in usability and/or application domain, are able to come out with more complete and precise results. We use the term abstract task since it describes a task that an inspector performs when is evaluating an application, but this description is provided in general terms and abstracts from a specific application to be evaluated. ATs can be seen as evaluation patterns, making possible to maximize the reuse of evaluators’ know-how, by capturing usability inspection expertise, and by expressing it in a precise and understandable form, so that it can be easily reproduced, communicated, and exploited. They therefore allow evaluators to take “... advantage of any of the efforts done in previous works, to reduce the effort needed to achieve a new one” [9]. As stated above, ATs are precisely formulated by means of a template that provides a consistent format and includes the following items: − AT Classification Code and Title: they univocally identify the

AT, and succinctly convey its essence. − Focus of Action: it shortly describes the context, or focus, of

the AT, by listing the application components that are the evaluation entities. − Intent: it describes the problem addressed by the AT and its

rationale, trying to make clear which is the specific goal to be achieved through the AT application. − Activity Description: it describes in detail the activities to be

performed during the AT application. − Output: it describes the output of the fragment of the

inspection the AT refers to. During the inspection, the evaluator uses the ATs to perform a rigorous and systematic analysis and produces a report in which the discovered problems are described, as suggested in the AT. The list of ATs provides a systematic guidance to the evaluator on how to inspect an application. Most evaluators are very good in analysing certain features of interactive applications; however, they often neglect some other features, strictly dependent on the specific application category. Exploiting a set of ATs ready for use allows evaluators with limited experience in a particular domain to perform a more accurate evaluation.

AT inspection has been used for evaluating different types of application in various domains, e.g. museum hypermedia [4], elearning systems [1]. Some controlled experiments have been performed to demonstrate the validity of this technique when compared with other techniques. Specifically, for validating the AT inspection for hypermedia systems, 28 novice inspectors have been divided in two groups. They have been asked to evaluate a commercial hypermedia CD-ROM applying the AT inspection or the traditional heuristic evaluation [4]. In another controlled experiment reported in [6], 73 participants, divided in three groups in a between-subjects design, evaluated a commercial elearning system applying the AT inspection for e-learning systems, the heuristic evaluation, or user-testing. Results of both experiments have shown an advantage of the AT inspection over the other usability evaluation methods, demonstrating that Abstract Tasks are effective and efficient tools to drive evaluators and improve their performance. In the next section we show how AT inspection can be used for evaluating IV systems.

3. ABSTRACT TASKS FOR IV SYSTEMS We are adapting the AT inspection technique to evaluate information visualization systems. By considering the literature on information visualization and the experience of humancomputer interaction experts, we are defining ATs that address specific characteristics of IV systems to support the evaluators in their inspections. Shneiderman and Plaisant have indicated some major tasks that users would like to perform and should then be allowed by the IV system. Such tasks are briefly described in the following [14]. Overview shows users an overview of the entire collection. Overview strategies include zoomed out views of each data type to see the entire collection plus an adjoining detail view. Zoom allows users to zoom in/out on items of interest. Users typically have an interest in some portion of a collection, and they need tools to enable them to control the zoom focus and the zoom factor. Zooming is particularly important in applications for small displays. Filter permits to filter out uninteresting items. When users control the contents of the display, they can quickly focus on their interests by eliminating unwanted items. Details-on-demand enables users to select an item or group to get details when needed. Once a collection has been organized in a few dozen items, it should be easy to browse the details about the group or individual items. Relate shows relationships among items. Within visual displays, there are opportunities for showing relationships by proximity, by containment, by connected lines, or by colour coding. Highlighting techniques can be used to catch attention to certain items in a field of thousands of items. History provides the possibility to perform undo, replay, and progressive refinement. It is rare that a single user action produces the desired outcome. Information exploration is inherently a process with many steps, so keeping the history of actions and allowing users to retrace their steps is very important.

Extract allows users to extract sub-collections and query parameters. Once users have obtained the item or set of items they desire, it would be useful to be able to extract that set. The above tasks are an example of basic tasks that IV systems should allow users to perform, especially those that aim to be more generic, i.e. used with a wide variety of data and in several application domains. Other specific tasks are currently being identified. The next step is to define ATs that support the inspectors in checking that, in the systems they are evaluating, all the appropriate tasks are available to the users. In Table 1 we report two ATs chosen among those we are currently defining; they address filter and details-on-demand tasks, respectively. Table 1. Two examples of ATs addressing filter and detailson-demand tasks FT_01: FILTER OUT UNINTERESTING ITEMS Focus of action: the visualization of a large amount of items Intent: verify the presence of mechanisms that allow users to filter out uninteresting items Activity Description: given a visualization of a large amount of items: − Choose an attribute that characterizes interesting items − Try to activate the mechanisms that permit to filter out, e.g. darkening or hiding, uninteresting items − Try to return to an unfiltered visualization of the items Output: a description reporting if: − There are mechanisms that allow users to choose an attribute that characterizes interesting items − There are mechanisms that allow users to filter out uninteresting items − There are difficulties in using filtering mechanisms, describing such difficulties DD_01: GET DETAILS ABOUT ITEMS Focus of action: the visualization of some items Intent: verify the presence of mechanisms that allow users to get details about items Activity Description: given a visualization of some items: − Choose an item or a group of items − Try to activate the mechanisms that permit to get details about the chosen items − Try to return to the previous visualization of the items Output: a description reporting if: − There are mechanisms that allow users to select an item or a group of items − There are mechanisms that allow users to get details about the chosen items − There are difficulties in using mechanisms to get details, describing such difficulties We are also defining tasks that may be used to inspect specific types of visualizations, such as Time Series visualizations. Time series are widely used in applications such as electrocardiograms (EKGs), seismographs, industrial processes, meteorology, and sound recordings. They consist of sequences of real numbers, representing the measurements or observations of a real variable at equal time intervals. From an algorithmic perspective, there is a long history on time series, originally grounded on statistical analysis. Today, with the aid of computers, users can analyze time

series data using classical statistical models, and also explore data using visualization tools. In this way, users can see the data and interact with them, also exploiting their perceptual abilities to identify trends or spot anomalies [3]. In Table 2 we report two ATs that inspectors should consider when evaluating Time Series visualizations. Table 2. Two examples of ATs addressing issues related to Time Series visualizations TS_06: ZOOM IN/OUT A SPECIFIC INTERVAL Focus of action: mechanisms to zoom in/out a time series Intent: verify if users can zoom in/out a time series Activity Description: given a visualization of the time series data: − Choose a series − Focus on a specific interval and try to zoom in/out it Output: a description reporting if: − The used visualization permits to zoom in/out a time series − There are difficulties in zooming in/out, describing such difficulties TS_08: DETAILS ABOUT SPECIFIC POINTS Focus of action: time series data Intent: verify if users can get details about specific data Activity Description: given a visualization of the time series data: − Choose a series − Focus on a specific point of the time series − Try to find its actual value Output: a description reporting if: − The used visualization permits to reach the row value of the selected point − There are difficulties in reaching the row value, describing such difficulties

4. CONCLUSION We have discussed issues related to usability evaluation of information visualization systems. Various evaluation techniques can be considered and applied at the different phases of the IV software life-cycle. In particular, we have compared user-based methods with inspection methods, discussing advantages and disadvantages of both. Inspection methods, such as heuristic evaluation, are worth considering especially because they are cost-effective. However, heuristic evaluation has some drawbacks that are overcome by the AT inspection described in this paper. It uses evaluation patterns, called Abstract Tasks, to drive the inspectors' activities. This method has been used to evaluate various types of applications in different domains. Controlled experiments with novice evaluators have proved the value of this method with respect to other evaluation methods [4, 6]. We have shown how the AT inspection can be applied to evaluate IV systems, thanks to the availability of Abstract Tasks specific for these systems. We already defined a good number of ATs that show the feasibility of this method for IV systems. Other ATs are currently under development.

5. REFERENCES [1] Ardito, C., Costabile, M.F., De Marsico, M., Lanzilotti, R., Levialdi, S., Roselli, T., and Rossano, V., An Approach to Usability Evaluation of e-Learning Applications, Universal

Access in the Information Society International Journal, 4/5, 2005, 1-14. [2] Baker, K., Greenberg, S., and Gutwin, C., Empirical Development of a Heuristics Evaluation Methodology for Shared Workspace Groupware, In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW ‘02), New Orleans, Louisiana, USA, November 1620, 2002, 96-105. [3] Buono, P. , Aris, A., Plaisant, C., Khella, A., Shneiderman, B., Interactive Pattern Search in Time Series, Visualization and Data Analysis, In Proceedings of VDA 2005, San Jose, CA, USA, January 16-20, 2005, 175-186. [4] De Angeli, A., Matera, M., Costabile, M.F., Garzotto, F., and Paolini, P., On the Advantages of a Systematic Inspection for Evaluating Hypermedia Usability, Intern. J. of HumanComputer Interaction, Lawrence Erlbaum Associates, Inc, 15(3), 2003, 315-335.

[8] Matera, M., Costabile, M.F., Garzotto, F., and Paolini, P., SUE Inspection: an Effective Method for Systematic Usability Evaluation of Hypermedia, IEEE Transactions on Systems, Man and Cybernetics- Part A, 32, 1, 2002, 93-103. [9] Nanard, M., Nanard, J., and Kahn, P., Pushing Reuse in Hypermedia Design: Golden Rules, Design Patterns and Constructive Templates, In Proceedings of Hypertext ‘98: Proceedings of the ninth ACM conference on Hypertext and hypermedia: links, objects, time and space, Pittsburgh, PA, USA, June 20-24, 1998, 11-20. [10] Nielsen, J., Usability Cambridge, MA, 1993.

Engineering,

Academic

Press,

[11] Nielsen, J., and Mack, R.L., Usability Inspection Methods. John Wiley & Sons, New York, 1994. [12] Plaisant, C. The Challenge of Information Visualization Evaluation, In Proceedings of Conference on Advanced Visual Interfaces AVI 2004, Gallipoli, Italy, May 25-28, 2004, 109-116.

[5] Dix, A., Finlay, J., Abowd, G., and Beale, R., HumanComputer Interaction (3rd Edition), London, Prentice Hall Europe, 2003.

[13] Preece, J., Rogers, Y., and Sharp, H., Interaction Design, John Wiley & Sons, 2002.

[6] Lanzilotti, R., A Holistic Approach to Designing and Evaluating e-Learning Quality: Usability and Educational Effectiveness, PhD dissertation, Dip. Informatica, Università di Bari, Bari, Italy, 2006.

[14] Shneiderman B., The eyes have it: A task by data type taxonomy for information visualizations, In Proceedings IEEE Visual Languages, Boulder, CO, USA, September 3-6, 1996, 336-343.

[7] Mankoff, J., Dey, A., Hsieh, G., Kients, J., Lederer, S., and Ames, M., Heuristic Evaluation of Ambient Display, In Proceedings of ACM Conference on Human Factors and Computing Systems, (CHI ‘03), Ft. Lauderdale, FL, USA, April 5-10, 2003, 169-176.

[15] Somervell, J., Wahid, S. and McCrickard, D.S., Usability Heuristics for Large Screen Information Exhibits, In Proceedings of Human-Computer Interaction (INTERACT ’03), Zurich, Switzerland, September 3-6, 2003, 904-907.