Moran, and Newell's (1983) seminal contribution, the process of engineering .... from digital representations to analogue ones by the setting of 'speed bugs' ...
MODELLING HUMAN-COMPUTER INTERACTION AS DISTRIBUTED COGNITION Peter Wright, Bob Fields, Michael Harrison Department of Computer Science University of York Heslington York YO10 5DD Email {pcw, bob, mdh}@cs.york.ac.uk
ABSTRACT In this article we wish to describe an approach to modelling human-computer interaction (HCI) based on recent theoretical developments in cognitive science, namely distributed cognition research (DC). There are of course many approaches to modelling HCI ranging from task-oriented models to so-called contextualised models such as Activity Theory. The motivation of this paper is not to provide just another approach but rather to offer an approach which spans some of this range while remaining relatively comprehensible in its theoretical foundations. A second motivation is that the recent theoretical developments in DC already alluded to are so obviously relevant to HCI theory and design but the ideas have lacked visibility in the HCI community. Distributed Cognition research, while identifying resources for action as central to the interaction between people and technologies, stops short of providing a model of such resources. The resources model described in this paper defines a limited number of resource types as abstract information structures which can be used to account for decisions and actions. It is shown how these abstract types can be differently represented in an interface and how this affects performance. The paper also identifies a number of interaction strategies which utilise different configurations of resources. These two components of the resources model, information structures and interaction strategy, through the process of co-ordination and integration, provide the link between devices, representations and action that is not well articulated in the DC literature. Our aim in this paper is to introduce a way of understanding interaction and HCI design in DC terms. We refer to this as the information resources model. After a review of relevant DC literature we introduce the components of the resources model. We then show how the resources model can be used to compare different interfaces and to analyse interaction episodes. Finally we use the model to explain observed differences in performance across interface styles. INTRODUCTION Within both the science and engineering of HCI, most models of interaction are task-based. Since Card, Moran, and Newell's (1983) seminal contribution, the process of engineering usable computer systems has been seen as requiring some form of task representation early in design. A task is defined as the way in which a goal is attained taking into account factors such as competence, knowledge, resources, and constraints. A task representation is a formal abstraction of some aspect of the task such as goals and methods (Card, Moran, and Newell, 1983), task knowledge structures (Johnson, 1992), or the semantic components of a task (Green, Schiele, and Payne, 1988). Problems with the task as a unit of analysis have emerged as HCI research has moved towards the study of computer supported cooperative work (CSCW) (Marti, in press). Grudin (1990) for example, argued for a change in focus in HCI away from the individual user, using a single computer, towards groups of individuals communicating through technologies. Rogers and Ellis (1994) argue that traditional task analyses is ineffective for modelling collaborative work and instead argue that a more appropriate unit of analysis is the network of people and technological artefacts. In this approach analysis focuses on the transformation of information representations as they are propagated around the network, and also how shared representations are used to co-ordinate collaborative work. This socalled distributed cognition (DC) perspective has been used to describe a range of activities from navigating warships to solving children’s puzzles, and DC has been discussed as one component of a theory to bridge research in CSCW and HCI (Nardi, 1996). Yet despite this claim and despite the fact 1
that the DC perspective is so obviously relevant to HCI theory and design, the ideas from distributed cognition research have not really gained visibility in the HCI community. In this paper, we propose an approach to modelling HCI which attempts to frame the ideas from DC research in a way that is more usable by HCI designers. We have called our approach the information resources model (or resources model for short). In many ways the term ‘model’ is too grand if taken in the spirit of GOMS or other formal cognitive models. Rather we mean to use the term to capture the idea that our approach, while incomplete, is systematic and bounded. The resources model, takes seriously an idea first introduced by Suchman namely that various kinds of information can serve as resources for action and it defines a set of abstract information structures which when represented in an interaction can be distributed between people and technological artefacts. These information structures, when combined can be used to inform action, and in taking action new resources are configured. In addition, the resources model introduces the concept of interaction strategy and describes the way in which interaction strategy and resources mutually shape the outcome of an interaction. In this sense the resources model is an HCI model in the DC tradition. The model is applied here to examples of single user- single system interaction rather than collaborative work. This is intentional since our aim is to show how DC ideas can be used in the more traditional domain of HCI. But work is beginning that aims to apply the resources model to collaborative work settings (Fields, Wright, Marti and Palmonari, 1998). Before describing the resources model, we review key DC literature in order to provide a motivation for its development. Distributed Cognition. Cognitive science research on the distributed nature of cognition comes from two distinct but not independent intellectual traditions, cognitive anthropology and cognitive psychology. In the former, the emphasis is on the study of real work settings and the role that artefacts play in the practice of work. The research is based on field studies and relates most naturally to the concerns of CSCW. The emphasis of the latter is the study of the individual in a technological environment, and the distinction between knowledge in the head and in the world is central. While everyday work is taken seriously there is a laboratory tradition at the heart of this approach. Both of these traditions will be reviewed in this section and the commonalities between them drawn out as a motivation for the resources model. The anthropology of Hutchins Hutchins (1995a) is perhaps seen as a chief protagonist of DC in cognitive science, although other cognitive anthropological traditions exist (see for example, Lave, 1988). As an anthropologist, Hutchins’ primary concern has been to consider how cognitive tasks are achieved in everyday settings. He argues that when cognition is studied in the wild it is apparent that it is best studied not as a individualistic mental phenomenon, or information process occurring inside the head of a solitary thinker. Instead it is necessary to consider cognition as a joint activity involving several agents some human and others technological. Hutchins (1995a), in a paper entitled how a cockpit remembers its speeds, makes explicit the unit of cognitive analysis as a network of people and technologies. The activity studied in this paper is that of an aircraft being flown on its final approach to landing by two pilots supported by a number of technological artefacts. The key problem that the distributed system has to solve is to ensure safe airspeeds during the final approach to landing. As the aircraft approaches the ground it must slow down but as it slows, flaps on the wings have to be extended far enough (but not too far) to maintain the lift required to support the aircraft. There is a minimum safe air speed for a particular flap setting and this varies depending on the weight of the aircraft at the time. Hutchins begins by identifying how information about speeds, weights and flap settings is represented in the cockpit, how these representations are transformed as they flow between agents and also how they are co-ordinated to generate new information and transform the tasks of the pilots. A number of representational artefacts are involved in this task. The mapping between aircraft weights and minimum air speeds for a range of flap settings is represented as a book of look-up tables, which can positioned above the primary flight instruments visible to both pilots. This representational device not only transforms a complex arithmetic calculation into a simple look-up task, but the estimate of aircraft weight and minimum speeds on approach landing is made available for inspection to both pilots for cross checking and later revision if necessary. Once the appropriate airspeeds have been looked up they are further transformed from digital representations to analogue ones by the setting of ‘speed bugs’ around the dial of the air speed indicators of each pilot. Speed bugs are small slider devices that are moved around the rim of the air speed indicator to mark the position of the safe minimum speeds. Re-representing the digits of the look-up table as positions on the analogue display of the air speed indicator in this way serves to transform the task of remembering several mode-dependent speeds into a perceptual task of locating
2
the airspeed indicator needle in a particular segment of the dial. Such perceptual task provides information not only of the difference between current speed and minima but also serves to indicate what configuration the aircraft should be in for the current speed. In a similar anthropological study on board a US warship, Hutchins (1995b) has studied how tasks such as fixing the location of the ship at sea are achieved by computations distributed across a number of individuals involved in a team navigation activity. As with the aircraft study, much attention is spent on describing how the technology used by the team helps transform analogue to digital representations and how different representations are brought together to generate new information. In addition however, Hutchins explains how the work is organised to maximise error recovery or minimise the impact of error and to distribute different cognitive tasks (e.g. sensing, recording and computing) amongst appropriate individuals. Some detailed consideration is also given to how actions in a shared plan are distributed amongst individuals and synchronised to ensure that the correct information processing steps are taken at the correct time. As these examples show, a key feature of DC theory is that the unit of analysis is a network of technologies and actors and that cognition is viewed as a process of co-ordinating distributed internal representations (i.e. in the human minds of team members collaborating together) and external representations (i.e. in the designed world of information artefacts) to provide a robust information processing system. While Hutchins’ emphasis has been on team work, others from cognitive psychology have taken a DC approach to individual problem solving, where the emphasis has been on how cognitive components of a problem, are distributed between a human problem solver and the problem solving artefact. Distributed cognition in psychology Norman (1988) first introduced to the HCI community the idea that knowledge might be as much in the world as it is in the head. Norman’s aim here was to point out that the information embedded in technological artefacts was as important to the achievement of a task as the knowledge residing in the mind of the individual who used that artefact. Furthermore Norman argued, well designed artefacts could reduce the need for users to remember large amounts of information while badly designed artefacts increased knowledge demands made of the user. The rhetorical aim of the knowledge in the head, knowledge in the world distinction was thus to draw designer’s attention to the implications that design decisions had on cognitive processing. The cognitive systems engineering approach advocated by Norman is premised upon this view of cognition as distributed between user and artefact (Norman, 1986). In search of theoretical foundations for distributed cognitive systems, (Zhang and Norman, 1994) have taken the DC approach back into the laboratory to study human problem solving. Here unlike much of Hutchins’ work, the emphasis is not on collaborative team working, but rather on the interaction between an individual and a representational artefact and a comparison of behaviours with different information representations. Zhang and Norman studied performance using the ‘white rat’ of problem solving, the Towers of Hanoi problem. In its most common form, the Towers of Hanoi problem is an apparatus comprising three pegs. On one peg is stacked 4 disks of decreasing size. The object of the game is to move the four disks one at a time from the starting peg to a finishing peg via the intermediate peg. Moves are restricted so that a larger peg cannot be stacked on top of a smaller peg. It is possible to generate many different versions of the Towers of Hanoi problem all of which have the same underlying structure. These structural isomorphs have been used in the past to study issues such as transfer of learning across formally equivalent tasks and the role of context in problem but the valuable contribution of Zhang and Norman is not the identification of sameness or abstract equivalence, of different versions of the Towers of Hanoi but rather the development of a systematic way of describing the differences between versions of the same problem. Zhang and Norman argue that differences between Towers of Hanoi versions can be captured by identifying whether the rules are represented externally in the problem solving apparatus or internally in the problem solver’s memory. They also distinguish between rules that are implicitly represented and rules that are explicitly represented. For example, the rule that a disk cannot be placed on a peg occupied by a smaller disk, can be explicitly represented in the instructions of the problem. Alternatively, in a version which uses containers that stack inside each other like Russian dolls, the fact that a larger container cannot be stacked inside a smaller one provides a physical constraint that is an implicit representation of the rule. In their analysis Zhang and Norman, measure performance on a range of Towers of Hanoi tasks to discover that problems requiring explicit internal representations of the ToH rules make greater demands on problem solvers than problems where the rules are implicit and externalised in the problem apparatus. 3
More recently, Zhang has attempted to provide a more substantial model for the analysis of distributed problem solving (Zhang, 1997). The model defines external representations as “the knowledge and structure in the environment, as physical symbols, objects or dimensions, and as external rules, constraints, or relations embedded in physical configurations.” He goes on to argue that information in external representations can be ‘picked up, analysed and processed’ by perceptual systems alone. These are contrasted with, internal representations such as schemata and productions which have to be retrieved from memory. Most interesting cognitive tasks have what is in Zhang’s terms an internal and an external component (that is to say, they are distributed problems), and as a consequence the problem solving process involves integrating or exchanging information from these representations, to produce new information. Inference is an example of such a process (where for example, an externally represented piece of information serves as the antecedent to some internally represented conditional rule of action). Here we shall refer to this process more neutrally as computation. A traditional view of cognition would argue that such computation is a matter of internalising external representations, but as Zhang points out, this is not always the necessary; “[external representations] need not be re-represented as internal representations in order to be involved in a distributed cognitive task: They can directly activate perceptual processes and directly provide perceptual information that in conjunction with internal representations, determine people’s behaviour”. (Zhang, 1997). Hutchins, perhaps, makes the point more concretely when he concludes; “As we have seen, a good deal of the computation performed by a navigation team is accomplished by processes such as hand-eye co-ordination....The task of navigation requires internal representations of much less of the environment than traditional cognitive science would have led us to expect.” Hutchins (1995b) p. 132 Two key conclusions from Zhang’s work are firstly, that appropriate external representations can reduce the difficulty of a task by supporting recognition-based memory or perceptual judgements rather than recall and secondly, that certain kinds of externalisation can trigger inappropriate problem solving strategies or inferences. This latter finding was further developed in Zhang (1996). This paper identifies a general class of displays including alphanumeric, graphical and tabular displays which he calls Relational Information Displays (RIDs). RIDs are “displays that represent the relations between information dimensions”. Zhang illustrates categorises information dimensions into four types, nominal, interval, ordinal and ratio, each with characteristic properties forming a hierarchical structure. For example, nominal scales have the property that each item can be categorised as same or different to any other item, whereas ordinal scales have the nominal property plus the property that each item can be ranked (greater than/less than, next to etc.). Using as an example, the Desktop display of a Macintosh computer, Zhang goes on to compare the dimensional properties of represented information (for example file size) with the dimensional properties of the representing display. The represented information structure is categorised in terms of number of dimensions and the scale types of each dimension. This is then used to classify different types of display. So, for example, an underlying information structure which has 2 dimensions, one of which is ratio and one of which is nominal, can be represented by a pie chart or a bar chart. Zhang goes onto consider how different information structures support general information processing tasks. For example, three general tasks are; information retrieval (e.g., what is the size of the Word file called work?), comparison (is the Word file final larger than the Word file work?) and integration (for example integrating information about breadth and depth to determine area). Zhang argues that although there is no task-independent way of determining the best display, there is a general principle that can be derived from the RID framework, namely that “the information perceivable from a RID should exactly match the information required for a task, no less and no more”. If the display does not obey this so-called “mapping principle” then the task cannot be achieved unless the user has other information internally represented. The mapping principle is useful in highlighting that displays are not meaningfully evaluated in a task independent way, and also in highlighting the relation between externally represented information and the residual information required to complete some task. It is usually the case of course, that displays have to support more than one task, and hence need to be evaluated in the total task context. Motivation for the resources model Those aspects of DC literature reviewed above provide an indication of the breadth of application of the basic ideas of this research paradigm. A number of commonalties exist amongst these basic ideas. 4
In summary, the DC position is that the appropriate unit of analysis is not an individual but rather a distributed cognitive system of people and artefacts. The means by which this distributed system is studied is by analysis of the way in which distributed representations are co-ordinated, propagated and transformed. By analysing this process of representation and re-representation we are able to account of how the tasks of humans in the system are transformed and made more or less difficult. For modelling HCI, the DC paradigm has some obvious attractions. It might be used to understand how properties of objects on the screen can serve as external representations and reduce cognitive effort (Scaife and Rogers, 1996). It can also serve to bring together work on CSCW and HCI by considering how technology mediates the propagation of representations between individuals. The deliberate softening of the boundary between the user and system inherent in the DC view also brings into focus the design question of the information requirements for interaction. What is the information required in order to carry out some task and where should it be located? As an interface object or as something that is mentally represented by the user? But as it stands the DC paradigm also has some limitations as a model of HCI. Central to such a model there needs to be an account of action. We see this in all HCI models. For example, the GOMS concept of operations ties representations to action (Card, Moran and Newell, 1983), and in a less formal account, Norman’s execution-evaluation loop (Norman, 1988) links the mental representation of plans to actions and their effects on the world. The RIDs framework is concerned more with inferences drawn from representations, and tasks are seen as something supported more or less well by representations. Indeed Zhang’s notion of task is very specific to information search and does not connect with tasks more familiar to HCI. For Zhang and Norman’s study of the ToH problem, the link from representation to action is presumably through a notion of planning or plan construction, but plans themselves are not viewed as representations that could be potentially distributed between device and problem solver. Hutchins’ account of how a cockpit remembers it speed emphasises representations of the state of the world (the speed and configuration of the aircraft, its proximity to landing and so on ) and goals to be achieved (keeping the airspeed above a certain limit and so on). There are obvious connections between this work and Zhang’s mapping principle but how these state representations inform the actions of the pilots is less explicit. Elsewhere, Hutchins has considered in some detail, representation of plans for action in the form of written procedures for achieving tasks. He has examined how these are distributed in artefacts and how they are internalised by people (Hutchins, 1995a). Within artificial intelligence, plan construction and plan-following are paradigmatic of the link between representation and action, but there a plan is seen as a control structure or program which is deterministically followed. An alternative view has been put forward by Suchman (1987). She argues that plans are resources for action. To make sense of plans as resources for action, plans themselves must be treated not as program-like control structures but rather as possible objects of conscious attention that can be manipulated, externalised and evaluated.1 Viewed in this way, plans like RIDs can be seen as distributed representations. But where RIDs are representations of state information, plans are representations of possible courses of action. The pilot’s check list for final approach described by Hutchins (1995a) is a plan partly internalised and partly externalised. It is a representation of a course of action, not state, but in order for the plan to be carried out it may need to be co-ordinated with other distributed representations such as the representations of the state of the aircraft. In addition, the work as practiced may not follow precisely the plan as externalised but it nevertheless provides a structuring resource for the activity. But plans are only one kind of resource for action and a rather self-evident one. When we visit a restaurant, we may have no plan as to what to order. We may rather choose to view the menu and decide on the basis of this. We might also recall the last few times we have been at the restaurant, what we have eaten on those occasions and then eliminate those from the set of possible choice represented by the menu. Here we are using two resources for action. The menu is a representation of what actions are possible, and our memory for the visits is a history of our previous choices. In pursuit of a distributed cognitive model of human computer interaction, we have identified and defined a number of resources for action and explored how different system implementations distributed these resources between people and technologies. Like Zhang and Norman’s ToH rules, we 1
It is not our intention here to suggest that all control of action must necessarily be deliberate nor available to introspection. This is clearly not the case. Rather we mean to suggest that plans can be reasoned about, altered, communicated to others, diverged from etc. Neither do we mean to suggest that even as conscious objects they are necessarily formulated prior to engaging in action. Plans may emerge from interaction in a situated way (see Suchman, 1993).
5
have also observed how these resources are sometimes explicitly represented and sometimes less so. In this model which we refer to as the resources model, Zhang’s RIDs, and Suchman’s plans are but two of a number of possible resources for action. Our unit of analysis, like Hutchins, is the distributed system, but in the examples we use below to introduce the model, we are concerned only with single users as a special case of the application of the model. One of the consequences of the resources model is that different styles of interaction or what Zhang might refer to as problem solving strategies, emerge as a consequence of how the interaction is resourced, both in terms of what resources are available and how and where they are represented (or implemented). THE RESOURCES MODEL The Resources Model (Wright Fields and Harrison, 1996; Fields, Wright and Harrison, 1997) identifies six abstract information structures that can be used to classify the types of information that inform interaction in a distributed cognitive system. Following Suchman’s critique of plan-based models of cognition, we argue that these information structures are not cognitive control structures, rather they are representations in Hutchins’ sense and are objects of perception and cognition that can be consciously appraised, evaluated, and modified by human agents in a distributed system. The information structures of the resources model can be defined as abstract information types in the same way that Zhang defines a taxonomy of information displays and Zhang and Norman define the information structures of the ToH. We also distinguish between an abstract information structure and its representation (or implementation) as a component of a distributed system. Thus we can consider two instances of interaction as equivalent (or isomorphic to use Zhang and Norman’s terminology) at an abstract level by virtue of involving the same abstract information structures but these same instances of interaction might be non-equivalent by virtue of the different way these abstract structures are implemented or represented. One possible implementation difference is the distribution of the resources across agents in the system. A second possible implementation difference is how the abstract information is displayed. This is analogous to Zhang’s distinction between the properties of the represented and representing structures. A third possible implementation difference is whether the abstract information is implemented in an explicit or implicit way in sense intended by Zhang and Norman (1994). The six information structures are plans, goals, affordances, history, action-effect relations and current state. These types have been derived from the HCI and CSCW literature in an inductive way. For example the concept of plan comes from the work of Suchman (1987) and Schmidt (1997). The concept of action-effect map is a response to the work of Monk (1990), and so on. It is entirely possible that more structures could be required. The six already identified are described below. Abstract information structures Plans: In the resources model, a plan is a sequence of actions that could be carried out. Plans can involve conditional branches that are dependent on the current state of the world or system (see below). Plans can be represented internally as memorised procedures to complete some task. They can also be represented externally as a step-by-step procedure for example, checklists, or standard operating procedures, or user instructions. They may also be represented across individuals as in the case of Hutchins’ description of team navigation (Hutchins, 1995b). Goals: Goals, like plans, are traditionally viewed as mental constructs, but within the resources model goals like plans, are abstract information structures that can be represented internally or externally. An everyday example of an externalised goal, is a shopping list. A shopping list can be used to represent the state of the world that a person wishes to bring about namely that all of the listed items are in a shopping trolley. Partial achievement of this goal often occurs when only a subset of the items can be found on the shelves. Note that in this example, the list is not a plan, it does not prescribe an order in which actions should be carried out nor what to do if some item on the list is missing. To consider a more technological example, aircraft flight control interfaces often include something called a flight director. This indicates to the pilot which way she must steer in order to continue on the desired flight path. It is not a plan because does describe the actions required to get the aircraft on the right path, it is a goal because it specifies a state of the world to be achieved2.
2
More precisely, the relation of the represented position of the aircraft to the represented target position, in the context of an intention to follow a planned route, generates for the user of the resources, a goal to close the gap. 6
Affordance: The term affordance has been used by others in the HCI literature. Norman (1988) for example uses the term to refer to some intuitive way of using an artefact. So for example, certain kinds of door handle are intuitive in that they afford pushing not pulling or vice versa. Within the resources model we use the term affordances in a more fundamental way, which is to refer to the set of possible next actions that can be taken, given the current state of the system. Everyday examples of affordances are menus (both on computers and in restaurants) which represent the set of possible things that can be chosen, and road junctions, which represent the set of possible routes to take. In both cases the artefact or situation affords a set of possible actions. Whether or not one particular action is more intuitive or appealing is a representation or implementation issue. Notice that sometimes in the case of menus, not only possible actions but also possible actions that are temporarily unavailable are indicated. History: An interaction history is a list of the actions already taken or states achieved, to get to the current state. A UNIX command line interface, for example, displays some of the previous commands that the user has issued. Undo commands in word processors and back and forward commands in web browsers rely on a representation of history. Plan-based models of human computer interaction such as that of Norman tend not to consider the effect that previous actions have in the on-going stream of interaction. Action-effect: An action effect relation is a statement of the effect that an action will have if it is carried out. If the effects of a set of possible actions are known we refer to this resource as an action-effect model. Most user manuals and help systems provide this information (Monk, 1990). Current state: The current state is the collection of relevant values of the objects being acted on. In the case of the ToH puzzle analysed by Zhang and Norman, it is the positions of the four rings on the three disks. In Hutchins’ cockpit example, it is the values of the flap and slat settings and the current indicated air speed. The user may carry out an action which bring about a particular state that is judged to be nearer to the goal. Processing in the resources model Within the resources model, action is informed by the configuration of resources represented in the interaction at any particular time, either externally in the interface or internally in the head of the user. When an action is taken the configuration of available resources changes. Minimally for example, the history of the interaction changes when an action is taken. By resource configuration we mean the collection of information structures that find expression as representations at a given step in the interaction and which can be used as resources for action. By using the term “action” we do not confine ourselves to atomic actions such as ‘selecting a menu item’; at a given point in the interaction, the resource configuration may be sufficient to determine a whole sequence of such primitive actions. An interaction sequence can thus described as a number of steps from one resource configuration to another. We also make a distinction between our primary concern which is a logical action such as choosing a menu item and the physical instantiation of that action as moving a mouse cursor to a particular area of the screen or whatever. Co-ordination The decision as to what logical action to take involves bringing together these different representations through processes which Hutchins describes as co-ordination. Co-ordination processes are discussed in great detail by Hutchins (1995b), in this paper we shall use the term in a relatively restricted sense. How easy it is to co-ordinate one or more resources is determined in part by the location and physical attributes of their implementation. For example, our restaurant diner described above, requires both the menu and a memory for meals she has chosen previously in order to decide what to have for her meal. The memory for previous choices is a history and the menu is a set of affordances. The history is an internal resource and the affordances an external one. The process of co-ordination in this case requires our diner to read an understand the menu, to be able to recall accurately enough the names of previously eaten meals and to search for items in the menu that are not in the history. Thus the diner can identify a smaller set of possible choices but not necessarily single out a particular choice. In order to illustrate co-ordination, we return to two examples discussed previously: achieving target speeds (adapted from Hutchins, 1995a) and following a checklist (Hutchins, 1995b; Wright, Fields and Pocock, 1998).
7
Achieving target speeds: Co-ordinating goals and states. In Hutchins’ example of cockpit speeds, described above, it is necessary for a correspondence to be maintained between the aircraft’s speed (which the pilot will tend to want to reduce as the approach progresses) and the settings of the flaps (which must be extended to maintain lift as the speed is reduced). Part of this process involves knowing the “target speed” at which the flap setting is to be increased; it is necessary to notice when the declining speed reaches this target, at which point the flap setting should be adjusted. One of the co-ordination processes that is carried out is therefore to make a comparison between a target or goal state (the target speed) and the current state (i.e., the current speed). In order to do this, the goal and current state resources must be brought into co-ordination, and precisely how this happens is highly dependent on the way the resources are represented. Figure 1 shows three examples of how the goal and current state resources may be represented to support this activity; in each case the process of co-ordinating resources to make the task-relevant comparison is different. SPEED:
0
264
MACH
Current state
220 30 0
10 0
8 1 3 7 6
KNOTS
127
SPEED:
246
TARGET:
127
TOO FAST Target
20 0
Target Current state (a)
(b)
(c)
Figure 1: Co-ordinating goal and state resources a) The current and target speeds are represented as a pointer and “speed bug” respectively. The process of comparison and co-ordination then becomes perceptual judgement about the relative locations of pointer and speed bug. b) The current speed is represented in a numeric display line, and the target is represented internally (i.e., remembered) by the pilot. In this case the process of co-ordination involves ability of the pilot to read the display, interpret it (employing knowledge about decimal numerals), and make the comparison with the remembered target speed to determine whether the remembered speed is higher or lower.3 c) Both current and target speed values are represented in numerical form in the flight deck display, allowing the pilot to make an explicit co-ordination similar to case (b), with the difference that both values are read and interpreted, and neither of them is represented only in the pilot’s memory. In addition, the flight deck computer system also makes a comparison and displays the result as a piece of text in the display. Following a checklist: Co-ordinating plans and histories. As a second example consider the activity of following a checklist. An instance of an activity we refer to as plan-following. Plan following is possible, when what Hutchins (1995b) refers to as a precomputed plan exists and the user follows it. In abstract terms the process for selecting the next action to perform is simple: compare the complete plan and the history of which actions have already been performed in order to determine which is the next item on the plan. Once again, the actual work involved in accomplishing this co-ordination is highly dependent of the representational form of the resources. In Figure 2, three illustrations are given of how the nature of plan and history representations determine the kind of co-ordination work that is needed for plan-following.
3
Note that interpreting the numeric display relies on a further co-ordination of information structures — both internal and external — that we won’t explore here, but see (Zhang, 1997) or (Hutchins, 1995b) for a detailed discussion.
8
Checklist Action A Action B Action C Action D
(a)
Action A Action B Action C
Do Action C ne xt
Action D
(b)
(c)
Figure 2: Co-ordinating plans and histories a) The “plan” resource is represented as a printed checklist of actions to perform. The “history” is differentiated by the user of the checklist marking the next item to perform with her finger. The coordination simply involves looking for the item next to the finger. Note that this means the representation of the plan, the next action to perform, and the history of actions that have been performed are very tightly interrelated. b) Here, the plan the history and the marker for the next action are all maintained electronically by the computer system and are represented in a single display. Actions performed already (the intersection of the history and the plan) are shown on a grey background, and the next action is shown in reverse type. Again, the co-ordination is trivial: it has already been performed by the machine. c) This is a variation on (b): both the plan and history are stored inside the machine, but only the next action is actually displayed. As both Figure 1 and 2 illustrate, in some situations co-ordination can be particularly well supported if the resources that need to be co-ordinated are represented in the same external space. In Figure 1a state and goal are co-located in the air-speed indicator for example. Similarly in Figure 2a, by introducing her finger as a marker the pilot produces a co-ordination of history and plan in a single external representation. When this occurs we describe the resources as being integrated. Interaction Strategies It is not intended that the co-ordination of resources deterministically generates a next action or sequence of actions in an activity. Rather, co-ordination in our sense of the term involves combining information resources to guide action. Our restaurant diner need not use history and the menu to choose something new to eat, she may use exactly the same resources to choose something she previously enjoyed. The goal is the same (to choose something to eat), the history and the affordances are the same yet the action taken is completely different. We use the term interaction strategy to describe different ways in which resources can be used to make decisions about action. It should be emphasised that a strategy does not emerge as a consequence of available resource configurations, but is an active process of interpreting representations as resources. Within the cognitive psychology literature there have been many demonstrations of the way in which people adapt their problem solving strategies to take exploit constraints and opportunities in the problem solving environment (Sternberg and Wagner, 1994). To take an everyday example, a learning test which adopts a multi-choice answer format allows the student to answer questions by directly recalling the answer from memory or by eliminating the implausible responses from the set of responses offered. The success of the elimination strategy will depend on the details of the responses offered. Thus the ability of the student to answer the question is a product not only of his learning but also of the details of the test design. In the case of HCI a similar distinction can be made for example, command line interfaces require the user to recall appropriate commands while menu-based interfaces support a range of interaction strategies such as for example, eliminating implausible options. Interaction strategies presuppose certain configurations of resources to make them effective and conversely a particular configuration of resources makes particular interaction strategies possible. Below we describe a range of interaction strategies relevant to HCI and relate these to the information resources they require.
9
Plan following As an interaction strategy, plan-following involves the user in co-ordinating a pre-computed plan with the history of action so far undertaken. In its simplest form, the plan is followed by determining the next action on the list until the list is exhausted. Some plans may have conditional steps requiring the plan-follower to examine the current state of the system. Plans are usually followed for a particular goal but it is possible to follow a plan blindly without any knowledge of what it will accomplish. A pre-computed plan is central to the plan following strategy, thus a plan following interaction will have the plan as a resource either externalised in the interface (see Figure 2a & b), recalled by the user or recorded in some other form (as an operating procedure kept in a handy manual for example). In a plan following interaction, the plan and interaction history need to be maintained and co-ordination will combine them in order to keep a sense of position within the plan. Aspects of the current state and goal may also be co-ordinated with this position to deal with conditionals or to assess whether the goal is achieved. Plan construction Plan following requires the use of a pre-computed plan. A more elaborate activity may involve an initial phase of plan construction, followed by a phase of following the plan that has been defined. While the output of a plan-following strategy is a sequence of action that is taken, the output of a plan construction activity is the plan for use as a resource. Plan construction typically requires the user to compare the current state of the world with some goal state and to select from possible next actions those that reduce the difference between the two states. Plan construction thus co-ordinates four types of information resource; goal, current state, action affordances and action effects. Plan construction may be a complicated process because as the state changes, the actions that are available or appropriate will also change. This makes plan construction a costly activity if is done entirely in the head of the user. Resources necessary for plan construction can be externalised however. A system that supports plan construction might for example, make the developing plan available for manipulation by the user and clarify the nature of the changing availability of the actions and their effects. Plan construction itself will require the planner to adopt some form of strategy to construct the plan. Artificial Intelligence research many such plan construction strategies (see, for example, Russell and Norvig, 1995). Goal matching In many situations, humans do not act by recalling and following previously formulated plans, nor by computing plans to follow ‘on the fly’. Many of the studies of users of graphical user interfaces conclude that this is not how users act, and that they often have rather poor recall of the tasks being performed and the interfaces used (see, for example, Mayes, Draper, McGregor and Oatley, 1988; O’ Malley and Draper, 1992). One alternative to plan construction or plan-following is goal matching. In a goal matching strategy, the user decides what to do in a much more localised way by matching the effects of an action with the current goal, and checking to see if the resulting system state satisfies the goal. The resources co-ordinated to support this strategy are a goal, a set of affordances and actioneffect relations. When the effects and affordances are represented externally in the interface, the strategy has been called display-based interaction (Howes and Payne, 1990). Kitajima and Polson (1995) have put forward a model of this strategy, but referred to it as label following. Others have referred to it as a semantic matching (Lewis, Polson, Wharton and Rieman, 1990). While this may seem a rather minimalist strategy, it is common among novice users of menu driven interfaces particularly when the activity may be akin to browsing, but also in more sophisticated systems where the complexity of problem space makes planning very difficult. Graphical user interfaces provide support for goal matching by providing, in the interface, representations of the commands that are available and the effects that they have (See Figure 3).
10
Goal: to make a chart
Action affordances
ChartWizard
Action effect
Creates embedded chart or modifies active chart
Figure 3: Matching a goal to an action affordance History-based selection and elimination HCI has given little attention to the role of history in decisions about action, but as we saw with our earlier story about the restaurant diner, a possible strategy for choosing among affordances is to eliminate those that have already been chosen. Alternatively, history could be used not to eliminate but as the basis for selection. Interfaces that support these strategies might have some inspectable representation of history such as the go function available in many web-browsers. A key feature of web browsers is that the history can also be used as an affordance — clicking on an item in the history list takes you there, providing strong support for history-based selection.4 The interaction strategy a user adopts will, in part, be shaped by the resources that are available to her. Table 1 summarises the description of interaction strategies given above and relates these to the abstract resources that are required for their use. Strategy
Resources required
plan following
goal, plan, history (current state if conditional plan)
plan construction
goal, current state, affordances, action-effects
goal matching
goal, affordances, action-effects
history-based choice
goal, affordance, history
Table 1: Strategies and the resources they require In designing an interactive system decisions are made about which resources are externalised in the interface and thus different interface designs support different strategies in different degrees. But there is a danger here of over-simplification leading to erroneous conclusions about good design. Simply externalising all resources will not guarantee a good design. This is so for a number of reasons. Firstly, making a resource available to the user through externalisation does not guarantee it will be used. Secondly, in some situations having resources represented internally can confer advantages. Thirdly, a given resource configuration may support a number of interaction strategies. For these reasons, knowing what resources are externalised does not in any useful way allow a designer to determine what strategy will be adopted. However, the resources model does have some practical value in the design and analysis of interactive systems as will be demonstrated in the next section.
4
Note that in many commonly used browsers, the options available in a “Go” menu are not a true representation of the interaction history (see Wright, Fields and Merriam, 1997).
11
USING THE RESOURCES MODEL TO ANALYSE INTERACTION In this section we aim to describe three ways in which the resources model has proved useful in the analysis of interaction; as a means of comparing different dialogue styles, as a means of evaluating interaction episodes and as a means of better understanding differences in performance with different interface styles. Comparing interfaces In Figure 1 we examined a number of different implementations of speed information in a cockpit and in Figure 2 we compared a number of different ways to implement checklist following. These examples were largely fictitious but were useful in showing how interface implementations can be compared in terms of the different resources they implement. Figures 4 and 5 illustrate examples of actual interfaces that can be compared in this way.
Figure 4: Microsoft chart making “wizard”
Figure 5: ClarisWorks chart options dialogue Figure 4 shows a dialogue from a Microsoft spreadsheet application and Figure 5 a dialogue box from a similar ClarisWorks spreadsheet. In order to complete a chart in the Microsoft application, the user goes through a prescribed sequence of dialogue boxes by clicking the next button after various choices have been made in the box itself. In the Claris application the same affect is achieved in one dialogue box and the order in which changes and parameter selections can be made is unconstrained. A resource-based account of these differences would point to the support that is given to plan-following 12
in the Microsoft application. In total there is a sequence of five dialogue boxes, the user progresses through and these represent a high-level plan for generating the chart. This sequence of dialogue boxes is an externalisation of a plan resource. The implementation does not however make the plan explicit to the user since at any one time only a small fragment of the plan is visible in the form of the current box (cf. figures 2a and 2c for an analogous distinction). This plan is externalised in the interface and it prescribes the order in which actions are taken. The user need have no memory of this plan in order to interact with the application. In the Claris application, there is no plan-following resource externalised for the user. The order in which the actions are carried out is unconstrained. For plan following to occur with this interface the plan must be known by the user. An alternative strategy that can be used however is goal matching, based on the meanings of the five ‘modify’ buttons, or an elimination strategy based on previous history. These difference are summarised in Table 2 below. Claris application
Microsoft application
resources externalised
goal, affordances
goal, plan, affordances, current state, action effect
strategy supported by the design
goal matching
goal matching and plan following
Table 2: Resources and strategies in a spreadsheet applications While the difference between the two interfaces is easily apparent, yet surprisingly few models of interaction provide the necessary concepts to capture that difference. A task-based model might capture concurrent/sequential distinction but what is harder to see from such model is what that difference means in terms of how the interaction needs to be managed by the user. But viewing the difference as the presence or absence of an external plan immediately leads to a consideration of what strategies the user might adopt for achieving the goal and what implications these have for ease of use. In the next section we shall analyse the Microsoft application in more detail to explore further issues of ease-ofuse in relation to resources and strategies. Analysing interaction episodes The resources model can be used to identify parts of an interaction episode where scarcity of externalised resources places heavy demands on the user’s knowledge and consequently might shape their choice of interaction strategy. As an example, we consider in more detail the use of dialogue wizards described in the previous section. This example also highlights the way in which even within a relatively small episode of interaction we might find changes of interaction strategy and a hierarchical organisation of strategies, reflecting more than one of level of action occurring simultaneously. In order to illustrate the activity of chart making, we assume an imaginary user with no previous experience of dialogue wizards and attribute to her the top-level goal of displaying the selected data as a chart in an area adjacent to the data. This goal is a resource that is represented in the head of the user. It is not represented in the interface at all. (see figure 3). Step 1 At the start of the interaction a number of action affordances are available to the user, most particularly the ChartWizard button in the toolbar. In common with many modern user interfaces, when the pointer is moved over a button, a pop-up text box appears, showing the name or a brief description of the function activated by the button (see Figure 6). This can serve as an external representation of an action-effect mapping, providing a resource that aids matching between chart-making goals and the affordances provided. Figure 3 illustrates the screen at the start of this step and Table 3 summarises the resource configuration, location and relevant content. Buttons
Toolbar ChartWizard
Figure 6: Toolbar buttons and pop-up text
13
Resource type
Relevant content
Resource location
Goal
To display selected data as a chart
User
Current state
Selected data range
Selected data is visible and is highlighted
Affordance
Chart wizard function
Wizard button on the toolbar
Action effect
Chart wizard assists in specifying the parameters for a new chart
When the mouse is over button, a popup text box indicates the button’s function; text at the bottom of the screen gives a more detailed description.
Interaction history
none
Table 3: Resource configuration at the start of the episode (step 1) Step 2 After clicking on the chart wizard button to start the chart production process, new interaction resources are configured. The spreadsheet window changes so as to appear as in Figure 7, and the resources that are available in this step are summarised in Table 4. The mouse pointer has changed shape to a cross hair and the border around the selected data has also changed. These external resources, on their own, are insufficient for the user to decide what action to take next. But at the bottom of the screen, an action-effect mapping has been expressed “drag in document to create a chart”. This action-effect mapping in the context of the user’s goal is the key resource for action and supports a goal matching strategy, however it is easily overlooked placed as it is in the bottom left of the screen.
Figure 7: The screen after clicking on the Chart Wizard button (start of step 2) Resource type
Relevant content
Resource location
Goal
To display selected data as a chart
User
Current state
Selected data
Selected data visible, border is dashed line
Affordance
Dragging action
Shape of mouse pointer
Action effect
“drag in document to create a chart.”
Text at the bottom of the display.
Interaction history
Selection of wizard, change of display relating to selection, change to list of affordances
User
Table 4: Resource configuration after clicking on the wizard (start of step 2) Steps 1 and 2 are not supported by an external plan and their correct execution depends on the user having some resources internally represented. In particular the second step relies entirely on the user seeing and understanding the action-effect information at the bottom of the screen. They must also be able to relate that external resource to their mental representation of the top-level goal as part of a goalmatching strategy.
14
Step 3 Carrying out the dragging action leads to the screen configuration shown in Figure 8. The dialogue box has now appeared, marking the starting point of the plan-based part of the interaction. The length of the plan and the user’s current position in it is clearly indicated. This step triggers a new and unexpected sub-goal for the user that of selecting appropriate data for charting. The user has an action history that is internally represented which, if consulted, will indicate that this action has already been carried out. In addition to the internally represented action history, state information expressed on the screen as a box around the data and a range of data co-ordinates, indicates which data is currently selected. Even so, the user must remember whether the information selected is the information he or she wishes to chart. Note also that once the next step is taken, the box around the data disappears and the information about which data is going to be charted is no longer visible. The user must rely entirely on memory. See Table 5.
Figure 8: The screen after dragging a presentation area (start of step 3) Resource type
Relevant content
Resource location
Goal
Displaying the selected data as a chart in an area adjacent to the data and sub-goal that is to modify the data selection
In the head of the user for the top level goal. Changing the data selection is indicated in the dialogue box.
Current state
Selected data
The selected data is visible on the display; border highlighting changes Current data selection appears as a default entry in the dialogue box.
Affordance
Information about selected data, buttons indicating movement forwards through the plan, other possible actions
Implemented in the system. Buttons.
Action effect
Information about the data entry box, information about the nature of the buttons.
Some of this depends on standard knowledge of Macintosh dialogue boxes; some provided by information on the buttons.
Interaction history
Position in the defined plan. Data has already been selected
Indicated on the dialogue box. Remembered by the user
Table 5: Resource configuration after dragging (start of step 3) Step 4 The next step involves defining the nature of the chart to be produced based on the selected data. A number of alternative images are offered suggesting the style of chart that is to be produced. Highlighting is supposed to afford an association with the idea that this is the type of chart to be
15
selected in the context of the rest of the plan. Buttons are offered in order to continue the plan. Other buttons are also offered which suggest effects of either aborting the plan and returning to the beginning having had no effect on effect, or finish which suggests completion taking all the defaults that are offered in the remaining steps of the plan. See Figure 9 and Table 6.
Figure 9: The screen after selecting data (start of step 4) Resource type
Relevant content
Resource location
Goal
Displaying the selected data as a chart Select type of chart
User Type of the chart is depicted in the display selected by highlighting
Current state
Selected data, type of chart
Selected data is visible Selected chart type highlighted on the display
Affordance
Buttons to navigate through the plan Chart types can be selected
Implemented in the system. Buttons. Use of highlighting, knowledge in the head about how the mouse can be used for selection.
Action effect
Information about what the buttons do, and what the effect of selected chart is.
Provided by button labels and user’s ability to relate the image of the chart to the nature of the outcome of the plan.
Position in the defined plan.
Indicated on the dialogue box.
Interaction history
Table 6: The screen after selecting data (start of step 4) Interestingly, the images serve not only as a goal resource indicating what charts are possible but also as action affordances since clicking on the images will move the user towards the chosen goal. As well as being expressed implicitly as a gallery of images, the new sub-goal is also expressed explicitly by the prompt ‘Select a chart type:’ Before this point in the interaction, the user may not have had an explicit type of chart in mind, and may not even have known about some of the possibilities. Hence the action taken is an emergent property of the interaction of external and internal resources. Choosing an appropriate chart itself requires an interaction strategy. The gallery of images supports a goal matching strategy or some form of elimination. This strategy is embedded within the higher level plan-following strategy for completing the top-level goal.
16
Step 5 onwards The process of chart construction continues in much the same way; the user is guided in the selection of parameters and options by the pre-planned sequence of dialogue boxes. In the final steps of this episode, the current state of the chart is represented as an image of how it will look on completion of the dialogue sequence (see Figure 10). This “image of achievement”, in conjunction with the user’s internal goal to produce a graphical representation of the original data, can be used to evaluate the quality of the finished product before it is finally produced. It might be concluded that the user could finish the dialogue here. Whether using a goal matching or a plan-following strategy, the image of achievement provides clear feedback that the goal has been achieved. Clicking on Next produces a further dialogue box, affording further actions, namely to add labels and legends to the chart. This is a continuation of the pre-computed plan that is embedded in the ChartWizard dialogue.
Figure 10: Dialogue box showing an image of the finished product
Resource type
Relevant content
Resource location
Goal
Display selected data as a chart; Make appropriate selections for data series, etc.
The overall goal and the style of the chart is depicted in the display as it would appear at this stage of the process. Indications on the display prompt new goal of changing column format selections
Current state
Chart preview
Effect of selections made so far is displayed.
Affordance
Data series / columns selections
Axis fields displayed afford data entry
Action effect
Changing selections
Understanding the role of the data entry fields, radio buttons etc. depends on previous experience of this style of interface.
Table 7: Resources after selecting a chart type (end of step 5) Results of the analysis Three key points emerge from the interaction analysis. i) The first two steps are poorly supported by external resources, and rely heavily on the what resources the user may possess. One of the few resources that could prove helpful here, the text at the bottom of the screen, is easy to overlook and can be difficult to interpret. A goal matching strategy would be expected to be most effective in these early parts of the interaction.
17
ii) After the first two steps the action is shaped by an external, pre-determined plan. The plan that is available is not explicitly stated as a visible, inspectable sequence of items. Rather it is implicitly expressed in the form and structure of the dialogue. The user is given feedback about the length of the plan, and their progress through it, but in order to find what the plan is they would have to navigate forward explicitly through the plan using the appropriate buttons. Similarly, in order to determine what they have done so far, they must recall the history or explicitly navigate back through the plan using appropriate buttons. iii) During this plan-based phase, new sub-goals are externalised, by forcing the user to make new choices and decisions (for instance, by presenting galleries of images showing possible chart types). These trigger possible choices that the user may have not considered and need not be aware of in advance. Hence the action taken by the user is an emergent property of the interaction between internal and external resources. The user’s top-level goal of creating a chart is given a rendering in the interface as an image of the final product before it is actually produced. This instantiation of the user’s goal serves a resource for early evaluation of the whole interaction. Our analysis of the dialogue wizard using the resources model allowed us to identify a number of characteristics of the design which enhance usability, and also some potential weakness in the resources provided by the designers of the wizard. This is particularly apparent in the early stages of the interaction. An empirical study of interaction strategies and interface design In the above analysis we had only a fictitious user and an imagined interaction, yet claims were made about the relationship between externalised resources and their implications for the use of interaction strategies and ultimately usability problems. In the next section we take some actual empirical results from the literature and explore how the resources model might account for them. A recent empirical study by Golightly and Gilmore, participants were asked to solve a computerised version of the “8-puzzle” using two interfaces that differ only in the “directness” of the interaction that is necessary. In one version, interaction occurred by directly manipulating the pieces of the puzzle by clicking them with a mouse (see Figure 11a), whereas in the other, pieces were indirectly manipulated by clicking on a separate array of buttons (Figure 11b). In both cases, a static picture showing the arrangement of pieces in the goal state was also provided (the right hand “Goal” square of Figures 11 (a) and (b)).
3 8
6
7
1
2
1
8
4
5
7
State & actions
2
3
3
4 6
5
8
Goal
6
7
1
2
1
2
1
3
4
8
4
5
5
6
7
7
8
State
(a)
Actions
2
3 4
6
5
Goal
(b)
Figure 12: Direct (a) and indirect (b) manipulation user interfaces for the “8 puzzle” The result obtained was that users of the IM interface required fewer moves to solve the problem than users of the direct manipulation interface. Furthermore, an analysis of the standard deviation of time taken for moves and a qualitative analysis of move patterns suggested that members of the IM group were following a different strategy from their direct manipulation counterparts. The IM users tended to plan ahead based on the current problem state, while the DM group acted in a short-term, opportunistic, “trial and error” fashion within the confines of a fairly rigid structure of high level goals. The strategies identified by Golightly and Gilmore are characterised in language specific to the 8puzzle, For example they talk of a goals involving “getting the top row in place” and so on. In a resource-based account of these results, the strategies identified can be seen as instances of the more generic interaction strategies discussed in the previous section. Across the two groups using IM and DM interfaces, three strategies appear to have been employed: goal matching in the direct manipulation case, and alternating between plan construction and plan following in the indirect case. The Table 8 below shows the resources that are required and manipulated by these three strategic components.
18
Strategy
Input Resources
Output
Goal matching
A goal Current state Action effects / affordances Means of articulating action
Action carried out
Plan construction
A goal Current state Action effects / affordances
Constructed plan Projection about future states
Plan following
Plan Means of articulating actions
Actions carried out
Table 8: Strategies and resources for the 8-puzzle One of the challenges for a resource-based account of the strategic differences observed by Golightly and Gilmore is that both of the interfaces present exactly the same information resources: the current state and the highest level goal are represented in the two copies of the game board, and the actions afforded by the device and their effects are represented either in the game board itself (in the DM case) or in the game board and key pad (in the IM case). Thus the presence or absence of information is clearly not a decisive factor in strategy selection here. What is different, however, is the physical location of the resources in each interface, and the way in which these resources are integrated. Golightly and Gilmore’s explanation for the observed strategic differences is that when the point of action is separated from the point of consequence (as in the indirect interface) the user is forced to bridge the gap by constructing a more sophisticated cognitive representation of the problem state, which encourages a more plan-based style of action. In the terms of the resources model these strategic differences are explained in terms of the differences in the integration between physical and logical action in the two interfaces. In the DM interface most of the resources required to support a trial and error (or goal matching)5 strategy are integrated in the same physical region of the screen. The current state of puzzle, the available actions, their effects and the means of articulating them are integrated in the same display. Only the current sub-goal has to be remembered by the user. By contrast, in the IM interface, although the same resources are present, those required in order to adopt a trial and error (or goal matching) strategy are not integrated (at least not to the same degree). The current state of the puzzle, the action effects and affordances are represented together (in the left hand game board), but the means of articulating actions is separated (in the keyboard display). In order to carry out the goal matching strategy, all these items need to be co-ordinated, and an additional overhead exists of collecting the resources together. This burden can be alleviated to some extent by using goal, state, action effect and affordance resources to plan several moves. Attention then shifts to the keypad and the plan can be executed. This requires the plan to be recalled along with the means of articulating actions. An assumption underlying this argument (which is also found in Golightly and Gilmore’s paper) is that humans display a natural economy in strategy selection and the construction of associated internal representations. To adopt a plan following strategy in the DM interface would create the unnecessary burden of constructing and manipulating a plan structure. In the IM, on the other hand, shifting attention between the game and keypad creates an additional burden and the user must rely on an internal representation of the game in way that is unnecessary in the DM case. Once such a representation is in place, following a plan-based strategy becomes more economical as it reduces the need for attention shifting. Distributed cognition, time and moves It might be noted at this point that Golightly and Gilmore’s conclusion that “increasing the difficulty of manipulating these problem domains improved users’ performance” (p 157), may be an overstatement. Other performance data collected by Golightly and Gilmore suggest a slightly different picture. These data taken from the paper are presented in Table 9.
5
Given the data available in the Golightly and Gilmore paper, it is not possible to distinguish between the idea of a trial and error strategy and our more generic goal matching strategy.
19
No of problems solved in 15 minutes
Average move time (ln) minutes
Moves taken to solve puzzles
Indirect manipulation
5.33
0.46
75
Direct manipulation
8.25
-0.23
104
Table 9: Data on time and moves from Golightly and Gilmore As Table 9 shows, users of the IM did take fewer moves to solve the problem as Gilmore and Golightly point out, but they also took longer to make a move and solved fewer problems overall than their DM counterparts. The more subtle point that these results show is that there is a trade off here between producing a solution with minimal moves and the time taken to find such a solution.6 From a DC perspective, the trade off between time and moves can be seen as a result of distribution of the cognitive activities involved in solving this problem. The same process of search that is externalised and consequently measurable as moves to solution in the direct case is internalised in the indirect case and additional superfluous moves are edited out of the solution path before they are articulated in the external artefact as a result of plan following. Thus the average time per move is increased since it represents not only the time to find the particular move but also the time to generate (and discard) all of the redundant moves generated internally but not externalised. Using the resources model to produce a retrospective account of the Golightly and Gilmore findings provides a parsimonious explanation of the causes of the strategic differences and, through the constructs of co-ordination and integration, gives a stronger sense of what it means for the point of consequence to be separated from the point of action. The apparent paradox between time and moves to solution over-looked by Golightly and Gilmore also has an interesting explanation when viewed from a distributed cognitive perspective. The analysis highlights however that to equate ease-of-use vis a vis an interaction strategy as simply a matter of identifying what resources are externalised is an over-simplification. DISCUSSION AND CONCLUSIONS Distributed Cognition research as an approach makes a valuable contribution to both CSCW and HCI theory and design because it leads us to re-examine the relationship between actors, artefacts and the settings in which interactions occur. By identifying information structures or representations flowing through functional systems as objects of analysis, it becomes possible to reason, not only about design artefacts, but also cognitive artefacts, within a single conceptual framework. But the DC research, while identifying resources for action as central to the interaction between people and technologies, stops short of providing a model of such resources. The resources model described in this paper has defined a limited number of resource types as abstract information structures which can be used to account for decisions and actions. It has shown how these abstract types can be differently represented in an interface and how this affects performance. It has also identified a number of interaction strategies which utilise different configurations of resources. These two components of the resources model, information structures and interaction strategy, through the process of co-ordination and integration, provide the link between devices, representations and action that is not well articulated in the DC literature. We have not yet used the resources model to drive an HCI design case study. As a first step, however, we have used the model to (i) evaluate alternative designs; (ii) analyse interaction episodes; (iii) explain behavioural differences resulting from different interface designs. We argue that with the resources model we are better placed than traditional DC work to inform design because of the model’s emphasis on (i) the relations between representation and action; (ii) definitions of information types; (iii) the identification of strategies for interaction. The idea of abstract information structures provides a ‘top down’ view of design analysis, it is possible to reason about and evaluate interaction design without commitment to implementation detail, and the idea of representation, co-ordination and integration does not force a differentiation between actor and artefact as analytical categories but instead provides means of reasoning about actors in an information
6
It is possible to envisage settings (perhaps real time, safety critical ones) where partially solving several problems as quickly as possible might be seen as better approach than producing an optimal solution to only some of the problems (see Wright, Pocock and Fields, 1998).
20
context as the unit of interaction analysis. But at the same time all our analyses have shown that it is at the implementation level that design differences begin to have impact. It is the way resources are implemented and used that is significant for interaction strategies not simply whether or not particular resources are available. In the spirit of distributed cognition research, the resources model deliberately seeks to soften the boundary between actor and designed artefact. Nardi (1996) has critiqued DC research arguing that it places machines and people on an equal epistemic footing. We do not believe that this is an accurate reflection of DC theory (see for example Hutchins and Klausen, 1991) although it is perhaps an understandable misunderstanding. The resources model views both resources and interaction strategies as objects of cognition. That is to say that resources and strategies can be thought about and used in a deliberate and conscious way by human actors. Thus human actors have a different epistemic status to technological artefacts although both are representational systems. This conception of the relation between human actors interaction strategies and information resources is thus closer to so-called metacognitive theories of intelligence and problem solving (see Sternberg and Wagner, 1994 for example). We view the resources model as a model of HCI embedded in the DC framework of theorising and methods. It is not a replacement for studies of cognition in the wild. But as yet, neither the DC research of Hutchins nor the resources model are fully contextual models of HCI or indeed activity in general. Within both DC research and the resources model, there is tendency to view the relation between a representation of information and its interpretation and use in a setting as unproblematic. Thus a plan while it may not be followed is unproblematically interpreted as something which could be followed. But this provides only what Clark (1996) refers to as a literal meaning for the information structure. A resource also takes on a pragmatic interpretation in a context of use. Thus a shopping list can be interpreted as a plan to be followed or a goal to be attained. Making a similar point, Schmidt (1997), has provides evidence that plans in the form of kanbans for co-ordinating production line work can function both as scripts to be followed and maps for orientation. Representations can also have meanings beyond their use to inform action. For example checklists in an aircraft cockpit are resources for action they also have legal status and define what it means to violate a procedure (McCarthy, Healey, Wright and Harrison, 1997). What is important however is that within human-computer interaction as within computer supported-cooperative work, we begin to take seriously what it means for representations to serve as resources for action, how such a concept if properly understood might help to bridge between HCI and CSCW and to understand how this theoretical orientation can be used to design better interactive systems. REFERENCES Card, S.K., Moran, T.P. and Newell, A. (1983) The psychology of human computer interaction. Lawrence Erlbaum Associates. Carroll, J.M. (1990) Infinite detail and emulation in an ontologically minimized HCI. In Chew, J.C., and Whiteside, J. (Eds.), CHI '90 Empowering people, conference proceedings, ACM, pp. 321328. Clark, H.H. (1996) Using Language. Cambridge University Press. Fields, B., Wright, P.C. and Harrison, M.D. (1997) Objectives strategies and resources as design drivers. In Howard, S. Hammond, J. Lindgaard, G. INTERACT ‘97. IFIP TC13 International Conference on Human-Computer Interaction. London: Chapman Hall. Fields, R.E., Wright, P.C., Marti, P., Palmonari, M. (1998) Air Traffic Control as a Distributed Cognitive System: a study of external representations. Proceedings of the 9th European Conference on Cognitive Ergonomics (ECCE9). EACE Press. Golightly, D. and Gilmore, D. (1997) Breaking the rules of direct manipulation. In Howard, S, Hammond, J. and Lindgaard, G. (Eds) Proceedings of Interact 97, pp. 156-163. Chapman & Hall. Green, T. R. G., Schiele, F. and Payne, S. J. (1988) Formalisable models of user knowledge in humancomputer interaction. In Green, T. R. G.. Ed. Working with Computers. Academic Press. Grudin, J. (1990) The computer reaches out: The historical continuity of user interface design, Proceedings of CHI ‘90 ACM SIGCHI Conference, Seattle USA. WA: ACM.
21
Howes, A. and Payne, S.J. (1990) Display-based competence: towards user models for menu-driven interfaces. International Journal of Man-Machine Studies, 33(6), pp. 637-655. Hutchins, E. and Klausen, T. (1991) Distributed cognition in the cockpit. In Engeström, Y. and Middleton, D. Cognition and communication at work. Cambridge University Press. Hutchins, E. (1995a) How a cockpit remembers its speeds. Cognitive Science Vol. 19 pp. 126-289. Hutchins, E. (1995b) Cognition in the wild. MIT Press. Hutchins, E.L., Hollan, J.D. and Norman, D. (1986) Direct manipulation interfaces. In Norman, D.A. & Draper, S. (Eds.), User centered system design: New perspectives on human-computer interaction. Lawrence Erlbaum Associates. Johnson, P. (1992) Human Computer interaction: Psychology, task analysis and software engineering. McGraw Hill. Kitajima, M and Polson, P.G. (1995) A comprehension-based model of correct performance and errors in skilled, displayed-based human-computer interaction. Institute of cognitive science, International Journal of Human-Computer Studies. 43(1):65-100. Lave, J. (1988) Cognition in practice: Mind, mathematics and culture in everyday life. Cambridge: Cambridge University Press. Lewis, C., Polson, P., Wharton, C. and Rieman, J. (1990) Testing a walkthrough methodology for theory-based design of walk-up-and-use interfaces. In Chew, J.C. and Whiteside, J. (Eds.), CHI'90 conference proceedings: empowering people, pp. 235-241. ACM Press. Marti, P. (in press) "The choice of the unit of analysis for modelling real work settings", Ergonomics. Mayes, R.T., Draper, S.W., McGregor, A.M. and Oatley, K. (1988) Information flow in a user interface: The effect of experience of and context on the recall of MacWrite screens. In Jones, D.M. and Winder, R. (Eds.) People and computers IV, pp. 257-289. Cambridge: Cambridge University Press. McCarthy, J.C., Healey, P.G.T., Wright, P.C. and Harrison, M.D. (1998) Accountability of work activity in high-consequence work systems: human error in context. International journal of human computer studies. Vol. 47.6, pp. 735- 766. Monk, A.F. (1990) Action-effect rules: A technique for evaluating an informal specification against principles, Behaviour and Information Technology Vol. 9, 147-155. Nardi, B.A. (1996) Context and consciousness: Activity theory and human-computer interaction. Cambridge: MIT Press. Norman, D.A. (1988). The psychology of everyday things. Basic books. Norman, D.A. (1986) Cognitive engineering. In Norman, D.A. & Draper, S. (Eds.), User centered system design: New perspectives on human-computer interaction. Lawrence Erlbaum Associates. O'Malley, C. and Draper, S. (1992) Representation and interaction: Are mental models all in the mind? In Rogers, Y., Rutherford, A. and Bibby, P. (Eds.) Models in the mind: Theory, perspectives and applications. pp. 73-92. London: Academic Press. Rogers, Y. and Ellis, J. (1994) Distributed Cognition: an alternative framework for analysing and explaining collaborative working. Journal of Information Technology, Vol. 9 (2), 119-128. Russell. S.J. and Norvig, P. (1995) Artificial intelligence : A modern approach. Englewood Cliffs: Prentice Hall.. Scaife, M. & Rogers, Y. (1996), External cognition: How do graphical representations work?, International Journal of Human-Computer Studies, vol. 45. pp. 185-213. Schmidt, K. (1997) Of maps and scripts: The status of formal constructs in cooperative work. In Hayne, S. And Prinz, W. (Eds) Group’97: Proceedings of the International ACM SIGGROUP conference on supporting group work. pp. 138-147.
22
Sternberg, R.J. and Wagner, R.K. (1994) Mind in Context: Interactionist perspectives on human intelligence. Cambridge: Cambridge University Press. Suchman, L.A. (1987) Plans and situated actions: The problem of human computer interaction. Cambridge: Cambridge University Press. Suchman, L.A. (1993) Response to Vera and Simon's Situated action: A symbolic interpretation. Cognitive Science vol. 17 pp. 71-77. Wright, P.C. Fields, R. and Harrison, M.D. (1996) Distributed information resources: A new approach to interaction modelling. In Green, T.R.G. Cañas, J.J. and Warren, C.P. Cognition and the worksystem. Proceedings of the 8th European Conference on Cognitive Ergonomics (ECCE 8). EACE Press pp. 5-10. Wright, P.C., Fields, B. and Merriam, N.A. (1997) From formal models to empirical evaluation and back again. In P. Palanque, and F. Paternò.(Eds) Formal methods in human-computer interaction. Berlin: Springer pp. 283-314. Wright, P., Pocock, S., Fields, R.E. (1998) The prescription and practice of work on the flight deck. Proceedings of 9th European Conference on Cognitive Ergonomics. EACE Press. Zhang, J. (1996). A representational analysis of relational information displays. International Journal of Human-Computer Studies, 45, pp. 59-74. Zhang, J. (1997). The nature of external representations in problem solving. Cognitive Science, 21(2), pp. 179-217. Zhang, J. and Norman, D.A. (1994) Representations in distributed cognitive tasks. Cognitive Science Vol. 18, pp. 87-122.
23