Representing Cognitive Activity In Complex Tasks ... - Semantic Scholar

1 downloads 0 Views 902KB Size Report
Amodeus project report (Barnard, Blandford & May, 1992), and a more detailed account of the workings of ...... Computer Graphics Forum, 14, 55-66. Duke, D.J. ...
Barnard, P. & May, J. (1999) Representing cognitive activity in complex tasks. Human Computer Interaction, 14, 93-158

Representing Cognitive Activity In Complex Tasks Philip J. Barnard MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge, CB2 2EF, U.K. Email: [email protected] Jon May Dept of Psychology, University of Sheffield, Western Bank, Sheffield, S10 2TP, U.K. Email: [email protected]

To be published in Human-Computer Interaction Abstract While cognitive theory has been recognised as essential for the analysis of Human-Computer Interaction, the representations that have been developed have been directed more towards theoretical purposes than practical application. To bridge the gap between theory and application, representations need to satisfy requirements for broad scope, a unified theoretical basis, and abstraction. Interacting Cognitive Subsystems (ICS) is proposed as a unified cognitive theory that can be used as the basis for such representations, and two approaches based upon the theory are described. One entails the description of Cognitive Task Models, which are a relatively complete representation of the cognitive activity required of a user in the course of an interaction. The other entails the production of less complete diagrammatic notations that are intended to provide support in small scale problem identification and resolution, and which can be applied across tasks, visual interface and sound interface issues, and can handle static and dynamic situations. While the former can be implemented in a production-rule expert system (ICSpert) and so does not require detailed modelling knowledge on the part of the analyst, the latter is a pencil and paper technique that does require theoretical knowledge, but which is intended to facilitate the acquisition of such knowledge in the interest of educating its users about the human aspects of HCI. The representations differ in the knowledge required for their use, in the support that they offer, and in the situations for which they are appropriate. They have been used to represent problems from experimental situations, core HCI scenarios, and ‘real world’ design projects. They share breadth of scope and abstraction, and their parent theory supports transfer of knowledge across domains of application and from older to newer technologies, and supports feedback between the domain of application and the domain of theory.

Philip Barnard is a psychologist with an interest in theories of mental architecture and their application to complex tasks, emotion and a range of psychopathologies; he is on the scientific staff of the Medical Research Council's Cognition and Brain Sciences Unit. Jon May is a psychologist with an interest in the application of unified models of cognition to perception, particularly with regard to the effects of task and context; he is a Lecturer in the Department of Psychology at the University of Sheffield. Acknowledgements. We thank our partners in both Amodeus projects for their constant requirement that we explain ourselves more clearly, and in particular Anker Jørgensen and his students at Copenhagen University for supporting the development of the Diagrammatic Representations. Notes: Authors are in alphabetical order. Support. This work was carried out as part of the Amodeus-2 project, ESPRIT Basic Research Action 7040 funded by the Commission of the European Communities. Technical reports from the Amodeus project are available via the World Wide Web at http://www.mrc-cbu.cam.ac.uk/amodeus/

Representing Cognitive Activity CONTENTS 1. Introduction 2. Problems in developing applied theory 2.1: Requirement for Breadth of Scope 2.2: Requirement for Unified Theory 2.3: Requirement for Abstraction 3: Meeting the requirements 3.1: The Nature of the Architecture 3.2: Two Approaches to Application 4: Interacting Cognitive Subsystems 4.1: Modelling with ICS 5: Building Cognitive Task Models 6: The CTM Attribute Space 6.1: Proceduralised Knowledge 6.2: Record Contents 6.3: Dynamic Control 6.4: ICS: First and Second Order Principles. 6.5: Application of CTM 7: Development of the expert system 7.1: Examples of Rules in ICSpert 7.2: Application to an HCI Design Problem 8: Limitations and potential 9: Representations for HCI problem solving 9.1: Diagrammatic Notations 9.2: Comparison with CTMs 9.3: Applicability of Diagrammatic Notations 10: Conclusions References

Page 2

Barnard & May (1999) HCI 14

1. Introduction The cognitive aspects of interactive computing systems have been recognised ever since they started to appear on people’s desks. The tasks they are required to support are of a conceptual rather than mechanical nature, and the skills required to use them draw upon the full range of users’ cognitive abilities in communication, memory and thinking. It is therefore natural that the vision driving much research in human computer interaction (HCI) has centred upon the idea that cognitive theory can be deployed to support design and evaluate interfaces. The application of cognitive theory to HCI has drawn directly from theories and paradigms current within the core cognitive sciences. Differences in the researchers’ native paradigms have resulted in quite different approaches, but all have shared the fundamental assumption that a predictive evaluation of interface effectiveness can be derived from a systematic, theory-based analysis of the computer user’s tasks, and the representations that they have spawned reflect this. Derivatives of human information processing models such as the model human processor and the keystroke level model of Card, Moran & Newell (1983) break the overall task down into subcomponents, to arrive at estimates of the time required to perform specific interaction sequences. Grammatical techniques originating in linguistics, such as BNF grammars (Reisner, 1982) and the Task Action Grammar developed by Payne & Green (1986), are used to characterise the complexity of the organisation of the task by treating it as a dialogue. Artificial intelligence research has given rise to approaches such as programmable user models (Young, Green and Simon, 1989) and cognitive complexity theory (Kieras & Polson, 1985), which represent and simulate the knowledge requirements of learning each step of the task sequence and their relationships, within a problem-solving framework. For theory in HCI to prosper, it must either have indirect effects, through feedback to the core disciplines enabling the development of an improved body of basic theory concerning complex tasks, or have immediate and direct benefits for design. Despite a consensus on the importance of the task as a starting point in any analysis, each theoretical approach tends to have provided a representation that is directed towards a specific evaluative function, such as predicting performance times for well-practised tasks or acquisition times for novel ones. The use to which a specific model is put typically depends upon the historical derivation of the underlying theory. In consequence, the models address different aspects of HCI problems using different methods, and tend to be irreconcilable alternatives. Some models have been highly influential within HCI research, most notably those associated with the GOMS family. However most research has been aimed at validating a specific model. There has been little real integration between different theoretical approaches or wider feedback to the parent disciplines. Specific theories are, of course, used in both HCI research and basic paradigms, such as analyses of working memory phenomena. Architectures used in this way include ACT-R (Lovett, Reder & Labiere, In Press), EPIC (Kieras, Meyer, Mueller & Seymour, In Press), ICS (Barnard, In Press), and SOAR (Young & Lewis In Press). However, the feedback from applied work in HCI to cognitive theory has been rather meagre and fragmentary when set against the level of research effort. Most lessons learned in application still seem to have had only marginal impact on the overall development of basic theory. This weakness in wider theoretical pay-off would be acceptable if the research were providing very real material benefits in the applied domain, but as Carroll and others have frequently pointed out (e.g., Carroll, Kellog & Rosson, 1991), there are few examples where cognitive theory has had a major and direct impact on design within the commercial community. On this criterion, cognitive science can more easily provide illustrations of its value through the application of empirical methodologies, through the data they provide and through the direct application of psychological reasoning in the invention and demonstration of design concepts (Anderson & Skwarecki, 1986; Card & Henderson, 1987; Carroll, Kellog & Rosson, 1991; Hammond & Allinson, 1988; Landauer, 1987). In practice, the limited scope of the applied techniques is now also curtailing their effectiveness within the domain of HCI. None of the approaches incorporate apparatus for integrating over the requirements of both knowledge acquisition and information processing constraints (e.g. see Barnard, 1987), which limits their application in the increasingly ‘interactionally rich’ settings that typify modern, multimodal computer interface design.

Page 1

Representing Cognitive Activity

2. Problems In Developing Applied Theory The difficulties associated with the development of truly applicable theory are well known, so we need do little more than summarise them here. A major problem is that theoretical enquiry and synthesis invariably occurs after the development of the products that it seeks to deal with, so that the design community is facing new questions by the time that theory encompasses their old concerns. Common complaints from design practitioners are that the theories are too low level and are of restricted scope; as abstractions from behaviour they fail to deal with the real context of work; and they fail to accommodate fine details of implementations and interactions that may crucially influence the use of a system (Carroll & Campbell, 1986; Newell & Card, 1985; Whiteside & Wixon, 1987). Even where theory can predict significant effects and receive empirical support, those effects may be of marginal practical consequence in the context of a broader interaction, or less important than the effects that are not specifically addressed (Landauer, 1987). Coupled with this is an inability for theories to frame their output in the economic language of design decision making, where what matters is not the direct usability of a design, but the trade-off between the costs of correcting a weakness and improved market share, or decreased training overheads. Developing appropriate theory in an applied context is a very demanding process. The practical techniques derived from basic theory are sometimes rather time-consuming for the benefits offered, difficult to apply systematically, and hard to ‘scale-up’ to big design problems. Many techniques require detailed specifications to be generated for each application modelled, with each specification requiring the formulation of many rules, the construction of which necessitates a modelling expert. All of this work has to be redone for each problem considered. Such techniques require a large commitment of resources, which is difficult to justify in many design contexts. One reaction to these difficulties has been the ‘softening’ of theory to provide practical heuristic methodologies (Polson, Lewis, Rieman & Wharton, 1992), but this weakens further the already weak link between theory and its applied realisation (May & Barnard, 1995b). Other reactions include an increasing reliance upon empirical methods and iterative design (e.g. see Landauer, 1995); ‘discount usability evaluations’ (Nielsen, 1993); the use of wider design space representations to aid design decision making (MacLean, Young, Bellotti & Moran, 1991); or the development of new structured human factors methods tailored to the requirements of software engineering processes (e.g. see Lim & Long, 1994). These reactions are all entirely reasonable given the practical aims of getting the existing methods into practice as rapidly as possible. A longer-term, but potentially more fruitful, reaction is to analyse the problems that have been encountered and rebuild the theory base and strategies for its application in the light of current failings. The development of theoretical apparatus needs to be coherently guided by a systematic requirements analysis, rather than by an opportunistic approach that capitalises upon the available modelling tools and the topical issues that fall within their capability. In the following sections we briefly set out the requirements that we believe applied theories need to meet if they are to fulfil the needs of design practitioners, and still support the development of core cognitive theory.

2.1: Requirement for Breadth of Scope In order to understand human action in interactionally rich settings, we need theories and models of broad scope. By moving away from the detailed analyses of well specified, low-level theory, we may have to compromise on our ability to explain and predict the micro-patterning of human behaviour in specific performance settings. This form of prediction has characteristically driven both the development, testing, and elaboration or rejection of ‘pure science’ theory in the laboratories of experimental psychology, but it is not necessarily the best way to approach applied problems, which are typically embedded in less well specified contexts, and which require more general descriptions. Where the specific details of a task cannot be known, a low-level theory cannot operate. The requirement for ‘engineering style’ approximations of human mental representations and of the processing potential of the underlying mechanism is an inevitable consequence of this line of reasoning. We need to guide the development of applied theory by considering how to provide explanations of the largest and most important effects that occur (c.f. Landauer, 1987). Current practice in theory development Page 2

Barnard & May (1999) HCI 14 and evaluation is oriented towards the derivation of detailed contrastive predictions that enable us to choose among alternative models. In this context, the relative importance of research tends to be determined within the space of possible theories rather than in the space of what is important in terms of effects in the world. The limited scope of competing theories defines the range of problems that are used to differentiate between them, to the exclusion of the much larger range of problems that none of the theories address well. A theory that is rejected because it does not account as well as another theory for one key problem may well have provided useful insights in other problems. Although the consequences are indirect, this reduces the potential relevance and applicability of the resulting research to real world concerns. This argument does not imply that efforts to validate and compare theories should be abandoned, but it does imply a rebalancing of concerns in which less emphasis is placed upon traditional strategies for testing theories and models, which have the positivist goal of working towards a ‘true’ theory. It may prove more productive in the long term to assess the utility of a range of theories in terms of their ability to produce explanations of a corresponding range of important phenomena in a systematic and principled way, and to seek to relate those theories within the overall framework of human information processing.

2.2: Requirement for Unified Theory The design of future interfaces that are interactionally rich will itself involve a great deal of complexity, since numerous inter-relationships among different parts of the design space will need to be considered. As technology design advances towards greater levels of integration and multimodality, interface issues that previously could be regarded as separable will themselves become directly interconnected. A theory that will be applicable in these circumstances will have to deal with a range of perceptual-motor phenomena and with the effects of mood and affect arising from the interactions, as well as with more standard cognitive aspects of interface use. While each of these aspects may currently be well understood in isolation, a unified theory would bring them together within a coherent framework so that a single analytical method could be economically applied to all aspects of a problem. A unified theory is also the only practical way for interactions between these different aspects of behaviour to be studied. Most importantly, the technical contribution made by a successful applied theory will have to be relevant to the needs of the target application domains and design processes. There isn’t a single design and development process, there are many of them - each tailored to meet the demands of different kinds of systems and task domains. Equally, there isn’t a single, ideal point in the development process where model-based information can most usefully constrain or guide decision making. As the design process iterates between initial requirements specification and the development of early plans for the overall design, the guidance required is of a general form, and could best be described as requirement elicitation, focusing the designer's attention upon the user's needs. When the overall design has been committed, and the focus is upon specific interface characteristics within that design, fairly precise design options need to be compared on theoretical grounds. Once evaluations of the performance of an operational prototype are available, problems that have been picked up need to be explained and solutions suggested. If we are to develop a body of theory of broad scope and utility the implications are straightforward. There must be multiple routes for mapping the theory through into design processes, so that model-based information is available appropriately at the stages at which it can serve a useful function in design decision making and in the ultimate commercial decision to commit a system to go live.

2.3: Requirement for Abstraction In developing routes for mapping theory into application (c.f. Long & Dowell, 1989), it must also be recognised that theories in the human sciences do not ‘do design’. Although they are often unable to resist the temptation to do so, neither should theorists or human factors specialists. Systems are typically designed and developed by specialists in the required hardware or software technology. The consequences of any mapping from cognitive theory to design must thus be in a form that is communicable to the endusers of the analysis. In addition, model-based information from the human sciences is clearly only going to form one part of the wider design space, for there will also be constraints of quite different forms, such as technological constraints, time pressure, cost factors, competitors' activities and end-user fashions. Any Page 3

Representing Cognitive Activity theoretically derived consequences must be communicated in a form that enables them to be weighed and assessed against considerations with quite different form, content and pertinence. Taken together with the issues raised by the possibilities afforded by future ‘interactionally rich’ interfaces, the problems of developing theories that might meet all of these requirements appear quite formidable. On immediate time horizons, not all of the various hurdles can be overcome. Theory cannot be expected to supplant valid knowledge gained from practical operational experience, knowledge of technologies, or from experimental tests of human performance. By the same token, we simply cannot expect experimental research to map and synthesise all of the relevant parameters, while industrial application waits for the theoretical answers. Abstractions of one sort or another are the only tractable way forward. We should not be attempting to deliver a mature body of valid theory that is both relevant and fully applicable in the short term. Practical consequences will flow from the intermediate products of an approach based on the cumulative acquisition of principled knowledge, provided that it is directed towards meeting the requirements listed here.

3: Meeting the requirements Despite the difficulties discussed above, it is our view that information processing theory has evolved to a point where establishing a systematic unified theory of broad scope is now a tractable task. A theoretical approach which has been developed over the last decade, Interacting Cognitive Subsystems (ICS) has sought to integrate a broad schema for information flow within human cognitive processing with a means of analysing the mental representation of information. This model was originally developed to address basic concerns with representation and flow across a range of comprehension and memory tasks. The form of theory was specifically framed to deal with differences in the processing of visually and auditorily presented information (Barnard, 1985), and was subsequently developed to deal with influences of mood and body state on a range of cognitive tasks (e.g. Barnard & Teasdale, 1991).

3.1: The Nature of the Architecture As a form of basic theory, the overall architecture is capable of providing high level accounts of a significant range of cognitive-affective phenomena in both the laboratory and practical settings (Teasdale & Barnard, 1993; Barnard, In Press). The theory not only has some scientific authority derived from the ability to synthesise and explain a significant range of range of basic phenomena, it has also been extensively applied to practical problems in HCI design over successive generations of interfaces (Barnard, 1987; Barnard, Wilson & MacLean, 1988; Barnard & May, 1993; May, Blandford & Barnard, 1993; Barnard & May, 1995). The cognitive architecture is derived, not from within the AI tradition, but from research in the human information processing tradition. Where an AI-based approach would attempt an explicit simulation of cognitive activity required by the interaction, applying the ICS architecture involves building an approximate representation which ‘describes’ the properties of mental activity that might be expected in the performance of particular tasks with a particular interactive system. The descriptions of mental activity are assumed to be systematically related to overt user behaviour and, in consequence, can be used to derive predictions about user behaviour or to provide more direct advice to design teams about the behavioural consequences of their particular design options. The underlying representations are referred to as Cognitive Task Models (CTMs) because they relate to the internal, mental ‘task’ faced by the cognitive mechanism when interacting with a computer or, indeed, any other real world artifact or person. CTMs result from a cognitive task analysis (CTA). This compares with more conventional task analysis which focus upon the external ‘task’ steps that the user is setting out to accomplish through the interaction, or the ‘task’ that the device has to do in the domain of application. While these uses of the word ‘task’ might at first seem to be simply differing levels of abstraction, they are actually quite different in nature. As a cognitive model, ICS can say very little about how tasks are structured per se, and so external task analysis is outside its domain of enquiry, and is taken as ‘input’.

Page 4

Barnard & May (1999) HCI 14

3.2: Two Approaches to Application Our primary approach has, for some time, been predicated on the assumption that we can automate the process of building cognitive task models through the use of expert system technology. The expertise automated in this way is the knowledge of modelling on the basis of the ICS theory. The potential outcome is to allow members of a design team (usually a cognitive specialist) to ask the expert system to build representations of the interaction, rather than asking a human modelling expert. This should be possible without a major time commitment or extensive expertise in cognitive modelling. It should be possible to build a whole family of such CTMs for different versions of an interface on a relatively short time scale (e.g. an afternoon’s work). The overall strategy for realising this form of modelling has involved three stages. First, the basic ICS theory, and its associated decomposition of cognitive resources is used to develop a general attribute space for describing cognitive activity. The basic theory specifies the constituent processes of cognition, the memory records that they use and the overall organisation of these resources into subsystems and the organisation of subsystems within the broader cognitive architecture. This is outlined in Section 4. The approximations are based around four main components: (a) a representation of the configuration of mental processes required to carry out a task; (b) a description of the "procedural knowledge" embodied in each of these processes - that is their ability to generate particular output given a particular input; (c) a description of the properties of any memory records that the individual processes may call upon to carry out the target task; and (d) a description of the overall way in which the action of the various processes is co-ordinated and controlled. This attribute space is detailed in Section 6. We make use of experimental evidence concerning human-computer interaction to infer principled relationships among concepts in the four component approximations. On the one hand, these principles relate configurations, procedural knowledge, and record contents to the overall dynamic control of the mental mechanism. On the other hand the principles relate properties of the components, taken together, to properties of user behaviour. Once principles have been established, they are encoded within the set of knowledge bases that make up an integrated expert system adviser. This requires three basic types of rules: those that take as input descriptions of users, tasks and systems, and map them onto concepts in the attribute space of the CTM representation; rules that operate entirely on components of the CTM representation; and rules that map from entities in the CTM representation to characteristics of human behaviour. The structure of a proof-of-concept demonstration system is described in Section 7, together with examples of the classes of rules that it contains and an illustration of its application to an HCI design problem. Our secondary approach to the application of ICS within HCI has developed as a consequence of our work towards meeting the goals of the first approach. While an automated tool can build the required models, and can hide the technicalities of building a representation of the interaction from the people using it, it cannot completely remove the need for them to have some domain specific knowledge — and in fact, it relies on this. Rather than attempt to represent all possible interface aspects and their consequences within the expert system, our approach has been to provide it with the conceptual structures necessary for it to recognise what aspects of the interface it needs to know about, and then to ask its users for that knowledge. One of the most important aspects of any interaction is, of course, the ability that users have to understand and comprehend the information that the system is presenting to them, and to understand and learn the structure of the tasks that they have to perform. Any approach which does not take into account individual patterns of expertise will be unable to predict the easily observable fact that the same interface is used in different ways by different people, and that the same person will use the same interface in different ways over time. Task structures change, plans can be elaborated or simplified, and even a person’s perception of the objects and entities within a scene or an interface can change markedly. CTA uses the idea of a hierarchical structure of ‘basic units’ of meaning within different forms of mental representation identified within ICS: primarily the semantic, object, and sound-based structures. Different structural grouping of these basic units can reflect learning, and the amount of ambiguity or similarity between the units can be predictive of the ease with which learning can occur (this is elaborated in section 6.1). The expert system obtains this information by asking the people consulting it to describe elements of specific types of information one at a time, and then asking them to assess the inter-relationships between them (this is described in section 6.3). These types of information can be objects upon the screen, sounds, Page 5

Representing Cognitive Activity commands to be typed, or higher order task steps. In practice, it has been found that once the expert system has identified the relevant type of information and begun to ask its consultant to describe it, they are often able to realise the direction that the analysis is taking, and to solve the problem themselves (Shum & Hammond, 1993). We decided to build upon this and to develop a ‘cut-down’ technique for CTA that concentrates on representations of the information that the user is acquiring and processing. While the expert system tool forms representations itself, and uses them to highlight problematic aspects in an interface or interaction, this second approach is really more a representational technique to help the analyst solve problems themselves. This technique is described in section 9.

4: Interacting Cognitive Subsystems Interacting Cognitive Subsystems (ICS) represents the human information processing mechanism as a parallel architecture, built up from nine independent but architecturally uniform subsystems. Each subsystem receives information (at an ‘input array’), stores it and abstracts regularities over time (in an ‘image record’), and produces output (by a number of parallel transformation processes). While the subsystems are operationally identical, the nature of the information that they receive and operate upon differs, and this defines their identity and role. In consequence, each subsystem can be thought of as dealing with a distinct level of mental representation (Figure 1).

ac

memory

incoming data

copy to memory

I

→mpl → implic

mpl

prop

output data

→ transform to X → transform to Y → transform to Z

implic

→ art → lim → prop

→ mpl → obj → implic

→ prop → som → visc

art

art← implic← lim←

som

→ lips →tounge

bs

← mouth ← hand ← …etc

visc

vis

Figure 1:

→implic →obj

obj

→ prop → lim → mpl

lim

→ hand → …etc

On the left the common architecture of a subsystem, showing the input array, storage and transformation processes. On the right, the overall architecture, showing the information flows.

Three subsystems deal with incoming, sensory information (acoustic, visual and ‘body state’ or proprioceptive information), while two deal with outgoing, effector information (a ballistic, skeletal level termed ‘limb’ information, and a finer, more expressive level termed ‘articulatory’ information). These five subsystems exchange information with the world, in that they either receive information from the senses or produce motor commands to control behaviour. The remaining four ‘central’ subsystems mediate between the sensory and effector subsystems, and exchange information with each other, not with the external world. Two operate on abstract, structural descriptions of the world derived primarily from the visual subsystem (the ‘object’ level of representation) and the acoustic subsystem (the ‘morphonolexical’ level of representation). Both of these subsystems can interpret their respective structural representations to derive propositional information about the identity and relationships of entities in the world. The eighth subsystem can process this propositional information Page 6

Barnard & May (1999) HCI 14 to derive implicational models that link the ‘meaning’ of propositions with more complex, schematic models of the world that have been developed over the individual’s lifetime. The final subsystem processes these implicational representations, together with affective information derived by the sensory subsystems, to produce further ‘internally derived’ propositional representations about the world, and to control the bodily, affective state of the individual (thus providing a link between cognitive and affective states). Rather than model how each of these subsystems stores and transforms the information it receives, as an AI-based approach might, ICS focuses on the flow of information between subsystems, and through the architecture as a whole. An important consequence of the central subsystems’ ability to exchange information with each other is the possibility of ‘reciprocal’ flows, mainly centred on the propositional subsystem. As well as receiving information from the two structural subsystems (object and morphonolexical) this subsystem can also produce structural information, based on its currently active propositional representation: this results in phenomena such as mental imagery (both visual and auditory) and in ‘top-down’ effects of expectation upon perception. It can also exchange information reciprocally with the implicational subsystem, with the result that events in the world are influenced by the individual’s currently active implicational representations (i.e., their ‘understanding’ of the situation or context). The flow of information indicated by the solid black arrows in Figure 2 shows information from the eyes being structurally interpreted by the visual subsystem (at the lower left of the diagram) to produce an object representation, which is then used to produce a limb representation to control hand movements. While the visual representation is a rich description of the raw visual scene, the object representation is a more abstract description of entities in the scene and their spatial relationships. The limb representation is a description of the motor actions that the individual intends to make within this scene.

ac

→mpl → implic

mpl

prop

implic

→ art → lim → prop

→ mpl → obj → implic

→ prop → som → visc

art

art← implic← lim←

som

→ lips →…etc

bs

← mouth ← hand ← …etc

visc

vis

Figure 2:

→implic →obj

obj

→ prop → lim → mpl

lim

→ hand → …etc

The information flows involved in visuomotor co-ordination.

At the same time as it produces a limb representation, the object subsystem is also outputting a propositional interpretation of the scene, which is an even more abstract description of the identities and semantic relationships of the entities within the scene. This representation can then be operated upon by the propositional subsystem, which can provide co-ordinated output back to the object subsystem. This illustrates two important aspects of the dynamic nature of information flow within ICS: the object subsystem is receiving a ‘blend’ of information from external (visual) and internal (propositional) sources; and it is locked into a cycle of reciprocal processing with the propositional subsystem, whereby each subsystem influences the information that is being processed by the other. The black arrow on the right of Figure 2 shows another reciprocal loop and another instance of blending, as the body-state subsystem returns proprioceptive feedback about the position of the hands to the limb subsystem, where it is blended with the representations produced by the object subsystem. Page 7

Representing Cognitive Activity The pattern of information flow in this simple example reflects the nature of the information that each subsystem can process, and the nature of the representations that they can produce. The visual subsystem, for example, cannot produce limb or propositional representations directly, and so the flow must go through an object level of representation. In more complex cases the presence of sound might result in the acoustic and morphonolexical subsystems processing information to produce an additional ‘stream’ of information for the propositional subsystem to blend with the visually derived stream that it receives from the object subsystem.

4.1: Modelling with ICS The dynamic processing of information that arises as representations are repeatedly transformed, interpreted and re-represented throughout the overall cognitive system gives rise to the phenomena that can be used to make predictions about the patterns of cognitive activity. These patterns are ‘approximate’ in that they do not describe precisely what is happening to a given token of information, nor the result of its transformation, nor do they directly produce estimates of processing latency. Instead of such quantitative values, the model generates qualitative assessments of the complexity of cognition, which vary according to the dynamics of the flow of information between the subsystems, and the competition for resources by transformations within subsystems. For modelling, the salient features of an interface are those which affect the course of the processing of information, and hence the patterns of cognitive activity. These aspects relate to the matching by each subsystem of incoming representations in their appropriate code to representations that they have operated upon before, and to the development and function of skilled performance (which in ICS terminology is called ‘proceduralised knowledge’). The image record of a subsystem represents a complete record of all information that it has received. Over time, and with the repeated processing of similar information, these experiential records give rise to generalised abstracted records which can be used to elaborate and interpret incoming representations, which may be novel, contain errors or be incomplete. While a transformation process can be supported by access to its subsystem’s image record, an architectural constraint of ICS is that the image record can only revive one representation at a time. This prevents different transformation processes within a subsystem accessing the image record simultaneously. The need to rely on the image record would consequently impose a bottleneck on information flow in complex situations. This can be circumvented over the longer-term by the development by each transformation process of proceduralised knowledge that links a given input representation directly to its associated output representation. While the content of the image record reflects the information that a subsystem has received, procedural knowledge reflects the transformations that have been carried out. Together, the image record and procedural knowledge allow the subsystems to build representations of information, and to elaborate them or to transform them into other codes. In summary, to frame an account of any form of behaviour, it is necessary to consider four aspects of the system-wide properties of processing activity. First, the behaviour of the complete system depends on the identity of the transformation processes that are required for a task (i.e., the ‘configuration’). Second, each process in a configuration is constrained by what recodings it can perform more or less automatically, which depends on what that process has learned (its proceduralised knowledge). If any process in a configuration does not posses the capability to transform the information its subsystem is receiving, then this will adversely affect the overall operation of the complete system. Third, performance may therefore also depend upon the nature and properties of the image records that may need to be accessed. These considerations have dealt with the capabilities of individual subsystems. Since the flow of information depends upon interactions between subsystems within a configuration, a fourth aspect must be considered, to take into account the overall dynamic co-ordination and control of the complete system. Although there is no ‘attention scheduler’ or ‘central executive’ within ICS, any configuration will ‘march at the pace of its slowest soldier’, and so the overall pattern of processing (or ‘dynamic control’) may be determined predominantly by the activity of the one subsystem (the ‘locus of dynamic control’) which has the greatest requirements for image record access, or which is constrained by its simultaneous activity within a concurrent task’s configuration (in which case a process may need to ‘interleave’ between two

Page 8

Barnard & May (1999) HCI 14 configurations). In demanding circumstances, where two or more subsystems have heavy requirements for image record access, the locus of dynamic control may ‘oscillate’ between them.

5: Building Cognitive Task Models The basic properties of the architecture outlined above suggest that overt behaviour is, over the very short term, an approximate function of the four factors of processing configuration, proceduralised knowledge, image record access and dynamic control. Building a cognitive task model (CTM) from these four factors is akin to forming a ‘snapshot’ of the state of the human information processing system as a whole, and would provide enough information to anticipate the immediate cognitive or behavioural consequences of a mental task. In practice, a sequence of behaviour is likely to consist of a number of different phases, organised over a short period of time, each with different configurations, degrees of proceduralisation, requirements for image record access and, consequently, different patterns of dynamic control. A prototypical HCI scenario, for example, may involve the user in noticing that some icon on their display is flashing, and that they have some new email to read. They recall that the command for reading their mail is ‘New Mail’ from the menu, and that it has a keyboard short-cut of ‘command-M’. They locate the relevant keys (puzzling for a moment over which key is ‘command’) and execute the command. This scenario, and others like it, could be broken down into a task decomposition at almost any level of complexity. The finer the grain of the decomposition, however, the more specific it becomes to the task in question, and the less general to task performance per se. For our purposes, it is sufficient to characterise task performance as consisting of three phases: goal formation (where the user identifies that there is something to be done); action specification (where they decide what needs to be done to reach the goal); and action execution (where they express the actions necessary to reach the goal). In our example, noticing the flashing icon, realising what it means and choosing to read their mail corresponds to goal formation; recalling the full command name and its abbreviation is action specification; and then co-ordinating the visual search for the appropriate keys and pressing them simultaneously is action execution. Where the four components of a CTM describe the state of the cognitive architecture, and hence behavioural consequences, in the ‘very short term’, the three phases of cognition produce a more extended description, in the ‘short term’. The components of each phase within the performance of a task will inevitably be inter-related to some degree, and the phases may overlap, iterate, or be omitted. They cannot be considered as independent and sequential in any real sense, despite the logical necessity for goals to be formed before action can be specified and executed. At the very least, the execution of actions may lead to the respecification of the actions to correct or refine performance, and to take into account unexpected changes in the world. There might even be the formation of new goals that interrupt execution, or lead to former goals being displaced. Since different phases will usually require different configurations of processes, interleaving of phases leads to the interleaving of configurations, and hence an increase in the complexity of dynamic control. If any process within a configuration for a particular phase does not posses the requisite proceduralised knowledge required for a task, then the overall pattern of data flow cannot be smooth: extra transactions will be required by other processes in different subsystems to revive relevant record contents, or to infer pertinent information that is lacking. These transactions will alter the state of these other processes in subsequent phases of task performance, thus having knock-on consequences for behaviour. The CTM for each phase will, of course, change over time as the user’s experience accumulates. Novice users who are unfamiliar with the non-alphanumeric keys on their keyboards may require visual search to execute certain key-combinations (as in the above example), while more practised users will ‘know’ each key’s location, and perhaps even which finger to press it with. Over the ‘longer term’, as learning progresses, the range of record contents that is available to a user grows, more comprehensive and less contextualised abstractions develop, and their response patterns increasingly acquire the characteristics of fully proceduralised knowledge. As they become fully expert in the use of an interface, the action specification phase may become so fully proceduralised that their behaviour will apparently progress directly from goal formation to action execution. Accordingly, in modelling task performance it is appropriate to consider not just a single CTM, with its ‘snapshot’ of cognition, but a family of CTMs, reflecting also the short term consequences of each phase of cognition, and the longer term development of behaviour from novice, through intermediate to expert knowledge (Figure 3). Page 9

Representing Cognitive Activity Short Term dynamics of cognition

Long Term dynamics (Learning)

Very Short Term dynamics of cognition

Figure 3:

Expert: Goal Formation

Expert: Action Specification

Intermediate: Goal Formation

Expert: Action Execution

Intermediate: Action Specification

Intermediate: Action Execution

Novice: Goal Formation

Novice: Action Specification

Novice: Action Execution

Configuration

Configuration

Configuration

VIS->OBJ::OBJ->PROP::PROP->IMPLIC::

:PROP-->IMPLIC::IMPLIC-->PROP: :PROP-MPL:

:PROP-->MPL::MPL-->ART: :ART-->MOTtyping

Proceduralised Knowledge

Proceduralised Knowledge

Proceduralised Knowledge

OBJ-->PROP is moderate

:PROP-MPL: is well proceduralised

ART-->MOTtyping is poor

Image Record Contents

Image Record Contents

Image Record Contents

@OBJ (Word(flashing)—Doc+Arrow)....

@PROP (Pa(P1 &P2).& (P3 & P4))....

@MPL (command name)....

Dynamic Control

Dynamic Control

Dynamic Control

Configural oscillation is medium

Configural oscillation is low

Configural oscillation is high

Behavioural Consequences

Behavioural Consequences

Behavioural Consequences

Users may not recognise icons

The command names will be recalled

User will have difficulty typing

A ‘family’ of CTMs, showing very short term aspects of cognition within each model, short term aspects across the phases of cognitive activity, and longer term aspects in the progression from novice to expert.

The ‘family’ of CTMs portrayed in Figure 3 provides a basis for cognitive task analysis, through which a theoretically justified decomposition of cognitive resources can be systematically related to performance. Cognitive task analysis examines the representational and mental processing demands of a task, and relates these to patterns of performance that have been observed or derived empirically. The aim of the analysis is to infer rules interrelating properties of the complete cognitive mechanism. However, if we are systematically to infer rules that govern activity, we must have a more detailed specification of what we mean by configurations of processes, proceduralised knowledge, record contents and dynamic control. Each of these constructs should be clearly related to the underlying theory. Using graph structures as an illustrative medium, the following section outlines fuller specifications of these four concepts.

6: The CTM Attribute Space Several of the points made through the ‘stack’ representation of the family of CTMs in Figure 3 are reformulated in Figure 4 as a graph structure. This figure captures the notion that the family of CTMs should have a specific scope with respect to: an application domain, task set and system type (i.e., the overall task context); the phases of cognitive activity (goal formation, action specification, action execution); and the type of people they are applicable to and their stage of expertise. The CTMs also have content in that each model has a configuration, proceduralised knowledge, record contents, and dynamic control).

A Family of Cognitive Task Models has has

has Content

Scope has has

Proceduralised Knowledge

has

has

has

has

Dynamic Control

Record Contents

Consequences isa

Behavioural Prediction

isa

Model Report

isa

Design Advice

Stage

System Type isa

isa

Task Set

isa

Novice

Intermediate Expert

Population

Phase

isa isa

isa

isa

Goal Formation Action Specification

Figure 4.

isa

isa

Configuration

Application Domain

isa

isa

Group

isa

Subset

Action Execution

A graph structure representing the core attributes required to determine a family of cognitive task models. Page 10

Barnard & May (1999) HCI 14 The third attribute is concerned with the required consequences of the model. At one level, a description of the CTM attribute space might itself be an adequate consequence in that is a theoretical description of the state of the cognitive architecture. Going a stage further, this state description might be used to extrapolate predictions about user behaviour, in terms of reaction times, learning curves, or error rates. It should also be possible to go beyond simple predictions and model reports, to use the attribute space to help designers understand the underlying psychological issues and to produce direct design advice. The four ‘content’ components of each CTM can be described in more detail. The configuration, as indicated in Section 4, is a set of information processing resources that transform mental representations, distributed between the nine subsystems. Each subsystem has a primary copy process, which transfers incoming representations to the image record, and a number of secondary transformation processes, which produce representations that can be used by other subsystems.

Configuration

has

isa A set of interacting information processing resources includes

includes

includes Primary Processes (COPY)

Figure 5:

A reference task description

Image Records

Secondary Processes (Transformations)

A graph structure depicting the attribute space for configurations of cognitive activity.

In Figure 2, the basic information flow for visuo-motor co-ordination was illustrated. This can be represented in terms of a configuration by the set of secondary processes: *VIS→OBJ:

(incoming visual to object representations)

:OBJ→PROP:

(reciprocal processing between

:PROP→OBJ:

object and propositional representations)

:OBJ→LIM: *BS→LIM: :LIM→MOT*

(object to limb representations) (proprioceptive feedback from body state) (limb representations produce motor action)

In this notation, an → indicates a transformation within a subsystem, a : indicates an exchange of representations between subsystems, and an * indicates an exchange of data with the world (through the senses or effector action). Although there is a rough correspondence in this ‘list’ of processes to a serial, top-to-bottom sequence, they should be seen as simultaneously active processes. Thus the object subsystem is receiving representations simultaneously from the visual and propositional subsystems, and is simultaneously producing propositional and limb representations, while the limb subsystem is receiving representations from both the object and body-state subsystems. This configuration might operate during a ‘tracking task’, where someone has to keep a pointer aligned with a target moving unpredictably on a screen. The purpose of this notational formalism is mainly brevity, since it allows simple configurations to be represented as a linear sequence. The configuration annotated above, for example, can be represented textually as : *VIS→OBJ::OBJ→PROP::PROP→OBJ::OBJ→LIM: :LIM→MOT* *BS→LIM:

Page 11

Representing Cognitive Activity There are clearly difficulties in representing parallel flows within such a linear sequence, as shown by the need to convey both the object and body state inputs to the limb subsystem. Even harder to portray are the reciprocal loops that are so important in governing central cognitive activity, as the object-propositional loop here. More precise mathematical formalisms are being investigated in the context of Syndetic Modelling, which aims to describe cognitive modelling and system modelling within a single framework (Duke, Barnard, Duce & May, 1995; in press). This configuration represents a fully proceduralised flow, with no image record access, and with each process operating in a direct mode upon the representations that are arriving at its subsystem. In some circumstances, there would be a requirement for at least one of the processes to switch to an indirect, buffered mode, in which the incoming representations are first copied to the image record, and only then being used as the source for the transformation process. This allows the ‘buffered’ process to operate at a slower rate than the pace of the incoming representations would otherwise allow, and also supports a degree of short-term temporal integration of incoming representations. The operation of such a buffer in visuomotor co-ordination might occur in the Object subsystem, resulting in the configuration: *VIS→OBJ::copyOBJbuff→PROP::PROP→OBJ::OBJ→LIM: :LIM→MOT* *BS→LIM: In this configuration, the object to propositional transformation is operating on buffered representations, perhaps supporting the recognition of patterns in the motion of the tracked target, while the object to limb transformation is operating directly upon the incoming representations. The constraints that buffered processing imposes upon the rate of information flow through the configuration means that in practice, only one process can be buffered at a time (the location of the buffer corresponds to the level of mental representation of which the person is focally aware). In consequence, if more than one process is insufficiently proceduralised, the buffer may have to ‘oscillate’ between subsystems, leading to a complex pattern of dynamic control of the configuration (see below).

6.1: Proceduralised Knowledge Each of the transformation processes in the configurations described above effect a mapping from a particular input representation to a different output representation. In the case of the object to propositional transformation :OBJ→PROP: the process can map from the visuospatial structure of any scene (visually derived or mentally imaged) to its semantic, referential meaning. The corresponding :MPL→PROP: transformation produces meaning from the surface structure of sounds (verbal and non-verbal). These transformations clearly perform a substantial number of mappings, and so while the basic concept is one of mapping from a particular instance of one form of representation to another, the complete process can produce a set of mappings (or a repertoire). The concept of proceduralised knowledge deals with the capabilities of these mappings, but it would clearly be an enormous task to attempt to model the capabilities of every possible transformation. The resulting model would be so specific to an individual person at a set time and place that it would serve no applicable purpose. To build approximate models it is necessary to consider the properties of sets of mappings within a process. For :MPL→PROP: a specific mapping might be the recognition of a particular instance of a spoken English word, while a repertoire would reflect the ability to recognise that word spoken by different people, or even more broadly, simply recognising spoken English. Figure 6 illustrates this approximation from the specific to the general by placing on the left hand side of the dotted line the theoretical constructs that apply to a single mapping, and on the right the salient approximations when these are generalised over a repertoire. It is this distinction between the specific mappings and the approximate repertoire, and its emphasis upon the modelling of the approximate, that distinguishes ICS from simulation based models of human information processing. By definition all mappings have an input and an output, which are representations of information at particular mental levels (all processes within a subsystem having inputs of the same level). Each representation will have a degree of completeness, since an input may be only a partial specification of the information required for the process to complete its mapping to an output. At all mental levels, the Page 12

Barnard & May (1999) HCI 14 representations can be specified in terms of a structural description, consisting of a set of basic units, each with their own constituent substructure, and which are themselves organised into superstructures. The substructure of the basic units reflects the encoding dimensions that are relevant to the level of representation: at an object level, for example, these dimensions would encompass the attributes of structurally integrated visual objects, their relative dimensions, positions and dynamic characteristics. The superstructures, which determine the organisation of the basic units into ‘chunks’ or ‘wholes’, can reflect both learning on the part of the bottom-up mappings (typically the sensory subsystems learning how to structurally group units, or the structural subsystems learning to produce semantic units in propositional code), and the contextual influence of top-down mappings (usually based upon propositional knowledge, but also implicational models). This structural organisation is not a fixed hierarchy, determined by the environment or the task, but is interpreted dynamically throughout the course of cognitive activity. Since each subsystem has its own representation, multiple representations will be present at the same time, and will influence each other. The analysis of this dynamic modification of representational structure underlies the diagrammatic notation outlined in section 9 of this paper.

Proceduralised Knowledge {Approximated over} A Mapping

has

has

repertoire

has

task specific

task relevant has

An Input

has AC

MPL OBJ PROP IMPLIC

Sensory Central Level Effector

LIM ART SOM VISC

has encoding has dimensions

Instance

Figure 6:

has

has

has

An Output

representation

VIS BS

task irrelevant

generic

developmental status

approximate state of development

neurological state

approximate neurological state

contextual precision

approx contextual precision

degree of completeness

approx record requirement

superstructure

approx utility of superstructure

basic units

approx utility of basic units

substructure

approx utility of substructure

{Approximated over}

Mapping Constraints

Subset Constraints Data has has Constraints has mean skew Variance contextual utility Key approx has is a

Set

The organisation of attributes characterising the Proceduralised Knowledge of a mapping, and its approximations over a repertoire of mappings.

Together the input and output representations constitute any mapping’s identity, since given that particular input, the mapping would normally result in the corresponding output. In reality, additional constraints may affect its accuracy. Figure 6 indicates three such constraints that are relevant to approximate modelling for HCI (others may be relevant for other purposes). It can be assumed that the accuracy of a mapping is determined by how practised it is (its developmental status), and its operation is likely to be moderated by the neurological state of the mental device – which may be affected through intoxication, stress, arousal, fatigue, or even brain damage. The mapping will also have a contextual precision which indicates how acceptable any variation in the output representation is, in the current context. For example, the propositional input representation to ‘make information visible on the screen’ might result in a morphonolexical output representation of ‘put’, ‘display’ or ‘show’, all of which are equally adequate when

Page 13

Representing Cognitive Activity talking to someone else, who is operating the computer, but only one of which is the correct command when interacting directly with the computer. The right hand side of Figure 6 depicts approximations of these attributes of proceduralised knowledge, taken over a repertoire of mappings. There would naturally be a wide range of possible mappings, some of which are specifically concerned with the task context being modelled (task specific), but many others would be completely unrelated (task irrelevant). Between these extremes would be mappings that are relevant through being generic to a range of contexts including the current context (and so whose use in the current context may be affected by experience in another), and task relevant mappings that are not specifically concerned with the current context, but which are derived from related contexts, and so which may provide support in the absence of appropriate task specific mappings (e.g., when interface objects or functions are represented through metaphor or analogy). Each of these repertoires can be described by two sets of constraints, which generalise over the mapping constraints of the specific processes on the left hand side, and over the data constraints of the representations that they process. As properties of a set of mappings, these constraints can be accorded something like the character of a statistical distribution—with means, variance, skew and so on. In the case of mapping constraints, we can assign a simple scale to a property like contextual precision (set size n, precision nil, low, medium, high or full). Similarly, we can approximate over the data constraints by assuming that, for the current task, the different levels of representational structure will have degrees of contextual utility indicating how useful they will be in supporting the mappings, and that the degree of completeness implies a requirement for accessing record contents to resolve ‘missing data’.

6.2: Record Contents The graph structure for record contents (Figure 7) shares a number of attributes with that for proceduralised knowledge, since it reflects the nature of representations that are stored in each subsystem’s image record. Storage is carried out by the ‘copy’ process, and so record contents are identical to the representations that form the input to a proceduralised mapping. When revived from the image record, they can thus serve as input for a transformation process, just as if they were newly received representations. The pattern of flow through the overall ICS architecture means that the output from one subsystem’s transformation processes becomes the input, and hence forms the record contents, of other subsystems. As the left hand side of Figure 7 shows, revival of representations from the image record occurs through the presence within a transformation process of an input representation, which can start a revival mapping. This mapping, just like those of the transformation processes, has the attributes of proceduralised knowledge, and can be carried out with varying degrees of precision, depending upon context, developmental and neurological state, and so on. If successful, it produces an output representation, but unlike the mappings in transformation processes, this output is at the same level as the input. In effect, the record contents have been used to elaborate the input according to the subsystem’s receptive history (i.e., the individual’s past experience). Like the mappings, the image record is physically instantiated in the neural structures of the brain, and so may be affected by the same neurological considerations. In the theoretical description of ICS, record contents play a central role in learning, and fulfil many of the properties of ‘memory’ in other theories. Since record contents are formed by the copy process, without any transformations being carried out, the record contents begin to develop in a subsystem before the secondary transformations. Indeed, the proceduralisation of mappings depends upon, and cannot occur without, the support of record contents. Repeated co-occurences of information within representations arriving at the subsystem lead to their subsequent co-revival, effectively providing an abstracted common record that provides the basis for transformational processes. While many skills may develop into fully automated mappings after substantial practice (e.g., pressing RETURN on a keyboard), others may continue to rely upon revival of records (e.g., forming an image of an icon that is to be located). The repertoires of records that can be approximated for cognitive modelling (the right hand side of Figure 7) differ from the mappings of proceduralised knowledge. While many records may remain completely irrelevant to the current task (task irrelevant records), four subsets of related records can be usefully distinguished. Page 14

Barnard & May (1999) HCI 14 Record Contents

{Approximated over}

A record has An Input

repertoire

has ETRs

a revival mapping

starts

gives

uses

An Output

CTRs has

EPRs has

has

task irrelevant has has

ATRs

storage constraints

neurological state

approximate state of development representation

Proceduralised Knowledge

approximate neurological state

has AC VIS BS MPL OBJ PROP IMPLIC

Sensory

degree of completeness

Central Level Effector

LIM ART SOM VISC

has encoding has dimensions

Instance

Figure 7:

Mapping Constraints

Subset Constraints Data has has Constraints has mean skew Variance contextual utility

approx contextual precision approx record requirement

superstructure

approx utility of superstructure

basic units

approx utility of basic units

substructure

approx utility of substructure

{Approximated over}

Key approx has is a

Set

The organisation of attributes for record contents and its approximations.

The basic form of a record is provided by the experiential task records (ETRs), which correspond to episodic memory traces for particular events that the individual has experienced: every representation that arrives at the subsystem, and which is copied into the image record, forms an ETR. These can be uniquely revived by the arrival at the subsystem of a closely matching representation, derived from the output of another subsystem. In interface use, ETRs would support specific knowledge such as the name used to label a file, or the rough location of a document within a folder on the desktop. As similar ETRs recur, their commonalities are abstracted into a common task record (CTR), which may consequently lack detail at the substructural (and even basic unit) level of representation when revived. The non-specificity of CTRs mean that they are valuable in supporting generalisation and the acquisition of abstract patterns of information, such as where to look to find the salient features of an icon, how far to drag the mouse to locate the position of options within a menu, or the identities of the ‘slots’ of a task sequence. A second form of abstraction are the entity property records (EPRs), which refer to recognisable entities within the domain of use. In HCI, these would encode the information that an object on the screen will respond in a particular way to a double-click, or that it can be dragged. The distinction between CTRs, which refer to the cognitive task the user is conducting, and the EPRs, which refer to the objects with which the user is interacting, is fine, but important, since they have different consequences for the transfer and generalisation of knowledge across contexts. As a crude generalisation, CTRs know ‘how to do it’, while EPRs know ‘what it does’. The fourth class of record with Figure 7 are active task records (ATRs). These are only available to support transformation processes in the very short term, since they are the input representations that a transformation process has just acted upon: while not strictly speaking part of the image record contents, they are included here because they are representations that can be used again directly just as if they had been revived from record contents, and share attributes with the other sets of record contents. Of course, as soon as a process operates upon a new representation, this becomes its ATR, and the previous representation is no longer available in this way. Page 15

Representing Cognitive Activity

6.3: Dynamic Control In modelling a distributed architecture of independent but interacting subsystems, it is necessary to add to the static attributes of configurations, proceduralised knowledge and record contents a range of dynamic attributes that describe the behaviour of the complete system over time (Figure 8). Given the potential interconnectivity of the whole mechanism, it is useful to distinguish a core configuration for a particular task from subsidiary configurations that may nonetheless play a significant role. We can, for example, usually drive a car while listening to the radio or holding a conversation with a passenger. This can happen because the configuration of processes for driving need not necessarily draw on any of the processes required for language comprehension or production. However, if the driver has to form a mental image of a sports commentary, or wishes to assess their passenger’s reaction to a particular point of discussion, processing resources in the visual and object subsystem must be diverted from the driving task and incorporated into the configuration for interpreting the conversation. There must be a shift in the configuration of processing activity, however momentary, which may affect their ability to drive. Dynamic Control

A configural pattern of cognitive activity across information processing resources Subsidiary Configurations

Sequential Set of Activities

Feedforward Feedback Record access

Core Configuration

Key approx has is a

Information roles Interleaving

Information roles

Extent of Record usage

Extent of Record usage

Complexity of process interchange

Complexity of process interchange

Locus of control

Locus of control

Oscillation

Oscillation

Interleaving

Goal coherence

Task context

Interleave rate

an instant or 'snapshot' of activity

Figure 8:

Very short term character 100-1000 msec

Short term character 1-10 sec

The organisation of attributes characterising dynamic control and its approximation over time.

In an HCI context, a user who is in an action specification phase of cognitive activity preparing to issue an interface command may experience a similar configural shift, when they change from trying to recall the command to searching through a menu, where the use of visual information becomes directly relevant. Similarly, while writing with a keyboard a user may notice a peculiar pattern of pressure at their finger tip when they accidentally press two adjacent keys instead of the intended one. For the brief moment in which they are ‘paying attention’ to the finger tip information, a central process has switched from handling information about the content of the text that they are constructing to one involving a data stream within a proprioceptive process, instigated a proceduralised motor action to hit delete, and then switched back to the meaning of their writing. These considerations are captured in the left hand side of Figure 8. For the purposes of approximation, a subsidiary process configuration can fulfil a number of information roles. In a conversation, the arms and hands may be gesturing in a manner semantically related to the Page 16

Barnard & May (1999) HCI 14 ongoing conversation. In this case, the core configuration controls the speech output, while a subsidiary configuration constructs gesture that presages the content of the speech stream in a ‘feedforward’ manner. If the hands are occupied with a wine glass or a small child, the secondary configuration is suppressed, with no direct effect upon the core configuration, although an absence of proprioceptive feedback from the speaker’s body state subsystem (another subsidiary configuration) may indirectly affect the development of their implicational representations. Unlike Figures 6 and 7, there are two vertical dotted lines in Figure 8. These distinguish approximations over different timescales. The first partition concerns dynamics in the very short term. It is topicalised upon the approximate description of activity within a single member of the family of models portrayed in Figure 5. In any ‘snapshot’ of (say) 50 msec of activity, it is unlikely that there can be much in the way of a shift from one coherent pattern of activity to another. In the range between 100-1000 msecs, configural patterns may be relatively stable or subject to significant shifts, captured in the very short term by the attribute of interleaving between different configurations. As described earlier, the ICS theory allows only one process within a configuration to be configured for buffered processing, since this controls the rate of data flow throughout the configuration. Thus, one level of representation is effectively dominant within a pattern of configural activity. This is captured by the concept of locus of dynamic control. So, for example, when we are concentrating upon the sound of a speaker’s voice, the locus of dynamic control will be in the acoustic subsystem; when we are concentrating on how someone is saying something it may be at the morphonolexical subsystem; when we are concentrating on what they mean, it will be at the propositional subsystem; and when we are concentrating on the emotional feelings generated by what it means, the locus will be at the implicational subsystem. A requirement for buffered processing at more than one subsystem will require oscillation of the locus. Increases in either the amount of interleaving or oscillation results in an increasingly complex dynamic control of the overall pattern of cognitive activity. Although there may be periods with relatively stable patterns of configural activity, patterns may differ in terms of both the complexity of process interchange and the extent of record access. In the earlier section dealing with configurations we illustrated how processes within the central subsystems could form a simple ‘loop’ of reciprocal processing activity—as for example when the Propositional to Implicational transformation and the Implicational to Propositional transformation enter into an extended period of reciprocal interaction for planning an utterance. Activity within this loop may be more or less stable and more or less extended in time. However, more complex patterns may develop, driven by both internal cognitive requirements for more detailed representations by any process within the configuration, or by external changes in the world leading to a discrete transition in the information being processed. Within the simple propositional to implicational loop, there is no necessity to ‘verbalise’ the thoughts involved, and a basic :IMPLIC→PROP: :PROP→IMPLIC: cycle might adequately support the cognitive task. However, the configural pattern might involve: :IMPLIC→PROP::PROP→MPL::MPL→PROP::PROP→IMPLIC: in which case the thoughts would be verbalised into a surface structure code, fed back to the propositional subsystem, and then back to the implicational subsystem. This pattern would be more complex than the simple reciprocal loop, and would have the benefits of providing two inputs to the propositional subsystem, both derived from the same propositional material, thus resulting in a richer and more stable semantic model. This is at the cost of increasing both the configural complexity and extent of activity, and by requiring additional resources makes the configuration open to interference by conflicting secondary task demands. Adding further oscillations or interleaving of subsidiary configurations would make matters worse. Extending the analysis of dynamic control over longer periods is a relatively straightforward matter. We can consider the properties of information roles, interleaving, oscillation, locus of control, complexity of process interchange and extent of record usage in some sense ‘averaged over’ the co-ordination of activity within the short term dynamics of task performance (1-10 secs and up). Here though, we need to draw in two further considerations. First, cognitive activity operates in a task context and subserves task goals. We therefore need to consider the overall coherence of that activity with respect to the task goals.

Page 17

Representing Cognitive Activity The activity may subserve the goals in a coherent way or, for example, extended activities may be less than coherent, as when an individual switches from one set of actions to another in an apparently aimless fashion. However, the switching between phases of cognitive activity may be very coherent indeed. When phases of cognitive activity switch from one to another, this is captured by interleaving. Over the short term, users characteristically do not plan their actions, obtain all the visual information they require and then execute the actions. While entering a computer command sequence, they may type a bit, look, think, look again and type some more. On a larger scale they may interact at the terminal, answer the phone, service an interruption from a visitor and then resume activities at the terminal. Taken together, the concepts of oscillation, interleaving and goal coherence enable us to restrict our phase analysis to goal formation, action specification and action execution. Rather than finessing the problem that cognitive activities in a task are not strictly seriated (e.g., see Norman, 1986), this approach explicitly captures properties of non-seriality.

6.4: ICS: First and Second Order Principles. The basic four component structure of the CTM introduced in section 5 and more detailed attribute space for the expert system elaborated in section 6 are coherently related. The theoretical resources used to define the underlying architecture are used as the basis for the language used in the CTM attribute space. The relationship between underlying theory and the CTM encompasses both description and explanation. The operation of the underlying architecture is assumed to be governed by a small number of basic principles concerning information flow, the way in which mental representations are formed, and the recovery of stored memory records. These include principles specifying that any one process that transforms information from one mental code to another can only deal with one coherent stream of information at a time; that the copy process copies all incoming information to the record associated with its subsystem; that there exists a retrieval mechanism within each subsystem which enables a description to be revived to enable recent representations to be re-used; and a separate mechanism for re-using representations preserved in the longer term, through which a description generated through the action of a process in one subsystem is used to access the record of another (Barnard, 1985). Such first-order principles can be used to motivate second-order, more approximate principles governing the operation of the broader architecture that can be expressed within the attribute space for a CTM. For example, if any given process that transforms information can only handle a single stream of data at a time, it follows that it must disengage from an incoming stream to access a memory record of past experience. If the transformation is not fully automated or proceduralised, a record may be used as the basis for the transformation. From the first-order principles, it can be conjectured that the complexity of dynamic control of the architecture will, within limits, increase as a function of the number of processes for which the recoding of information is unproceduralised. Similarly, other conjectures can be formed to inter-relate the revival of descriptions in the short term to the use of active task records (ATRs) and to link the principle allowing access to and re-use of experiential task records (ETRs) in one subsystem through descriptions originating in another. So, for example, in both forms of access it is likely that a well-formed structural description will facilitate re-use of a representation, and that the structure of information in ATRs will bias processing activity when recoding operations are relatively unconstrained by prior procedural knowledge or ETRs. A number of such principles were originally proposed by Barnard (1987) and evaluated against empirical evidence derived from studies of command language, menu usage and information layout. Naturally, the task of developing and testing principles across a broad range of settings is a substantial one. Following on from Barnard (1987), further consideration of evidence from lexical dialogues (Barnard, Grudin and MacLean, 1989), iconic search (Green & Barnard, 1990) and multi-modal settings (Barnard & May, 1995) have led to the formulation and testing of an increased range of second-order principles applicable to the dynamic control of auditory-verbal and visuo-spatial behaviours, some of which have been shown to have predictive validity (May, Tweedie & Barnard, 1993; May, Barnard & Blandford, 1993). The second-order principles need more than a simple grounding in the ICS architecture and the authority of some empirical evidence. Since all subsystems within ICS obey the same principles, the second-order principles need to be formulated in a sufficiently abstract form that they can be applied across different mental domains, and must be internally consistent in their application. A high level principle governing Page 18

Barnard & May (1999) HCI 14 complexity of dynamic control will need to generalise from classic problems such as visual search and lexical commands to advanced interfaces such as multimedia applications or virtual environments. There must also be a clear means of applying them to practical problems and design scenarios with realistic degrees of complexity.

6.5: Application of CTM Other than those already outlined, cognitive task analyses based upon ICS have now been conducted for interface issues as diverse as the dynamics of screen changes (drawing upon an analysis of cinematographic practices, May & Barnard 1995a), the use of hand gestures for user input (Duke 1995, Duke et al 1995), mouse and keyboard input in a multimodal system, a range of multimodal output issues (Barnard & May, 1995), and the deictic use of speech (Barnard, May & Salber, 1996). Beyond interface issues, it has also been applied to the wider problems of access and availability of people working in ‘ubiquitous computing environments’, where desktop computers with video channels provide the potential for breaches of personal privacy (Bellotti et al, 1996). The diverse nature of these examples provide some evidence that the CTM representation is potentially capable of meeting the requirement of scope outlined in section 2. While the classic interface issues are the sort of well-specified problem that is likely to be met frequently in a design setting, where two or more options have to be compared and evaluated, the general problems of multimodal interfaces and ubiquitous computing environments are clearly very abstract in the information they provide to the modeller, and in the sort of information that is required in return. This supports our claim that this representation also meets the requirement for abstraction. All of these examples of application were the result of the authors’ exploration of the contribution that ‘user modelling’ could make to an interdisciplinary approach to design support, as part of the Esprit projects Amodeus and Amodeus-2 (Barnard et al, 1992; Barnard et al 1995). While ICS itself is based upon empirical evidence, both from the psychological literature and from experiments that have been conducted explicitly to test deductions made from the theory, within the Amodeus projects it was applied to two forms of HCI scenarios: those intended to reflect ‘core issues’ within HCI which any approach to modelling ought to be able to address (Young & Barnard, 1987), and those drawn from ‘real world’ design situations to which the project members had access. Over the course of the project, the emphasis shifted from the former to the latter, and these in turn evolved from small-scale design problems experienced by members of the project to larger-scale, multimedia issues experienced by designers in real, on-going design projects in Computer Based Learning, Computer Supported Collaborative Work, and Safety Critical Systems. The apparent success with which the CTM representation could be applied to such a wide range of design scenarios gave rise to a key question: How much of the analytical insight was being provided by the representation; and how much was ‘merely’ the craft skills of the authors, dressed up in the formalism of the representation? Once a particular problem space has been defined, the application of general principles and their mapping to a specific application setting should be formulaic. One way to settle craft skill issue would be to demonstrate that the modelling process could be automated with both second-order principles and their mapping to a domain of application rendered explicit. A vital motivation for the original development of the CTM attribute space described in section 5 was that it would enable us to specify the principles of ICS and the relationships between the component parts of the attribute space as production rules within an expert system. It was envisaged that it should be possible for someone with no knowledge of the modelling to describe a problem, and for the expert system to build the representation. The additional role that the expert system could play as an ‘existence proof’ of the principled nature of the representation was serendipitous, for the rules that the expert system executed in the course of its ‘analysis’ of the problem would explicitly show the nature of the knowledge needed to construct the representation, and the lack of heuristic insight or implicit solution recognition.

Page 19

Representing Cognitive Activity

7: Development of the expert system Before the development of the CTM attribute space, some preliminary HCI problems had been analysed using ICS constructs, and these analyses had been implemented as production rule systems (e.g. see Barnard et al, 1987; 1988). These preliminary systems were separate, unintegrated illustrations of what such an analytical tool might do. In addition, they were largely concerned with early command and menu style dialogues, rather than with the broader, visually based interactions of current systems. As mentioned above, the CTM attribute space representation had been created to bring these individual systems together, and support the development of an integrated expert system, capable of constructing CTMs for a range of problems, and of using them to reason about user behaviour. Initially we brought together the existing analyses and began to extend the general applicability of our techniques to the interpretation of structural and dynamic properties of visual interface objects. We then began the development of an integrated ‘expert system’ demonstrator in which we could encode many examples in a unified way. In terms of the production of the demonstrator, called ‘ICSpert’, we set as a target the implementation of sufficient rules to cover six areas, which drew upon: Two topics previously implemented: 1 reference to keystroke commands 2 conceptual structure of tasks and the sequencing of their constituents Two new areas derived from more recent experimental work in the HCI literature: 3 icon search 4 mental models or device knowledge Two new design scenarios: 5 navigation functions in a hypertext 6 location of command options within a WIMP interface These examples were selected to provide coverage over key domains of interface design and to illustrate the re-use of general rules over a range of issues. Across these examples, we have cases of keystroke command, menu and visually based, direct manipulation interfaces. In each instance there are particular forms of tasks, reference is made to command operations, and so on. This means that knowledge based rules dealing with the structure of tasks (derived from case 2) play a powerful role in other cases (e.g. 1, 4 and 6). Similarly, rules dealing with command reference derived from case 1 also play a role in cases 2, 5, and 6. Those rules dealing with the structural and dynamic properties of visual interface objects derived primarily from case 3 also play a role across cases 4, 5 and 6. Although we list here six cases used to specify generalisable rules, the coverage of the system is actually quite broad. So, for example, we know that we can describe many variants dealing with the constitution of command name sets, including semantically specific words, general words, pseudowords, abbreviations, random consonant strings, etc. Likewise, many different forms of iconic interface and task organisation may be described. The applicability of the present system goes beyond the test cases used to motivate the rules. The implementation of ICSpert was carried out within a commercially available product called Xi+, marketed by Inference Europe Ltd. In attempting to use this product within a research project of this type, we were clearly exploring capabilities that would not normally be used in the production of a ‘classic’ expert system. These systems are not primarily designed as theoretically based modellers. As such, a considerable proportion of the effort was devoted to making the control rules work in the manner we required. We estimate that two thirds of the implementational effort was devoted to getting control passed among knowledge bases appropriately, together with the parameters we required to be passed. Around a third of the effort went into encoding the relevant theoretical approximations derived from ICS and the CTM attribute space. The complete listing of the expert system knowledgebases is provided in an Amodeus project report (Barnard, Blandford & May, 1992), and a more detailed account of the workings of Page 20

Barnard & May (1999) HCI 14 ICSpert than can be accommodated within this paper can be found in May, Blandford and Barnard (1993). In the rest of this section we present an overview of ICSpert to illustrate the relationship between its operation and the construction of the CTM attribute space. The product of this implementational phase involved around 400 rules distributed over several knowledge bases (Figure 9). First, the essential rules that map from descriptions of tasks and interfaces are concentrated in a knowledgebase that builds hierarchical, structural descriptions (around 50 rules). This generates a database of reusable information. In addition, of course, other knowledge bases have the capability to call for more information on a particular point relating to a value of one of the attributes. The rules that map from the CTM attribute space back to the properties of user behaviour are concentrated in an ‘analyst’ knowledge base (around 75 rules). Those rules that deal with setting values in the CTM attribute space are distributed over the knowledge bases that relate to the specific subsystems of the ICS model and a co-ordinating ‘modeller’ knowledge base. The distribution of rules between knowledgebases is not equal: since we were not dealing with examples involving speech input, the ‘acoustic’ knowledge base is essentially empty (1 rule). Similarly, it is a property of the ICS theory that the central subsystems can enter into reciprocal cycles of processing. Although we know that two reciprocal processes are unlikely to have identical properties (e.g. the vocabulary that an individual can produce is very much smaller than the vocabulary they can comprehend), we have nonetheless used this reciprocal property in our approximations. In practice this means that most of the work done to establish what is going on in the central subsystems is accomplished within the knowledge base for the propositional subsystem (over 100 rules). The knowledge base for the implicational subsystem, which is involved primarily in reciprocal exchanges with the propositional subsystem, is considerably simpler (7 rules). If, on the other hand, we had been constructing a modeller in the domain of affect and cognition (as proposed by Teasdale & Barnard, 1993), the implicational knowledgebase would have contained most of the rules.

cognitive task models

problem description

modeller

structor

ac

subsystem specific knowledgebases vis Figure 9:

mpl

textual reports

analyst

art

prop bs implic obj

lim

The core knowledgebases (ellipses) of the ICS Expert System for Cognitive Task Modelling in HCI, showing the sequence of modelling and the information produced at each point (rectangles).

7.1: Examples of Rules in ICSpert Figures 10 to 14 show examples of the modelling rules that are contained in ICSpert. Figure 10 shows three related rules drawn from the ‘modeller’ knowledgebase, which co ordinates the overall construction of the CTM. These three rules define the configuration that is required in a phase of cognition for a particular task context. The left hand side (LHS) of each rule, following the ‘if’, sets the conditions under which the Page 21

Representing Cognitive Activity facts on the right hand side (RHS), following the ‘then’, will be assumed to be true. In these rules the first part of the LHS describes the phase of cognition that is being modelled (goal formation, action specification or action execution), and the rest of the LHS described the general type of interface that is being used (i.e., whether the user must type or select a word for a ‘verbal’ interface, or locate and click an icon in an ‘iconic’ interface). In the case of action execution, the configuration is the same whether the user has to press a single function key or click on an object on the screen - here the rule has two LHSs, linked by an ‘or’ conjunction. if

phase is goalf and command form includes verbal then config includes vis|obj and config includes obj|mpl and config includes mpl|prop and config includes prop|implic and config includes implic|prop and config includes prop|mpl if

phase is actionsp and command form includes iconic then config includes prop|implic and config includes implic|prop and config includes prop|obj and config includes obj|prop if

phase is actionex and input from user includes keystroke and keystroke form includes function keys or if phase is actionex and input from user includes direct manipulation then config includes prop|implic and config includes implic|prop and config includes prop|obj and config includes obj|lim and config includes lim|mot Figure 10:

Three rules from the ‘modeller’ that are used to define the configuration of processes required for a particular CTM.

These ‘if’ rules are used in ‘backward chaining’, where ICSpert is searching for possible values of ‘config’. The inference engine of the expert system shell finds all rules that include a fact about ‘config’ in their RHS, and then attempts to evaluate the LHSs to see if the conditions are satisfied. With these three rules, this can lead to a further search to find rules that include ‘command form’, ‘keystroke form’, and ‘input from user’ among their consequences. Eventually the inference engine may come to a rule that requires it to display a question to the person consulting ICSpert, to ask them for information about the task, the interface, or the user. In this way, ICSpert only asks the questions it needs to know the answers to. Once the configurations have been established, the ‘modeller’ requests the inference engine to search for the values of identifiers in the attribute spaces of procedural knowledge and record contents, again using backward chaining. Figure 11 shows four rules from the ART knowledgebase, which deals with the activity of the ‘articulatory’ subsystem, including speech, typing, and other verbal output. These rules would fire if the inference engine were constructing a CTM for the action execution phase, and the configuration included the production of motor output for keyboard use. The first two determine the approximate contextual precision (acp) of task specific (ts) procedural knowledge for the :ART→MOT* transformation process, according to the typing experience of the class of user that is being examined. The second two rules determine the approximate contextual precision (acp) of the active task records (atr) within the subsystem. These attributes of the CTM are defined in Figures 6 and 7, respectively.

Page 22

Barnard & May (1999) HCI 14

if

phase is actex and subset is pk and output code is mot_t and typing experience is expert then acp at art|mot_t of ts in actex = 5 and automaticity = 5 if

phase is actex and subset is pk and output code is mot_t and typing experience is novice then acp at art|mot_t of ts in actex = 1 and automaticity = 1 if

phase is actionex and subset is rc and output code is mot_t and typing experience is expert then acp at art of atr in actionex = 5 and automaticity = 5 if

phase is actionex and subset is rc and output code is mot_t and typing experience is novice then acp at art of atr in actionex = 1 and automaticity = 1 Figure 11:

Four rules from the ART knowledgebase, determining attributes of procedural knowledge (pk) or record contents (rc), according to the typing experience of the user.

Rules like those shown in Figure 11 in the ART and other subsystem specific knowledgebases map between the problem description provided by the person consulting ICSpert (e.g., attributes such as the users’ anticipated typing skills), to attributes of the CTM, usually via intermediary identifiers inferred by other modelling rules (such as the nature of the output codes, or the identity of the processes within the configuration). To the extent that they provide values for CTMs, these rules can be said to contain the knowledge that supports cognitive modelling. It can be seen that the values that can be taken make a relatively economic number of contrasts: the degree of typing experience is either expert, intermediate or novice, for example, and most internal attributes such as ‘approximate contextual precision’ are made on the five point scale, 1 (none), 2 (a little), 3 (average), 4 (a useful amount), 5 (complete). The use of a small number of contrasts is regarded a strength of approximate modelling, since neither excessive time nor spurious accuracy is required in order to proceed with modelling. The effects of combining several attributes also allows for a much larger range of analytic outcomes to occur. In the rules illustrated, for example, the difference between an expert typist and a novice typist is reflected in the ability they have to automatically produce typed output from a representation based on the sound of the material being typed, without having to visually search for the keys: expert typists can do it perfectly (a value of 5), and novices cannot do it at all (a value of 1). Since these are backward chaining rules, the inference engine has to be told which attributes it has to establish values for by forward chaining rules such as those in Figure 12. In place of ‘if’, these have ‘when’ so that as soon as their LHS conditions are met, their RHS will be assumed true. These three rules use variables (indicated by identifiers that start with a capital letter) so that whenever the inference engine begins to look for the attribute values for ‘Contents’ (either procedural knowledge or record contents) in any of ‘Phase’ (goal formation, action specification or action execution), then it will be instructed to search for values of some specific intermediate identifiers. These will in turn fire backward chaining rules that require specific attribute values to be set. Here, depending upon the nature of the interface task, the rules infer that the modelling requires information about either ‘order ambiguity’ or ‘item ambiguity’ of the information being processed, or the clarity of the ‘meaning_of_text’ presented to the user.

Page 23

Representing Cognitive Activity

when check Contents of Phase and units of Phase are task steps then check unit ordering in Phase and check order ambiguity in Phase when check Contents of Phase and units of Phase are commands or text strings then check commands and check item ambiguity in Phase when check Contents of Phase and units of Phase are task steps and command_form includes verbal then check meaning_of_text Figure 12:

Three ‘forward chaining’ rules from the propositional knowledgebase.

These rules indicate the power of generalisation that derives from the similarity between the attribute spaces of procedural knowledge and record contents. Although one refers to the ‘knowledge’ contained within a transformation process, and the other to the ‘knowledge’ contained within the image record, both attributes spaces have similar structures, and so their analysis can be driven by the same high-level rules. Similarly, the uniform subsystem architecture means that strings of identifiers can be concatenated in a systematic way by a comparatively small number of modelling rules, to reflect the detailed structure of the CTM attribute space in an economic manner. The consequence is that within ICSpert a limited number of forward chaining rules set the modelling goals, varying according to aspects of the interaction problem, and then cascades of backward chaining determine the appropriate attribute values of the relevant areas of the CTM, asking for more specific information as and when it is required. The next set of rules (Figure 13) show how the attribute values for particular phases of cognition can be used to reason about the dynamic control of cognition, such as the attributes of ‘oscillation’. The first rule depends upon a record contents attribute from one phase (the approximate utility of the superstructure of the common task records available in the propositional subsystem image record, during goal formation) to determine the consequences during another phase (action specification). if

phase is actionsp and command_form includes verbal and item ambiguity in actionsp is high or very high and ausuper at prop of ctr in goalf < 3 then locus of control at mpl|prop in actionsp is probable and locus of control at prop|mpl in actionsp is possible and locus of control at prop|implic in actionsp is very probable and locus of control at implic|prop in actionsp is possible and oscillation in actionsp is high if

phase is actionsp and command_form includes iconic and item ambiguity in actionsp is zero or low then locus of control at obj|prop in actionsp is possible and locus of control at prop|obj in actionsp is possible and locus of control at prop|implic in actionsp is very probable and locus of control at implic|prop in actionsp is possible and oscillation in actionsp is low Figure 13:

Rules from the ‘modeller’ knowledgebase that predict dynamic control attributes of a family of CTMs, dependent upon attributes from different phases of cognition.

This occurs because the modelling has predicted a high degree of ‘item ambiguity’ of the verbal commands that are to be generated (e.g., semantic confusions between command names such as delete and remove), which could be overcome if there was sufficient support from the goals that the user had formed (i.e., the task had been well learnt, to the extent that the goal is specifically to ‘delete’ or ‘remove’ rather than just Page 24

Barnard & May (1999) HCI 14 ‘get rid of’). The second rule covers the corresponding case where there is predicted to be no problem of item ambiguity. The RHSs of these two rules specify the approximate probabilities that a given transformation process would require buffered processing in the given circumstances. If a process operates in buffered mode, then as explained in section 6.3, it would become the locus of dynamic control. Where two or more processes are rated as ‘probable’ or ‘very probable’ then the locus will oscillate between them (as is the case for the first rule), but if only one process is so rated, oscillation is unlikely to occur (the second rule). These rules can clearly only fire once the identifiers relating to item ambiguity have been instantiated, and this is where the information collected by the Structor knowledgebase is used. This essentially presents a set of questions about each ‘item’ within an interface or interaction, where the nature of the item depends upon the interface class being described: the items could be screen objects, for example, or task steps. Once an item has been described, Structor asks if it has a ‘constituent structure’, i.e., whether it consists of any elements that can also be described. In this way it iteratively builds up a representation of a hierarchy of elements within the interface, whose interrelationships can then be evaluated to arrive at overall approximations of attributes such as ‘item ambiguity’. These attributes do not appear in the graph structures for the CTM, since they are not strictly speaking part of the CTM: they are intermediate attributes that describe some property of the task environment, which could in principle be instantiated in a true CTM identifier such as the approximate utility of the basic units of a representation. Once the CTMs have been constructed, the modelling is essentially complete. ICSpert has constructed a large database of user-defined identifiers, intermediate identifiers, and identifiers describing the CTMs. The final stage is to interpret the models, and this is performed by forward chaining rules in the ‘analyst’ knowledgebase, such as those shown in Figure 14. These rules would fire during analysis of the ‘action specification’ phase of cognition for an interaction that required the user to construct verbal commands. Depending upon the degree of support from propositional experiential task records (ETRs) and the likelihood that the propositional image record will be required to support the :PROP→MPL: transformation, differing predictions can be made about the extent to which users will make semantic confusions between command names. when analysis is actionsp and acp at prop of etr in actionsp < 5 and acp at prop of etr in actionsp > 1 and command_form includes verbal and locus of control at prop|mpl in actionsp is probable or very probable then report Users will make a few errors due to semantic confusion between the command names. when analysis is actionsp and acp at prop of etr in actionsp = 1 and command_form includes verbal and locus of control at prop|mpl in actionsp is probable or very probable then report Users will make several errors due to semantic confusion between the command names. Figure 14:

Two output rules from the ‘analyst’ knowledgebase which use the attribute values in the CTMs to produce reports about the predicted usability of an interface.

The ‘analyst’ knowledgebase contained around seventy reporting rules like those in Figure 14, each having conditions drawn from CTM attributes and problem specific identifiers such as those describing the ‘command_form’. The reports that they generated were of course determined by the scope of the scenarios that had been used to build ICSpert, but the majority of these rules would fire in more than one situation. Very few were specific to particular cases or instances of the original problems, and those that were generally used information about the nature of the interface in order to phrase the report appropriately. Other reporting rules did not contain problem specific identifiers at all, but were based entirely upon the CTM identifiers, especially those reporting consequences of dynamic control. Page 25

Representing Cognitive Activity

7.2: Application to an HCI Design Problem As the diagrams in section 6 illustrate, the number of possible identifiers in the complete attribute space is of course large. In producing a CTM for a particular phase, however, ICSpert does not attempt to instantiate every possible identifier. The use of backward chaining means that only those relevant to the current context are inferred. Figure 15 lists a section of the database constructed by ICSpert when asked to model the second of its examples, taken from a study by Barnard, MacLean & Hammond (1984). This paper examined the conceptual structure of tasks and the sequencing of their components by asking people to carry out a set of electronic mail operations where the command name sets varied in semantic confusability, and using a task structure where the steps were organised either as a set of four paired operations, or as two sequences of four operations. All eight operations had to be executed in a specified, serial order. The operations were of two kinds: ‘establish’ operations that obtained a piece of information about a mail message; and ‘action’ operations that processed a piece of information. In the ‘four pairs’ structure, each pair consisted of an ‘establish’ operation, followed by its associated ‘action’ operation. In this design option, there is a clear pragmatic constraint on the ordering of the commands within the pairs, since the action cannot be performed until the requisite information has been established, but the ordering of the four pairs is not so constrained. In the ‘two sets of four’ option, all four ‘establish’ operations are grouped together, and then the four ‘action’ operations are performed. Here the pragmatic constraint on ordering is between the groups of operations, but there is no pragmatic constraint on the ordering within the groups. The two options differ in the support that they give to the user in learning the correct sequence; and the design question is whether the support is better given ‘early’ or ‘later’ in the task structure. To produce the model listed in Figure 15, the ‘four pairs’ structure with semantically related names was described to ICSpert. Here only the attributes of the action specification phase are listed; an equivalent number of attributes were inferred for the goal formation and action execution phases. The knowledgebase that inferred each attribute is indicated in parentheses. The processes that are part of the configuration are each modelled in turn, by the ‘modeller’ knowledgebase asking the knowledgebase indicated by the ‘input’ side of the transformation process about the production of the ‘output’. Procedural Knowledge and Record Contents are examined in turn, and so the model first contains descriptions of task specific (ts) procedural knowledge for the :PROP→IMPLIC: transformation, produced by the ‘prop’ or propositional knowledgebase. To infer the approximate contextual precision (acp) of this transformation, the propositional knowledgebase had first to establish the degree of item ambiguity in action specification, which it did by evaluating two measures of the maximum number of ‘confusions’ possible in the task sequences that had been described. Because of the backward chaining, these confusion measures enter the database first, followed by the item ambiguity attribute, and finally the ‘acp’ attribute for the transformation process. The ‘automaticity’ of the process is then inferred, before the individual record contents attributes are determined, and finally an overall evaluation of the ‘record utility’ for the process (i.e., a general measure of how well processing in this subsystem can be supported by its image record contents). The propositional knowledgebase then passed control back to the modeller, which asked the implicational knowledgebase to return values for the reciprocal :IMPLIC-PROP: transformation. As described in section 7.1, the implicational knowledgebase takes advantage of assumptions of ‘symmetry’ in reciprocal processing which are almost certain to be oversimplifications, but which are useful as heuristics. Instead of repeating the modelling that has been carried out by the propositional knowledgebase, then, it just transfers the final estimates into its own attribute values for automaticity and record utility. A degree of safety is provided in that the specific attributes are not copied, so that if later rules needed to know more precise information, it would have to be specifically evaluated rather than derived through this heuristic approach. The next transformation process, :PROP→MPL:, again requires the propositional knowledgebase to add some attribute values to the database, and so the procedural knowledge available to produce morphonolexical output is inferred. In another oversimplification, ICSpert assumes that the record contents available to a subsystem is the same regardless of the process that is to make use of it, and since record contents of the prepositional subsystem has already been assessed for the :PROP→IMPLIC: process, no new record contents attributes are added for the :PROP→MPL: transformation.

Page 26

Barnard & May (1999) HCI 14

aubasic at prop|implic of ts in actionsp = 5 (prop) factorial confusions of est = 120 (prop) factorial confusions of act = 1 (prop) total confusions in actionsp = 7 (prop) item ambiguity in actionsp is high (prop) acp at prop|implic of ts in actionsp = 1 (prop) automaticity of prop|implic in actionsp = 3.4 (prop) ausuper at prop of atr in actionsp = 5 (prop) ausuper at prop of etr in actionsp = 5 (prop) ausuper at prop of ctr in actionsp = 5 (prop) ausuper at prop of epr in actionsp = 5 (prop) acp at prop of etr in actionsp = 1 (prop) acp at prop of ctr in actionsp = 1 (prop) acp at prop of epr in actionsp = 1 (prop) aubasic at prop of etr in actionsp = 5 (prop) aubasic at prop of ctr in actionsp = 5 (prop) aubasic at prop of epr in actionsp = 5 (prop) record utility of prop in actionsp = 3.67 (prop) automaticity of implic|prop in actionsp = 3.4 (implic) record utility of implic in actionsp = 3.67 (implic) acp at prop|mpl of ts in actionsp = 1 (prop) aubasic at prop|mpl of ts in actionsp = 5 (prop) automaticity of prop|mpl in actionsp = 3 (prop) acp at mpl|prop of ts in actionsp = 1 (mpl) automaticity of mpl|prop in actionsp = 1 (mpl) aubasic at mpl of etr in actionsp = 5 (mpl) aubasic at mpl of ctr in actionsp = 5 (mpl) aubasic at mpl of epr in actionsp = 5 (mpl) ausub at mpl of etr in actionsp = 5 (mpl) ausub at mpl of ctr in actionsp = 5 (mpl) ausub at mpl of epr in actionsp = 5 (mpl) record utility of mpl in actionsp = 5 (mpl) average record utility in actionsp = 4 (modeller) average configural automaticity in actionsp = 2.7 (modeller) locus of control at mpl|prop in actionsp is possible (modeller) locus of control at prop|mpl in actionsp is probable (modeller) locus of control at prop|implic in actionsp is very probable (modeller) locus of control at implic|prop in actionsp is possible (modeller) oscillation in actionsp is high (modeller) Figure 15:

The attribute values inferred by ICSpert for the action specification phase of a task taken from Barnard et al (1984) examining the organisation of task steps in an electronic mail task. The knowledgebase that inferred each attribute is indicated in parentheses.

The final process to be examined is the :MPL→PROP: transformation, and here the symmetry of reciprocal processing heuristic is not used, and so the relevant procedural knowledge and record contents attributes are inferred specifically. Once this has been done, average values for record utility and configural automaticity are computed from the results of the four transformation processes (the propositional record contents attributes being counted twice, even though they have only entered the database once), and rules similar to those shown in Figure 13 determine the likely locus of dynamic control and the consequent amount of oscillation in the phase. The output that is produced by ICSpert for this scenario, based on the attribute values shown in Figure 15 and those of the goal formation and action execution phases, is shown in Figure 16. As ICSpert is a ‘proof of concept’ demonstrator, intended to show that this approach to the automated production and assessment of cognitive task models was feasible, the reporting rules have been intentionally written to be potentially generalisable - that is, they should be applicable to problems other than those from which they have been derived. In part this reflects the division of responsibility between the knowledgebases of ICSpert, with the analyst knowledgebase forming its reports predominantly upon the CTM attribute space rather than directly upon the problem descriptions. Our interest has not been to produce perfect problem solutions from known material, as might be the situation in a Case Based Reasoning approach, but to show that the theoretical analysis of a problem can in principle lead to reasonable conclusions. Page 27

Representing Cognitive Activity

Analysis for goal formation: Users will initially experience difficulty in remembering the correct order in which to issue the commands. With practice, users will be able to learn the correct order of commands. Users should be able to infer what to do next from knowledge of ordering. Users will be able to generalise performance to different sequences of these commands, if that is required. When planning how to carry out individual task steps, the overall co-ordination and control of users’ mental activity will be quite straightforward During planning, processing activity will involve relatively few shifts of emphasis between ‘structural’ and ‘semantic’ forms of mental representations. Analysis for action specification: Users will initially find this set of command names very difficult to learn. Users will make several errors due to semantic confusion between the command names. Occasional users are likely to be able to recall this command set. When formulating the constituents of individual task steps, the overall co-ordination and control of users mental activity will be quite involved. During action formulation, processing activity will involve many shifts of emphasis between ‘structural’ and ‘semantic’ forms of mental representations. Analysis for action execution: When engaged in the performance of individual task steps, the overall co-ordination and control of users mental activity will be very straightforward. During performance, processing activity will involve relatively few shifts of emphasis between ‘structural’ and ‘semantic’ forms of mental representations. Overall, the task phases of planning, formulating constituents and actually performing them, will be ‘interleaved’ with considerable frequency. Figure 16:

The output produced by the reporting rules of the ‘analyst’ knowledgebase given the complete family of CTMs summarised in Figure 15.

An important feature of the output, even in this limited system, is that it not only identifies probable usability difficulties, but it explains why they occur, in terms of the user’s cognitive tasks. In this example, the output suggests that, while there may be some initial difficulty with recalling the order, it will soon be learnt (a reflection of the low value within the CTM for ‘order uncertainty’ for the goal formation phase). This is in contrast to the output that would have been produced for this phase for the ‘two sets of four’ design option, in which all four ‘establish’ commands had to be issued before the four ‘action’ commands. In the CTM for that option, the ‘order uncertainty’ attributes would be higher, and the text output by ICSpert would reflect that.

Page 28

Barnard & May (1999) HCI 14 The indication that the semantic confusability of the command names is causing difficulties during the phase of action specification, but not action execution, reflects the different values of ‘item uncertainty’ in these two phases. Because ICSpert does not ‘know’ what the constraints on the design are, it cannot actually ‘suggest’ corrective action to resolve these problems; it does not even really ‘know’ that these are problems. It is the role of the person consulting ICSpert to decide whether these are acceptable difficulties, or whether they should attempt to produce a design option that removes them. If they take the latter course, they will be guided towards a resolution by their understanding of what it means to reduce ‘semantic confusability’ in action specification: they will at least know that the difficulties are a consequence of the user not being able to decide which name refers to the goal that they have chosen, but that having chosen a name, they do not have any difficulty executing the operation. Because the output rules are only operating on the CTMs, and not on the original problem description, they cannot explain why a particular cognitive difficulty has arisen, except in general terms, and because only one option is being modelled at a time, ICSpert cannot itself weigh up options against each other: this is again the role of the person using ICSpert.

8: Limitations and potential The current implementation of ICSpert should not be seen as anything more than the ‘proof of concept’ demonstration that it was built to be. It is certainly nowhere near being a real, applicable tool. It is best viewed as a snapshot of a development process, but one with the potential for further development. We are now exploring the application of a similar system to ‘mainstream’ cognitive psychology experimental tasks, as a concrete example of the contribution that effort in the applied science of HCI can make to one of its ‘parent’ sciences. The visibility of the rules within the knowledgebases is important, as it lays bare the modelling process, and justifies the claim that the theory is making a contribution, as well as the craft skill of the modeller. The rules in the knowledgebases correspond to the second-order principles described in Section 6.4, rather than the first-order principles, but in each case the derivation should be apparent. Thus the rules in Figure 10 that define the configurations of processes cannot call on the same process for two purposes, as this would contravene the first-order principle that a process can only operate on a single data stream at any one moment. The rules in Figure 13 can determine the locus of control and an assessment of its oscillation, since image record access, and hence the locus of control, is restricted to a single process at any moment by another of the first-order principles of ICS operation. ICSpert works for the six test cases, which were chosen as part of a wider investigation into the multidisciplinary analysis of ‘core HCI scenarios’, and were not chosen specifically for the purpose of constructing this system. In some of these instances (notably the hypertext scenario), it ‘works’ through a somewhat inelegant route. The output is sometimes generated via an intermediate rather than a proper CTM attribute. These intermediate attributes, such as ‘item ambiguity’ and ‘order ambiguity’, have been used within ICSpert as ways of generalising the questions that the expert system asks its user, so that they can be presented in several situations where different ‘surface’ problems have underlying cognitive similarities. Item ambiguity might correspond to an assessment of the utility of basic units of a propositional representation in one configuration, but the utility of the superstructure of object representation in another, depending upon the phase of activity, source of ambiguity, and the configuration being used. The same rules within the Structor knowledgebase can derive the intermediate attribute in all cases, however, and subsequent rules should map it onto the appropriate CTM attribute. In a complete ICSpert that built full CTMs, all intermediate identifiers would be re-expressed as CTM attributes, but since this is not a complete version, some intermediate identifiers persist in the final model, and so these must be used to derive the analytical reports. We are confident that as we gain a grasp of the full interrelationships between the intermediate and CTM attributes, we will be able to generate rules to make the connection via a properly motivated rule. The current implementation also clearly demonstrates one key feature of this approach, an understanding of which has emerged in the course of our development process. The descriptions of tasks, users and interfaces are collected up in a manner not unlike that of the AI programme ‘Eliza’ (Weizenbaum, 1966). The Structor knowledgebase has no semantics for the commands, tasks, or visual interface objects that are given to it. However, the specialist knowledge bases do contain a relevant form of semantics, because these knowledge bases ‘know’ about the properties of mental representations likely to apply to different classes of entities at the interface (through the intermediate identifiers of item ambiguity, order uncertainty, spatial Page 29

Representing Cognitive Activity uncertainty etc). In this way, our particular form of approximation avoids one of the classic difficulties associated with normal AI-based simulation methodologies. We do not need rules that specify exactly how a particular process generates a particular output. We avoid the classic knowledge explosion problem by confining our theoretical reasoning to abstract characterisations of structure in mental representation and inferences about mental processing activity based upon abstractions concerning the properties of particular levels of representation. The use of these intermediate identifiers also lets the system ‘work’ for scenarios not included in our test cases, provided that the ‘input’ questions can be interpreted loosely, to cover the relevant aspects of the scenario. Sometimes this leads to unexpected outputs which have, so far, mostly appeared reasonable, but in some cases are clearly peculiar. We regard the explicitness of the approach that leads to this to be an advantageous property of the approach rather than a limitation: it provides a clear basis for the validation and advancement of the rule set in a systematic manner. As a computer-based tool, ICSpert also suffers from the limitations that its domain of expertise was intended to correct: the computers it was built for are now technologically outdated, and the software shell it was written in is no longer supported. Since its interface design was constrained by the software shell, it is also vulnerable to the HCI equivalent of the comment ‘physician, heal thyself’. One of the findings of the evaluation of its use (Shum & Hammond, 1993) was that, while it successfully removed the need for its users to have an in depth knowledge of ICS, they did need training in the use of the peculiarities of the expert system shell, and in interpreting the meaning of some of the questions that ICSpert was asking: especially when the questions were ‘generic’ and relied upon the context of the consultation to make their referents clear. A more anecdotal finding was that in answering the questions ICSpert posed, people started to work out ‘where it was going’, and in having to iteratively describe tasks steps or commands to the Structor, often noticed design weaknesses themselves, without having to complete the consultation. While we felt that was in itself a laudable consequence of ICSpert’s ability to home in on the salient aspects of an interface, and to direct its user’s attention to them, it did question the necessity for the complete representation of the CTM. This led us to develop our secondary approach to the application of ICS in HCI, in which it forms the theoretical basis for a set of ‘problem solving techniques’ based around the ideas of mental representation and flow, and emphasising the utility of the intermediate identifiers that had become important in the development of ICSpert.

9: Representations for HCI problem solving In addition to the quasi-formal, principled CTMs, ICS also provides an informal language for reasoning about cognition. For situations where a principled model is not appropriate, this language may suffice to define the points of a problem that need to be considered, and so support the process of problem resolution. Of particular importance are the nature of the information processed by the four central subsystems within ICS, the flows of representations between them, and the structure of the information within those representations. As part of the EU Esprit project AMODEUS-2, we wrote a working paper which outlined the steps that were necessary to identify the elements of the object representation that a user would form from a computer display, based upon the ideas underlying the Structor knowledgebase, and the rules that Modeller and the Object knowledgebases would apply to the resulting hierarchical structure. This was taken up by the project as a potential example of the way that modelling skills could be ‘transferred’ from theorists to practitioners, and so the paper was developed into a set of tutorial materials, whose effectiveness could be ‘assayed’. The materials, and the techniques themselves, have continued to develop, and have now been presented as tutorial materials several times (May, Scott and Barnard, 1995; May & Barnard, 1997; Jørgensen & May, 1997).

9.1: Diagrammatic Notations In its current form, the tutorial materials include an eighty-page booklet which describes the idea of a hierarchical structure for mental representations, and presents the ICS framework to convey the idea of multiple levels of mental representation and their interaction. In place of the hierarchical structures built up Page 30

Barnard & May (1999) HCI 14 by ICSpert’s Structor, designers are provided with a representational technique for recording Structure Diagrams, and Transition Path Diagrams (TPDs) which record the way that the focus of processing moves through the structures over time. By constructing TPDs for different levels of mental representation, designers can identify points at which information at one level can be ‘out of step’ with another. These are the sort of situations that the CTMs built by ICSpert would indicate as requiring a high degree of cognitive activity, resulting in poorer user performance. A typical cause of such a situation might be the user wanting to carry out one action while the display presents information pertaining to another; or where the display changes without helping the user identify the element that they are required to attend to next. Problems in the Structure Diagrams or the TPDs can often be resolved by rearranging display elements or sounds, adding or removing task steps, or otherwise changing the structure of information provided by the interface to help the user attend to the salient elements. Two simple Structure Diagrams are illustrated in Figure 17. These represent the two alternative task structures used in the ‘email’ experiment of Barnard, MacLean & Hammond (1984), and for which ICSpert produced the CTM listed in Figure 15. While the Structure Diagrams represent the overall hierarchy of the information that is potentially available, only a subset of this can actually be actively processed at any moment. In the CTM, the actively processed subset would correspond to the ‘basic units’ of the representation, and their grouping would be the ‘superstructure’. The element that is the focus of processing is called the ‘psychological subject’, and the other elements that share its superordinate, grouping element form the ‘psychological predicate’. As time passes, the focus of processing can move through the hierarchy from the subject to any of the elements in the predicate, ‘up’ to the grouping element, or ‘down’ into the constituent elements of the subject. Because the CTMs do not attempt to simulate the course of cognition, the TPDs do not have a direct analogue within ICSpert, but each row corresponds to an ‘active task record’ at a given level of cognitive representation (see Figure 7). Overall, a TPD represents the sequential changes in an ‘active task record’ as the interaction progresses. In the upper part of Figure 17, the ‘ordering’ task consists of four pairs of commands, with the highlighted member of each pair being a ‘pragmatic subject’ that the user knows to execute first (in this example, these are commands that the user must execute to find out or ‘establish’ a piece of information that the second command will then ‘act’ upon when it is executed). Pragmatic subjects are important, because on moving the focus of processing ‘down’ into a subject’s constituent structure, the pragmatic subject will automatically form the next subject. The lower diagram represents an interface in which the commands have been grouped functionally, so that the four ‘establish’ commands must be performed before the four ‘actions’. Here the only pragmatic subject is the element representing the whole group of ‘establish’ commands: while it is clear that these must be performed before the ‘actions’, there is no pragmatic reason for performing any of the four ‘establish’ commands before any of the others, and so there is no ‘pragmatic subject’ within the group. The same applies to the four ‘actions’. ordering

time

display

stamp

register

originator

locate

confirm

index

store

recipient

identify

dispatch

ordering

establish

actions

display locate index identify

Figure 17:

stamp confirm store dispatch

Structure Diagrams for two interfaces with identical task steps but different structures.

Structure Diagrams such as that illustrated in Figure 17 have proven straightforward for designers to understand, since they are familiar with hierarchical decompositions. The training materials start off by discussing an object based structure, and then move on to propositionally based task structures, to Page 31

Representing Cognitive Activity introduce the idea that representations of different cognitive content can be dealt with in the same way. TPDs such as that shown in Figure 18 evolved from the need to introduce a temporal dimension into the representation. It was necessary to include within each row of the diagram all the elements that the focus of processing could move to subsequently: the predicate elements, the superstructural elements, and the constituent elements of the psychological subject. The TPD in Figure 18 shows how the active representation changes over time as a user correctly carries out the start of the ‘two sets of four’ task structure. The focus of processing at each step is now indicated by the highlighted box, and the predicate elements are shown immediately to its right, in an ellipse bounded by the same rectangle to emphasise that they are all basic units at the same level of decomposition. In an ellipse to the left is the superordinate group that the subject and predicate belong to, and to the far right is an ellipse containing the constituent structure of the subject. The user starts with the focus on the overall task element, and then moves down into its constituent structure (a move ‘down’ is indicated by the inverted-U between the lines). ordering

1

establish

actions

display locate index identify

ordering

establish

actions

establish

display

locate index identify

establish

locate

display index identify

4 3 Figure 18:

A Transition Path Diagram for the ‘two sets of four’ structure, showing the temporal sequence of the focus of processing for the first two operations.

Since the ‘establish’ element is a pragmatic subject it becomes the new focus of processing directly. This is reflected by the ambiguity value of ‘1’ for the transition, shown at the left of the figure. The next change in the focus is a move into the constituent structure of the ‘establish’ group, to focus on the four individual task steps. At this point, the designer constructing the TPD can note that there is no pragmatic subject, and so there is ‘ambiguity’ about the transition. Since there are four equally constituent elements, all equally likely to be chosen, the ambiguity value of this transition is 4. Once this element has been dealt with, there are still three predicate elements to choose between for the next task step, and so the ambiguity of the next step is 3. The complete diagram would continue with each of the four establish actions being represented as the subject of a row, followed by a transition ‘up’ the structure to their parent element, and a transition ‘across’ to the ‘action’ element. The four ‘action’ operations would then have to be processed in the same way as the ‘establish’ actions, with corresponding ambiguity values. In Figure 19 the initial section of the TPD for the ‘four pairs’ option is shown. The first subject is again the overall task of ‘ordering’ something, but its constituent elements are now the four pairs, instead of the two sets. The first transition is to one of these, and since they have no pragmatic subject, this has an ambiguity value of 4. The next step, into its constituent structure, does have a pragmatic subject, since one of its elements is an ‘establish’ action, while the other is the related ‘action’ operation. both of these transitions are unambiguous, and have the value 1, as does the fifth transition, back up to the pairing element. The next transition, to another pair of operations, would have a value of three, and so on. The overall ambiguity of the TPDs is given simply by multiplying all of these ambiguity values together, to give a metric of the probability that someone would execute the operations in the correct sequence if they had no knowledge of the ordering of the items, but were guided entirely by the pragmatic constraints of the design. The full TPD for the ‘two sets of four’ option has the values 4x3x2x4x3x2=576, since there is ambiguity within each set of four operations, while the TPD for the ‘four pairs’ option has the ambiguity value 4x3x2=24, since the ambiguity resides between the pairs, but not within them.

Page 32

Barnard & May (1999) HCI 14 time originator register recipient

ordering

4 1

ordering

time

originator register recipient

time

display

stamp

time

stamp

display

ordering

time

display

stamp

display

stamp

1 1 Figure 19:

originator register recipient

A Transition Path Diagram for the ‘four pairs’ structure, showing the temporal sequence of the focus of processing for the first two operations.

9.2: Comparison with CTMs In this simple description, an assessment of ‘ambiguity’ has hopefully been made by the designer, as a consequence of constructing the TPD, but it is not forced upon them in the way that it would be in a consultation with ICSpert. If they do make it, however, they can see directly where it has come from, and what has to be done about it in terms of the elements of the interface structure, rather than having to infer it from a textual output. These representational techniques are tools that can help the designer analyse their design, rather than tools that will build a representation and do the analysis for them. On the other hand, they can be much quicker for the designer to work through, and are not limited by the need to complete every step of the process, as an ICSpert consultation is. Most helpfully, though, repeated use of the techniques allows the designer to learn about the ICS framework, and to become practised in conceiving of the interface in terms of the user’s cognitive tasks. It should be clear that, since both representational forms are derived from the same theory, ICS, they share the same underlying constructs of ‘item uncertainty’ and ‘order uncertainty’. The major difference is that ICSpert takes the person consulting it step by step through the process, automates the process of identifying the elements of the interface, describing the relationships between them, and then uses ICS to determine which aspects of the interface need to be assessed in terms of ambiguity, while the diagrammatic notations require the designer to do all of this themselves, and just provide notational support.

9.3: Applicability of Diagrammatic Notations As with the CTM representation, the diagrammatic notations are generalisable to all of the levels of mental representation defined within ICS. The electronic mail task structure example is at the propositional level of representation, while the Hypertext illustrated in Figure 20 is at the object level of illustration. This figure shows two versions of an online guide to York (after Hammond & Allinson, 1988), which differ only in the presence or absence of a navigation button that gives access to an alphabetical index to the hypertext. While this was intended by the designers to be a minimal design change that would leave the pattern of use of the rest of the interface unchanged, the respective TPDs show that the consequences for the most frequent navigation operation, returning to the previous screen, is altered substantially. In the full version, this navigation operation required the user to focus on the Map button, since it is the leftmost element of the array of buttons, and hence the pragmatic subject of an object representation for people experienced in reading from left to right. The Map button gave access to a graphical overview of the Hypertext, and the designers found that removing the Index button had the side effect of reducing user’s use of and knowledge about the availability of the Map. The TPD for the index-less version shows why: users no longer have to attend to this button in their primary navigation tasks of going ‘back one’ and going ‘home’. They therefore fail to experience the Map button in the context of navigation.

Page 33

Representing Cognitive Activity Welcome to York

MAP

INDEX

BACK ONE RESTART

Welcome to York

MAP

Figure 20:

screen A

title

screen A

buttons

picture title

buttons

map

buttons

index

buttons

back one

buttons

restart

screen B

title

screen B

buttons

text buttons picture text

map index back-one restart

index back-one restart

map

back-one restart

map

index restart

map index back-one

picture text map-button buttons

buttons

back-one

buttons

restart

title picture text map-button

back-one restart

restart

back-one

BACK ONE RESTART

Two versions of a Hypertext with different navigation buttons (after Hammond & Allinson, 1988), and the respective TPDs for the selection of the ‘back-one’ button.

Hypertext applications such as that illustrated in Figure 20 are difficult for conventional modelling techniques to represent, due to the different types of information that they can contain (text, graphical objects, moving graphics, and sound) and the fact that the screen frequently changes dynamically, with and without the user executing any operations. The ability of the diagrammatic notations to deal with multiple levels of cognitive representation, simultaneously, provides a way to handle all of these complexities, and opens the way to the representation of multimodal interaction. An example of a dynamic screen change is shown in Figure 21, where a user clicks on Baden-Baden in a small scale map of a region of south-western Germany, and the screen changes to show a more detailed, larger scale map of the town. The change in the user’s object representation in this case has been caused externally, by a change in the interface rather than internally by a change in their propositional task representation. It is essential, therefore, that the new object representation is coherent with the existing propositional task structure, or else the user will be provided with an additional task of locating an appropriate object within the display, or of reformulating their task structure. The TPD in Figure 21 shows that the designers have met this criterion by making it likely that the object element the user was focusing on prior to the screen change (i.e., the element that they operated upon) is replaced by a similar, task-related element in the new screen. The relationship between the task the user is performing and the screen display is mapped by the relationship between the propositional and object levels. Although we have only shown here TPDs for propositional and object representations, the notations can be applied to all levels of representation within ICS. Sound can be represented as basic units of sound or speech in morphonolexical code, derived from external acoustic information that the user hears (in the case of interfaces that produce sound) or from internal propositional information that the user produces (in the case of interfaces that require the user to speak, such as telephone banking systems). The notations are of the same form and follow the same transition rules, reflecting the uniform way that ICS treats the operation of the cognitive subsystems. Changes in sound structures made by the interface (which would, in current interfaces, normally be the production of a brief alert sound) can be useful in supporting the execution of a task sequence only if they are produced at appropriate points in the transitions through the propositional structure, just as screen changes need to be. When they occur at other points, they are likely to orient the user away from the task, or to be missed or unrecognised. Sounds that the user is required to produce will also depend upon the propositional task structure, and so they are not likely to be able to execute different operations simultaneously in two modalities, to take an extreme example.

Page 34

Barnard & May (1999) HCI 14

2

1

3

4

screen

map

scale mark

map area

top row

map area

top row

left column

Nordheim

A

B

map C

left column

"baden- city cross Y"nordbaden" blob hair junction heim"

D

Badenbaden

E

map area

"badenbaden"

"nordcity cross Yblob hair junction heim"

map area

"badenbaden"

square spa cross gast square Yblob hotel hair junction hof blob

click

3

D

P Kirche

Schloß

Marktplatz P spa hotel

E

Figure 21:

P

Badenbaden

Gastho f P

Dynamic changes in an interface result in externally driven changes in an object representation, as shown in the TPD.

10: Conclusions Representations can be judged on two criteria: the completeness with which they represent information, and the facility with which they enable completion of the tasks they are supporting. The representations current within HCI support tasks of theoretical relevance more than practical design tasks, and so while they have been successful within the theoretical community, they have had little success in terms of influence in design. The representations we have described in this paper are specifically aimed at application rather than theoretical use. The two approaches differ in the completeness with which they seek to represent an interaction, and in the degree of support they provide in design evaluation and problem resolution. The more complete representation, Cognitive Task Models (CTM), can be constructed by a skilled theorist or, in principle, by an automated tool; the less complete representations, Structural and Transition Path Diagrams, are intended as a notational form to support rapid, small-scale problem identification and resolution. The CTM representation describes the nature of cognitive activity underlying the performance of complex tasks. The process of building them can be relatively automatic, given the principles and some domain knowledge. The ICSpert system ‘knows’ what kinds of configurations are associated with particular phases of cognitive activity; it ‘knows’ something about the conditions under which knowledge becomes proceduralised, and it ‘knows’ the properties of memory records that might support recall and inference in complex task environments. It also ‘knows’ something about the theoretical interdependencies between these factors in determining the overall patterning, complexity and qualities of the co-ordination and dynamic control of cognitive activity. An abstract representation of cognitive activity can be constructed in terms of a four component model specifying attributes of configurations, procedural knowledge, record contents and dynamic control. Finally, in order to produce an output that can be used in the design process, ICSpert ‘knows’ something about the relationships between these representations and the attributes of user behaviour. As an applications representation, ICSpert is very different from cognitive walkthroughs (Polson et al., 1992) or programmable user models (Young, Green & Simon, 1989). Like PUMs, the actual tool embodies explicit theory drawn from the science base, and the underlying architectural concept enables a broad range of issues to be addressed. Unlike PUMs, it more directly addresses resources across perception, cognition and action. It also applies a different trade-off in when and by whom the modelling knowledge is specified. While ICSpert must contain a complete set of rules for mapping between the world and the model, this does not mean that the expert system must necessarily ‘know’ each and every detail, nor need Page 35

Representing Cognitive Activity the details be very exact. Rather, within some defined scope, the complete chain of assumptions from artifact to theory and from theory to behaviour must be made explicit at an appropriate level of approximation. Some mappings will be more detailed, and more specific, than others; but those that are not well specified may be those that are rare or unlikely to be of relevance to the domain of application. Equally, the input and output rules must obviously be grounded in the domain of the inquiry, which here is the language of interface description and user-system interaction. Although some of the assumptions may be heuristic, and many of them may need crafting, both theoretical and craft components are there. The modelling knowledge is laid out for inspection in the content of the rules. However, at the point of use, the expert system has some properties more closely akin to those associated with the cognitive walkthrough. Both require considerably less precision than PUMs in the specification and operationalisation of the knowledge required to use the application being considered, and require the consultant or analyst to use their own knowledge to identify entities within the design, and to assess their interrelationships. As with walkthroughs, ICSpert can build a family of models very quickly and without its user necessarily acquiring any great level of expertise in the underlying cognitive theory. In this way, it is possible for that user to explore models for alternative system designs over the course of something like one afternoon. Since the system is modular, and the models are specified in abstract terms, it is possible in principle to tailor ICSpert’s input and output rules without modifying the core theoretical reasoning. The development of the tool could then respond to requirements that might emerge from empirical studies of the real needs of design teams or of particular application domains. In contrast, the Diagrammatic representational techniques are more like a heuristic evaluation technique, in which the analyst is invited to consider various aspects of their design, and to ask themselves whether they really are adequate. The most important difference is that the relationship between the heuristics and the underlying theory, ICS, is made explicit, and the users of the techniques have to develop a degree of conceptual knowledge about ICS in order to carry out the analysis. Through using the techniques, it is intended that practitioners learn to adopt a ‘user centred approach’, and one which is, moreover, explicitly grounded in principled cognitive theory. Guidelines, checklists and general heuristics are simultaneously too abstract to teach much about specific design situations and too concrete to teach much about theoretical issues which can be transferred across domains or situations. An added benefit is that the Structure Diagrams and Transition Path Diagrams provide a systematic way of recording the process of the evaluation, aiding both its execution and its communication to others. The CTM representation constructed by ICSpert can also be recorded so that the implications of different design options can be compared in terms of specific cognitive consequences. Whether as a form of Design Rationale (MacLean et al 1991) or as a form of design advocacy, such representational techniques can help designers to convey their arguments and the reason for their designs to others, who may be the ones in the organisation who actually have the power to make design decisions. In Section 2, we outlined three main requirements that applied theories need to meet if they are to fulfil the needs of design practitioners, and still support the development of core cognitive theory. ICS concentrates on different levels of mental representation and on the processes that transform them, rather than task and paradigm specific concepts. This enables it to be applied across a very broad range of settings, within and beyond HCI. For the purposes of constructing a representation to bridge between theorists and practitioners, ICS provides explicit, yet approximate, representations of cognitive activity which are systematically derived and specified. Its modular form allows this to be done with economy, in the use of a small number of general principles which apply across the range of human cognitive faculties. In this paper we have described two forms of representation derived from ICS which can be applied in system development and design, and which both benefit from the conceptual economy of ICS. A relatively small number of rules support the instantiation of the CTM attribute spaces, and these can be automated within a production rule system. The diagrammatic notations can be applied across tasks, visual interface and sound interface issues, and can handle static and dynamic situations without needing ad hoc modifications. It is unlikely to be possible to develop a single representational framework for HCI that can satisfy all needs, since the situations in which design support is required vary so widely. Different representations must be developed if theoretical analysis is to have any beneficial impact upon design practice, but equally importantly, practitioners cannot be expected to learn a multiplicity of theories and to learn how to identify exactly which theory’s representational form is applicable to their current problem; not can they fairly be Page 36

Barnard & May (1999) HCI 14 asked to solve the problems of translating between different representations that they might have had to use at different points within a single design project. These two conflicting demands of breadth of representational form and simplicity of theory require a single theoretical framework. In this paper we have tried to show how such a theory can be used to derive different representational forms, to minimise the conceptual burden on practitioners of the techniques we want them to adopt. The representational techniques presented in this paper represent opposite ends of the spectrum in terms of the knowledge required for their use, in the support that they offer practitioners, and in the situations for which they are appropriate. Both evolved to meet specific demands that we ‘transfer’ the modelling potential of ICS to different audiences, for use in different ways. Other representations could presumably be developed for different purposes, given an accurate identification of the contexts within which they should be used. So far, the CTMs and Diagrammatic Notations have been used mainly within research projects and by HCI students, but they have been used to represent problems from experimental situations, core HCI scenarios, and ‘real world’ design projects. They share the breadth of scope and abstraction that we have identified as primary requirements, and their parent theory meets the requirement for a unified theory, supporting transfer of knowledge across domains of application and from older to newer technologies, and supporting feedback between the domain of application and the domain of theory. It is essential that the communication between theory and practice in HCI operate in both directions if the discipline is to survive and evolve as an applied, practical science.

References Anderson, J.R. & Skwarecki, E. (1986) The automated tutoring of introductory computer programming. Communications of the ACM, 29, 842-849. Barnard, P.J. (1985). Interacting cognitive subsystems: A psycholinguistic approach to short term memory. In A. Ellis, (Ed.), Progress in the psychology of language (pp. 197-258). London: Lawrence Erlbaum Associates. Barnard, P.J. (1987) Cognitive resources and the learning of dialogues. In J. M. Carroll (Ed) Interfacing Thought (pp. 112-128) Cambridge, Mass: MIT Barnard, P (In Press). Interacting Cognitive Subsystems: modelling working memory phenomena within a multi-processor architecture. In Miyake, A. & Shah, P. (Eds.), Models of Working Memory: Mechanisms of active maintenance and executive control. New York: Cambridge University Press. Barnard, P.J., Bernsen, N.O., Coutaz, J., Darzentas, J., Faconti, G., Hammond, N. H., Harrison, M.D., Jørgensen, A.H, Löwgren, J., May, J., Young, R.M. (1995) Assaying Means of Design Expression for Users and Sysytems. AMODEUS-2 Project Final Report, D13, CEC Brussels, pp43. Barnard, P.J., Blandford, A.E & May J. (1992) Documentation accompanying Amodeus Deliverable 19, Brussels: CEC DG XIII Barnard P.J, Coutaz, J., Hammond, N., Harrison, M., Jørgensen A, MacLean, A. & Young, R.M. (1992) AMODEUS: Final Report, D23, CEC, Brussels. Barnard, P., Grudin, J. and MacLean, A. (1989). Developing a science base for the naming of computer commands. In J.B. Long and A. Whitefield (Eds.), Cognitive Ergonomics and Human Computer Interaction. (pp. 95-133) Cambridge: Cambridge University Press. Barnard, P.J., MacLean, A. & Hammond, N.V. (1984) User representations of ordered sequences of command operations. Proceedings of Interact ‘84: First IFIP Conference on Human-Computer Interaction,. 434-438, London: IEE, Barnard, P.J. & May, J. (1993) Cognitive Modelling for User Requirements. In P.F. Byerley, P.J. Barnard & J. May (eds) Computers, Communication and Usability: Design issues, research and methods for integrated services. (pp. 101-146). Amsterdam : Elsevier

Page 37

Representing Cognitive Activity Barnard, P.J. & May, J. (1995) Interactions with Advanced Graphical Interfaces and the Deployment of Latent Human Knowledge. In F. Paterno’ (ed) Eurographics Workshop on the Design, Specification and Verification of Interactive Systems. (pp. 15-48) Berlin: Springer Verlag. Barnard, P. J., May, J. & Salber, D. (1996) Deixis and Points of View in Media Spaces: an empirical gesture, Behaviour and Information Technology, 15, 37-50. Barnard, P.J. & Teasdale, J. (1991) Interacting Cognitive Subsystems: A systemic approach to cognitive affective interaction and change. Cognition and Emotion, 5 , 1-39. Barnard, P.J., Wilson, M. & MacLean, A. (1987) Approximate modelling of cognitive activity with an expert system: a strategy for the development of an interactive design tool.Proceedings of CHI+GI ‘87, 21-26, New York: ACM. Barnard, P.J., Wilson, M. & MacLean, A. (1988). Approximate modelling of cognitive activity with an Expert system: A theory based strategy for developing an interactive design tool. The Computer Journal, 31, 445-456. Bellotti, V., Blandford, A., Duke, D., Maclean, A., May, J., & Nigay, L. (1996). Interpersonal access control in computer-mediated communications: A systematic analysis of the design space. Human Computer Interaction, 11, 357-432. Card, S.K. & Henderson, D.A. (1987) A multiple virtual workspace interface to support user task switching. Proceedings of CHI+GI ‘87, . 53-59, New York: ACM Card, S.K., Moran T.P. & Newell, A. (1983) The psychology of human computer interaction Hillsdale, NJ: Lawrence Erlbaum Carroll, J.M. & Campbell, R.L. (1986) Artifacts as psychological theories: The case of human-computer interaction. Behaviour and Information Technology, 8, 247-256 Carroll, J.M., Kellog, W.A. & Rosson, M.B. (1991) The Task-Artifact Cycle. In J.M. Carroll (ed.) Designing Interaction, (pp.174-102) Cambridge: CUP Duke, D.J. (1995) Reasoning about Gestural interaction. Computer Graphics Forum, 14, 55-66. Duke, D.J., Barnard, P.J., Duce, D.A. & May, J. (1995) Systematic development of the human interface. APSEC'95: Second Asia-Pacific Software Engineering Conference, 313-321. IEEE Computer Society Press. Duke, D.J., Barnard, P.J., Duce, D.A. & May, J. (in press) Syndetic Modelling. Human Computer Interaction.. Green, A.J.K. and Barnard, P.J. (1990). Iconic interfacing: The role of icon distinctiveness and fixed or variable screen location Human-Computer Interaction - INTERACT '90. 457-462. Amsterdam: Elsevier Science Publishers, B.V.,. Hammond, N. & Allinson, L. (1988) Travels around a learning support environment Proceedings of CHI ‘88, 269-273 New York: ACM. Jørgensen, A. & May, J. (1997) Evaluation of a Theory-Based Display Guide. Proceedings of HCI International 97. 403-406 Amsterdam: Elsevier. Kieras, D.E., Meyer, D.E., Mueller, S. & Seymour, T. (In Press). Insights into Working Memory from the Perspective of the EPIC Architecture for Modelling Skilled Perceptuo-Motor and Cognitive Human Performance. In Miyake, A. & Shah, P. (Eds.), Models of Working Memory: Mechanisms of active maintenance and executive control. New York: Cambridge University Press.

Page 38

Barnard & May (1999) HCI 14 Kieras, D.E. & Polson, P.G.(1985) An approach to the formal analysis of user complexity. International Journal of Man-Machine Studies, 22, 365-394. Landauer, T.K. (1987) Relations between cognitive psychology and computer systems design. In J. M. Carroll (ed.) Interfacing Thought (pp. 1-25) Cambridge, Mass: MIT Landauer, T.K. (1995) The Trouble with Computers. Cambridge, MIT Press. Lim, K.Y. & Long, J.B. (1994) The MUSE method for Usability Engineering. Cambridge: CUP. Long, J.B. & Dowell, J. (1989). Conceptions of the discipline of HCI: Craft, applied science and engineering. In A. Sutcliffe & L. Macaulay (Eds.), People and Computers V. (pp. 9-32) Cambridge: Cambridge University Press.. Lovett, M.C., Reder, L.M. & Lebiere, C. (In Press). Modelling Working Memory in a Unified Architecture: An ACT-R Perspective. In Miyake, A. & Shah, P. (Eds.), Models of Working Memory: Mechanisms of active maintenance and executive control. New York: Cambridge University Press. MacLean, A., Young, R. Bellotti, V. & Moran, T. (1991). Questions, Options, and Criteria: elements of design space analysis. Human Computer Interaction, 6, 201-250. May, J. & Barnard, P.J. (1997) Modelling multimodal Interaction: A theory-based technique for design analysis and support. Human-Computer Interaction INTERACT ‘97. 667-668 London : Chapman & Hall May, J., & Barnard, P.J (1995a) Cinematography and Interface Design Human-Computer Interaction: Interact’95. 26-31. London : Chapman & Hall May, J., & Barnard, P.J. (1995b) Towards supportive evaluation during design. Interacting with Computers, 7 , 115-143. May, J., Barnard, P.J., & Blandford, A. (1993) Using Structural Descriptions of Interfaces to Automate the Modelling of User Cognition, User Modelling and User Adaptive Interfaces, 3, 27-64. May, J., Scott, S., & Barnard, P. (1995). Structuring Displays: A Psychological Guide. Eurographics Tutorial Notes PS95 TN4, ISSN 1017-4656. Geneva: European Association for Computer Graphics. May, J., Tweedie, L. & Barnard, P. (1993) Modelling User Performance in Visually Based Interactions. In J.L. Alty, D. Diaper and S. Guest (Eds.), People and Computers VIII: Proceedings of the HCI '93 Conference. (pp 95-110) Cambridge: Cambridge University Press Nielsen, J. (1993). Usability Engineering. London. Academic Press. Newell, S. & Card. S.K. (1985) The prospects for science in human computer interaction. Human Computer Interaction, 1, 209-242 Payne, S., & Green, T. (1986). Task action grammars: A model of the mental representation of task languages. Human-Computer Interaction, 2, 93-133. Polson, P.G., Lewis, C., Rieman, J., Wharton, C. (1992) Cognitive walkthroughs: a methodology for theory-based evaluation of user interfaces. International Journal of Man-Machine Studies, 36, 741773 Reisner, P. (1982). Further developments towards using formal grammar as a design tool.. Human Factors in Computer Systems , 304-308. New York: ACM.

Page 39

Representing Cognitive Activity Shum, S. & Hammond, N. (1993) Analysis of the expert system modeller as a vehicle for ICS encapsulation. Amodeus-2 project document TA/WP5: ftp://ftp.mrc-apu.cam.ac.uk/pub/amodeus/assay/ta_wp05.rtf. Teasdale, J. & Barnard, P.J. (1993) Affect, Cognition and Change: Re-modelling depressive thought. Hove: Lawrence Erlbaum Associates. Whiteside, J. & Wixon, D. (1987) Improving Human computer interaction: A quest for cognitive science. In J. M. Carroll (ed.) Interfacing Thought Cambridge, (pp 353-365) Mass: MIT Weizenbaum, J. (1966) ELIZA - a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9, 36-44. Young, R.M & Barnard, P.J. (1987) The use of scenarios in human computer interaction: turbocharging the tortoise of cumulative science. Proceedings of CHI+GI ‘87, 291-296 New York: ACM. Young, R.M., Green, T.R.G. & Simon, T. (1989) Programmable user models for predictive evaluation of interface designsProceedings of CHI ’89 15-19 New York: ACM. Young, R.M. & Lewis, R.L. (In Press). The SOAR Cognitive Architecture and Human Working Memeory. In Miyake, A. & Shah, P. (Eds.), Models of Working Memory: Mechanisms of active maintenance and executive control. New York: Cambridge University Press.

Page 40

Suggest Documents