Using Structural Descriptions of Interfaces to Automate the ... - CiteSeerX

1 downloads 0 Views 756KB Size Report
Page : Arrow (right) : Line (vertical,solid) : Lines (horizontal,dotted). 0. 8. 3 e. 2091. Move Line: Page : Rectangle (horizontal, dotted) : Arrow (horizontal, solid) ...
Using Structural Descriptions of Interfaces to Automate the Modelling of User Cognition Jon May Philip J. Barnard Ann Blandford Medical Research Council Applied Psychology Unit 15 Chaucer Road Cambridge UK—CB2 2EF to appear in: User Modelling and Adaptive User Interfaces, 3(1), 1993

Abstract One approach to user modelling (Barnard et al., 1988) involves building approximate descriptions of the cognitive activity underlying task performance in human-computer interactions. This approach does not aim to simulate exactly what is going on in the user’s head, but to capture the salient features of their cognitive processing. The technique requires several sets of production rules. One set maps from a real-world description of an interface design to an internal theoretical description. Other rules elaborate the theoretical description, while further rules map from the theoretical description to properties of user behaviour. This paper is concerned primarily with the first type of rule, for mapping from interface descriptions to theoretical description of cognitive activity. Here we show how structural descriptions of interface designs can be used to model user tasks, visual interface objects and screen layouts. Included in our treatment are some indications of how properties of cognitive activity and their behavioural consequences can be inferred from such structural descriptions. An expert system implementation of the modelling technique has been developed, and its structure is described, together with some examples of its use in the evaluation of HCI design scenarios. Keywords:

cognition, usability, interface, hci, design, task structure, icons, screen layout, expert systems

Acknowledgements: The work reported in this paper was carried out as part of the ESPRIT Basic Research Action 3066, ‘Amodeus—Assimilating Models of Designers, Users and systems’, supported by funding from the Commission of the European Communities. We would like to thank an anonymous referee whose helpful comments on an earlier version of this paper enabled us to improve its structure and clarity. ‘This paper has not been submitted elsewhere in identical or similar form, nor will it be during the first three months after its submission to UMUAI.’

1

1. Introduction Although it is generally accepted that human factors research should be taken into account in the design of interactive computer systems, designers face several difficulties in obtaining appropriate usability information, in relating it to their immediate questions, and in applying its recommendations (Eason and Harker, 1991). As a consequence, rapid prototyping and user trialling still provide the main source of information about users’ responses to new designs. Unfortunately these methods can be costly and resource intensive and, as they require some form of prototype system, they are often not used until it is too late in the design process to challenge major design decisions. The application of cognitive psychology to human computer interaction provides the possibility that objective information about user behaviour can be provided at an early stage in design, without the need to produce mockup systems or to sample representative groups of users. Unfortunately, as Gardiner and Christie (1987) note, ‘trained cognitive psychologists with experience in information technology are few and far between in industry and are certainly not available to most product designers ‘on tap’ to contribute to the design of specific products under development’, and they advocate the development of software design tools which enable practical use to be made of the results of cognitive psychology research in the design process. In this paper we describe the development of such a system. Constructed as a set of expert system knowledgebases, it contains approximately four hundred separate rules, and has been produced as a ‘proof of concept’ demonstration in the course of the Esprit Basic Research Action 3066, ‘Amodeus’. Unlike earlier human factors support tools, which sought to assist designers by basing their output upon a fixed set of ‘guidelines’, this system automates the process by which a cognitive psychologist analyses a design, constructing a model of the cognition of a user interacting with the proposed design, and basing its ouput upon that model. The advantage of this approach is that the support that it gives is not limited to design situations that it ‘already knows about’ through its guidelines. Any design which can be described to the expert system can be modelled. We present a way of describing the aspects of the interface which are relevant to the modelling process in a structural form, which can be used directly by an expert system that has no real world or semantic knowledge. As with the application of any body of theoretical knowledge to real-world problems, the task of modelling user cognition requires several distinct steps to be performed. The first involves the description of the real-world problem in a way which makes it amenable to theoretical treatment, by representing the relevant aspects of the problem in an abstract way, and omitting aspects which do not affect usability. Once this has been done, theoretical knowledge can be applied to model the user’s cognition. Finally, the results of this modelling must be mapped back to the specific design situation to provide appropriate reports and advice. A fourth step, not explicitly part of the modelling process but of crucial importance in design, is the use of this information to modify and improve the design, before perhaps repeating the cycle until a satisfactory result is achieved.

interface 4 design recommendations

1

theoretical description 2

3 cognitive model

Figure 1: Steps in modelling user cognition to provide design advice These steps (figure 1) can be compared with those identified by Long (1989), who characterises HCI science support as a process of analysing the real world in terms of an acquisition representation, generalising from this to a science knowledge representation to produce scientific theories and engineering principles, which can be particularised back via an application representation to allow synthesis with the real world.

2

In the following three sections we present three examples of interface design questions which, despite addressing different aspects of their respective designs, show that it is possible for a single form of acquisition representation to be used to allow the generalisation to science knowledge to be performed. This acquisition representation is based upon the structure of the mental information that must be transformed by the user’s cognitive resources in the interaction with the interface. It is the ease with which these structures can be parsed and transformed into other mental representations that determines the ease or complexity of the cognition that is required to use the design which is being examined. Furthermore, since it is the structure which is important, and not the content of the structure, this approach to deriving the acquisition representation means that it can be used across a wide range of design questions.

2. Modelling of task structures To show how an acquisition representation that is based upon structural description of an interface can enable the modelling of tasks, consider the example of an email system such as that described in Barnard, MacLean and Hammond (1984). This system contained twelve commands, six which allowed users to ‘establish’ information associated with an email message, and six which let them perform some ‘action’ upon it. In one experimental condition, subjects had to perform four ‘establish’ operations in each trial followed by four ‘action’ operations. Another group of subjects used the same system, but had to enter commands grouped as four pairs, each containing one ‘establish’ and one ‘action’ operation (see figure 2). Both groups of subjects learnt the same conceptual structure of commands, as a set of six establish commands and a set of six action commands, together with the implicit constraint that establish commands had to be performed before their respective action commands (since information could not be acted upon until it has been established). The two groups differed, however, in the task structures that they learnt.

condition 1 (2 x 4)

action

establish

author

cost

index

append

identify

enter

store

dispatch

condition 2 (4 x 2)

reference

author

append

cost

price

file

enter

index

send

store

identify dispatch

Figure 2: Example task structures from the two conditions used in the e-mail task of Barnard et al (1984) In the first condition a user’s task consisted of two main elements, first to ‘establish’ the information and then to perform ‘actions’ upon it. Since there was a logical constraint that the ‘establish’ tasks must be performed before the ‘action’ tasks, the order of these two elements was unlikely to be confused. In performing the ‘establish’ tasks, a user learnt that there were four things to do, although these could vary from trial to trial. Similarly, while performing the ‘actions’, a user knew that they had four tasks to carry out. In each of these cases, however, there were no obvious identity or sequencing cues within the four sub-tasks, and so errors could occur in the order in which subjects attempt to carry them out within the sub-groups, but not between.

3

In the second condition, the task was broken down into four elements, each of which had two constituents, one establish and one action command. Now the logical constraint upon the task sequence was at the lowest level, and so there was no confusion about which of the two tasks within each pair had to be performed first. However, there were no cues about identity or sequencing at the middle level, and so errors could occur here (users attempting to perform the ‘file’ pair before the ‘price’ pair, for example). This explanation, based upon the ambiguities in the users’ respective task structures, is consistent with the pattern of errors made by the subjects in the study. The explanation is derived from features of the structures themelves, and not from the content or meaning of the nodes from which it is built. One way of representing these two task structures is as ‘facts’ where an element of the task is stated to consist of a set of subelements: condition_1

consists_of

establish : action

establish

consists_of

author : cost : index : identify

action

consists_of

append : enter : store : dispatch

condition_2

consists_of

reference : price : file : send

reference

consists_of

author : append

price

consists_of

cost : enter

file

consists_of

index : store

send

consists_of

identify : dispatch

If these facts are presented to an expert system by a designer who is consulting it, the system has only to ask if there are any pragmatic constraints governing the order of the subelements within each element, and it can then calculate the remaining number of permutations of the subelements in order to derive a measure of the overall ‘ambiguity’ of the structure. It can also highlight points within the structure where this ambiguity is most likely to cause local confusion and hence to increase the likelihood of transposition errors. Although some additional assumptions are required, it is clear that a structural description forms the foundation of the analysis and explanation. In the division of labour between the design consultant and the expert system modeller, the modeller can competently perform symbolic processing upon the structure, but any semantic information has to be provided by the consultant. Thus the modeller can count the number of elements and recognise symmetry, common shapes of substructures, repetition of elements, and so on, but the consultant has to provide all of the semantic information about sequences, relationships and groupings that require ‘real-world’ knowledge of the content of the nodes in the structure. When the modeller identifies a feature of the structure, such as the fact that ‘condition_2’ consists of more than two elements, it cannot reason about their content, and so must ask the consultant a question, for instance : “Is there an obvious reason why the 4 elements in condition_2 should be performed in the sequence given? Choose one of: yes no” This question is phrased appropriately for a structure dealing with task representations rather than, say, screen layout. However, the underlying modelling that identified a possible ambiguity in the structural sequencing would be common to any representational structure, and the answer to this question would set an identifier to mark the presence of ambiguity in the structure per se rather than one marking, say, ambiguity in task decomposition. This allows the same knowledgebases and rules to be reused in different semantic domains.

3. Modelling visual interface objects In the previous example ambiguity in the interface structures was due to the possibility of confusion in the order of elements within a task structure, or ‘order uncertainty’. A second source of ambiguity in interface design is ‘item uncertainty’, where there is potential for confusion between the identities of elements. This can be seen in a study carried out by Arend, Muthig and Wandmacher (1987), and replicated by Green and Barnard (1990), which contrasted the efficiency with which two sets of icons were searched and a specified icon selected. The icons represented functions of a simple word-processor. One of the sets portrayed them representationally, using a document shape with lines and arrows to show the consequences of the action, while the other set was abstract, the shapes being comprehensible only once their context was understood (figure 3).

4

Twelve icons made up the complete set of commands, and Arend et al trained users to associate each command with an icon. Half of the users were taught the abstract icons, and half the representational set. The users were then tested by being given the name of a command, and asked to select the appropriate icon from the array, with the position of each icon within the array being varied from trial to trial. Users who had learnt the abstract icon set could find the target icon faster than those who had learnt the representational set. The global similarity of the representational icons was hypothesised to be responsible for this difference, since it meant that each icon had to be examined, while the degree of difference between the abstract icons meant that incorrect icons were easier to reject.

Representational Icons

Abstract Icons

Figure 3: The two icon sets used by Green and Barnard (1990) (following Arend et al, 1987) A structural analysis of the icons supports this hypothesis, but it also allows us to go further. Each icon can be thought of as having a structure which, as in the email example, consists of a series of elements, except that now the elements are the constituent parts of the visual image instead of being task steps. In searching the icon array, the user would have to compare the structure of the icon they are looking at (their ‘candidate’ icon), element by element, with the mental representation of the ‘target’ icon that they have generated from the name of the command. If a mismatch is found between the structures, then the candidate can be rejected and another icon can be located and assessed. Since the icons are arranged randomly within the array, the factor that governs the ease with which an icon can be found, and hence the ease with which a particular set of icons can be used, is the depth to which the structure of each candidate must be evaluated before it can be rejected.

5

others rejectable at search 1st 2nd 3rd 4th group time (ms)

Structure Word search:

Circle : Line(solid)

11

a

739

Insert:

Lines (horizontal, dotted) : Line (solid, horizontal)

11

a

950

Justify:

Line (horizontal, mixed) : Line (solid, vertical)

11

a

1023

Delete Line:

Lines (solid, oblique) : Line (dotted, horizontal)

11

a

1041

Word Ahead:

Rectangle (vertical, dotted) : Triangle (right)

10

1

b

1186

Word Back:

Rectangle (vertical, dotted) : Triangle (left)

10

1

b

1268

Store file:

Rectangle (vertical, solid) : Line (oblique, solid)

10

1

b

1310

Delete Word:

Cross : Line (dotted, horizontal)

10

0

1

c

1063

Replace:

{Cross : Line (horizontal, dotted)} : {Rectangle (vertical, solid) : Line (horizontal, dotted)}

10

0

1

c

1361

Scroll Ahead:

Rectangle (horizontal, dotted) : Polygon (down)

9

2

d

1188

Scroll Back:

Rectangle (horizontal, dotted) : Polygon (up)

9

2

d

1371

Move Line:

Rectangle (horizontal, dotted) : Line (solid, horizontal)

9

2

d

1572

Table 1: Structural details of Abstract icons used by Arend et al (1987) The order of the elements within the structure of an icon depends upon the visual characteristics of the icon, following conventional perceptual strategies such as ‘outside before inside’, ‘left before right’ and ‘top before bottom’, with large, bold or otherwise salient elements being more likely to figure earlier in the structure than equally positioned but less salient elements. The abstract icon set, whose structural decompositions are listed in table 1, presents users with little difficulty, since several icons have a unique ‘subject’ — i.e., the first element of their structure does not feature in any other icon. When one of these icons is the target, any incorrectly selected icons can thus be rejected immediately. When the target is an icon whose subject appears in another icon, and that other icon is being scanned, it must be evaluated further before it can be rejected. Table 1 also shows the number of icons within the set which can be disambiguated at each level of the structure. The majority of these icons can be disambiguated on the evaluation of the first or second element of their structure, only two requiring the third element to be evaluated. As Table 2 shows, the representational icons require more evaluation to be carried out before an incorrect candidate can be rejected since, as is suggested by Arend et al, all but one have a subject which appears in every other icon. Furthermore, the majority of these icons cannot be unambiguously identified until the third or fourth element of their structure has been evaluated.

6

others rejectable at search 1st 2nd 3rd 4th group time (ms) 11 a 1675

Structure Store file:

Bracket : Arrow (left) : Page

Word search:

Page : Circle : Arrow (down) : Circle

0

11

b

1587

Delete Line:

Page : Lines (oblique) : Line (dotted,horizontal) : Arrow (left)

0

11

b

1929

Insert:

Page : Lines (horizontal, dotted) : Space : Arrow (right) : Line(horizontal, solid)

0

10

1

c

1837

Replace:

Page : Cross : Line (horizontal, dotted) : Arrow (right) : Square(solid)

0

10

0

1

d

2051

Delete Word:

Page : Cross : Line (horizontal, dotted) : Arrow (left)

0

10

0

1

d

2390

Word Back:

Page : Arrow (left) : Blob(filled) : Blob(unfilled)

0

8

3

e

1975

Word Ahead:

Page : Arrow (right) : Blob(unfilled) : Blob(filled)

0

8

3

e

2031

Justify:

Page : Arrow (right) : Line (vertical,solid) : Lines (horizontal,dotted)

0

8

3

e

2091

Move Line:

Page : Rectangle (horizontal, dotted) : Arrow (horizontal, solid) : Lines (horizontal, solid)

0

9

1

1

f

2092

Scroll Back:

Page : Rectangle(horizontal, dotted) : Rectangle (horizontal, solid) : Arrow (up)

0

9

1

1

f

2173

Scroll Ahead:

Page : Rectangle(horizontal, dotted) : Rectangle (horizontal, solid) : Arrow (down)

0

9

1

1

f

2226

Table 2: Structural details of Representational icons used by Arend et al (1987) Tables 1 and 2 also list the mean search times reported by Arend et al for the case when all 12 icons were shown in the same array (taken from figure 4 in their paper). If the icons are grouped into categories according to the depth to which the other icons must be evaluated, means for each category can be calculated (figure 4). Categories where incorrect candidates can all be rejected on the basis of the subject of the icons’ structures can be seen to have shorter detection latencies than those categories where several candidates must be evaluated to greater depths before they can be rejected. Representative icons (with one exception) all belong to categories which require deeper evaluation of more of the candidate icons than do any of the abstract icons. The categories with the slowest search times require some candidates to be evaluated to their third or fourth elements. In summary, the consideration of icons as a visual structure which must be sequentially evaluated allows us to make predictions not only about the relative simplicity of the two contrasting icon sets, but also about the rank orderings within each set.

2400 2200 2000 1800 1600 1400 1200 1000

a

b

c

d

a

Abstract Icons

b

c

d

e

Representational Icons

Figure 4: Icon search latencies (from Arend et al, 1987) plotted against evaluation depth categories A variable manipulated by Green and Barnard which has not been discussed in the above description, and yet which has a clear effect upon the search times, over-riding the effects of ‘item uncertainty’, was positional

7

variability. In one condition in their replication of the original study, the icons were fixed and always appeared in the same position within the array; in another half were fixed and half varied; and in the third each icon could appear in any position. This corresponds to the ‘order uncertainty’ described in the previous e-mail example, since when the position of the icons is fixed and there is no order uncertainty, users can look directly toward the expected location of the target icon, reducing the probability of needing to reject an incorrectly selected candidate. Low order uncertainty thus aids performance, just as it allowed users in the email example to select one of a pair of commands on pragmatic grounds without confusion. This shows that it is important for an expert system modeller to take into account all the possible sources of variation in the salient features of an interface. The key point of this example is that the success or failure of the method described depends upon the structural features of the interface, and not upon the meaning of the elements. This makes the analysis suited to an expert system which can manipulate tokens, without needing extensive knowledge about the meanings of the tokens. Where the structural features are such that semantic or relational information might influence the pattern of cognition, the consultant can be asked to make the relevant judgement, as is shown in the next example.

4. Modelling visual interface layout The previous two examples have concentrated upon different aspects of design. The email example showed that it was possible to model task structures, and the icon search example has shown that it is possible to deal with visual properties of an interface, using the same structural approach. In this example we will show that it also possible to deal with these aspects simultaneously, to model the way that the visual structure of an interface combines with typical patterns of use to affect the way users will learn how to use a particular design. In this example, a structural description helps to explain some empirical results obtained on user trials with an experimental hypertext system—a guide to the city of York (Myers and Hammond 1991). In a set of experiments conducted to examine the effects of different forms of navigational aids upon users’ performance with a hypertext system (figure 5) , one set of users were provided with access to a map of the hypertext links, an alphabetical index of topics, a ‘back to previous frame’ button and a ‘back to first frame’ button. All of these buttons appeared in the same place on every frame of the system, while the pictures and text varied. The users were shown how the system and the four buttons worked, and were then given a set of questions about York to answer, using the hypertext system to find the required information. After they had finished, a post-test questionnaire showed that these users understood the structure of the hypertext system well, and had developed an accurate understanding of the information presented by the map facility and how it should be used. A second group of users were given exactly the same system, but with the index button removed so that they did not have access to an alphabetical index of topics. With the intention of keeping other factors constant, the position and effects of the other buttons were not altered—which meant that an empty space was left between the map and the back one buttons. From the post-test questionnaire, it was found that some of these users had not understood what the map button did, nor what it represented, and they did not know how it could be used. Since the facility that had been withheld did not seem to have any relevance to the use of the map button, it was initially difficult to understand these results, but a structural description of the two interfaces provided the solution.

8

Welcome to York

MAP

INDEX

BACK ONE

RESTART

Figure 5: The layout of the hypertext screen discussed in Myers and Hammond, 1991 (from the condition including the index button) In the full interface, the bottom line buttons formed a distinct perceptual group, and when users focussed on this group, and then attended to the structure within the group to use any of the buttons, map was the head of the structure, given the subjects’ acquired tendency to scan and read such structures from left to right. For example, in the format used in the email example earlier, the structure of the hypertext screen can be represented by: hypertext screen

consists_of

title: picture : text : buttons

buttons

consists_of

map : index : back one : restart

In order to use any of the navigational buttons, the users had to focus upon map, and then make a visual transition to the appropriate button within the structure. In searching the hypertext for the answers to the test questions, users most frequently used the back one button, to move to the previous frame, and restart, to move back to the ‘home’ frame. From their names and from their actions, these buttons are clearly related to moving to another screen. However, to find them within the structure of four buttons, users had to scan over map and index. Removing index, however, broke up this grouping, and since map was often closer to the pictures shown on the hypertext frames than it was to back one and restart, these two buttons formed a unit on their own, separated from map, which shifted to an earlier position in the structure, to reflect its position close to the picture. The structure of the interface became: hypertext screen

consists_of

title : picture : map button : text : back buttons

buttons

consists_of

back one : restart

Use of the two ‘back buttons’ could continue as before, but now back one was the head of its structure, and on attending to the group, could be located immediately. The users did not need to attend to map . In the four button version, users continually encountered map while performing a navigational task, looking for back one or restart , and so with increasing use of the interface it became understood in a navigational context. In the three button version, however, such semantic support for the interpretation of map was lacking, and users had to infer its meaning from other sources — in fact, in the absence of other information, users are likely to expect a map button in a geographical hypertext system to present them with a map of the city rather than an abstract display of hypertext links. The anticipation of such semantic transfer between elements of a substructure is clearly beyond the capabilities of a simple, token-processing modeller. This is an example of a situation where the modeller must ask the consultant for real-world information: in this case if there is any reason why an element could be expected to benefit from being interpreted in the same context as other elements in its structural group. For the map button, this question can be answered positively in the case of the four button interface, but not for the three button interface. The modeller only needs to know whether such information is available and what its consequences are, not what that information actually is. In effect, it asks a structural question and not a content

9

question: whether or not any relationships exist, and not what their nature is. Once again, this balance between structure and content means that the underlying rules and identifiers can be used regardless of context.

5. Interacting Cognitive Subsystems (ICS) These examples have shown that it is possible to carry out the first step illustrated in figure 1, mapping from a surface description of an interface to an acquisition representation that is based upon structural information about the interface. This acquisition representation describes the attributes of the basic units that are being examined, together with their relationships to the superstructure which orders them into groups, and details about the substructure of each unit. The nature of the units that are described and the level of description that is necessary varies according to the design question that is being asked. It can be about a task sequence, as in the email example, in which case the units are the task steps and their superstructure is relevant, or it may be about the visual design of icons, in which case the units are the screen objects themselves and their substructures are relevant. The hypertext example required a consideration of the way that the visual structure of the full-screen display would combine with the typical tasks that users would perform and hence the way that they would come to interpret individual screen objects. Since the structural description is largely independent of the semantic content of the interface design, it is suitable for use by an expert system modeller of user cognition. Basing the modelling process upon the structure of the information available from the interface rather than its content makes it possible for the modeller to be applicable to a wide range of design issues. The second step illustrated in figure 1 carries out the modelling of cognition, based upon this structural description. In principle, any approach to the modelling of cognition could be applied here, provided that it is able to deal with the abstract nature of the information contained in the structural description. A modelling technique based upon semantic information would not be appropriate, given that the structural descriptions omit such information in the interests of generalisation. One approach to the modelling of cognition which has been applied within the field of HCI, and which is well suited to the use of structural information, is that of Interacting Cognitive Subsystems, or ICS (Barnard 1987; Barnard, Wilson & Maclean 1988; Barnard, Grudin and MacLean, 1989). One advantage of ICS for the development of a general-purpose expert system modeller is that it does not aim to model the exact nature of information as it is processed by the human cognitive system. Instead it builds approximate models of cognition, based upon the demands that processing places upon cognitive resources. In contrast to the Model Human Processor of Card, Moran and Newell (1983), ‘ICS represents the nature of processing more strongly and in a less parameterised fashion. It is able therefore to capture wider variations in behaviour due to performative factors’ (Simon, 1988). In this section we present a brief overview of the ICS approach to modelling, and show how it can make use of the information collected by the structural descriptions.

Image Record (preserves a history of input)

From Store

To Store COPY (store)

Inputs

Transform Transform

Outputs

. Figure 6: The basic architecture of a cognitive subsystem ICS is a resource-based model that does not make use of any general purpose capabilities such as a central directing executive, limited capacity working memory or a unified long term memory. Instead, these aspects of cognition arise from the coordinated operation of a large number of specific, single purpose transformation

10

processes operating independently, and in parallel. Each of these processes fulfils a particular function in the manipulation, storage or recall of mental representations. These processes are organised into distinct subsystems, according to the nature of the mental representations upon which they operate. Within a subsystem, processes operate upon the same mental representation, transforming it into the representations required by other subsystems. These processes share a common long term store of the representations that their subsystem has received. Although each subsystem operates upon a different mental code, they all share the same basic architecture (figure 6). The model makes use of nine subsystems (see figure 7). The Acoustic, Visual, and Body State subsystems deal directly with sensory information. The Morphonolexical, Object, Propositional and Implicational subsystems represent ‘central’ processes, and operate upon the information produced by the sensory subsystems and by each other. The Articulatory and Limb effector subsystems are driven by the output of the central subsystems to produce overt physical actions (for a more complete description of the nature of the information dealt with by these subsystems, see Barnard and Teasdale, 1991).

MPL Morphonolexical Subsystem

AC Accoustic Subsystem IMPLIC Implicational Subsystem BS Body State Subsystem

VIS Visual Subsystem

Sensory

PROP Propositional Subsystem

ART Articulatory Subsystem

LIMB Limb Subsystem

OBJ Object Subsystem

Central

Effector

Figure 7: The three classes of subsystem, with arrows showing the flow of information between them Within the Object subsystem, for example, is a set of transformation processes which, given an object-based representation, describing the shape, size and position of physical objects, produces an output representation in propositional code, representing the identities and attributes of the objects and their relationships to each other. This set of transformation processes is collectively called the OBJ⇒PROP transformation. If the visual objects are actually written words, another set of transformation processes, OBJ⇒MPL, is simultaneously capable of transforming the object representation into morphonolexical code, which is a representation of speech-based or linguistic information. There are also processes transforming the representations into effector codes, with OBJ⇒ART producing the representation required for articulatory output (ie ‘naming’), and OBJ⇒LIMB that for a motor movement. Taken together, these sets form the Object subsystem. The transformation processes within a single subsystem all work in parallel upon the same input representation. Thus when a person’s object subsystem receives a representation of a ball moving rapidly through the air towards them, it could simultaneously produce representations of the name of the object in morphonolexical code (“ball”) using the OBJ⇒MPL transformation, its semantic attributes in propositional code (“ball approaching quickly”) using the OBJ⇒PROP transformation, and generate motor specifications for an avoidance

11

action with OBJ⇒LIMB. In practice, the precise way in which individual processes are called into play depends upon the information represented and upon task demands. When a process has transformed the representation received by its own subsystem into the representation required by a different subsystem, the information becomes available for use by that latter subsystem. Since the subsystems are themselves operating in parallel, representations produced by more than one subsystem can be combined to form the input representation to a subsequent subsystem. The propositional subsystem, for example, could work upon a combination of the outputs of the morphonolexical (via MPL⇒PROP) and object subsystems (via OBJ⇒PROP), rather than just one or the other. Because each subsystem operates upon a different class of information, information is repeatedly transformed, interpreted and re-represented throughout the overall cognitive system. It is this dynamic processing of the representations of information that gives rise to the phenomena that can be used to make predictions about the patterns of cognitive activity. These patterns are ‘approximate’ in that they do not describe precisely what is happening to a given token of information, nor do they directly produce estimates of latency or values for memory load. Instead of such quantitative values, the model generates qualitative assessments of cognition, which vary according to the dynamics of the flow of information between the subsystems. From the point of view of user modelling for interface design, the salient features of an interface are those which affect the course of the processing of information, and hence the patterns of cognitive activity. These aspects relate to the matching by each subsystem of incoming representations in their appropriate code to representations that they have operated upon before (the ‘record contents’), and to the development and function of skilled performance (which in ICS terminology is called ‘procedural knowledge’). The record contents of a subsystem represent a complete record of all inputs that it has received – in other words, all representations that the cognitive system has processed that are in the appropriate code for that subsystem to operate upon. Over time, and with the repeated processing of similar information, these experiential records give rise to generalised abstracted records which can be used to elaborate and interpret incoming representations, which may be novel, contain errors or be incomplete. Procedural knowledge is likewise acquired through the repeated processing of similar information and the production of similar outputs, and enables one subsystem to produce an output representation for another subsystem directly, rather than by going through an intermediary set of transformations. Thus while the object representation of a ball flying through the air towards someone would lead them to derive an implicational representation that it is going to hit them, they would not need to indulge in further cycles of processing based upon this representation to start to move out of the way. Experience will have led them to develop procedural knowledge such that the object subsystem itself can produce an appropriate representation for the motor subsystem. Record contents and procedural knowledge allow the subsystems to build representations of information, and to elaborate them or to transform them into other codes. A major element of the ICS modelling approach concerns the degree to which an interface allows or demands the recruitment of existing record contents and procedural knowledge, the support that it gives for their acquisition and development, and the extent of any conflicts between the requirements of the interface and existing knowledge. For the purposes of user modelling, it is not necessary for an expert system to develop a detailed description of how abstraction over record contents or the development of procedural knowledge occurs. All that is necessary is to be able to make approximate predictions of what will happen in a given situation with a particular set of representations. ICS deals with this by examining the structure of the representations that are being processed, and the requirements for their flow through the configuration of individual subsystems. The requirements for processing structural information, for moving the focus of processing within and between the levels of a representation, and for transforming the content of the elements from one subsystem code to another, are the aspects that govern the complexity of processing and hence the nature of the approximate model that is derived. A central point of the ICS approach to modelling cognition is that, since all the subsystems have the same logical and functional architecture, the transformation processes that they contain and the representational structures upon which they operate can be formally modelled in the same way, even though they deal with different classes of information. A valuable consequence for any automation of the modelling process is the economy of coding that this allows.

12

In the past, several small demonstrator expert systems had been written which embodied various aspects of the ICS model, to show how the model could reason about specific HCI scenarios (Barnard, Wilson and MacLean, 1987, 1988). Work has recently been completed on an integration of these small systems into a general purpose demonstrator running on an IBM PC, using the expert system shell Xi+ produced by Inference Ltd. This system is able to model the three examples described earlier in this paper, along with several other cases drawn from the domain of HCI. These examples include the design of command names, presentation of task structures in training, and the grouping of operations within different pop-up or pull-down menu locations. The disparate nature of these design scenarios, and the comparatively consise nature of the rules within the expert system, encourages us to believe that the expert system can be expanded to cover a wider range of design questions without encountering the problem of ‘knowledge explosion’ that occurs when a modelling approach requires the consideration of content or semantic knowledge as well as structural information.

6. The implementation of an ICS structural modeller The current implementation of the general purpose demonstrator (figure 8) is subdivided into several distinct knowledgebases which pass control between each other with some iteration. The first knowledgebase, Introduction, collects up general information which configures the modelling (i.e., restricting the modelling to certain classes of user, or certain stages of the interaction, and requesting particular levels of detail in the output) and which describes invariant aspects of the design (such as the type of peripherals, pointing devices, reasons for use of the design, etc).

general background information

database of background information

introduction

controller

structure of interface

design reports analyst

structural descriptor

modelling reports usability reports

parser data base of s pecific infor mation

modeller model of cognition

ac vis bs mpl obj prop implic analysis specific information

information entered by consultant information output/input by system

art limb

flow of control between knowledgebases

Figure 8: the knowledgebases implemented in a general purpose approximate modeller of user cognition Some sample rules from this knowledgebase are: when input_from_user includes “direct manipulation” then do form command_form when input_from_user includes “keystroke” then do form keystroke_form when keystroke_form includes “string entry” or “control codes” then command_form includes “verbal”

13

The first rule checks the value of the identifier ‘input_from_user’, and if the person consulting the system has specified that the interface includes direct manipulation, it presents a dialogue called ‘command_form’ which allows the consultant to say whether the manipulated objects are iconic (eg a palette of drawing tools in a graphics program) or verbal (eg a pull down menu of textual command names). The response is stored in the identifier ‘command_form’, which has been defined as an identifier that can ‘include’ several different values simultaneously—so the consultant could specify both iconic and verbal (eg an array of icons that are labelled with text, like files within the Macintosh Finder). General information of this type is saved to enable subsequent consultations with the modeller to be resumed at this point, allowing varying design options to be compared without requiring the background information to be respecified (a complete transcript of a consultation, giving the questions asked by the expert system and the designer’s answers, together with an explanatory commentary, is given in the Appendix). The second knowledgebase, the Controller, co-ordinates the passing of information between the main components of the expert system.

6.1 Entering the structural descriptions The first step that was illustrated in figure 1, which maps from this ‘real world’ information to an acquisition representation, is largely contained within the third knowledgebase, the Structural Descriptor. Its role is to build up descriptions of the structural aspects of the interface, which will need to be available for the second set of knowledge to produce an approximate model of cognition. Because this structural description contains no information about the meaning of the elements, but consists solely of descriptions of the presence, absence and strength of relationships within the structure and of the size of each element’s substructure, this knowledgebase is able to operate within any domain of inquiry, facilitating the generalisation of the modeller to cover novel scenarios and design problems. In the current version of the expert system implementation, the Structural Descriptor allows the consultant to describe any or all of three structures, corresponding to the cognitive phases of goal formation, action specification and action execution. These phases, based upon the ideas of Norman (eg Norman, 1986), represent the main types of cognitive activity that must be performed in any short term sequence of behaviour. They can be thought of as reflecting the stages of realising that something needs to be done, deciding what to do, and then doing it. In the goal formation phase, the consultant may enter a list of text labels that identify the ‘task steps’ required in the interaction—in the email example from section 2, for example, the elements to be labelled would be the eight steps that establish and act upon the information in an incoming message. These labels are not analysed in themselves, but are just used to co-ordinate the entry of structural information and the subsequent production of reports from the modelling. Once the system has been given the labels for the elements, it asks a series of questions that gather up the description of the structure. It begins by asking which superordinate group an element belongs to, and again the consultant simply enters a text label. In the email example, they would use either two or four superordinate group labels, depending upon which of the conditions they were modelling—two groups (perhaps labelled ‘establish’ and ‘action’) would correspond to the condition in which users had to perform the four establish commands, followed by the four action commands, while four groups (perhaps labelled ‘pair1’ to ‘pair 4’) would correspond to the condition where they performed the commands in four establish–action pairs. Note that these are placeholders for the eight task steps that would be carried out on each trial, rather than the exact commands that would be given, since the commands vary from trial to trial. The consultant is then asked if the element is fixed in its sequence or position within the group, in relation to the other elements. This attribute can be used to establish the ‘order uncertainty’ discussed in section 2. The next question asks how well the ‘meaning’ of the group encompasses the element, with the consultant being asked to select from the options ‘better than other groups’, ‘as well as other groups’, or ‘worse than other groups’. This covers the possibility that users might expect an element to belong to a different group in the structure from the one the designer intends it to belong to, and is an example of the expert system relying upon the consultant to supply the semantic knowledge that it cannot infer. Note that the consultant is not asked to give an absolute quantitative evaluation of the goodness of fit of the element’s membership of the group, but just a qualitative estimate compared to its possible membership of the other groupings. Two other questions of this type ask how distinguishable each group is from the others (‘very distinguishable’, ‘distinguishable’, or ‘confusable’), and

14

how closely the form or appearance of each element corresponds to its underlying meaning (‘good’, ‘average’ or ‘poor’). Together these allow the system to infer the level of ‘item uncertainty’ within the superstructure. These attributes are collected for each of the elements that the consultant has labelled. Since there can be a great deal of similarity between the attributes that the elements within a structure have, a facility is provided to allow the consultant to ‘copy’ the attributes from an element that they have already described onto other elements. The structure that is entered for the phase of action specification describes the decision space of the user, and should contain the set of commands or operations from which they can construct their side of the interaction. In the email example, the consultant can describe the complete set of twelve commands, with the two superordinate groups of ‘establish information’ and ‘act upon information’. Once the labels have been entered, the same questions are asked as were asked in goal formation, with the exception of the one which set a value for ‘order uncertainty’, since this is not relevant to action specification (the decision space does not have any ordering). It is replaced by a question that asks whether an element could be confused with any other elements in its group, since this is a possible source of ‘item uncertainty’ in the context of action specification. For the phase of action execution the consultant can describe the interface objects that the user has to interact with. Whether or not they choose to enter a structural description for this phase of the interaction, they are asked what the nature of the elements are, being given a choice of icons, screen objects, operations, text strings or ‘other’. This information is used to help select contextually relevant statements in the subsequent reporting phase (if ‘other’ is chosen then they are asked to enter a text label), and to control the modelling process so that irrelevant steps can be omitted or defaulted. An option is also provided in the action execution phase to allow the consultant to enter the structure of each interaction object, so that the Parser knowledgebase can derive values for item uncertainty between the elements by assessing the similarities between the structures of elements within and between groups, on the basis of disambiguation depth. If the structures of the icons from the study described in section 3 are entered, for example, the Parser derives the disambiguation depths given in tables 1 and 2. Once the structural descriptions have been entered, the data is once again saved to allow reconsultation, and the Controller and Modeller knowledgebases carry out the second step in modelling, to produce models of the flow of activity through the cognitive subsystems.

6.2 Constructing the model of cognition The Modeller is able to call seperate knowledgebases which each contain rules specific to the processes within a single cognitive subsystem. The knowledgebases that will be called in any of the three phases of cognition which are modelled correspond to the subsystems that would, according to ICS, need to be involved in the mental processing of the information involved in that phase. The Modeller contains a set of rules defining these sequences for various mental operations. For example, if it has inferred that the process transforming Object code to Morphonolexical code is involved in the cognition (as it would be for any interaction that involved reading words), the Modeller calls the Obj knowledgebase to determine how well proceduralised the transformation is, given the state of the object record contents and the nature of the codes being transformed. An example of a rule in the Obj knowledgebase that might fire in this case is: if phase is goal formation and subset of model is procedural knowledge and output code is mpl and display text is legible then approximate utility of basic units at obj_to_mpl of task specific in goal formation = 5 This rule, if it fires, sets an attribute called ‘approximate utility of basic units’ for the transformation of object code to morphonolexical code to its highest value (all such attributes take values from 1 to 5). This shows that procedural knowledge is available to transform the units of the object representation into a morphonolexical representation. In this instance, some extra information about the interface design is required in addition to the structural information, namely whether or not the display text is legible. When the expert system encounters a rule like this with an untested fact it asks the consultant for additional information to allow it to decide whether the fact is true or false—in this case the simple question:

15

How easy is the display text to read?

legible indistinct

Because these questions are only asked when the Modeller has identified a potential problem in the interface design, according to the structural description given earlier, they are restricted to a minimum, avoiding the need for the designer to give a detailed description of aspects of their design which turn out to be irrelevant to its usability.

6.3 Analysing the cognitive models Once the models have been built by the Modeller and saved, the Controller knowledgebase calls the Analyst. The Analyst carries out the third step, to map from a model of cognitive activity to a description of user behaviour, a design commentary or a description of the model in terms of the ICS approach. For clarity of output, the reporting rules are tested in three passes, so that those relevant to goal formation are fired first, followed by those for action specification, and finally those for action execution. An example of an output rule which refers to user behaviour in the action specification phase is: when analysis phase is action specification and reporting style includes behavioural reports and approximate utility of basic units at prop of experiential task records in action specification=5 then report Occasional users are likely to be able to recall this command set. This rule fires when the commands described in the action specification phase have the highest value for their approximate utility—which means that they have low item ambiguity (so are not likely to be confused with each other) and their names are well related to the effects that they produce. The distinction between the three steps in modelling make it possible for the Analyst to produce different levels of detail or forms of analysis from the same underlying model, depending upon the uses to which the analysis is to be put. It is thus feasible to extend the capabilities of the system by incorporating additional rules to map onto novel forms of analysis, without revising the rules which operate in the first step, building the acquisition representation, or the second step, in constructing the model of cognition. Once the report has been produced, the consultant can return to one of several points in the consultation and repeat the subsequent modelling, giving them the possibility to change the answers that they gave to questions asked during the modelling process, or to redescribe the structures of the interface. The Modeller and Analyst will then run on the new information, and produce a revised analysis which can be used to contrast different design options. If, after entering one of the conditions in the email example of section 2, a designer ran the consultation again and gave the alternative task structure, they would find that the Analyst reported “With practice, users will be able to learn the correct order of commands” for the four pairs of commands, but “Users’ difficulty in remembering the correct order of commands will persist despite practice” for the two sets of four commands.

7. Conclusions The approach to user modelling that we have presented here involves building ‘approximate’ descriptions of the cognitive activity underlying task performance in human-computer interactions. This approach does not aim to simulate exactly what is going on in the user’s head, but to capture the salient features of their cognitive processing. In this way we strive to achieve economy in our theoretical representation and its predictive mechanisms. The actual technique requires several sets of production ‘rules’. One set of rules maps from the real-world description of an interface design to an acquisition representation which describes the structural features of the interface. A second set of rules operates to build a model of the user’s cognition required to use the interface, while further rules map from the model to properties of user behaviour. We have illustrated how a set of rules can be re-used to gather up structural descriptions of user tasks, visual interface objects and screen layouts. In doing so, we have tried to show how such descriptions provide a firm basis both for predicting performance in experimental settings, such as learning task structures or searching for icons, and also for analysing and understanding user behaviour with dynamic display implementations, such as the hypertext example.

16

The actual decompositional technique is itself very simple. The rules that carry out the decomposition do not, in and of themselves, have to know about the semantics of the items being described, although the ‘labelling’ of constituents in those descriptions (provided by the consultant) implicitly carries semantic information. Within the wider modelling technique the semantic properties of the items are further assessed and elaborated—but by a separate set of rules, whose operation in the derivation of behavioural predictions we have also illustrated through properties assigned to the structural descriptions. Techniques like cognitive complexity theory, programmable user models or various grammars require detailed specifications to be generated for each application modelled. Often each application can require the specification of many rules, the construction of which requires a modelling expert. All of this work has to be redone for each problem considered. With approximate modelling of cognitive activity, the main effort goes into the initial production of the knowledge bases. We currently estimate that a useful and stable body of ‘core’ modelling rules can be developed on a foreseeable timescale. Once that core is stable, the input and output rules can be tailored relatively quickly to meet the demands of new applications. This part of the process obviously requires a cognitive specialist with significant craft skill and knowledge of the underlying theory. The end user of the modeller, operating in the design context, may require some relevant theoretical skills but need not be particularly skilled in the mechanics of cognitive modelling. Their skills lie more in the anticipation of likely sources of difficulty in a design, so that they know what aspects of the interface to describe to the expert system, and the appropriate level of precision with which to answer its questions. They would also be skilled in suggesting viable alternatives for the designers to consider. The role of the expert system modeller in the design process is to support the human factors specialist by automating the application of cognitive theory. Even with the current implementation of the modeller, it is quite possible to build a significant number of different models covering a range of possible issues in an interface design space in the course of an afternoon. This is a meaningful target for such an ‘automated’ modeller. Actually knowing about the form of the underlying model is also helpful, though. In the course of developing the modeller we have found that attributes in the model space actually reminded us of problems and issues in a design scenario that we had not directly considered. Clearly, we do not yet have a complete cognitive modeller. We need to add to the three sets of rules. The first set, which build the structural descriptions, need to be modified to allow additional attributes of an interface to be specified. The core rules need to be supplemented with a capability for handling the dynamics of system behaviour over time and for multisensory integration. Such rules are required to cope with multimedia designs. This may require a shift away from the description of structures for each ‘phase of cognition’, towards a description of the interface in terms of the representations used by each of the cognitive subsystems. This would make it possible for the person consulting the modeller to describe an interface that used, for example, sound co-ordinated with a visual display. The second set of rules, building the models of cognition, can also be expanded to cover a more complete range of cognitive models. The current version has rules that deal mainly with the short-term dynamics of humancomputer interaction (i.e, the performance of a single interaction), with fewer rules modelling the very-short term dynamics (ie the interleaving and oscillation between different cognitive configurations or simultaneous tasks) or the long-term dynamics (ie, the pattern of learning over several interactions with the interface). The third set of rules, mapping from the models to user behaviour, is currently in a fairly basic form and produces a limited range of behavioural predictions. These can be improved by the addition of more and more detailed output statements, associated with finer granularity in the attribute values in the cognitive models. We are collaborating with groups researching designers’ requirements, and expect to be able to modify the way that the modeller produces reports so that they are more closely tailored to the realities of the design process. In principle a smaller group of specialist modellers could maintain, extend, tailor and empirically validate the system to bring new or more specific applications within its scope. We anticipate that these developments can be accomodated within the structure of the expert system described here, and that the concepts used to move from the theoretical description of the interface to the approximate model of user’s cognition will not require major elaboration.

17

8. References Arend, U., K-P. Muthig and J. Wandmacher: 1987, ‘Evidence for global feature superiority in menu selection by icons’, Behaviour and Information Technology, 6, 411-426 Barnard, P.J.: 1987, ‘Cognitive Resources and the learning of computer dialogues’. In: J.M. Carroll (ed): Interfacing Thought: Cognitive Aspects of Human Computer Interaction, Cambridge, MA: Cambridge Univ. Press, pp. 112-158 Barnard, P.J., J. Grudin and A. MacLean: 1989, ‘Developing a science base for the naming of computer commands’. In: J.B. Long and A. Whitefield (eds): Cognitive Ergonomics and Human Computer Interaction, Cambridge: Cambridge Univ. Press, pp. 95-133 Barnard, P.J., A. MacLean and N.V. Hammond: 1984, ‘User representations of ordered sequences of command operations’. In B. Shackel (ed) Proceedings of Interact ’84: First IFIP Conference on Human-Computer Interaction, Volume 1. London: IEE, pp. 434-438 Barnard, P.J. and J. Teasdale: 1991, ‘Interacting Cognitive Subsystems: A systemic approach to cognitive affective interaction and change’. Cognition and Emotion 5 (1), 1-39. Barnard, P.J., M.W. Wilson and A. MacLean: 1986, ‘The elicitation of system knowledge by picture probes’. In Proceedings of CHI’86: Human Factors in Computer Systems, New York: ACM, pp 235-240. Barnard, P.J., M. Wilson and A. MacLean: 1987, ‘Approximate modelling of cognitive activity: towards an expert system design aid’. In J.M. Carroll and P.P Tanner (eds) Proc CHI+GI ’87—Human factors in computing systems and Graphics interface, New York: ACM, pp. 21-26 Barnard, P.J., M. Wilson and A. MacLean: 1988, ‘Approximate modelling of cognitive activity with an expert system: A Theory based strategy for developing an interactive design tool’ The Computer Journal, 31, 445-456 Card, S.K., T.P. Moran and A. Newell: 1983, ‘The psychology of human computer interaction’ Hillsdale, NJ: Lawrence Erlbaum Eason, K. and Harker. S.: 1991, Human Factors contributions to the design process, in B. Shackel and S. Richardson (eds) Human Factors for Informatics Usability, Cambridge: Camb. Univ. Press. Gardiner, M.M. & Christie, B.: 1987, ‘Introduction’, in M. M. Gardiner and B. Christie (eds) Applying Cognitive Psychology to User Interface Design, Chichester: Wiley & Sons, pp. 4-12. Green, A.J.K and P.J. Barnard: 1990, ‘Icon Interfacing: The role of icon distinctiveness and fixed or variable screen location’. In D. Diaper, D. Gilmore, G. Cockton and B. Shackel (eds) Proceedings of Interact ’90, Amsterdam: Elsevier Scientific Publishers B.V., pp 457-462 Long, J.: 1989, Cognitive Ergonomics and Human-Cmputer Interaction . In: J.B. Long and A. Whitefield (eds): Cognitive Ergonomics and Human Computer Interaction, Cambridge: Cambridge Univ. Press, pp. 4-34 Myers, K.J. and N.V. Hammond: 1991, ‘Consolidated Report of workshop on scenario matrix analysis’. Esprit 3066 ‘Amodeus’ Deliverable D9, Dept. of Psychology, Univ.of York, UK Norman, D.: 1986, ‘Cognitive Engineering’. In D.A. Norman and J.W. Draper (eds) User Centered System Design, Erlbaum, Hillsdale NJ, pp. 31-62. Simon, T.: 1988, Analysing the Scope of Cognitive Models in Humnan-Computer Interaction. In. D.M. Jones and R. Winder (eds) People and Computers IV, Cambridge: Cambridge Univ. Press, pp. 79-93.

18

Appendix: A consultation with the ICS Expert System Cognitive Modeller. This appendix lists the questions asked by the expert system and the answers given by a designer who is considering the email scenario described in section two . The design option they are consulting the expert system about is the one in which the commands must be entered in four pairs, each pair consisting of an ‘establish’ command followed by an ‘action’ command. The first four questions are asked by the Introduction knowledgebase. The option selected by the designer is shown in bold—only one option can be selected for a question, except those marked † , when multiple options can be chosen, and * , where the designer enters a text string or a number (this question-and-answer style of dialogue is somewhat constrained by the expert system shell that was used to develop the modeller).

†How are the commands communicated to the system: †What form do the keystroke commands take?

direct manipulation keystroke

string entry control codes function keys

†How does the system communicate with the user?

vdu print Is the user’s task determined by the information on the screen? no

yes

The next series of questions are asked by the Structural Descriptor. For the first phase of cognition, it asks about the structure of the task steps. In this scenario, the designer enters eight labels to stand as placeholders for the four pairs of commands that would be performed on an individual trial.

Do you want to enter a structure for task steps?

yes no

*The task steps form 4 superordinate groups *Please list the basic units: establish1 action1 establish2 action2 establish3 action3 establish4 action4 The following questions concern the relationships between establish1 and the superordinate group that it belongs to. *What is the name of the group that establish1 belongs to? pair1 Is establish1 fixed in sequence or position with respect to its superordinate group? yes no How well does the meaning of the superordinate group encompass establish1? better than other groups do as well as worse than On processing the group that establish1 belongs to, will users encounter it first? yes (ie on natural or pragmatic grounds) no

19

How distinguishable from the other superordinate groupings is the one that establish1 belongs to? very distinguishable distinguishable confusable How closely does the form or appearance of establish1 correspond to its underlying meaning good average (ie if the meaning is known, how poor well can the form be generated?) This has completed the entry of the attributes for the first unit in the structure. Similar questions would be asked for the other seven, but there is a facility to copy the entries from one unit to another. In this example, the designer copies the attributes of establish1 to the other seven units. The only question that has to be answered again for each unit is the one asking for the name of its superordinate group. Each time a novel label is entered as the name of a group, the questions relevant to groups are asked, ie those beginning ‘On processing the group…’ and ‘How distinguishable from the other superordinate groupings…’ When all units have been described to the Structural Descriptor, it proceeds to the phase of Action Specification, and asks for the structure of the command set. This time the designer enters all twelve of the commands used in the interface, clustered into the two groups of ‘establish’ and ‘action’.

Do you want to enter a structure for commands?

yes no

*The task steps form 2 superordinate groups *Please list the basic units: append author confirm cost dispatch display enter identify index locate stamp store The following questions concern the relationships between append and the superordinate group that it belongs to. *What is the name of the group that append belongs to? action How well does the meaning of the superordinate group encompass append ? better than other groups do as well as worse than Is the meaning of append confusable with that of any other units in its superordinate group? yes no How distinguishable from the other superordinate groupings is the one that append belongs to? very distinguishable distinguishable confusable

20

How closely does the form or appearance of append correspond to its underlying meaning good average (ie if the meaning is known, how poor well can the form be generated?) Once again, the attributes for the other twelve commands can be copied from those entered for append, apart from the group name. The first time that establish is entered as the group name (for the command author), the question beginning ‘How distinguishable from the other superordinate groupings…’ will have to be answered again. Once this phase has been completed, the Structural Descriptor asks for details of the Action Execution phase.

What is the nature of the basic units relevant to action execution?

icons screen objects operations (if you select ‘other’ you will be text strings asked to enter a name for them. other Do you want to enter a structure for text strings? yes no

The designer could proceed to enter the structures of the text strings, in which case they would describe the strings letter by letter, and the Parser could determine their item ambiguity. In this case, however, the designer is not interested in the usability of the command names themselves, and so decides not to enter the structure for this phase. If they do not enter a structure, however, and the Modeller finds that it needs this information, the designer can volunteer approximations for attributes that the Parser would have inferred. The Modeller now takes over, and in the course of modelling, the following questions are asked by the individual cognitive subsystem knowledgebases:

How easy is the display text to read

legible indistinct How easy is the display text to understand? clear unclear Pragmatic constraints upon present at all levels the task sequence are: present for groups (these are grounds for the user present for units within each group to know that some tasks have absent to be done before or after others) Are users given direct instructions about what to do? yes no The full command names are non words lexical The command names have everyday meanings which are used lost The everyday form of the command names are kept lost What is the level of confusability of the text strings high medium low zero (The previous question would not have been asked if the designer had entered a structural description of the text strings).

How good can the users be expected to be at typing

21

novice intermediate expert

The behavioural output produced by the Analyst, given the model produced by this input, is as follows:

Analysis for goal formation: Users will initially experience difficulty in remembering the correct order in which to issue the commands. With practice, users will be able to learn the correct order of commands. Users should be able to infer what to do next from knowledge of ordering. Users will be able to generalise performance to different sequences of these commands, if that is required. When planning how to carry out individual task steps, the overall co-ordination and control of users' mental activity will be quite straightforward During planning, processing activity will involve relatively few shifts of emphasis between 'structural' and 'semantic' forms of mental representations. Analysis for action specification: This set of command names will not be difficult for users to learn. Occasional users are likely to be able to recall this command set. When formulating the constituents of individual task steps, the overall co-ordination and control of users mental activity will be very straightforward. During action formulation, processing activity will involve relatively few shifts of emphasis between 'structural' and 'semantic' forms of mental representations. Analysis for action execution: When engaged in the performance of individual task steps, the overall co-ordination and control of users mental activity will be very straightforward. During performance, processing activity will involve relatively few shifts of emphasis between 'structural' and 'semantic' forms of mental representations. Overall, the task phases of planning, formulating constituents and actually performing them, will be smoothly interrelated.

22

List of contents 1. Introduction 2. Modelling of task structures 3. Modelling visual interface objects 4. Modelling visual interface layout 5. Interacting Cognitive Subsystems (ICS) 6. The Implementation of an ICS structural modeller 6.1 Entering the structural descriptions 6.2 Constructing the model of cognition 6.3 Analysing the cognitive models 7. Conclusions 8. References Appendix: A consultation with the ICS Expert System Cognitive Modeller. Tables: Table 1: Structural details of Abstract icons used by Arend et al (1987) Table 2: Structural details of Representational icons used by Arend et al (1987) Figures: Figure 1: Steps in modelling user cognition to provide design advice Figure 2: Example task structures from the two conditions used in the e-mail task of Barnard et al (1984) Figure 3: The two icon sets used by Green and Barnard,1990 (following Arend et al, 1987) Figure 4: Icon search latencies (from Arend et al, 1987) plotted against evaluation depth categories Figure 5: The layout of the hypertext screen discussed in Myers and Hammond, 1991 (from the condition including the index button) Figure 6: The basic architecture of a cognitive subsystem Figure 7: The three classes of subsystem, with arrows showing the flow of information between them Figure 8: The knowledgebases implemented in a general purpose approximate modeller of user cognition

23