Man-Computer Interaction: A Challenge for Human Factors Research. By R. S. NICKERSON. Bolt Beranek and Newman Inc., Cambridge, Massachusetts, U.S.A..
164
Man-Computer Interaction: A Challenge for Human Factors Research By R. S. NICKERSON Bolt Beranek and Newman Inc., Cambridge, Massachusetts, U.S.A. This paper claims that the increasing heterogeneity of the community of conmputer users poses a challenge to psychologists and human factors researchers. There follows a brief discussion of why this challenge apparently has not yet evoked a strong response. Three problems, or problem areas, are identified as being particularly in need of human factors research. These are (1) the development and evaluation of conversational languages, (2) the determination of how the use patterns adopted by users depend on system characteristics, and (3) the description, or modelling, of man-computer interaction.
1. Introduction The June 1968 issue of Computers and Automation lists, under 59 major headings, some 1400 applications to which computers are being put. One wonders if space could be saved by listing those areas in which the computer is not being tused! But what is more significant, to this symposium, than the proliferation of uses for computers, is the fact that the community of computer users is becoming increasingly heterogeneous. Until fairly recently the user community was a select group of individuals sharing a common interest in computer technology per se. Today it includes many scientists, engineers, administrators, educators and students whose main area of expertise is not computer technology, and who are interested in the computer only as a tool that may facilitate their work in other fields. In the foreseeable future, we are told, it is likely to include the average housewife. As Mills (1967, p. 227) puts it, 'the future is rapidly approaching when " professional " programmers will be among the least numerous and least significant system users '. The challenge that this development offers to psychologists and human factors researchers is obvious; unfortunately, it is not so obvious that this challenge has yet been accepted. In fact, opinions to the contrary have been expressed, on occasion, rather forcefully. Davis, for example, in her 1966 review of work in the area of man-machine communication, made the following appraisal of the impact of human factors research on the development of interactive displays: 'It appears to this reviewer that the clamour of humanengineering specialists to have their field recognized as essential to display technology has somewhat subsided. Having gained the attention they sought, they were faced with the reality of having " to put up or shut up ". Their field be it one of science or technology-was not up to the claims of its proponents. As a result, their work now appears to be primarily that of laboratory experiments, of cementing terminology, and of limited operational design on displays with a simple, straightforward purpose, e.g. altimeter design for aircraft cockpits' (Davis 1966, p. 240). It is not my purpose to debate whether this remarkable indictment was justified, but simply to note what the first Annual Review of Information Science and Technology records concerning the contribution of human factors research to the advancement of one aspect IH
ERG.
165 of man-computer interaction- and that an aspect for which the importance of human factors considerations is particularly clear. In a speech before the American Management Association in 1967 Turoff phrased his appraisal of the impact of psychology on the design and development of interactive, or immediate-access, computer systems rather more euphemistically. But the message was the same. 'In attempting to aid mental processes, we would conjecture that, perhaps, the field of psychology would be able to contribute greatly to the construction of meaningful immediateaccess systems. However, when one looks at some of the current systems of this nature, it becomes quite evident that the evolution of these systems has not been overly influenced by this field' (Turoff 1967, p. 2). Unfortunately, little has happened in the meantime that would be likely to make the author of this appraisal change his mind. If the assessment were accurate then, it is little less so now. A careful search of the major human factors, and applied psychology, journals-Ergonomics, Human Factors, IEEE Transactions on Man-Machine Systems (formerly IEEE Transactions on Human Factors in Electronics), The Journal of Applied Psychology, The Journal of Engineering Psychology- is sufficient to convince one that, if there is not a lack of activity on the part of psychologists and human factors specialists in the area of man-computer interaction, there is at least a lack of visible evidence of that activity in those places where one tends to look to find out what is going on in the human factors world. In short, there is remarkably little evidence of research that has been undertaken for the express purpose either of increasing our understanding of man-computer interaction or of providing information that will be useful in the development of systems that are optimally suited to users' needs and preferences. I have been able to find only about a dozen articles in these sources on the topic of interest, about half of these being in the special (March 1967) issue of the IEEE Transactions on Human Factors in Electronics, edited by Robert Taylor. Why should this be so? If, as is being claimed, the interactive computer system ushers in the age of machine-aided cognition, and opens up all the exciting possibilities that that notion implies, one would expect that researchers in the field of psychology-or human factors- would be deeply involved in both the design and study of such systems. How do we explain the fact that they are not? While this is not an easy question to answer, it is possible, I think, to identify several more or less plausible attitudes that might tend to dampen one's enthusiasm for undertaking research in this field. One might take the position, for example, that research in this area is unnecessary, that man-computer systems are not different in principle from other man-machine systems, and that the information needed to 'human engineer' such systems already exists. From this point of view, the need is not to acquire new information but to apply that which we already have. I do not believe this position is really defensible. While one would be hard-pressed to argue convincingly that designers of interactive systems have made maximum use of the type of information one finds in human engineering handbooks (e.g. Morgan et al. 1963, Woodson and Conover 1964), even if the utmost care were taken to ensure conformity to acknowledged human factors principles of system design, the designer would still find it necessary to base many of his most
166
critical decisions on common-sense guesses or intuitive hunches. This is so because inany of the questions that are relevant to the design of an interactive computer system have not been encountered in other contexts. In the case of no other man, machine system, for example, does it make sense to refer to the interaction,as a conversation '. The designer of an interactive systemn is faced with genuinely new IQuestions, the answers to which he will not find in a
handbook.
Another possible attitude, and one that 1 find somewhat more defensible, is that research in this area is futile. Computer technology is moving so fast, so this argument goes, that knowledge obtained through experimentation is bound to be obsolete before it can be applied. Design improvements are discovered faster by trial and error than they could be by laboratory research. New systemns incorporate, for the most part, the better features of their predecessors, while improving upon their weaknesses. Moreover, new systems come along with such rapidity that computers may well replace fruit flies as the primary object of study for those scientists who are concerned with the processes of evolution and natural selection. It seems to me that this argument has some force. Computer technology is indeed moving very fast, and research that is too specifically directed toward ironing the wrinkles out of any particular system is unlikely to be very useful. However, this is not an argument against research per se, but against research that is too narrowly conceived; the point being that if research in this area is to be worth while, it must be directed toward the discovery of general principles which can then be translated into the specifics of particular system designs. But, if we are not willing to admit that research on the human factors aspects of man-machine interaction is futile, we must at least concede that it is difficult. If a researcher wants to run controlled experiments on an operational interactive system, he faces many problems. First of all, he must have such a system available for experimental purposes which can be costly. Moreover, if he must run his experiment while the system is servicing a user community, he must be careful that the mnanlipulationis he performs for the purposes of the experiment do not degrade the system's service to the nonexperimental users. Assuming that he can solve these and similar practical problems, there remains the critical one of ensuring that the data he collects will have implications that are not limited to the system and situation in which they were obtained. This problem is a very difficult one. A few efforts have been made to investigate experimentally the relative merits of interactive against batch-processing modes of operation, and the results bear witness to the problems involved in attempting to draw general conclusions from experimental data obtained with particular and highly idiosyncratic systems (Adams and Cohen 1969; Gold 1967; Grant and Sackman 1967; Schatzoff et al. 1967; Smith 1967). (See Lampson 1967 for a discussion of some of these problems.) But the fact that a problem is difficult is not usually sufficient cause for ignoring it-unless, perhaps, it is clearly intractable. Let us assume that 'difficult ' is the appropriate word here, and that ' intractable ' is too strong. In another place, some colleagues and I have taken the position that research on interactive systems is needed, and that this need represents a challenge to which the human factors community should respond (Nickerson et al. 1968). I want to argue this position again today and to comment on three specific
167
problem areas to which
we might address ourselves;: (1) the development and evaluation of conversational languages; (2) the investigation of how the use patterns adopted by users depend on system characteristics and on system dynamics in particular; and (3) the problem of describing, or modelling, mancomputer interaction.
2. Conversational Languages
What makes the man-computer interaction qualitatively different from other types of man-machine interaction is the fact that it may be described, without gross misuse of words, as a conversation. That is to say, the interaction involves a two-way exchange of information in the form of commands, requests, queries, answers to queries, and messages of sundry sorts. For all practical purposes, the user of an interactive system may think of the computer simply as a thing that 'understands' a 'language ', or that obeys commands that are expressed in strict accordance with a set of rules. It is not necessary for him to know anything at all about the nature of the machine that is on the other side of his wall plug. It is desirable, however, that he becomes fluent with the language in which his conversations with the machine will take place. As languages begin to proliferate, it is natural that classification schemes emerge. Thus, already a number of distinctions are mnade: general-purpose v special-purpose, procedure-oriented v problem-oriented, business or.administraalgebraic v list processing, supervisory v service. (The last distinction is particularly relevant to the use of a time-sharing system since a user typically converses with the machine in both types of language during any given on-line session. Licklider(1968) haspointed out that there is a problem of ' interlingual compatibility stemming from this fact.') Such distinctions suggest the need for a variety of languages of different types that are suited to different purposes. But when we confine our attention to a particular type of language and discover the variety of dialects that exists, we are bound to suspect that this sort of variability reflects a lack of understanding of what the users' needs really are. BASIC (Kemeny and Kurtz 1966), CAL (Morey et al. 1969), CPS (Andrada et al. 1969), JOSS (Baker 1966; Marks and Armerding 1967) and TELCOMP (Myer 1967), for example, are conversational, algebraic languages, each of which was developed for a user community composed primarily of non-programmer scientists, engineers and students. Although these languages have many similarities, they also differ from other in numerous ways; e.g. the basic operations available to the user, each the command grammar, the constraints imposed on variable names, conventions for defining functions and procedures, array-handling and subscripting techniques, output-formatting options, data-handling methods, provision for program annotation, preprogrammed functions provided, mneans for program segmentation, editing features, string-manipulating techniques, and flexibility of executing program segments. The important question, from the human factors point of view, which of these differences are inconsequential, and which are likely to be reflected in the quality of the man-computer interaction. What are the basic operations that should be available to the user? How should the command grammar be defined? How much freedom should the user have in assigning names to variables? What sorts of program segmentation schemes should be provided? tive
scientific,
168
Such decisions, typically, have been made by programmers or desiginers in accordance with their own preferences and design philoso-
And so forth.
system phies. In some
cases the,differences in the resulting languages stem from differences among the hardware systems for which the various languages were originally developed. (A language may be given certain properties in order to compensate for the limitations or to capitalize on the idiosyncrasies of a particular machine.) More often, however, they represent differences of opinion concerning how an interactive language should be designed. Taylor (1967) notes the importance of 'software human engineering' and illustrates the need by mentioning, among other things, the differing requirements of expert and novice users for language aids. How does onie design a conversational language so that the language aids satisfy the requirements of
all levels of expertise with the system ?
users representing
All
tend to be
users
impatient with redundancies in the language and with non-informative computer-to-user messages. The problem is that the extent to which any particular communication from the computer is redundant or non-informative depends on the amount of experience that the user has had with the system. Thus, a novice needs self-explanatory messages expressed in natural language; an expert, on the other hand, will undoubtedly be happier with messages that are
coded
possible-in
cryptically
as
accordance
his
with
mnemonic
own
conventions.
One common approach to this problem has been to allow the
himself
rates;
different
at
level
recognize
to
fails
his
of
appropriate
messages it
to
as
level.
that
that
thus
any
he
quite
the
familiar
with
to classify
user
produce
computer
approach
this
with
different
masters
user
be
may
have
problem
One
given
then
and
expertness
of
aspects
respect
to
that
is
system of
some
the
system's features, while remaining unskilled with respect to others. possibility
alternative
self-explanatory statement;
first; (G) and
output
to
by responding
program
and
Secondly,
to
the
the
the
the
user
Thirdly,
knowledge the
which
gramming
capability some
The
as
for
procedure
he
(CPS)
factors
all
the
would
does
one
IBM
recognizes
05/360 It
point
of
different be
the
is
message
facilitate
the
about
view?
uses
in
any
always
to
all
one
et
the
specifically
he
available
languages
anl In
Conversational Pro-
provide
errors
ways,
of
op.
a
one
message
an
interactive which
has
;
of which
relating
is to
to
'.
request,
the type of
cit.).
conversational language
be
y
computer-to-
by simply typing'
wished
to
when
just that
of
syntactic
al.,
a
same
users.
technique
output
evaluating
requests the
all
for
acquisition
The to
message
error-handling
an
of several
the
Suppose
is.
what
to
make
conversational languages currently
names
user
approach,
unless
message
developed
was
responds
carriage return, go
this
in
complete
a
time-consuming messages unnecessary.
error that was committed (Andrada How
only
appropriate
are
lengthy
a
should
which
may respond
the
of operation
receives
make
With
the
if
always have the option of terminating any
features.
then
by pressing
the
output with'
self-explanatory
soon
System
of these
user
will
should
user
message
user
the
two-character) code and statement
An
message
output the coded form of the
self-explanatory
never
computer-to-user
each
store
to
mode
same
it,--although
case,
to
one- or always
(b)
coded
requests needed.
(a)
a concise (perhaps
two forms
it,
be
would
in
compared;
a
tabular
use. each
from
Column row
a
human
comparison
of
headings
heading would
169
identify a feature with respect to which the comparison was to be made. Then each column of the table would constitute a description of a particular language in terms of the specified set of features. But what are the features with respect to which the comparison should be made? What would an ideal language look like in terms of these features? Given that none of the existing languages is ideal, how are we to combine the various features to come up with a single figure of merit? Ruyle et al. (1967) compared four systems that were designed to provide on-line assistance for mathematical analysis involving the manipulation of non-scalar entities such as functions and matrices. The systems compared were AMTRAN (Reinfelds et al. 1966), the Culler-Fried System (two versions) (Culler and Fried 1965), the Lincoln Reckoner (Stowe et at. 1966), and MAP (Kaplow et al. 1966). The paper points out the relative strengths and weaknesses of each of the systems with respect to a variety of capabilitiers and operating characteristics; it also makes clear the difficulty of arriving at a single figure of merit with which one might rate an overall system. (For another comparison of several programming languages see Raphael et al. 1968.) It may seem out of place to suggest that human factors specialists should take an active interest in the design and evaluation of conversational lang-uages. Traditionally, human factors activity has been oriented very much toward hardware design problems. But, traditionally, men have not 'conversed' with their machines, at least not according to standard operating procedures. In discussing engineering-psychology problems in communications, Klemmer (1968) has pointed out the need to 'be interested in more than optimizing the man-machine interface; more than speech intelligibility, more than optimizing displays, controls, more than optimizing information flow and control actions in well-defined systems. We must be concerned with improved methods of communication generally; with information dissemination and retrieval; with indexing, with ways of characterizing, describing and measuring communication (not merely information flow) The observation has equal force, I think, whether one is concerned with communication among people, or between people and machines. 3. Effects of System Characteristics on User Behaviour The way in which one interacts with a system must depend to some degree on the nature of the facilities that the system provides. But such dependencies are relatively obvious: one cannot input or output graphics, for example, if the system does not have graphical input and output devices. A more difficult question concerns the way in which use patterns are influenced by system dynamics and operating characteristics. How is the user's strategy affected by such things as the system's accessibility, responsiveness, predictability, charging policy, etc.? Most of the evidence on this question, so far, is anecdotal; however, it is suggestive of possible directions for research. As one example of how system dynamics may affect use patterns, Scherr (1965) noted that the effect of placing a scheduling penalty on large programs (in MIT's Compatible Time-Sharing System) was to reduce the average program size by one-third in a space of three months. This observation led him to suggest that 'scheduling, if properly executed, could be used to "mould " the
170
users to some extent by assigning priority on the basis of job type, program size, program running time, user think time,, etc., .etc. Then,. the user, in trying to " beat the system ", will tend to conform to the image of what the writers of the scheduling program considered to be the ideal user' (p. 105). Another example of an operating characteristic that has been observed to influence use patterns is the ' overhead associated with getting on and off the system. A user of a system that relies on magnetic tape for its bulk storage capability is likely to encounter particularly long delays when he first enters the system and calls system programs, or his own user programs, off the tape; similarly, he may have to wait a long time when getting off the system if he wishes to create new tape files before doing so. If he tends to use the system frequently, but for short periods of time, he will find that a large fraction of his on-line time is consumed by the process of starting and stopping the work session. Thus, he is encouraged to adopt a pattern of use in which the sessions are relatively long and infrequent. With a rapid-access bulk storage capability, on the other hand, the (time) overhead of entering and exiting can be considerably reduced and the pressure of planning long work sessions mliy be diminished as a consequence. At Bolt Beranek and Newman we have recently converted from magnetic tape to disc for bulk storage on a time-shared system that is used. by the technical staff. Although it is too early to say definitely, it looks as though one of the effects of this conversion is indeed to change use patterns in the way suggested, i.e. the frequency of work sessions seems to be increasing while the mean duration of a session is going down. A third example of a system property that may affect user behaviour is system response time (Carbonell et al. 1968). Fast response times are generally considered to be the sine qua non of an interactive system. Moreover, ' the faster the better might appear to be a reasonable design objective. However, response times tend to vary with such things as the load on the system and the type of command that is being executed; and there is some indication that the variability of the response-time distribution may be nearly as important, from the user's point of view, as its meani. It appears that the degree to which a user is frustrated by a delay is not simply a function of its duration. Rather, it depends on such psychological factors as the degree of uncertainty in the mind of the user concerning how long the delay will be, the extent to which the actual delay contradicts his expectations, and what he considers to be the cause of the delay. That the user's uncertainty concerning how long he must wait for a response the computer may, in some cases, be a greater source of aggravation than the delay itself was pointed out by Neisser (1965). One implication of this is that a relatively long response time of constant (and known) duration may, in some cases, be more tolerable than a highly variable one with a relatively short mean. In keeping with this idea Simon (1966) has suggested that some th-ought be given to the possibility of quantizing delays rather than allowing them to vary over a continuous range, the idea being that, if the computer cannot respond to a command immediately, the delay should, perhaps, be artificially extended to a known duration that would be long enough to allow the user to attend temporarily to another task. The importance of the user's expectations has been tacitly recognized by at least one supplier of time-shared services, who programmed artificial delays into
froqn
171
the system so that it never responded as though there were fewer than three or four users on line. Thus, even if a user happened to be the only one on the system at a particular time, the delays that he encountered were similar to those that he would have encountered had he been sharing the system with two or three others. The purpose of this subterfuge was to guard the user against discovering the fact that there were conditions under which the system's response time was very short indeed. Given that these conditions for extremely short-response times were rarely realized, it seemed better to the system designer that the user never be allowed to experience them. It was felt that an awareness of what the system could do under ideal, circumstances would only serve to make one less satisfied with the system under normal operating conditions. The idea that the apparent cause of a delay determines,-to some extent, the user's willingness to tolerate it is also based strictly on anecdotal evidence, but seems eminently plausible. If a user has given the computer a command which involves a great deal of computation, he will perhaps be less frustrated by a delayed response than if he has simply asked that it type a program statement. That is to say, he is less likely to be frustrated by delays as long as he feels the computer is working on his problem. Being ignored-or apparently ignored-is what hurts. It has been suggested that the user's terminal should contain a light which would be on whenever the computer is attending to his program, thus making those long delays that result from the user's demands more tolerable (Turoff 1967). The problem with this solution is that it may also increase the user's frustration over those delays during which the computer is, in fact, ignoring his program. The first empirical evidence that system response time can indeed affect the user's strategy for interacting with the system has been reported by a group of researchers at MIT's Lincoln Laboratory (Yntema 1968, Grossberg et al. 1969). Svbjects used the Lincoln Reckoner system (Forgie et al. 1966), to solve a variety of computational and graphical problems under conditions in which the system's response time was intentionally varied. Although the results depended somewhat on the nature of the problem being solved, they clearly showed that users adapted to the system's dynamics. In some cases, as response time increased, subjects took more time to solve a problem, but they solved it with fewer commands. As Yntema put it, 'as an interaction becomes more expensive, the subject makes fewer of them. He evidently thinks harder about each interaction, and tries to make each one count '. This experiment is significant, not so much because of the particular results obtained, however, but as a demonstration that controlled studies of interactive problem solving can be made, and meaningful results obtained. We have argued in another place (Carbonell et al., op. cit.) that the procedure for calculating the cost of use of an interactive system should be of interest to the human factors specialist as well as to supplier accountants. The reasons for this claim are (1) different suppliers of time-shared computing services use different algorithms for computing costs; and (2) it seems reasonable to expect that different charging procedures will foster different patterns of user behaviour. For example, the user who is charged solely on the basis of ' hookup ' or' on-line ' time may feel compelled to interact with the system in a rather different way than the user whose charge is strictly a function of the
172
amount of CPU time used. Many suppliers figure both on-line and CPU time into their cost calculations, but they do not all use the same combination rule. Although little empirical evidence is yet available on exactly how costing procedures reflect themselves in user work habits, there are many indications in the literature of an awareness of the significanee of this question. O^Sullivan (1967), for example, discusses the different billing philosophies of a number of time-sharing services to which one company simultaneously subscribed, and points out the fact that some costing structures may work to the advantage of users whose CPU-time/on-line-time ratio is low, while others favour those whose compute demands are high. The differences, in this case, were apparently significant enough to lead the (buyer) company to assign users with low-compute demands to one vendor and those whose compute demands were high to- another. Turoff (op. cit.) has suggested that, when a company subscribes to an interactive computing service for use by its employees, the employees should not be charged for their use of the system on an individual basis. Rather, he argues, ' financing should be out of overhead for a group of similar users ... The user should have the same attitude toward this service as he would have in obtaining the use of a secretary from a pool existing for that purpose ' (p. 10). Whether this is a viable suggestion remains to be seen. The point I wanit to make is that the suggestion stemmed from a concern for the possibility that charging on an individual basis might discourage potential users from learning how to use the system; one might hesitate to 'waste his budget on the learning effort'. We might note, in passing, that the highly successful Dartmouth College time-shared system is operated-within the college community on such a basis. All students are free to use the system-through the nlumerous terminals scattered over the campus-whenever and for whatever purpose they like. ' The student is not charged directly for his computer use; the charges for all students are sent as a single bill to the college at the end of the year' (Kurtz 1969, p. 653). It would be absurd, of course, to attribute the success of the Dartmouth experiment to this policy alone; nevertheless, one suspects that things might have gone differently had student use of the system been carefully monitored and tightly controlled. Selwyn (op. cit.) has discussed the problem of pricing and has argued that not only should computing facilities and services be charged on a ' pay for what-you-use basis ', but communications charges should be similarly derived. At the moment, the latter are figured in terms o-f the time that one is on the line and the length and bandwidth of the line. Selwyn's contention is that a equitable charging scheme would conisider only the number of characters m'ore -or, perhaps, bits- actually transmitted. From the human factors point of view, a particularly interesting suggestion that he makes is that the user should be given certain options on the type of service he receives. For example, he should be charged less when he is willing to use the system during nonprime time, or to accept a relatively low priority in the service queue. The variety of pricing sehemes currently in use and the diversity of opinions concerning what pricing policy should be guarantee that differences will exist for some time. Moreover, as the facilities and services offered by suppliers become more varied and complex, so will the algorithms tor computinLg costs. The interesting problem for the human factors specialist will be to discover
173
:how alternative charging policies affect the ways in which users interact with the systems. 4. Problem of Description The problem of description is this. There exists an interactive, timesharing compuLter system which is designed to service x users, simultaneously. How are we to describe the system, its capabilities, and its operation? Describing the system itself is an easy problem-at least if we are willing to consider a specification of components and their interconnections to be a description. Describing the system's capabilities is more difficult. Even in the case of conventional batch-processing systems, anyone who has had the problem of comparing the capabilities of various systems knows that this is a non-trivial undertaking. Such standard descriptors as storage capacity, processor speed, word length, instruction set, etc., tell only a part of the story in this regard. Thus, people who have the responsibility of evaluating or comparing the capabilities of different systems tend to rely heavily on the use of so-called 'bench-mark' problems or programs. Candidate systems are evaluated in terms of the way in which they perform some well-defined task (e.g. computing a payroll, or inverting a matrix of a specified size). Given such a task, one can-compare systems with respect to such measures as the size of the program required, execution time, and aecuracy of the results. Although the bench-mark technique has been a useful one, it has its drawbacks. First, there is the problem of deciding what tasks should be used as bench marks. This is easy if the system is going to be used repetitively to perform the same, or a small set of, tasks-as would be the case in many business applications. It is less straightforward, however, if the system is to be used for a wide variety of tasks, some of which are not even kniown in advance-as is often the case in scientific applications. The fact that a computer system makes a good showing on one bench-mark task is no guarantee that it will do so on others. Second, there is the fact that the computer's showing on a bench-mark problem depends not only on the capabilities inherent in the system, but also on the skill of the man who programs it. Third, bench-mark evaluations typically overlook such nontrivial factors as the time and effort required to develop and debug the program that is to perform the bench-mark task. The difficulty of describing the system's capabilities is magnified in the case of multi-user interactive systems, because what a system can do will depend, usually in nonlinear ways, on the particular mix of demands placed upon it. Moreover, if the system is intended to be a general-purpose computing facility for a heterogeneous community of users, meaningful bench-mark problems are next to impossible to identify. In evaluating interactive systems, one typically resorts to a check-list approach in which a variety of pertinent questions is considered regarding such things as languages and subsystems provided, average load on system (how close to saturation is the system usually run?), storage limitations, types of terminals used, costs (including communication costs), system-response time, reliability, accessibility, etc. (Computer Research Corporation 1968, Hammersmith 1968, O'Sullivan op. cit.). How one combines the answers to these questions in attempting to assess the relative merits of a particular system is not very clear.
174
Tougher still than the problem of describing a system's capabilities is that of describing its operation. It is this problem, however, that should be a particular challenge to the psychologist. How is one to describe the mancomputer interaction? What are the concepts that we need? In other areas of man-machine interaction, notably manual control, it has been considered essential that the behaviour of the man and that of the machine be described in the same terms. It seems likely that, as long as it is the interaction that we wish to describe (and not just one or the other side of it), the same principle will hold when the machine happens to be a computer. There is a danger here though to; which psychologists should be particularly -sensitive; namely that of choosing concepts which are well-suited to describing the dynamics of a computer system, but which capture little of the psychology of the man-computer interaction. We shall return to this point presently. Most of the work that has been done in this areas to date has been directed to the description of system time parameters. This commonly involves defining a state space, nioting the times at which a transition is made from one state to another, and then calculating such measures as the relative amnounts of time spent in each state, probabilities of transitions between specific states, distributions of intertransition times, etc. The problem of generating a set of states is that of establishing a set of eqjuivalence classes which are: (1) exhaustive, (2) mutually exclusive, (3) unambiguous and (4) interesting. That is to say, it is necessary that there be a class for every possible observation, that no observation belongs in more than one class, that there be well-defined criteria for determining to which class any particular observation belongs, and, finially, that the classification scheme be non-trivial. What one wants to be able to do, in short, is to look at the system (maln and/or computer) at any point in time and to be able to answer the question (in terms of the activity classes incorporated in the mnodel) 'What is he/it doing now? And one would like the answer to be an interesting one. The problem is essentially that of deciding on a set of descriptors which avoids the extreme of triviality on the one hand and that of unmanageable complexity on the other. What constitutes an interesting state taxonomy depends on the model builder's purpose. A model that is useful to a systems programmer trying to develop efficient scheduling algorithms may be of little interest to a psychologist attempting to understand interactive problem solving. And the converse is equally true. It may be instructive to consider several examples of the state-space modelling approach in enough detail to gain some appreciation for the latitude one has in defining system states. Perhaps the first attempt to describe the dynamics of an interactive system (M.I.T.'s Compatible Time-Sharing System) in this way was made by Scherr (op. cit.). At the most molar, or macroscopic level, he distinguished two states: ' the working part of the interaction ', the time during which the user is waiting for the system, and ' the console part of the interaction ', or the time during which the system is waiting for the user. At a more microscopic level, these two states were broken down into six: dead ', ' dormant ', ' working ', ' waiting command ', ' input wait ' and output wait ', one, and only one of which characterizes the internal disposition of the user's program in the CTSS system at any point in time. (The third and fourth constitute the working part of the interaction; the console part is made up of the remainder.) No finer grain partitioning of activities was
175
attempted. For example, no distinctioni was made between the time that the user spends inputting data and the time he spends thinking; 'thinking time' was equated with the console part of the interaction in this analysis. Within this framework, Scherr considered how various performance parameters were related to the number of users on the CTSS system, and was able to predict how modification of the scheduling algorithm would effect overall system performance measures. Brown (1965) used a similar approach to obtain some time measures on ani interactive system used in the analysis of bubble chamber films. (The user's role in this system is to identify those aspects o,f a film which warrant quantitative analysis. This he does by typing command sequences which specify the measurements that are to be made and by pointing with a light pen to specify those parts of the film frame to which the commands apply.) Brown distinguished three states of the system reflecting three functions of the executive program: (1) ingesting data from the film, (2) controlling typewriter dialogue and (3) maintaining the output data (CRT) display. By studying the frequency distributions of delays terminated by different state-to-state transitions, Brown was able to draw some conclusions regarding the economics of time sharing such a program. Wallace and Rosenberg (1966) have discussed the problem of describinig systems which have a sufficiently large number of states (i.e. thousands) to exclude the feasibility of either standard analytic techniques or Monte Carlo simulation. They propose, as one approach to the problem, the determination of equilibrium probabilities of a system by means of iterative numerical methods which capitalize on the sparseness and internal redundancy of the matrices which specify the state-to-state transition probabilities. They were able to show impressively large savings in computation requirements; however, they were careful to point out that the description of a system in terms of Markov chains implies a certain amount of sophistication and cleverness at best, and at worst may be impossible. Smith (1966) distinguished five phases of system operation in terms of the disposition of the program of a particular user: (1) CPU execution, (2) an information transfer involving high-speed core memory and an intermediate bulk-storage module, (3) operator response, (4) an information transfer involvilng a very large capacity file-storage module, and (5) queueing for phases 1, 2 or 3. Associated with each of the first four phases is a single parameter, the mean execution time for that phase. (Execution times were assumed to be exponentially distributed.) The state of the system at any point in time was described by a four-component vector, each element of which represents the length of the queue awaiting one of the four basic operations. Given this conceptualization, Smith then suggested several measures which one might use to relate the performance of the system to such variables of interest as the nature of the scheduling algorithm and the number of users in the system. Among the various measures mentioned, the two most relevant to the problem of describing the man-computer interaction are (1) 'user busy fraction: the average fraction of each interaction period that a user is busy, that is, making a response', and (2) ' user program response: the average fraction of the total time a user program is eligible to use system processors (CPU and data channels) that it does actually use them '. The first measure, aecording to Smith, ' probably
176
represents the average user's subjective evaluation of the system '; the second 'is a measure of the overall queuemg delays experienced by user programs' (p. 90). Grignetti (1969) has recently developed a queueing model of the Bolt Beranek and Newman TELCOMP system, in which he conceptualizes the system as a set of 'servers ': run server, drum server, TTY-in server and TTY-out server. A user's program is said to be in one of five states (run, drum, TTY-in, TTY-out, or wait), depending on whether it is being served by one of the above servers or is in a queue waiting for service. Grignetti's primary objective was to identify bottlenecks in an operating system, that is, to determine which components impose the throughput limitations under different degrees of saturation. This he was able to do by investigating how a variety of measures (probability density of time in each state probability density of number of users in each state, first-order state-to-state transition probabilities) depend upon the number of users on the system. (For a summary of several other queueing models see Estrin and Kleinrock 1967.) The state space approach to the problem of describing the behaviour of an interactive system has the virtue of conceptual elegance: one can visualize the behaviour of the system (or a program) in terms of a directed graph in which the nodes represent states and the weights on the inter-node paths represent state-to-state transition probabilities. Moreover, the taxonomies that have been used have the advantage, in most cases, of clear-cut objective criteria for determnining what state the system is in at any time (sufficiently objective, in fact, that the computer can note the times at which trransitions from one state to another occur and can keep a record of the amount of time spent in each state). The possibility of ambiguity is ruled out by the way in which the states of the system or program are defined. Finally, the approach has proved useful in the solving of practical problems. It has been,possible, in this way, to identify throughput bottlenecks, and, hence, to determine how scheduling algorithms might be modified to reduce waiting time and make more efficient use of the svstem's resources. From a human factors point of view, the main weakness in the descriptive frameworks that have been developed is that they shed little light on the psychology of the man- machine interaction. The user tends to appear as a unit in various service queues and is of interest primarily because of the demands that he places on the system's resources. One would like to see some descriptive taxonomies and models in which the user is at least as much a focal point as the system with which he is interacting. This is not to suggest that taxonomies of this type should replace those considered above, but, rather, to stress the need for approaches to the problem of description which would complement the current emphasis on thenon-psychological aspects of the interaction. Carbonell has recently developed a conceptual model of man-computer interaction which does in fact focus on the man. In his words, his purpose was to model 'what a human operator does when he is sitting in front of a console of a time-sharing system: how he inputs a problem, observes results, and makes decisions conditional upon those results and other factors; how he is affected by either intrinsic or operating-system characteristics, etc.' (Carbonell 1969, p. 16). It is interesting to note that one of his reasons for taking this approach, rather than a more conventional one, was the conviction that man is the more
177
constant member of the man-computer partnership, and, hence, more and more likely to become the limiting factor in man-computer achieveme-nts as computer technology advances. By focusing on the man, and treating the computer as a tool that he uses, one increases his chances of developing models that are not system specific and that may be valid for a significant period of time in spite of the rapidity of the technical developments in the computer field. Carbonell conceptualizes the man-computer interaction process as involving communication between two information structures, one of which is internal to the man, the other being within the computer. Mills (1966, 1967) has expressed a similar view. The modelling task then is to describe the process by which these structures, particularly that of the man, become modified. The user interacts with the computer in general accordance with some plan of action in an attempt to reduce or eliminate his uncertainty regarding some problem. In the cases of greatest interest, the plan is not detailed; that is, the man does not preprogram himself, and, consequently, he is constantly evaluating the computer's output, reformulating hypotheses concerning possible problem solutions, and deciding among alternatives for subsequent courses of action. The dynamics of this process of hypothesis formulation, testing, reformulation, etc., is what Carbonell suggests should be the locus of interest, and for which he attempts to develop a descriptive -framework. The problem is that he finds himself with a variety of variables which are considerably more difficult to measure than those incorporated in the models mentioned above. Nevertheless, it seems to me that the attempt to model the psychology of man-computer interaction is a step in the right direction, and it is to be hoped that this attempt will be followed by others shortly. 5. Concluding Remarks There are, of course, other problem areas, besides the three mentioned, that could profit from human factors research. A particularly obvious case in point is the problem of interface design. Although many of the most promising and exciting forms of man-computer interaction involve user terminals with graphical capabilities, it seems probable that for some time to come a keyboard device will be the basic component of the input-output terminal for many, if not most, users of interactive systems. If this is a reasonable assumption, it makes sense to devote considerable effort to the human engineering of these devices. At the moment, the most common keyboard instruments are standard, commercially available typewriters and teletypes. With the exception of the work done by the RAND Corporation in connection with their development of the JOSS system (Baker 1967), little attempt has been made to determine what constitutes an optimally designed keyboard for the user of an on-line system. What are the symbols that such a device should have? What mathematical and logical symbols does the user need? What special formatting symbols would facilitate his work? Is it important that he have both upper and lower case characters? How should the symbols be arranged on the keyboard? What can be done to help make programs and hard-copy records of the man-computer dialogue easier to read? What are the relative merits of large keyboards with many keys, each having a unique function as opposed to small keyboards with few keys in which each key can have many functions? Or, in more general terms, what is the optimal combination of
178
number of keys per keyboard and number of functions per key? Perhaps the basic question is whether it is reasonable to think in terms of a general-purpose keyboard. Possibly different applications imply radically different designs. Hopefully, the pressures to improve keyboards, which should increase with the number of users of on-line systems, will provide the needed stimulus for more research addressed to these and similar problems. Corbato and Fano (1966) have noted the need for considerable improvement in output devices before time-sharing can eventuate in general intellectual utilities. They point out that 'efficient communication betweern the timesharing system and its users will become at least as important as the operation of the system itself ' (p. 1 36), and they consider the challenge that this problem presents to system designers to be a particutlarly crucial one. The need for a low-cost, reliable, versatile durable, easy-to-use terminal becomes increasingly great as the community of users of interactive systems grows and becomes increasingly more heterogeneous. One wonders how soon computing and information utilities would become a reality if such a device were currently available. Another research need has been pointed out by Licklider in his 1968 review of man computer communication. He mentioned two areas that he had particularly wanted to cover, but could not for lack of significant material. The areas to which he referred-man-computer interactive techniques and on-line problem solving and decision making are both areas with which human factors researchers should be particularly concerned. This paper has been rather narrowly focused on one type of man-computer interaction; namely that which may be characterized as 'conversational ' and which occurs in a time-sharing environment. There are other types of interaction between men and computers, however, that also should be of concern to the psychologist and. human factors researcher. Let me draw. an example from the field of psychology itself. Until a very few years ago the use of a computer for the real-time control of psychological experimnents was something of a novelty; today it is commonplace (Uttal 1968). A major factor contributing to this development has been the fact that the cost of computing hardware has decreased to the point that many individual researchers have found it possible to have their own 'dedicated' machine(s) (Markowitz 1969). The notion of man-computer interaction has a somewhat different connotation when applied to the long-term relationship between a scientist and the machine that he uses to control his experiments, than when applied as it has been in this paper, but it is no less interesting for that. It seems reasonable to expect that this type of interaction may have significant effects not only on the research strategy and experimental procedures that one employs but on the nature of the problems that one works on and the theories that he develops as well. In any case, here is an interesting question for the psychologist who finds the behaviour of his colleagues a worthy object of study. The remarks in this paper have been based on the assumption that the community of computer users will continue to increase, both in size and in heterogeneity. If that assumption is sound, then human factors problems will become increasingly significant for the design and use of computer systems. To paraphrase Holmes (1969), the need of the future is not so much for computer-oriented people as for people-oriented computers. Perhaps the very
179
fact that a symposium such as this has been organized may be taken as an indication that human factors researchers are beginning to respond to that need. This work was sponsored hy the Electronic Systems Division (ESKK), Air Force Systems Clommand, ITSAF, L. Cx. Hanscom Fieldi uin(ler ARPA Order No. 627, Contract No. F19628-68-C0125.
References ADAM.S, J., and COHEN, L., 1969, Time-sharing vs instant batch processing: An experiment in programmer training. Computers and Automation, 18, 30-34. ANDRADA, J. E., et al., 1969, CPS Terminal Users' Manual. TUCC Memorandum No. LS-55, Triangle UTniversities Compuitation Center, Research Triangle Park, N.C. BAKER, C. L., 1966, JOSS: Introduction to a helpful assistant. Memorandum RM-5058-PR, The Randl Corp., Santa Monica, Calif. 13AKER, C. L., 1967, JOSS: Console (lesign. Memorandum RM-5218-PR, The Rand Corp., Sainta Monica, Calif. 13RoWN, R. M., 1965, An experimental study of an on-line man-computer system. IEEE Transactions on. Electronic Computers, EC-14, 82-85. C'ARBONELL, J. R., ELKIND, J. I., and NICKERSON, R. S., 1968., n the psychological importance of time in a time-sharing system. Human Factors, 10, 135-142. CARBONELL, J. R., 1969, On man-computer interaction: A model and some related issues. IEEE Trans. on Systems Science and Cybernetics, SSC-5. COMPUTER RESEARCH CORPORATION, Fall 1968, Time-sharing system scorecard: A survey of on-line multiple user compuiter systems, No. 7. CULLER, G. J., and FRIED, B. D., 1965, The TRW two-station on-line scientific computer. In Computer Augmentation of Human Reasoning (Edite(d by H. A. SASS and W. D. WILKINSON) (Washington, D.C.: SPARTAN BOOKS). DAVIS, R. M., 1966, Man-machine communication. In Annual Review of Information Science and Technology, 1 (Edited by C. A. CUADRA) (New York: INTERSCIENCE PUBLISHERS). ESTRIN, G., and KLEINROCK, L., 1967, Measures, models and measurements for time-shared compuiter utilities. Proceedings Association for Computing Machinery, 85-96. FANO, R. M., and CORBATO, F. J., 1966, Time-sharing on computers. Scientific American, 215, 128-140. GOLD. M. M., 1967, Time-sharing and batch processing: An experimental comparison of their value in a pi-oblem solving situation. Dept. of Computer Science Report, Carnegie-Mellon
University. GRANT, E. E., and SACKMAN, H., 1967, An exploratory investigation of programmer performance under on-line and off-line conditions. JEEE Transactions on Human Factors in Electronics, HFE-8, 33-48. GRIGNETTI, M. C., 1969, Microanalysis and probabilistic modelling of a time-shared system. Bolt Beranek and Newman Inc., Cambridge, Mass., Scientific Rept. No. 11. HAMMERSMITH, A. G., 1966, Selecting a vendor of time-shared computer services. Computers and Automation, 12, 16-22. HOLMES, D. C., 1968, Computers in oil-1967-1987. In Computer Yearbook and Directory, 2nd e(l. (Edited by F. H. GRILLE) (Detroit: AMERICAN DATA PROCESSING).
KAPLOW, R., B3RACKETT, J., and STRONG, S., 1966, MAP, a system for on-line mathematical analysis: Description of the language and user manual. Massachusetts Institute of Technology, Cambridge, Mass., Project MAC Technical Rept. MAC-TR-24. KEMENY, J. G., and KURTZ, J. E., 1966, Basil, 3rd ed. (Hanover, N.H.: DARTMOUTH COLLEGE). KURTZ, J. E., 1969, The many roles of computing on the campus. Proceedings-Spring Joint Computer Conference, 649-656. KLEMMER, E. T., 1968, Key engineering psychology problems in communications. Paper presented as part of a symposium entitled ' Key Research Problems in Engineering Psychology ' at the Annual Convention of the American Psychological Association, San Francisco, Calif. LAMISON, 13. W., 1967, A critique of 'An exploratory investigation of programmer performance under on-line and off-line conditions'. IEEE Transactions on Human Factors in Electronics, HFE-8, 48-51. LICKLIDER, J. C. R., 1968, Man-computer communication. In Annual Review of Information Science and Technology, 3 (Edited by C. A. CUADRA (Chicago, Ill.: ENCYCLOPAEDIA BRITANNICA, INC.) MARKOWITZ, J., 1969, Can a small machine find happiness in a company that pioneered timesharing? DECUS Proceedings (in press).
180 MARKs, S. L., and ARMERDING, G. W., 1967, The JOSS primer. Memorandum RM-5220-PR, The Rand Corp., Santa Monica, Calif. MILLS, R. G., 1966, Man-computer interaction-present and future. IEEE International Convention Record, 14, 196-198. MILLS, R. G., 1967, Man-machine communication and problem solving. In Annual Review of Information Science and Technology, 2 (Edited by C. A. CUADRA) (N.Y.: INTERSCIENCE PUBLISHERS). MOREY, C. F., SMITH, P. W., Jr., and STERN, R.,, 1969, CAL Manual, 13olt Beranek and Newman Inc., Cambridge, Mass. (in preparation). MORFIELD, M. A., WIESEN, R. A., GROSSBERG, M., and YNTXEMA, D. 13., 1969, Initial experiments on the effects of system delay on on-line problem-solving. Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, Mass., Tech. Note TN-1969-5. MORGAN, C. T., Cook, J. S., III, CHAPANIS, A., and LUND, M. W. (Eds.), 1963, Human Engineering Guide to Equipment Design (N.Y.: MCGPRAW-HILL). MYER, T. H., 1967, TELCOMP Manual. B1olt 'Beranek and Newman, Inc., Cambridge, Mass. NEISSER, U., 1965, MAC and its users. Massachusetts Institute of Technology, Cambridge, Mass., Project MAC Memo 185. NICKERSON, R. S., ELHIND, J. I., and CARBONELL, J., 1968, Human factors and the design of time-sharing computer systems. Human Factors, 10, 127-134. O'SULLIVAN, T. C., 1967, Exploiting the time-sharing environment. Proceedings-Association
for Computing Machinery, 169-175.
RAPHAEL, B., BOBROW, D. G(., FEIN, L., and YOUNG, J. W., 1968, A brief survey of computer languages for symbolic and algebraic manipulation. In Symbol Manipulation Languages and Techniques (Edited by D. GO.OBROW) (Amsterdam: NORTH HOLLAND PUBLISHING CO.) REINFELDS, J., FLENKER, L., SEITZ, R., and CLEM, P., Jr., 1966, AMTRAN, a remote-terminal conversation-mode computer system. Proceedings-Association for Com?,puting Machinery, 469-478. RUYLE, A., BRACKETT, J. W., and KAPLOW, R., 1967, The status of systems for on-line mathemnatical assistance. Proceedidngs-Assciuation for Computing Machinery, 151-167. SCHATZOFF, M., TSAV, R., and WIIG, H., 1967, An experimental comparison of time-sharing and batch processing. Communications of the ACM, 10, 261-265. SCHERR, A. L., 1965, An analysis of time-shared computer systems. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, Mass., MAC-TR-18. SELWYN, L. L., 1966, The information utility. Indu,strial Management Review, 7, 17-26. SIMON, H. A., 1966, Reflections on time-sharing from a user's point of view. Computer Science Research Review, Carnegie Institute of Technology, 43-51. SMITH, J. L., 1966, An analysis of time-sharing computer systems using Markov models. Proceedings-Spring Joint Computer Conference, 87-95. SMITH, L. 13., 1967, A comparison oni batch processing anid instant turnaround. Communications of the ACM, 10, 495-500. STOWE, A. N., WIESEN, R. A., YNTEMA, D. 13., and FORGIE, J. W., 1966, The Lincoln Reckoner: An operation-oriented on-line facility with distributed control. Proceeding8s-Fall Joint Computer Conference, 433-444. TAYLLOR, R. W., 1967, Man-computer input output techniques, IEEE Transactions on Human Factors in Electronics, HFE-8, 1 4. TUROFF, M., 1967, Knowns and unknowns in immediate access time-shared systems. Paper presented at the American Management Association Conference on The Computer Utility, New York. UTTAL, W. R., 1968, Real-Time Compaters: Technique and Applications in the Psychological Sciences (New York: HARPER & Row). WALLACE, V. L., and ROSENBERG, R. S., 1966, Markovian models and numerical analysis of computer system behavior. Proceedings-Spring Joint Computer Conference, 141-148. WOODSON, W. E., and CONOVER, D. W., 1964, Human Engineering Guide for Equipment Designers, 2nd ed. (Calif.: UNIVERSITY OF CALIFORNIA PRESS). YNTEMA, D. B., 1968, Engineering psychology in man-computer interaction. Paper presented at A nnual Convention of the American Psychological Association, San Francisco, Calif.
II
ERG.