The Role of Theory in Science

Theory

Theory is the backbone of all sciences, but many researchers feel that the field of human-computer interaction (HCI) lacks a theory. HCI designers are often skeptical of the contribution that theory makes to their creative work. Engineers can be impatient with the abstract nature and lack of specific guidance provided by theoretical accounts of HCI. In light of these critiques, many people assert that theory has no role in HCI. To answer this assertion, one needs to understand the nature of scientific theories and review the history of theoretical development in HCI before envisaging the prospects for a theory in HCI.

The Role of Theory in Science Classically, the scientific method follows a cycle in which people observe phenomena within nature and notice regularities. People then infer causal theories about these regularities. People then deduce hypotheses from theories and subject the hypotheses to experimental evaluation, allowing theories to be falsified. The theories are modified or replaced, and the scientific process continues toward ever more accurate and comprehensive theories of nature. Theories have three roles to play in the furtherance of knowledge: They explain what is already known; they predict what will happen in unknown situations, and they generate the discovery of novel situations. HCI is a particularly challenging domain for the development of scientific theory. It is an interdisciplinary domain, where the sciences of psychology and computer science meet in a context driven by continual advances in electronic engineering, chemistry, and physics. Commercial and technological changes mean that the scope of phenomena is changing at a rate far faster than in any other scientific domain. We can hardly be surprised that theories have had difficulty in fulfilling even their basic explanatory role, let alone predicting the unknown and generating the new. The sciences of psychology and computer science are also comparatively young and are notably divided within themselves; thus, HCI is faced with the difficulties of communication and conceptual focus between different theorists within these two parent disciplines. The very difficulty of developing theory within a rapidly changing domain is an attractive feature for many researchers. As an applied field, HCI does hold out the prospect of fruitfully applying theoretical predictions in novel technologies and making a real difference to society. Economically, an HCI technology that has been developed with some theoretical basis is more likely to succeed than one that has been driven by feasibility alone. Attractions also exist for theorists who wish to remain within the parent disciplines but who seek new phenomena to test and to extend existing theories. Psychologists can explore perception, attention, memory, and learning within the context of entirely novel situations or virtual reconstructions of known situations, where many variables can be controlled with much greater precision than in real-world counterparts. Computer scientists can explore the behavior of complex systems when faced with the apparent unpredictability of human operators and organizations and evaluate the robustness of architectures in the context of technological implementation. These opportunities may be justification for the use of HCI as a domain for scientific investigation, but they do not require a theory of HCI, and without a theory, HCI is not a scientific discipline. A theory consists of a definition of the phenomenon that it intends to explain and of the things in the world that are thought to be involved in causing the phenomenon or that are affected by it (the theoretical entities). Crucially, a theory also defines the relationships between entities, so that if the states of all of the

relevant entities are known for a particular moment, the theory allows predictions to be made about the future for these entities within a certain time frame. The scope of the phenomenon and of the entities involved marks out the domain of the theory: the range of things that it is intended to deal with. These entities can have distinct and measurable states, or values, that vary in time and so define the variables that can be measured and used in experiments. Theories are abstract rules that people infer from many particular instances or observations but that are thought to hold true in general for all instances. They can therefore be tested by deducing specific hypotheses about situations that have not yet been observed and by applying a general rule to predict what will happen. People can then set up a situation (by controlling the values or states of the variables that a theory defines as causative) and observe the outcome. Such tests of hypotheses are called “experiments,” and they provide empirical tests of the applicability of a theory. If the outcome differs from that predicted by the theory, then it has been falsified and needs to be modified. This result may require a minor alteration to some small part of a theory, or it may require a major alteration. The result may be catered for by a situation-specific additional rule or the incorporation of an additional variable. Over time the incremental addition of specific rules and variables may lead the theory as a whole to become internally contradictory or unable to make clear predictions for as-yet-unobserved situations. At some point the theory will be rejected, and, instead of being modified, it will be replaced by a new theory. The new theory might start with a completely different set of entities and variables to explain the same phenomena as the old theory but in a more economical manner. It might divide the phenomena in a different way, so that the observable events that are thought worth explaining are not the same as those explained by the old theory, even though the domain may be the same. If the outcome of an experiment is in line with the theoretical prediction, then the theory is supported but cannot logically be said to have been proven true because other reasons may exist for why results turned out as they did. The best that we can say is that the theory has received empirical support. Not all hypothetical predictions provide equally strong empirical tests of a hypothesis: We need to take into account the likelihood of the predicted outcome, compared to the likelihood of outcomes that would falsify the hypothesis. If a large range of outcomes that are consistent with the hypothesis and a small number of potentially falsifying outcomes exist, then the prediction is not very useful because it does not give us much certainty in the particular outcome. The worst theories are those that account for all possible outcomes because such theories cannot be falsified. Theories have to be potentially falsifiable to be accepted scientifically: An unfalsifiable theory may have great explanatory power, but it has no predictive power and cannot generate new discoveries. The longer that a theory survives without substantial falsification, the more likely it is to be taken as true, especially if supporting evidence for it comes from quite different applications of the theory in apparently different phenomena. Converging evidence from different problems is the strongest support for a theory because it points to its generality and hence its value in economically explaining the behavior of a large number of variables by a smaller number of entities and rules. The true value of scientific theory for society lies beyond the purely scientific desire to explain causal relationships in nature. Theories’ definitions of phenomena, and of entities, provide a conceptual framework that can direct investigations and hence generate the discovery of novel phenomena that also need explanation. Theories therefore continually widen the scope of things that need to be explained,

potentially leading to their own rejection. The discovery of new phenomena, the construction of new technologies, and the improvement of existing technologies follow from this generative power of theory.

The Development of Theory in HCI Over time, developments in technology have meant that the interaction between human and computer has taken three forms. Originally, computers were large and expensive machines shared by expert users to perform well-defined tasks in corporate settings. These computers then developed into cheaper personal computers used by a single person to perform many varied tasks. They are now becoming fashionable consumer products, used by individuals to communicate in many settings, not least for entertainment and enjoyment rather than work (mobile phones, electronic organizers, and music players are all small, powerful computers with generic abilities but niche-specific hardware). HCI theory has had to keep pace with these changes. People could address the first two forms by considering the interaction between a single person and a single computer, often a single software program. Researchers dealt with ergonomic considerations of input and output modalities in isolation, without needing to consider the interaction itself. Psychologists theorized about the human side of the interaction and computer scientists theorized about the computer side of the interaction. Initial HCI theory was thus heavily influenced by the dominant theoretical positions within the parent disciplines and tended to consider the two sides of the interaction independently rather than the interaction itself. From psychology, information-processing models of cognition provided an attractive way to construct a model of what might be happening in a user’s head during an interaction. Psychologists Stuart Card, Thomas Moran, and Alan Newell defined a “model human processor” that, in analogy with computing devices, perceived events on a computer screen, processed the information, formed goals, decided how to carry them out, and executed actions through a keyboard (and later through a mouse). Each cognitive operation required a specified time to execute, estimated from the psychological literature, and by specifying exactly what sequence of operations was required to perform a task, an analyst could work out how long the entire task would take (this results in what has become known as a “keystroke-level model,” although what is actually being modeled are the internal mental operations). The model human processor gave rise to the GOMS approach to HCI, in which Goals are achieved through a sequence of Operators, which are collected into Methods, which are chosen through the use of Selection rules. GOMS models took a well-understood task to be performed by an expert and predicted how long the task should take. They were thus well suited to qualitative evaluations of design alternatives typical during the first phase of HCI, where implementation was expensive and tasks well defined. As the second phase developed, the weaknesses of GOMS models became more apparent. They said little about how information was acquired or structured, and they said little about the interaction at a higher level than task execution. This type of theory tended to have high predictive power within its domain, but as the domain altered to include different types of users and less-understood tasks, its predictive power waned, and its lack of generative ability become more obvious. Psychologist Don Norman defined a more abstract theory for the second phase of HCI, in which an interaction was seen as a set of seven stages, organized into two gulfs that divided the users’ minds from the world in which they were acting. The gulf of execution spanned the stages of goal formation, intention to act, action specification, and action execution and resulted in a person making some change to the state

of the world (that is, interacting with a device). The gulf of evaluation spanned the stages of perceiving the new state of the world (or device), interpreting the changes, and evaluating the outcome to compare it with the original goal. The cycle could then continue, with the modification of the original goal or the formation of new goals. This theory was more generative than the GOMS approach because the seven stages corresponded to design questions that researchers could use to guide design as well as evaluate it. Norman listed these questions as asking how easily one could determine a device’s function, identify the actions that are possible, infer the relationships between intention and action, perform the actions, identify system states, compare system states and intended states, and define the relationship between system state and interpretation. Inherent in Norman’s theory was the idea of a user’s mental model of a device, a recognition that people base their actions not on an immediate evaluation of the observable state of the world but rather on their inferences about unobservable internal states of other entities, derived from previously observed aspects. This recognition held people to be natural scientists who develop causal theories to simplify the complexity of their observable world and help them predict what is about to happen and how they can behave to influence it. This recognition was consistent with the mental models approach in cognitive psychology advocated by Philip Johnson-Laird, along with the literature on errors in human reasoning caused by phenomena such as confirmation bias, by which people fail to search for evidence that will falsify their models of the world and thus can persist with false models. A false model of the state of a computer system will, sooner or later, lead to an interaction error, the learning of an inefficient action execution (in GOMS terms), or the failure to discover system functions. The task of the designer is, in this view, to help the user to build an appropriate model by making the unobservable and observable aspects of the system correspond closely. Norman used this approach to advocate user-centered design, in which design is focused on the needs of users, the tasks that they are to perform, and their ability to understand the device and also seeks to involve them throughout the design process.

Newer Developments By the time that these theories had matured and had influenced HCI, the pace of change was already moving the field away from the dyad (pairing) of one user and one computer to more socially interactive situations in which technological devices supported communication between humans in a variety of roles, typified by the field of computer-supported cooperative work (CSCW). The models borrowed from cognitive psychology had little to say about such use. Indeed, within psychology itself a divide exists between those people who study the workings of an individual mind and those people who study the social interactions of individuals. During the 1990s HCI researchers turned to social psychologyand to the social sciences in generalto find conceptual methods to suit these wider contexts of interaction. One method that focused on the contextualized nature of HCI drew from the developmental psychology of Lev Vygotsky. He had proposed that, instead of occurring through a series of maturational stages, competencies develop independently in different domains, with little transfer of skills between domains. Children can be at different stages of development in each domain, and what is critical is their zone of proximal development, the difference between what they are already able to do and what they would be able to learn to do if presented with the challenge. Much of device use is discovering what the device can do, and activity theory applies Vygotsky’s ideas about development to knowledge acquisition in general. Scandinavian and eastern European researchers, who had (for political and cultural reasons) a traditional focus upon group and work psychology, first applied activity theory to HCI. Psychologist Susanne Bødker

shaped the approach to focus on the computing device as mediating human activity, where activity can be construed as the development of expertise or knowledge in specific contextual domains. Sociologist Lucy Suchman compared HCI with other forms of situated action, where people are able to apply sophisticated, situation-specific skills and knowledge without needing to have a mental model or any naïve theory to drive their planning. Suchman proposed that much behavior is fitted to the immediate demands of a situation, rather than being shaped by a higher goal. Behavior that appears to be rationally based and coherently directed to a particular goal is actually determined by local factors and in particular by the social and cultural context of the moment at which decisions need to be made. Although plans do exist and can be used to guide one’s behavior, they are just one resource among many other sources of information about the most appropriate action to take next. Suchman’s work was based in a sociological technique called “ethnomethodology.” This technique seeks to understand how people make sense of their world. In this way ethnomethodology is similar to the mental models approach in seeing the person as a theorist, but it takes the social interaction between individuals as its focus rather than causation in general. Applied to HCI, especially by Psychologist Paul Dourish and Sociologist Graham Button, ethnomethodology addresses the nature of work and the communication between workers who are to use a technological system. Designing the system is not so much an analysis of the functionality required and how best to provide it as an analysis of the flow of information between workers (or in general, people) and how to facilitate it.

Prospects for Theory in HCI Although the more socially oriented theories of HCI have explanatory power, especially in the hands of their proponents, they have not yet proven themselves in terms of predictive or generative power, and it is too early for us to decide whether any of the approaches will survive in their current form or whether they will need to be further adapted to the peculiar demands of HCI, with its rapid progression of technological context. HCI certainly has changed decisively from the simpler information-processing models of the 1980s, and phenomena of social interaction and communication are now key parts of the domain. We may have to concede that HCI is not a single discipline amenable to a single body of theory and that it will, like other applied sciences, continue to adapt and borrow theories from its parent disciplines, modifying them and applying them even after psychology, sociology, and computer science have moved on to other approaches. This pattern of one-way communication between basic science and applied science would weaken the claim that HCI is a valuable domain for basic scientists to explore and evaluate their theories Psychologist Philip Barnard and colleagues have proposed that what HCI needs is not a single theory because the domain includes phenomena that are being described at several levels of systemic complexity. Just as psychology has its own theories for dealing with different levels of analysis of human behavior, ranging from neuropsychology through cognitive psychology to social and organizational psychology, so HCI needs theories that deal with the traditional dyadic interactions of users and their devices, people communicating through devices, and communities interacting as groups. Each level of the system requires its own type of theory, but a new form of theory is needed to map between the concepts at different levels of theorizing. Plenty of competing theories exist for different levels, but little communication exists between them. In recognizing the need to incorporate social psychological and sociological theorizing into

HCI, we would make a mistake in discarding the existing body of cognitive and system theory simply because it does not address the new levels of interest. The task for future HCI researchers is to find ways of communicating between the phenomena explained, predicted, and generated by each different level of theory. If HCI could succeed in this, it would acquire a conceptual unity as a science and would also make a major contribution to science. Jon May See also Computer-Supported Cooperative Work; Psychology and HCI; Social Psychology and HCI Further Reading Barnard, P. J., May, J., Duke, D., & Duce, D. (2000). Systems, interactions and macrotheory. ACM Transactions on Computer Human Interaction, 7, 222262. Bødker, S. (1991). Through the interface: A human activity approach to user interface design. Hillsdale, NJ: Lawrence Erlbaum Associates. Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. Hillsdale, NJ: Erlbaum. Dourish, P., & Button, G. (1998). On "technomethodology": Foundational relationships between ethnomethodology and system design. Human-Computer Interaction, 13(4), 395432. Norman, D. A. (1988). The design of everyday things. Boston: MIT Press. Suchman, L. A. (1987). Plans and situated actions: The problem of human-machine communication. New York: Cambridge University Press.