1 THE AFFORDANCES OF MOBILE APPLICATIONS ... - CiteSeerX

1 THE AFFORDANCES OF MOBILE APPLICATIONS Sanna Raudaskoski Researcher University of Tampere, Finland Tel.+358 3 2157185/ Fax: +358 3 2156080/ Email: [email protected] Abstract This paper concerns both theoretical and practical dimensions of the concept ‘affordance’. It deals with the possibilities and constraints of the everyday usage of mobile devices. Artefacts have properties that afford various action potentials in use. Following James J. Gibson’s theory of ecological psychology, these potentials can be called affordances. Affordances are not primarily matters of perception, but of action and interaction. We can achieve a better knowledge of the interactional affordances by studying user-device systems in real life. My study of interaction belongs to the social scientific tradition. My aim is to show that conversation analysis (CA) is a good method for making affordances visible. CA, based on the analysis of everyday communication, gives methodological tools to analyse the sequential organisation of action. Affordances, for their part, give a chance to connect the analysis of this moment-by-moment proceeding activity to its material and social circumstances. The combination of affordance theory and conversation analytical methods is a new way of studying user-device interaction. Together with academic knowledge, the analytical observations that arise from this kind of research can be seen as a resource for a better design of the artefacts involved.

1. Introduction Video recorded actual uses of mobile devices make up the empirical data of my research. I have twelve video recordings of the actual use of various mobile devices.1 Preliminary findings show that in practical activities we can perceive affordances that the device does not have in its design, or we might not notice potential affordances it has. The analysis also reveals our assumptions about the properties of the application, e.g. what are the ‘necessary’ affordances of the device. At the moment I am collecting more empirical data from everyday use of mobile devices. In the following, I first present the basis of Gibson’s ecological psychology and the idea of affordances. After that, I will look at Donald Norman’s viewpoint on affordances, especially 1

This data comes from the research project done by Professor Ilkka Arminen and myself (Raudaskoski and Arminen 2003) during the Spring and Summer of 2002. The topic of that project was to evaluate the experts’ opinions of the current and future mobile media services. Eighteen people were interviewed, and twelve video recordings were made of the actual use of various mobile devices (Nokia communicator, GPRS-phone, and pocket pc).

2 in human-computer interaction (HCI) studies. I will also compare Gibson’s and Norman’s theories. Then I consider under what conditions it is possible to study the affordances of mobile devices. After that I present conversation analysis as a method for my studies. Then I will talk about usability. I suggest that the study of affordances and the study of usability are related to each other. Usability is a complex matter and the different levels of usability can be seen in relation to different kinds of affordances. I will finally illustrate this with an example of the conversation analytical way of studying affordances.

2. Gibson and the theory of ecological psychology The concept ‘affordance’ was developed by James Jerome Gibson, and it is a significant part of his theory of ecological psychology. Ecological psychology concerns mostly visual perception. For Gibson, perception is a process which unites a human being and her/his environment. The observer and the environment are complementary. The verb ‘afford’ can be found in the dictionary, but not the noun ‘affordance’. Gibson made it up. By it he meant something that refers both to the environment and the human being (animal); it implies the complementarity of the human and the environment. In Gibson’s theory, all parts of the environment afford some kind of behaviour, e.g. holding, sitting, eating and so on. We perceive these possible functions directly. When we look at objects, we perceive their affordances, not qualities. In a scientific experiment, we may discriminate the dimensions of different qualities if required, but what we normally pay attention to is what the object affords us (Gibson 1979, 134). Affordances are relative to the behaviour of the animal being considered. They are unique to the animal and they also have to be measured relative to that animal. Gibson says that affordances cannot be measured in the same way as we measure in physics (127-128). The physical qualities of an object may stay the same, but there can be different affordances in relation to different observers. The central element in the ecological theory of perception is ‘information pick-up’. Gibson asserts that the perception of the environment is direct, not mediated by retinal, neural or mental pictures. The perceiving of an affordance is not a process of perceiving a value-free physical object, to which meaning is somehow added, but is a process of perceiving a valuerich ecological object. Any substance, any surface, any layout has some affordances to benefit or injure someone. Gibson says that physics may be value-free, but ecology is not (140). Direct perception is a process of information pickup that involves the activity of looking around, getting around, and looking at things. Perception is not a reaction to the stimulus, but is an active and ongoing process, where the perceiver – without mental information processes – takes direct advantage of the information that the environment offers. The perceiver does not deal with the representations or sensations, but is through behaviour in a straight connection to the environment. For instance, light does not cause the visual perception. Light does not function as a stimulus to the eyes, because one cannot see light as energy. Things can be seen with the help of light: it reflects from the surfaces around us, separates information, and the visual perception is based on picking up information (the invariants that the light reveals), not on the mental responses to the stimuli from the eye. Information about the environment accompanies information about the self and these two are inseparable. As one perceives the environment, s/he simultaneously perceives oneself. This is

3 inconsistent with the dualism of any kind, either mind-matter dualism or mind-body dualism. The awareness of the world and one’s relations to the world are not separable. (Gibson 1979, 126, 141) The separation between the human being and the environment is artificial, because both belong to the same acting system. The Finnish psychologist Timo Järvilehto (1998) has developed a theory of organism-environment system, which has close connections to Gibson’s ecological psychology. The theory of organism-environment system starts with the proposition that in any functional sense human (or any other organism) and environment are inseparable and form one unitary system. The organism cannot exist without the environment and the environment has descriptive properties only if it is connected to the organism. According to this theory, mental activity is an activity of the whole organism-environment system and traditional psychological concepts describe only different aspects of the organisation of this system. All phenomena get their properties inside the organismenvironment organisation. As Gibson defines it, all parts of the environment are relational to the observer: edibility is relational to the mouth, teeth and food digestive organs; manageability is relational to the hands; movability is relational to the feet, and so on (Gibson 1979). Gibson says that an important fact about affordances is that in a sense they are neither objective nor subjective properties; or they are both. An affordance cuts across the dichotomy of subjective-objective. It is equally a fact of the environment and a fact of behaviour. An affordance points both to the observer and the environment. They are both physical and psychical, yet neither. Even though affordances are relational, they are not imagined. They are there, but cannot be measured as we measure physical qualities (129). According to Gibson, the affordance of something in itself does not change when the observer changes. We may or may not perceive the affordance, according to our needs, but the affordance is always there to be perceived (138-139). However, we do not know these affordances before they are revealed inside a certain observer-environment system. Although affordances are in a way constituted by objective features (invariants) of the environment, these features only become affordances when some organisms relate to them in their activity. Affordances are therefore features of activity systems that include both the physical environment and the organism (Järvilehto 1998, Bærentsen & Trettvik 2002). Let us take the example of a handheld mobile device. One critical feature of this device is that one can easily carry it. Carry-with-ability is, however, a relational quality. When experts in the mobile sphere talked about their own mobile device usage, it turned out that the same device would in a changed situation be carried differently (Raudaskoski & Arminen 2003). A communicator which fits into someone’s pocket does not fit another’s pocket. For instance, women often need a handbag to carry a communicator. Whereas a PDA-device, which fits into the pocket of the winter jacket, is hard to take along in summer, even for men. Thus, these devices afford a carry-with-able relation to the carrier. Or rather, in this case, a relation to the clothes of the carrier. Still, this example shows that focusing only on the device and its physical qualities will not on its own tell us all about the usable features of the device. 3. Norman and affordances In his book The psychology of everyday things, Donald A. Norman (1988) brought the concept of ‘affordance’ to the field of artefact design. Norman’s impact on the design of information technology has shaped the way affordances are understood inside the human-

4 computer interaction (HCI) community. Norman says that the term ‘affordance’ “refers to the perceived and actual properties of the thing, primarily those fundamental properties that determine just how the thing could possible be used… When affordances are taken advantage of, the user knows what to do just by looking: no picture, label, or instruction is required.” (Norman 1988, 9) Norman employs the term affordance, but abandons Gibson’s ecological psychological framework, inside which the term was originally developed. Norman distinguishes the real and perceived affordances, and says that in the design of objects, real affordances are not nearly so important as perceived ones. Perceived affordances tell the user what actions can be performed on an object (1999b, 123). According to Norman (1988, 55) in everyday situations behaviour is determined by a combination of internal knowledge and external information and constraints. He makes a distinction between conventions – or cultural constraints – and affordances. He sees affordances as natural constraints and potentials, and these belong to the sphere of external knowledge. Cultural constraints, instead, are something to be learnt and are in this sense matters of memory. Thus, for him, only those things that can be perceived without learning processes are affordances. He talks about two psychological spheres: the psychology of everyday things (which includes affordances) and the psychology of cognitive processes. Norman does not approve of Gibson’s idea of direct perception. He considers environmental – physical – affordances as being apart from the mental representations people have from their environment. There is a duality of the worlds inside and outside the head. This leads to a misreading of Gibson’s idea of affordances. For Gibson, an affordance is something that holds the observer and environment together. They are neither subjective, nor objective. From his point of view, there actually is only one acting system and the perceptions are collaborative ends of the action in the different parts of this system. Although Norman later speaks about affordances as some kinds of relationships that hold between the object and the organism that is acting on an object (Norman 1999b, 123), he still uses the term ‘affordance’ together with cognitive psychological assumptions. Affordances “reflect possible relationships among actors and objects: they are properties of the world. Conventions, conversely, are arbitrary, artificial, and learned” (1999a, 42). Cultural constraints and conventions develop, but not affordances. The socio-cultural world is placed outside the domain of affordances. From Gibson’s viewpoint, this kind of thinking is absurd: he created the term ‘affordance’ just to articulate the complementarity of the observer and the environment. Anyway, the problem of Gibson’s theory is that he mainly speaks about the natural environment when talking about affordances. Thus, he leaves room for an interpretation in which affordances concern only the natural environment and that direct perception is only about non-conscious information pick-up. What follows from this kind of interpretation is the idea that the basis of perception is different when perceiving the physical environment from when observing the socio-cultural environment (and that these two can actually be separated), and, further, that learnt skills have nothing to do in the process of perception. Yet, Gibson says that it is a mistake to separate the natural environment from the artificial environment, as if there were two environments. It is also, according to him, a mistake to separate the cultural environment from the natural environment, as if there were a world of mental products distinct from the world of material products (1979, 130). In the process of perception, these two worlds are together. From the ecological viewpoint, affordances and the socially based abilities belong to the same perceptive system.

5 After all, learning to perceive the affordances of cultural-historical products is a process that proceeds in principally the same way as learning to perceive natural affordances, except in some details: the features of artefacts are specifically designed with specific affordances in mind. The perception takes place in social settings and culturally-historically modified environments and includes man-made objects, including symbols. The human beings have ability to adapt, not only to the natural environment, but also to the culturally changing environment (see Bærentsen & Trettvik 2002 ). 4. Culturally provided affordances and mobile devices Mobile devices are all but parts of the natural environment. They are great examples of complex cultural artefacts through which different kinds of social practices are connected: communication, device and application design, maintenance of mobile networks, etc. Despite of this complexity, for users they afford actions. What are, then, the affordances of mobile devices? As mentioned, affordances are relations between the observer and the environment, and are based on information pick-up. So, for instance, there is something in the situation that affords ‘grasping’, but this something must not be in a material form. Not only can we pick up information from our physical environment, we can pick up information that is based on the social relations of people, for example, from regularities of language use. Affordances that relate to spoken language, texts, drawings etc. are not – from my point of view – more unreal than so called physical affordances. They are a characteristic part of the human environment. The complex environment of things, artefacts and other persons, located in space and time, is actually the resource that makes actions possible and produces the sense of actions (see Suchman 1987). A human being and her/his environment form an operative system, and the function of this system means that all senses work together to actively orient to the affordances of the environment. For me to say that there is no mental information processing does not mean that there is no thinking. Thinking belongs to our operative organism; it is our socially based sense and it works together with other senses. At the conceptual level we can distinguish biologicalevolutional and socio-cultural competence to seek information, but in fact those dimensions work together when people act. Things can be, for example, sit-on-able, they can be drinkable and they can be thinkable! As human beings, we have an ability to recognize not only our own but also other people’s relations to the environment. Human action is the process of intertwining of the body and the environment in co-operation with (possible) other people. Hence, what we concretely and consciously perceive is always related to the possibility of co-operation. All conscious things are common, shared with other people (Järvilehto 2000). That is why we have common affordances. Furthermore, we have the ability to communicate, and all affordances – in principle – can be communicative. They can be discussed with others, although many actions are taken non-consciously without any awareness of the affordances. Information pick-up is not passive processing of environmental information, but active perceving of the world-for-the-individual-in-a-particular-context. Gibson was primarily concerned with individual understanding. He asked what it is about things in the perceptible world that enables humans to see them. Although an individual has a unique relationship to the environment, the structure of the environment is shared with other people. In that sense, the possibility of perception can be said to be culturally provided. Seeing is an intersubjective

6 matter. Knowing how to look is like knowing how to speak – knowing the language games, the culture, the practices embedded in any environment (Anderson & Sharrock 1993). Mobile devices are material artefacts which afford actions. They are consequences of sociocultural practices and it may be difficult to see them as a suitable research topic relating to affordances. However, in HCI studies, Norman’s ideas about affordances have been largely accepted and studied. Even if I like Norman’s way of critically examining the design of information technologies, my position is, however, somewhat distinct from Norman’s. I do not want to separate the actor and the environment. Affordances are not qualities of objects in isolation. Artefacts, technologies and their users are interdependent. The best way to understand the affordances is to look at the co-existence of the mobile device and its user in the process of situated activity. Mauri Ylä-Kotola & Mehdi Arai (2000) propose that the theory of organism-environment system forms a better starting point for the design and study of technological systems than cognitive psychology does. The cognitive assumption of steady mental models with input and output processes makes an artificial separation between the human being and the technical device. For example, a computer user forms a functional system together with the machine. This human-device system changes along with the changes in the purpose of the action. An important part of the system can in the next moment be unimportant. During the ongoing action, there are many possible connecting points in the user interface: the user can interact with a keyboard, a mouse, a screen or the chair s/he is sitting on, etc. A person learns new things all the time when interacting with the computer. Learning makes it possible to perceive new affordances, which again makes new actions possible. Hence, the designers of technical devices should be interested in the interaction that the operating systems affords, rather than consider how these systems fit to cognitive models inside our heads. The borderline between human and the machine is not clear. The connection is contingent and changing all the time. The user interface will be organised according to the action at a specific moment. Information does not move between the user and the device, but knowledge emerges within the computer while the system is functioning. 5. Conversation analysis and the theory of affordances The basic research goal for studying the affordances of mobile devices is to explicate the relationship between the structures of actions and the resources and constraints afforded by physical and social circumstances. Particularly when studying new media technologies, one possible way of making affordances visible is to use conversation analysis as a tool for analysing data. Conversation analysis can be used as a method to analyse not only common conversations, but also situations in which the other part of interaction is linguistically or visually represented on the screen of a device (e.g. Raudaskoski 1999) or in which there is some kind of sound for the user to respond to. In conversation analysis, the focus is on the description and explication of the competences that ordinary speakers use and rely on in participating in conversation. At its most basic, CA is the study of talk-in-interaction, the systematic analysis of the kinds of talk produced in the everyday situations of social interaction. An assumption throughout CA is that human activities are accomplished as the accountable products of the common sets of procedures. In CA, the primary units of analysis are sequences. This is based on the recognition that the

7 production of current conversational actions proposes a here-and-now definition of the situation, to which subsequent talk will be oriented (Atkinson & Heritage 1984). CA reveals the regular features of actions done in conversations. There are certain systematic rules of interaction. In other words, looking at the sequence organisation reveals what kinds of possibilities for actions (i.e. affordances) there are for participants. Participants pick up information; there are invariants which we may not be consciously noticed, but which – by analysing conversations in detail – may become observable. For instance, a question can be an affordance which affords first and foremost answering. Also not answering gets its meaning in relation to the common understanding that one should answer to the question; it is an act done with respect of this understanding. Which one of these actions (answering, not answering) will be realised, directs the ongoing activity. In CA, the main focus is not on the words and sentences as such, but on the actions done by those linguistic tools. We could, of course, think that especially words are affordances to us. They are something concrete, they are ‘material’ properties in our environment. Pronouncing words causes sound waves, which our ears catch. However, pauses, which are silence, do not cause any stimulus to our senses. And yet, in conversation pauses especially offer places for action. This is understandable under the theory of affordances. Because the perception does not rest on the reactions the stimuli cause, but is a process of an active perceiver, there is a possibility to perceive the ‘meaningful something’ without any ‘real’ stimulus. The observer is oriented to some (co)behavioural end and the pauses afford actions towards that end. When people talk, they usually have a shared understanding of what is going on, and this awareness directs the action. If there are misunderstandings, they are typically corrected during the conversation. However, interacting with a technical device is largely different from ordinary conversation. The user and the device do not necessarily have shared understanding of ‘what is going on’. Anyhow, using a device is also interaction, which proceeds in turns. The device reacts somehow to the user’s actions. Even no reaction is a response. The information from the device directs the action, as does other participants' actions during a conversation. Looking at the sequence organisation of user-device interaction enables us to interpret what in the device (or on an application) works as an affordance in a certain context. It is possible to do this interpretation by studying the user-device interaction in detail. How does a user act? What seem to be the user’s public interpretations of the affordances i.e. what actions are taken as ‘responses’ to what is going on in the device? 6. Usability is a multi-level matter The usability of artefacts is often confused with their functionality. However, functionality refers to what the artefacts can do, whereas usability refers to how people work with the product. Correct functionality is important, but it does not guarantee the success of the product. A product by itself has no value; it has value only insofar as it is used. There can be high functionality, but low usability. An often mentioned example is the VCR. For users, it is often hard to program their VCRs to record a certain TV programme. The VCR may have high functionality, i.e. it works as it is designed to work, but it has low usability, i.e. people cannot use them as quickly and easily as they wanted (Dumas & Redish 1999, 4-5). This is an example of how the ‘designed affordances’ may not be the same as those which will be perceived during the use of products.

8 Usability is usually described to mean that people who use the product can do so quickly and easily to accomplish their own tasks. People are seen to consider usability in terms of the time it takes to do what they want, the number of steps they go through, and the success they have in predicting the right action to take (cf. Dumas & Redish 1999, 5). Let us think about the Short Message Service (SMS) of mobile phones. People often wonder why SMSs are so popular. The text entry by using telephone keypads is not quick or easy, at least not for beginners, and one may need to go through several steps before sending the message. The usability of SMS is, however, more than the manageability of a mobile phone. The strength of the service is in that it allows one to engage in social communication, e.g. to communicate with an absent other even in circumstances where talk is not allowed. The most important aspect of usability is that products are used to achieve certain performance goals. People’s actions are always directed towards (co)behavioural ends, even if those ends are not conscious. One could say that basically usability is about affordances. What affordances, i.e. what possibilities for action, does the ‘goal-oriented’ user perceive when interacting with the product. However, usability is not only a matter of physical affordances. To make the analysis of affordances and usability more precise, it may be helpful to make a distinction between the three conceptual dimensions of affordances. (See Table 1) Table 1: The conceptual levels of affordances (Arminen & Raudaskoski, submitted for publication) MANAGEABILITY

COMPREHENSIBILITY

ALLOW-ABILITY

What do physical forms afford? Often operational, nonconscious actions. Limited number of learnt features. Æ Problems in use make affordances observable (communicable).

How can affordances be recognized? The relation between the system image and the user’s conceptual model in situated action. The intuitive understanding of the system image may vary. First learnt features, later become practical Æ Problems in use make observable.

What do artefacts allow to do? Possible action results Æ Can be abstracted to affordances and types of affordances. Vary from physical to social, from material to mental.

When acting, the user engages in her environment in many ways. On the one hand, the user has a physical, haptic relationship to the environment. This is the level of manageability. There are operational actions that are basically physical and the user has a practical and nondiscursive relation to them. For instance, it is hard to tell others what actually happens when one drives a car or rides a bicycle, even if one does it daily. On the other hand, there is a level of comprehensibility. The comprehensibility refers to the fact that artefacts should be selfexplanatory. This means that their operation should be discoverable without extensive training, from information provided on or through the artefacts themselves. In this view, the degree to which an artefact is self-explanatory is just the extent to which someone utilizing

9 the artefact is able to reconstruct the system image regarding its use. (cf. Suchman 1987, 17). The concepts of ‘system image’ and ‘user’s conceptual model’, which here belong to the level of comprehensibility, are borrowed from Donald A. Norman (1988). The user’s conceptual model is what the user develops to explain the operation of the system. The user and the designer communicate only through the system itself: its physical appearance, its operations, the way it responds etc. The system image is critical because the user acquires knowledge of the system from that image. One can figure out how a device works if the operating parts are visible and the implications are intuitive, i.e. if there is effective use of affordances and constraints. The structure of the device, its system image, must include a visible relationship between the actions and the results. Thirdly, there is the level of allow-ability. The affordances of artefacts allow different kinds of acts. For instance, altered telephone technologies and services afford communication in different environments, and with the help of these technologies purposeful social actions can be taken. Let us return to the example of Short Message Services. The poor manageability of mobile phones’ keypads does not mean that SMSs have bad usability in general. SMS has good comprehensibility, i.e. its system image is intuitive. Its logical structure is simply to understand: you just write a message and send it. The special advantage of SMSs, however, is in their allow-ability. SMSs can be used to conduct almost the same communicative functions as phone calls. SMSs allow both sending and receiving messages in such circumstances where talking aloud is not allowed or socially acceptable. When examined in different conceptual levels of affordances (manageability, comprehensibility, allow-ability), the popularity of SMSs, despite their poor manageability, is no more a mystery (Arminen & Raudaskoski, submitted for publication). In real action all these levels of human activity work inseparably. Together with the nondiscursive practical operations, there are actions that require discursive consciousness, the understanding of common concepts. Discursive consciousness is based on cultural conventions which make affordances thinkable and communicable. In the levels of manageability, comprehensibility and allow-ability, it is possible to study different kinds of affordances without leaving some parts of human conduct out of examination. When speaking about conceptual models and system images, I do not mean that the mental and observable processes must be researched differently. I argue that the most effective way of studying both intentions and actions is to explore the real activity in detail. The break between the user’s conceptual model and system image can be seen by looking at the situated user-device interaction. The user’s conceptual model is not something constructed in advance, but is a matter of situated interaction between the user and the device. There is no need to go ‘inside the user’s head’ to see what the user’s intentions are or how she interprets the state of the world. Instead, the way that users will work with the product is a basic issue for studying of affordances. Therefore, intentions in this context do not refer to thoughts one might have when acting, but are matters of taking situated actions. In activity, intention is a perceivable effort towards some behavioural result. It can be seen in the sequential progress of humandevice interaction, especially when there are breaks, or ‘problems’, in that ongoing interaction. As Lucy Suchman points out in her study on interaction between novice users and a paper copier (1987), there is always an asymmetry between the situational resources of the user and the machine. The machine is behaving on the basis of resources provided by ‘its’ situation, the user in accord with the resources of hers. The situation of the user comprises preconceptions about the nature of the machine and the operations required to use it, combined with moment-by-moment interpretations of the evidence found in the actual use of

10 it. This interactional situation is such where, from my viewpoint, the user’s model is established. The situation of the machine, in contrast, is constituted by a plan for the use of the machine. It is written by the designer and implemented as a program that determines the behaviour of the machine (i.e. the system image). The intersection of the situations of the user and the machine is the locus both for successful exploitation of mutually available resources, and for problems of understanding that arise out of the disparity of the respective situations (see Suchman 1987, 118-119). 7. An example of studying the interaction between a user and the mobile application

Let us now examine how the methodological principles of conversation analysis can be used to study mobile application usage. This following example comes from a situation in which an experienced user is asked to show by using his Nokia 8310 mobile phone what kinds of WAP-services he regularly utilizes. In this example, it can be seen that the user’s preconception about the nature of the service, i.e. the user’s conceptual model, is combined with moment-by-moment interpretations of evidence found in the actual use of it. The system image, also, is something which exposes itself by a sequential procedure. Because the person in this excerpt is an experienced user of the WAP, there are no problems in the manageability of the mobile phone. Instead, as the excerpt will show, he has some difficulties in understanding the service. The situation has been video recorded by a researcher who is also interacting with the user. Before this situation, the user has already shown some features of the service to the accompanying researcher. It is a bus timetable service, which tells him the estimated times of when the next buses (several routes) pass the bus stop near him. For the service, all the stops of the city have been given an identity number which one must know when seeking stopspecific information. The user has contacted the service by using a bookmark that leads him directly to the information about his nearest bus stop. Thus, he did not have to remember the four-numbered identifier of the stop. After doing this, he starts to show that instead of looking for bus stop-specific information, one actually can look at all the timetables of a particular route. In this extract, U is the user of the device and C is the person behind the camera. Of course, this kind of situation does not correspond with actual usage because the action happens at the request of the researcher, and there is a discussion between her and the user. The ‘extra’ person affects the organisation of the user-device interaction. The benefit of this kind of situation is, however, that this kind of ‘thinking aloud’ helps the analysis. It is easier to examine the user’s interpretations about the construct of the service. The speech is transcribed according to standard CA conventions (See Appendix). In addition, the transcription includes the following ways of marking the interaction between the user and the device: < > Performing an activity with the device { } Menu (or page or state) that is opened through the activity (( )) Comments about ongoing activity

11 This transcription is a translation from Finnish. Some explications: ‘TKL’ is an abbreviation of a local public transportation firm (Tampere City Transport). ‘Hervanta’ is a name of a quarter of the town. ‘Kalevantie’ and ‘Messukyläntie’ are the names of the streets on which the stops in question are. EXTRACT 1 (Mobile services, bus 23) 1

{TKL-WAP SERVICE/ Start the search: by route number / by stop number}

[ ] Eleven lines omitted 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

U: C: U: U:

U:

Æ

U: U: C:

Æ U: C: U:

Æ

C: U: C: U: C: U:

Here: is begin the search by route number by stop number by stop address (1.0) So if [I now want some rou[te number [Yeah:, [ {SELECTIONS} [(0.6) there say (1.0) when do[es number twenty three run [ [{ROUTE NUMBER} from here, (1.0) [I put there that f[ormat [twenty three was [ {SELECTIONS} [ {ROUTE NUMBER} [((TYPES 23)) the route n[umber (1.4) #and well: [let’s fi[nd# (1.5) so this gives [ {TKL-WAP} [ {SELECTIONS} [ a[pparently quite (0.8) well Hervanta Central Square or Central [{TKL-DIRECTIONS: Hervanta-Central Square/Central Square-Hervanta} Square Hervanta so Hervan[ta Cent[ral Squa[re, [Mmm::. [ {SELECTIONS} [ (4.2) {TKL-STOPS: Line 23 page 1/3 / 4007 Messukyläntie} [((RIFFLES THE STOPS)) [And then it well: (0.7) uh it offers different stops. Yes. And then well,=>if I choose from there when it is< at that s[to[p, [ {SELECTIONS} [ (0.6) °Mm° (3.0) {TKL-PAGE 1/5: stop 4001, Kalevantie /line=time} {23=11.18/23=11.38/23=11.58} ((RIFFLES THE MENU)) [So well, (0.6) so it gives n- you see the next buses that g[o th- the next twenty threes £that go by that stop£. (.) [M-ye::. Okay (.) it does it this way. .hhh Mm. Well this is hhhh well (0.7) quite (0.2) an .

The user has oriented to find the timetable information for the route of bus number 23, and it seems to me that he thinks they are in a similar form as in a printed timetable. Firstly, this kind of supposition is backed up the structure of the menu which is present in the first line. It gives impression that one can seek information separately about bus routes and bus stops. The same display “{TKL-WAP SERVICE/ Start the search: by route number / by stop number}” is present all the time during the omitted lines, until line 2 when the user begins to riffle the menu, reading

12 it aloud at the same time (lines 2 and 3), when the third possibility, searching by the address of the bus stop, comes forth. The user chooses the bus route search (lines 3 to 7) seeking information “when does number twenty-three run from here” (lines 6 and 8). That he uses the expression “from here” can be interpreted several ways. First, it can be read to mean that the user assume the service to be place sensitive: i.e. the service would give information in relation to the given place. 2 However, I myself interpret the utterance “from here” primarily to express the user’s aim to search the paper timetable type of information. I happen to know that the video recording was done in Hervanta where the terminal point of the route 23 (between Hervanta and Central Square) lies. So, the utterance “from here” indicates Hervanta.3 This conclusion is also backed up with the lines 15 to 20, where the user must choose the direction of the bus. Especially in line 17 the utterance “so Hervanta Cental Square” where the particle “so” express choosing the direction as something self-evident matter: direction is “from here”, from Hervanta to Central Square In any case, the user assumes that from the route-specific search he will get different kind of information than from the stop-specific search. For instance, the user does not say ‘when does number 23 run from the nearest bus stop’. In that case he probably would like to know the stop-specific information. 4 In lines 12 and 15 the user starts to define what kind of service this probably is by saying “so this gives apparently quite”. However, in line 16 the device asks to choose the direction, and this demand interrupts the user’s talk. He was perhaps going to say something else than ‘direction-specific information’. Anyhow, the device’s demand for direction choosing will not stop user’s action for a longer period: it is not in conflict with his presupposition that the service will give information regarding the whole route for bus number twenty three. Nevertheless, in line 21 the device starts to give stop-specific information. The user understand this only after riffling the menu when he sees that the device offers just different stops and that he must choose some stop to be able to go on searching information. In line 23 the utterance “uh it offers different stops” reveals, first of all, that this is new knowledge to him. In addition, the expression “uh” can be interpreted to express the disappointment that also the search started by the bus route become to offer bus stop specific information. This is not what he expected. However, the user continues to demonstrate the service to the person behind the camera, and say “when it is in that stop”, choosing an absolutely irrelevant bus stop (line 27 ) for the situation at hand. This stop cannot be found in Hervanta, but in the other side of the city. So, bus number 23 does no more run “from here”, from Hervanta. The user has given up the original purpose to find printed timetable kind of information and just continues to demonstrate the service to the camera person. In line 31 the information about the stop in question opens up, and when he, in line 32, starts to riffle the menu, he notices that the service gives him all the following buses number 23 that pass the stop number 4001. So, he does not get all timetable information about the route 23, not even in relation to that separate stop. “Okay (.) it does it this way” (line 36) is said in such a way that it marks his understanding about the structure of the service as new information to him.

2

This is not a satellite based location service, and an experienced user surely knows that. Of course, the same expression could be used if one assumes to get place specific information. 4 In Finnish there is a difference between the expressions ‘täältä’ and ‘tästä’, which, however, are both translated ‘from here’. In this case the original word was ‘täältä’ and it refers to the larger area than only to a single bus stop. However, the expression ‘täältä’ could also fit to the situation, where one is near the terminal point. 3

13 After the device’s announcement in line 21 the user understands that he cannot get general information about the bus line 23, only stop-specific information. But it is not until after the device’s turn in line 31 that the user realises that he cannot even get all information about the route 23 regarding that precise stop: he can only see the timetables of those buses number 23 that pass the stop after the ongoing search moment. As a matter of fact, through the service one can only get stop-specific information, regardless of if one seeks it by the stop number or by the route number or by the address of the stop. In line 38 the user is all but convincing when saying it is an “okay service”. The choice of words, hesitating, pauses and prosody of the last comment all give the impression that he expected the structure of the service to be different than what it turned out to be in the usage. By studying this situated action in detail, we can come to know what the user’s intent and his understanding of the function of the service were, and see how he, during the activity, step by step, has to correct his presuppositions about the service at hand. In the extract, the device in a way ‘corrects’ the misunderstandings of the user. In the transcript, I have marked with an arrow those places which are certain key points: to be able to go on with the service, the user must act according to the information and action potentials (i.e. affordances) that the display of the device offers. In lines 16, 21 and 31 the device restricts the possible actions and thus directs the ongoing activity in spite of the suppositions of the user. These seem to be the points where the perceived affordance deviates from the assumed one. The user must all along the interaction update his ‘conceptual model’ with the ‘system image’ to be able to go on using the service. So, the user’s preconceptions about the nature of the service are combined with moment-by-moment interpretations of evidence found in the actual use of it. The system image, too, is something which exposes itself only by a sequential procedure. There are no problems in manageability of the mobile phone. Instead. the structure of the service is not in every respect intuitive to the user: there are some difficulties in comprehensibility. The poor comprehensibility is in this case closely connected to the allow-ability of the service: the user wants to do with the procedure something which is not possible (e.g. search all the timetables of a bus route). 8. Conclusion All human actions are situated in particular social and physical circumstances and that is why the situation is crucial to the interpretation of action. Conversation analysis, based on the analysis of everyday communication, gives methodological tools to analyse the sequential organisation of action. Further, affordances give a chance to connect the analysis of this moment-by-moment ongoing activity to its material and social circumstances. In humandevice interaction, it is not possible to consider the user and the device as the symmetrical participants of the co-operation. There is always asymmetry between their situational resources. The device is not, however, a passive object of actions, even if it is behaving on the basis of resources provided by the beforehand designed software. The connection of conversation analysis and theory of affordances makes apparent that material, or rather material-semiotic (Raudaskoski 1999, 2002, 2003), features of the device play a significant part in understanding the use of the device (or the use of the application.). By looking at the sequential organisation of situated user-device interaction, it is possible to notice the differences between the user’s expectations of the functions of the service and the designed structure of it. In action we can perceive affordances that the device does not have, or we might not notice potential affordances it has. The analysis can also reveal what are our

14 assumptions about the properties, e.g. what are the ‘needed’ affordances of the device. Together with academic knowledge, these analytical observations can be interpreted as a resource for a better design of the artefacts. References Arminen, Ilkka & Raudaskoski, Sanna (submitted for publication) Tarjoumat ja tietotekniikan tutkimus, [‘Affordances and the study of information technologies’] Atkinson, Maxwell & Heritage, John (1984) (eds.) Structures of Social Action. Studies in Conversation Analysis. Cambridge University Press Bærentsen, Klaus B. & Trettvik, Johan (2002) An Activity Theory Approach to Affordance. A conference paper: NordCHI, October 19.-23., Århus, Denmark Dumas, Joseph S. & Redish, Janice C. (1993) A Practical Guide to Usability Testing Gibson, James J. (1979) The Ecological Approach to Visual Perception. Boston: Houghton Miffin Hutchby, Ian (2001) Conversation and Technology from the Telephone to the Internet. Cambridge: Polity Press Järvilehto, Timo (1998) The Theory of the Organism-Environment System: I. Description of the Theory. Integrative Physiological and Behavioral Science, 33, 317-330 Järvilehto, Timo (2000) The Theory of the Organism-Environment System: IV. The Problem of Mental Activity and Consciousness. Integrative Physiological and Behavioral Science, 35, 35-57 Norman, Donald A. (1988) The Psychology of Everyday things. New York: Basic Books Norman, Donald A. (1999a) Affordance, Conventions, and Design. Interactions, May + June 1999 Norman. Donald A. (1999b) The Invisible Computer. Why Good Products Can Fail, the Personal Computer Is So Complex, and Information Appliances Are the Solution. London: The MIT Press Raudaskoski, Pirkko (1999) The Use of Communicative Resources in Language Technology Environments. A Conversation Analytic Approach to Semiosis at Computer Media. Academic dissertation: Faculty of Humanities: University of Oulu Raudaskoski, Pirkko (2002) Miten (digi)tv:n katsomista voi tutkia? Katsomistilanteessa keskustellaan televisiosta ja sen kanssa. Mediumi 1.3. (3.12.2002) http://www.m-cult.net/mediumi/ [How to research (digi)television viewing?] Raudaskoski, Pirkko (2003) User's interpretations at a computer tutorial: Detecting (causes) of misunderstandings. In Prevignano, Carlo L & Thibault, Paul J. (eds.) Discussing Conversation Analysis. The work of Emanuel A. Schegloff. Benjamins: Philadelphia. Raudaskoski Sanna & Arminen Ilkka (2003) Mobiilipalveluiden näkymät –tapaustutkimus eTampereen mobiiliklusterin asiantuntijoiden näkemyksistä. Tietoyhteiskuntainstituutin raportteja 1/2003 [‘The future of mobile appliances. A case study on experts' future views inside the mobile cluster of Tampere’] Suchman. Lucy A. (1987) Plans and situated actions. The problem of human-machine communication. Cambridge University Press. Ylä-Kotola, Mauri & Arai, Mehdi (2000) Uusmediatieteen perusteet. [‘An introduction to new media studies’.] Helsinki: Edita

15 APPENDIX Transcription conventions < > Performing an activity with the device { } Menu (or page or state) that is opened through the activity (( )) Comments about ongoing activity (0.5) Numbers in brackets indicate a gap timed in tenths of a second. (.) A dot enclosed in brackets indicates a ‘micro pause’ of less than two tenths of a second. = Absolute contiguity between utterances. [ ] Square brackets indicate the points of overlapping talk or activity. ( ) Unclear utterance or other sound. .hhh Inward breathing. The more ‘h’s, the longer the breath hhh ‘h’s with no preceding dot are used to represent outward breathing. : Colons indicate the stretching the sound. A dash indicates a sudden cut-off of the prior sound. . A full stop indicates a falling tone. , A comma indicates fall-rise or rise-fall (i.e. a continuing tone) ? A question mark indicates a marked rising tone. ↑↓ Upward and downward arrows are used to mark an overall rise or fall in pitch across a phrase. Under Underlining indicates speaker’s emphasis. When connected to the activity of the device, indicates the place of the cursor. In talk, outward pointing arrow heads indicate noticeably slower talk.