Situating conversations within the language/action ...

15 downloads 0 Views 60KB Size Report
Nov 9, 1993 - Milan Conversation Model, and we are designing a new conversation handler ..... sharing sessions, videoteleconferences, etc. ) and/or asynchronous (email ...... Vera, A. & Simon, H. Situated Action: A Symbolic. Interpretation.
Situating conversations within the language/action perspective: the Milan Conversation Model Giorgio De Michelis, M. Antonietta Grasso Cooperation Technologies Lab. University of Milano Via Comelico, 39 I-20135 Milano ITALY Tel: (39) 2 55006311 / 313 E-mail: {gdemich, grasso}@hermes.dsi.unimi.it

ABSTRACT

The debate on the language/action perspective has been exciting the CSCW field for almost ten years. In this paper we recall the most relevant issues raised in it, we also propose a new exploitation of the language/action perspective considering it from the viewpoint of understanding the complexity of communication within work processes and the situatedness of work practices. On this basis we have defined a new conversation model, the Milan Conversation Model, and we are designing a new conversation handler implementing it. KEYWORDS

Language/action perspective, conversation, work process, commitment

INTRODUCTION

Language/action perspective [11, 12, 13, 33, 34, 35, 16] is one of the relevant theoretical contributions which has appeared within Computer Supported Cooperative Work. Fernando Flores and his co-workers [13] propose the following claim as the basis for the language/action perspective: "human beings are fundamentally linguistic beings: action happens in language in a world constituted through language". Meanwhile Finn Kensing and Terry Winograd [16] articulate its relevance for the analysis of cooperative work in the following terms: "Cooperative work is coordinated by the performance of language actions, in which parties become mutually committed to the performance of future actions and in which they make declarations creating social structures in which those acts are generated and interpreted."

In Proceedings of the 5th Conference on CSCW, October 22-26, Chapel Hill, North Carolina. ACM, New York, 1994, (to appear).

The language/action perspective has had a significant impact within the computer supported cooperative work field: its relevance has been frequently acknowledged [4, 32, 20, 15]; various researchers claim to have been inspired by it [7, 32, 15]; the package created within it by Action Technology Inc., The Coordinator, has been installed in various work settings, in many cases with success [14, 10, 6]; the linkage of messages within conversations of The Coordinator has been acknowledged as one of the most valuable features of groupware systems [5]. Nevertheless, both its theoretical foundations and The Coordinator, have been subject to criticism in a passionate debate lasting almost ten years. The main arguments against language/action perspective are the following: •the speech act theory, in particular the contribution to it by John Searle with the taxonomy of speech acts from the illocutionary viewpoint and its embedding into the conversation for action model by Flores and Winograd, is wrong in that it assumes a one-to-one mapping between utterances and illocutionary acts, which is not recognizable in real life conversations [4]; •the normative use of the illocutionary force of utterances, implicit in the idea of categorizing the latter by their illocutionary points making intentions explicit and transparent, is the basis for developing tools for the discipline and control over organization members' actions and not for supporting cooperative work among equals [29]; •the language/action perspective does not recognize that conversations embed a process for negotiating the agreement of meaning [23]; •the language/action perspective misses the locality and situatedness of conversations, because it proposes a set of fixed models of conversations for any group without supporting its ability to design its own conversation models [4]; •the language/action perspective offers a partial insight into the work phenomenology; it has to be integrated with other theories ([20] list the language/action

1

perspective together with other four approaches as partial contributions to the development of a coordination theory); •the language/action perspective supports the creation of tools usable only in some organizational settings: generally hierarchical, authoritarian organizations [23]; •the Coordinator, the conversation manager system developed within the language/action perspective, was successful when people used it without accepting the strict rules of behaviour it embeds, forcing it to support their particular behavioural patterns and needs ([6] shows that some users were successfully forcing its use as a teleconferencing system). Some of the sharper critiques of the language/action perspective have been written by scholars either from the ethnomethodological school (Lucy Suchman) or influenced by it (John Bowers, Simon Kaplan and Mike Robinson). In a certain sense this is surprising, for ethnomethodology shares with the proponents of the language/action perspective the same theoretical grounds: namely a rejection of the Tayloristic approach to work analysis and design; an attention to the complexity (contextuality) of work processes; and a reference to twentieth century European philosophy, in particular to phenomenology and hermeneutics. Perfectly aware of this commonality are Alonso Vera and Herbert Simon [30] who, when discussing the situated action perspective, take as reference texts both the Suchman and Winograd Flores books (this quotation should not suggest we agree with the position Vera and Simon propose in that paper, as should be clear our position is radically different; but a discussion of Vera and Simon's thesis is beyond of the scope of this paper). Moreover, many authors (among which we can count ourselves) pay tribute to both the anthropological analysis of work situatedness and to the language/action perspective (e. g.: [6, 15]); Lucy Suchman herself wrote in her review [26] of "Understanding computers and cognition" by Terry Winograd and Fernando Flores [33]: "[ethnomethodology], the field of social studies based on a critique of traditional sociology [emerged in the last 20 years], ... has much in common with Winograd and Flores' critique of the logical positivist tradition in philosophy, linguistics and psychology"; Finn Kensing and Terry Winograd recently claimed that they "have been influenced by ethnography" [16]. The apparent contradiction between the severity of the critiques and the common grounds has an impact on the ongoing debate. On the one hand, the debate on the language/action perspective has brought forth some of the crucial issues of computer supports to cooperative work, as well as of any tool used within a social context. Among others we recall: the nature of work; the role of communication within work processes; and the role of conceptual categories within the analysis of social processes. On the other, it appears increasingly as a battle between two opposite armies: those who love The Coordinator and the Action Workflow (the workflow management system, developed recently within the language/action

perspective by Fernando Flores and his co-workers, [21]), and those who hate them: you can either be with the language/action perspective or against it. Instead of opening new research directions, it freezes the positions as if they were completely defined. We think that this way of conducting it is wrong, for computer supported cooperative work is more a research topic than a well developed theory. Moreover, we think that the debate about the language/action perspective offers a lot to those who want to understand more about work and relative support tools if they try to move forward instead of positioning themselves in the battlefield. It is remarkable that Lucy Suchman seems to share the same understanding and worries when she concludes her review of "Understanding Computer and Cognition" [33] with the following words: "If the discussions reduce to complaints about the particular characterizations offered, the book's value to the community will be minimized. But if the discussions are about not only this book, but what this book is about, then we may in fact gain new orientations toward cognitive science and computer design. And if Winograd and Flores manage to stir up those winds of change, they will have made a contribution indeed." [28]. In this paper, in accordance with Lucy Suchman's caveat, we try to move ahead from a static confrontation to propose some developments to the theoretical grounds of the language/action perspective and to its application on designing computer supports for cooperative work. The proposed developments have originated from the debate on language/action perspective and try to show how it offers the basis for rethinking and enhancing it. The paper is organized as follows: in the second section we analyze conversations and commitments as part of the context of work processes, offering a more articulated insight into their mutual relations; in the third section we discuss computer-based tools as supports to sustain the complexity of communication; in the fourth section we propose the new model of conversation that derives from the considerations of the two previous chapters. In the conclusion we give some hints about our research agenda.

CONVERSATIONS AND COMMITMENTS: A NEW APPROACH

The situated action perspective has taught us that human beings participate during their lifetime in various different social experiences, each one of which is characterized by the group of persons with which it is shared and by its history: our work, our family life, our social and/or political engagements are some examples of these social experiences, each with its own history and its own participants. We claim moreover that any utterance a person directs to other persons in a certain moment is an event (and therefore a speech act [2, 24, 33]) within their common history: the history of our work is made by the communication we perform in it, since any event emerges in our social experience only by our speaking and listening about it. 2

The history of the common experiences of a group of persons generates the context within which they interpret any new speech act: no speech act can have either a meaning or an illocutionary point [25, 33] out of a context. Any group of persons sharing a social experience, shares a language, a physical space, an agenda, ... without which no interpretation is possible. Moreover the interpretation of a speech act within a given context depends on (and defines) the viewpoint from which it is performed. It is, in our opinion, impossible to assume that two persons interpret in the same way a speech act, because it is impossible to ascertain that they have in mind the same context and that they share the viewpoint (we also think the multiplicity of viewpoints depends on the multiplicity of contexts, but any discussion of this is beyond of the scope of this paper). As linguistics has well explained, the sentence "The soup is too insipid for me" is a negative evaluation within a competitive context and a request for some salt during a family dinner. Speech acts are, by necessity, incomplete and/or ambiguous, both semantically and pragmatically. The image a person has of any one of the many contexts characterizing her social experience is not that context itself, but only a partial and personal account of it: any new event within a group of persons sharing a social experience induces a change of the context within which they interpret their mutual speech acts (the evolution of the context can be considered the learning of the group), as well as of their personal images of it (the learning of the individuals). It is in the interplay between group and individual learning that emerge both misunderstanding and creativity. The complexity of human communication is both the ultimate cause of all the breakdowns occurring within a group of persons and the space where they can afford them as opening new possibilities. From this viewpoint, any speech act a person makes is therefore situated in multiple contexts. Thus attributing to it a single intention is impossible. Whoever performs a speech act, whoever acts in the language, cannot determine its listening, can only influence it. Within a conversation, meanings, intentions, mutual relations ... evolve and eventually converge on the basis of the listening of the participants with respect to the utterances, and of their ability to show their listening with declarations of satisfaction such as "Yes!", "O.K.", "Thank you!". Communication is therefore determined by listening, not by saying! Within a conversation, whoever performs a speech act makes an attempt whose effectiveness is verified by the answer it receives. As we have said above, whenever a person answers "Yes", "O.K.", or any other (implicit or explicit) declaration of satisfaction, she as a listener exhibits her listening as an agreement with what she is answering to. Her agreement is in any case purely pragmatical: with her declaration of satisfaction she extends the semantical and/or illocutionary space of possibilities of the conversation, but she cannot make explicit the semantical and/or illocutionary content of her agreement. If we restrict our attention on work practices, we can concentrate our attention to the communities of practices

[18, 26] and on the work processes they perform. As we have discussed more extensively in [8, 9] the basic unit of context of work practices is what we call work process. Work processes, in our understanding, are not input-output transformations, but communicative relations - if necessary, embedding input-output transformations between those who request a service and those who perform it (for this intuition we pay tribute to Fernando Flores and his co-workers who developed the model of the Action Workflow [21]). The relational model of work process sketched above allows us to ground the full conceptualization of work experiences on human communicative relations: the notion of work process on the one hand rephrases the concepts of activity and of work practice, as they appear respectively in activity theory [17] and in the ethnomethodological studies of work [27]; on the other, gives clear criteria for distinguishing basic units of work without assuming any univocity of the distinction procedure. A work process therefore characterizes a community of practices by the commitments its participants have taken and eventually fulfilled and/or by the conversations they are performing. Communities of practices are, in any case, also characterized by other factors: for example the roles played by participants, the space they occupy, the artifacts they use, the procedure standardizing complex action executions, the tasks the participants have within the latter, etc. While it could be shown that past and current commitments and conversations generate the organizational structure, at every moment during the life of a community of practices the interplay between conversations and the organizational structure elements is the basis of the ability of its participants to give sense to their behaviour. Any request for an action makes reference, at last, to something the doer knows how to do, in terms of the tasks assigned to him/her, the available artifacts, the started procedures, etc. When a conversation is opened, its subject is related to the current conversations, but also to the state of the ongoing procedures, to the special configuration of the work setting, to the available artifacts, etc. The multiplicity of work processes (each person may question in which work process she is currently situated) together with their opacity (each person can only be partially aware of what is going on within a work process) does not allow any (internal or external) observer to assume a fixed context for her observation: observing a work process is always on the one hand attempting an interpretation, on the other focusing the attention of other participants on the work process. Listening to an utterance therefore means interpreting its semantic value and/or its illocutionary point in a context: the speaker cannot be sure of the interpretation of the listener because of the multiplicity of possible contexts where the listener is listening. The listener can give multiple interpretations to a single utterance: listening is a (the) creative moment within a conversation, modifying the context where the conversation is situated; listening triggers both group and individual learning.

3

Reducing a conversation from a chain of related utterances to a chain of moves with respect to a commitment, as Flores and Winograd seem to do with their conversation for action model [33], is therefore negating the basic learning mechanism of a community of practices. This does not negate that communication within a community of practices generates its organizational structure, that utterances are speech acts, that the pragmatical dimension of communication is constituting the community of practices. The ontological nature of the community of practices and of their communication can only be captured through an observation that reflects its multiplicity. If we cannot reduce a conversation to a commitment, a speech act to an illocutionary point, we can in fact consider any commitment as a viewpoint from which a conversation becomes (univocally) a conversation for action in Flores and Winograd terms, from which a speech act assumes (univocally) a given illocutionary point. The creative act of listening can therefore be understood as the creative act of the listener choosing (creating) the commitment(s) through which she enters in relation with the speaker. Making a commitment explicit within a conversation is therefore always possible. And it can be useful if that commitment assumes a relevance in the life of a community of practices, but it does not allow us to reduce the conversation to it. Explicit commitment negotiations are in fact artifacts of a special type (namely procedures) within work processes. They are procedures because they are well defined sequences of steps. They are special procedures because they have a communicative nature that can always emerge in the behaviour of its participants. In our approach the ontological nature of the pragmatical dimension of human communication is strictly linked to the irreducible complexity of human social experience. It is the condition of our life to be immersed in networks of commitments constituting the communities or practices within which we live, even if we can never reduce them to the images (of them) we have in mind. Any metaphor, any theory, any set of conceptual categories through which we observe the behaviour of a community of practices also has a performative nature [3] because it can discipline our behaviour. Performing a community of practices from the viewpoint of the language/action perspective opens to its participants the possibility of understanding their cooperation in terms of commitments and conversations, augmenting their relational effectiveness (the recent book of John Whiteside "The Phoenix Agenda" goes in this direction [31]). Recognizing the performative nature of conceptual categories does not mean attributing to them a normative role: performing a community of practices from the viewpoint of language/action perspective opens new possibilities, normating the behaviour of its members from the same viewpoint closes to them. Thus the argument of Lucy Suchman against the normative conception of The Coordinator [29] appears in our opinion distinctly justifiable, even if we must keep in mind that successful users of The Coordinator reveal its normative strength to be very weak.

In which direction can we go to design a system best supporting the possibility of its users to perform a community of practices freely? The above discussion offers us some interesting hints. On the one hand, we cannot constrain the communicative behaviour of its participants. Even if the users are allowed to design their own conversation patterns, we are de facto adopting a normative conception of tools. As we have said above, it is not possible for any (internal or external) observer to substitute her image of a work process with the work process itself. Moreover, two observers can agree only pragmatically on their observations without being able to make explicit the semantical and/or illocutionary content of their agreement. Tools are usable to the extent they respect the complexity of human communication. On the other hand, a community of practices uses artifacts of various types, among which are often relevant procedures. Users need to design their procedures in order to shape work processes in which they participate. The nature of commitment negotiations, both procedures and communicative events, encourage us to consider them special procedures that are embedded within conversations. The negotiation in itself exhibits its procedural nature, while the conversation to which it is attached exhibits its communicative nature. This choice seems to us well suited if it is true that users need tools supporting general patterns of behaviour, allowing them to perform freely the community of practices they are part of. From this viewpoint tools are considered enhancers of the possibilities of their users, bringing forth the context where they are both freely communicating and negotiating mutual commitments.

SUPPORTING COMMITMENTS

CONVERSATIONS

AND

In the previous section we presented a new exploitation of the language/action perspective, where considerable attention is paid to capturing the full complexity of human cooperation within work processes. We do not change the basic assumptions of the language/action perspective as defined by Fernando Flores, Terry Winograd and their coworkers, establishing that human beings cooperate within a work process through their conversations and through the mutual commitments taken within them. Our attempt is to conjugate them with full recognition of the complexity of human communication as it appears when we look carefully at the communicative behaviour of the persons participating in a community of practices, in the problems they encounter conversing with each other, in particular listening from the semantical and pragmatical viewpoints to the speech acts constituting a conversation. The not unique links between conversations and commitments which emerge when we pay attention to the central role of listening within cooperative processes continuously renew the distance between the community of practices and its participants, generating both obstacles for the flow of the work and new possibilities for it. 4

Reducing the complexity of a work process and of the communication going on within it is not, in our opinion, what is needed by its participants. Rather, they need to be supported in coping with that complexity, without reducing it. This means that any tool supporting a community of practices must broaden and not restrict the range of the possibilities of its members. The different evaluations that a tool such as The Coordinator has received by different groups of users shows that generally it is not possible to distinguish clearly between the tools opening new possibilities and those restricting them. Tools are in fact not what their designers think they are but what their users learn to do with them. Though we recognize this further ambiguity, we think that if someone designs a new tool on the basis of a theory taking fully into account the complexity of the work practices she wants to support and characterizing the mechanisms generating it, she will reduce the gap between her image of it and that of its users. With this guideline in mind, we will exploit in this section the theoretical framework proposed in the previous one, specifically in terms of the services communities of practices need to communicate effectively. We can distinguish two types of support a person needs within any activity: •the possibility of acting as she thinks best and with minimal effort; •the possibility of situating herself in the best way in the context where she is acting. Any support system must offer both the above services, since the deficiencies of either one may also weaken the other: on the one hand if acting is badly supported, it requires great attention on the how distracting from the what; on the other without understanding the context, it is impossible to understand the effectiveness of any action. In the sequel we discuss a wide range of issues regarding tools supporting communication within communities of practices. In the next session we will introduce the Milan Conversation Model and the prototype of support system we are developing on its basis (which is only a partial answer to the issues raised here below). We are at the very beginning of the design of a new family of support systems for work processes. The issues we raise here describe the scenario where each step within it must fit. Multimedia conversations

As one of the authors and his co-workers have discussed extensively in [1], the observations of Stephen Reder and Schwab on the frequency of switching between different media of communication within any non trivial interaction sequence [22] together with those of Christine Bullen and John Bennett on the utility that users attribute to message linking mechanisms [5], indicate the basic unit of communication that should be considered is the multimedia conversation. A multimedia conversation is a sequence of synchronous (face to face meetings, telephone calls, video sharing sessions, videoteleconferences, etc. ) and/or asynchronous (email messages, faxes, voice messages,

video letters, etc.) communicative events between a group of persons on a certain issue. Supporting multimedia conversations opens a series of new problems requiring effective solutions. First, a system supporting a multimedia conversation should be used in any type of communication event and, therefore, should be useful within any part of it. Email systems, even conversation handlers such as The Coordinator or ConversationBuilder, teleconferencing systems, any system supporting communication through one single medium, are all ineffective in this respect because either they restrict their users to one single communication medium or they support only some communication events and not the whole conversation. In all cases, when the workstation is part of the communication channel or when it can be part of it (from email to videoteleconferences), this problem has various good solutions available in existing tools and prototypes. The same cannot be said for communicative events such as face to face meetings where the workstation is not part of the medium: in these cases the tool must be able to offer services within the communicative event, creating the conditions for its use. The UTUCS prototype, described in [1], is a prototype of a system supporting multimedia conversations, where particular attention has been paid to the modules supporting face to face meetings in line with the system developed at DEC by John Whiteside [31]. Secondly, a system supporting a multimedia conversation should be able to generate a record of any communication event of any type, and to link it to the records of the previous events within a conversation. Any record of a communicative event type should be semi-structured [19], and its structured fields should be compatible with those of the other event types. We can consider adequate records, respectively, the message itself (with all its structured fields and eventually its annexes) for the email, the minutes (together with the documents distributed within it and some structured fields describing the participants, the various items of the agenda,...) for the meeting, the voice record (together with some structured fields describing the participants, the issue under discussion, ...) for the telephone call, etc. The integration provided by the links creating the conversation allows us to reduce to a minimum the role of the user in filling the structured fields of the above records and recording the documents. On the one hand, in fact, almost all the structured fields can be filled automatically on the basis of the conversation which the communicative event is part of; on the other, the use of electronic devices for storing documents offers them in electronic format for the multimedia conversation record. We have done some experiments with problems of this type using the UTUCS prototype [1]. Thirdly, systems supporting multimedia conversations should be able to offer their support in different locations (people move during their work time), and eventually to manage networks of heterogeneous workstations, offering different opportunities in terms of access to communication media. As yet we have not faced this problem directly, but we think that the integration of different communication 5

media necessary to link together the records of communicative events of different types represents the natural environment where such a problem may find an elegant and effective solution. Commitments and conversations

Commitments are negotiated and taken within conversations. On the one hand, as pointed out by John Bowers and John Churcher, the relation between conversations and commitments is not a one-to-one relation [4]; and, as we have discussed in the previous section, reducing a conversation to a commitment can restrict the possibilites open to its participants. On the other, making a commitment explicit is sometimes very useful, in particular when we must be sure that it will be completed satisfactorily. The explicit negotiation of a commitment allows its participants to take into account it in the framework of their activity, as long as they can assure that work processes will follow the procedures designed in it, if any. An explicit negotiation of a commitment is always a joint choice: it makes clear a point of view, it does not prohibit that other views are taken into account. We can overcome the apparent contradiction between the two above points if we think of a conversation as a sequence of communicative events (see above) to which can be attached not only documents of any type (as annexes) but also any number of commitment negotiations. From this viewpoint, conversations are just sequences of communicative events involving two or more participants, where each participant is free to be as creative as she wants in her listening. Conversations can be supported by a system making accessible the sequence of records of the communicative events, together with the documents generated and/or exchanged and with the commitment negotiation steps which occurred during them. A support system of this kind in fact helps the user to understand the context where she is acting. Let us distinguish within the information characterizing the context of a work process the part related to the temporal connection of events and that bringing forth the peculiarities of any event, calling them respectively in the history of and in articulation of. Then the conversation record represents an information of the type in the history of, while all the attached enclosures and commitment negotiations of any message represent information of the type in articulation of. Within this framework a commitment is defined by the respective negotiation steps performed within a conversation by two persons (playing with respect to the commitment the role respectively of doer and referent) and by the documents that are attached to them. Any negotiation step of a commitment is characterized by its object, its completion time and its state. The state of a step of a commitment negotiation reflects the illocutionary content of the communication event to which it is attached from the viewpoint of that negotiation, without reducing the interpretation possibilities it opens to its participants. In other words, we could say that a commitment negotiation corresponds to the projection of a conversation

on the conversation for action relative to it, in accordance with the terminology of Terry Winograd and Fernando Flores [33]. Commitment negotiations can be supported by a system making accessible the sequence of records of their negotiation steps, together with both their annexes and the conversations where they are embedded. A support system of this kind in fact helps the user to understand the context where she is negotiating, as well as the state of the negotiation. Commitment negotiations are therefore fully transparent to their actors within conversations without imposing any normative constraints upon them. The aim of the Milan Conversation Model consists therefore in modelling multimedia conversations embedding commitment negotiations as a basis for the design of a system supporting them.

THE MILAN CONVERSATION MODEL

The Milan Conversation Model (MCM), which we are developing both as a theoretical framework for understanding communication within work processes and as a system supporting it, tries to exploit the language/action perspective within the ethnomethodological approach to work. As in the two previous sections, it considers a conversation within a work process as a complex object, made up by a multimedia conversation record and by the commitment negotiations occurring within it. The concept of multimedia conversation which has already been developed and implemented at the Laboratory for Cooperation Technologies of the University of Milan is fully part of the model, as highlighted in the previous section within the UTUCS (User To User Communication Support) system prototype [1]. Telephone

User Interface

Face to Face Couple Colloquies

Electronic Mail

Face to Face Group Meetings

Message Switching System

Mail Box

Talks

Conversations

Minutes

Information Base Figure 1: UTUCS architecture

The UTUCS system in fact supports conversations for action performed by means of various media such as 6

The first provides the user information on all the ongoing conversations and the way to check which of them contain new messages (with eventually new steps of commitment negotiation). Some filters can be selected by the user to extract subsets of the conversations in accordance with particular keys. The second view allows her to focus on a single conversation (Figure 2). VIEWING MODE

CONVERSATION Our Partecipation at CSCW'94 started at

11/09/93

Participants

Negotiations

5 Events

Last event (5)