Developing a Methodology for the Evaluation of ... - Semantic Scholar

LANCASTER UNI VERSITY Computing Department

Developing a Methodology for the Evaluation of Cooperative Systems Magnus Ramage Cooperative Systems Engineering Group Technical Report Ref: CSEG/13/1997 http://www.comp.lancs.ac.uk/computing/research/cseg/97_rep.html

Published in Proceedings of IRIS 20 (Information Systems in Scandinavia), Hanko, Fjordhotel, Norway, 9-12 August 1997.

ABSTRACT Cooperative systems are collections of people and organisations in cooperation facilitated by technology. How does one evaluate the effects of such systems at different stages of their development? This paper suggests that this evaluation is rather difficult, and requires an awareness of viewpoints of many people and groups rather than one view. This leads me into regarding evaluation as an ongoing learning process. I discuss the ways I have come to these conclusions in theoretical understanding, but also in a practical context of evaluation.

CSEG, Computing Department, Lancaster University, Lancaster, LA1 4YR, UK Tel: +44-1524-65201 Ext 93799; Fax: +44-1524-593608; E-Mail: [email protected] http://www.comp.lancs.ac.uk/computing/research/cseg/

Developing a methodology for the evaluation of cooperative systems Magnus Ramage CSCW Research Centre, Lancaster University, Lancaster LA1 4YR, UK [email protected]

Abstract Cooperative systems are collections of people and organisations in cooperation facilitated by technology. How does one evaluate the effects of such systems at different stages of their development? This paper suggests that this evaluation is rather difficult, and requires an awareness of viewpoints of many people and groups rather than one view. This leads me into regarding evaluation as an ongoing learning process. I discuss the ways I have come to these conclusions in theoretical understanding, but also in a practical context of evaluation.

Keywords: CSCW, cooperation, evaluation, stakeholders, organisational learning, systems thinking, multiplicity

On cooperative systems and their evaluation This is a paper about how to evaluate cooperative systems. This phrase is often used as a synonym for those computer systems which support cooperative work, often called computer-supported cooperative work (CSCW) systems or groupware; but I use it here in a slightly wider sense, to mean: a combination of technology, people and organisations that facilitates the communication and coordination necessary for a group to effectively work together in the pursuit of a shared goal, and to achieve gain for all its members. This definition will be justified below. The complexities of the term ‘evaluation’, and my preferred usage, will also become clearer later in the paper, but a working definition will be the examination of such systems in use (either in a ‘real’ setting or in a laboratory) for potential improvement or other purposes. Such evaluation is generally regarded as a highly complex task (Ross et al, 1995; Bannon, 1993; Grudin 1988). It is difficult methodologically (because the familiar methods of usability evaluation have had little to say about multiple use), practically (the effects of the change can only be seen in the longer term and often in different places), psychologically (systems designers and evaluators must make a Copernican-style shift from just-the-technology to technology-in-use-inan-organisation) and politically (for once one begins to look at the evaluation of organisations, many issues of priority and conflicting interests arise). This is particularly apparent given the social embededness of the computer systems: in information systems in general, cooperation takes place using the computer; in cooperative systems, that cooperation takes place via the computer. This means that any organisational conflicts or clashes in individual personalities and cultures will not only be more readily apparent but will directly affect how well the system works. It may well be the case that a computer system will be designed perfectly, with all the right sort of software engineering procedures, requirements analysis, and usability testing; but that the system is introduced insensitively, or it cuts across the way people have become used to working, or it changes the power relationships between workers at different levels of the organisation. All of these have been well documented in case studies in the CSCW literature (e.g. Bowers et al. 1995; Grudin 1988; Ramage 1994). In such situations, it may

1

be the case that the system is not used by sufficient people to attain the critical mass Grudin points out is necessary for some systems to be useful; or it may be that people devise work-arounds that bypass the computer and feed it information later, as in Bowers’ study; or it may be that people grumble about the “Big Brother” systems but have no choice but to use them (as in my study). One of the things that makes CSCW evaluation so fraught is the wide range of different perspectives that need to be brought to bear: usability, individual psychology, group dynamics, the efficiency of communications, the effects of and on organisational structures and cultures, and so on (Ross et al. 1995). All too often, most of these are excluded, and a narrow disciplinary view is taken (according to the backgrounds and skills of the evaluators). This leads to the importance of multiplicity of theories brought to bear on the evaluation, to counter this and ensure the whole range of experiences with the system be considered. A further problem is the dominance of the views of experts — the archetypal ‘scientists in white coats’ - over those of the people who actually use the systems. In fact, both are necessary: we must be aware of the theoretical, analytic side to the evaluation (based on scientific analysis of users’ experiences) but also the users’ own perception of their experiences. Our solution in Ross et al. (1995) was to propose a framework based on multiple perspectives - in our case study, those of users and evaluators - on evaluation, and methods to allow both to be reflected. The question of multiple perspectives needs further unpacking, however. The category “user” is one that has become increasingly disputed, as ignoring the reality of people’s work and shoving them into a category whose focus is on the computer (Grudin 1993). I have been particularly interested in at the different needs of the many groups of people with a stake in the nature and effects of a CSCW system; and how these different needs change the way they evaluate it. For the print-room staff in Bowers et al. (1995)’s study, the workflow system imposed upon them interfered with their work, and made it less efficient and interesting; but for their managers, it provided useful data about how they were working. The question of multiple stakeholders in the evaluation thus arises. A third kind of multiplicity that can be seen to be a problem is the range of activities carried out under the banner of evaluation. For some people, it is a process of making things better during the development of a (software) system; for others, a process of telling whether a piece of research is of sufficiently high quality; for others a process of deciding what piece of software to buy; and for still others a process of figuring out whether a change in one’s organisation and/or technology has had an effect, and what that effect is. Few evaluation methodologies seem to be aware of all of these potential meanings of evaluation, rather giving preference to one or another. There is nothing wrong with this in principle, given that most methodologies are limited in their scope; but there is a problem that they often tend to act as if theirs was the only kind of evaluation that existed, and to ignore the other kinds. In the rest of this paper, I shall describe my experience in considering how this evaluation takes place, and how it might be done; and, eventually (but popping up in nascent form in various places), I shall describe the methodology which I have developed to do such evaluation, entitled Systemic Evaluation for Stakeholder Learning (SESL). It takes the form more of a journey than a standard academic paper, and as such the writing style is more personal than is often the case. I shall not here be comparing the SESL methodology with the methodologies described by other writers within CSCW, due to lack of space — good overviews can be found in a number of the papers mentioned earlier, notably Grudin (1988), Bannon (1993) and Ross et al (1995); and in the book edited by Thomas (1995).

What are cooperative systems? I promised above to justify my definition of the term “cooperative systems”. The key point about my definition is an attempt at shifting the term away from a purely technical usage, like that of Sommerville et al (1993), who use it to mean: “systems which [are] essentially cooperative in the sense that they [are] team-based”. I wish to do this because of the need described above to look at a wider view, and because of the long history of the words ‘cooperative’ and ‘systems’ in both 2

academic and practical discourse. Therefore, I shall attempt to discern something of the meaning of the term ‘cooperative systems’ by considering the meaning of its constituent parts. I must stress that I do this for the sake of understanding, appreciating the dangers of pulling things apart to understand them: this is why the key part of this process is the final definition. The Nature of Cooperation

It was Marx who first used the term “cooperative work” - although, as Hughes et al (1991) have more recently pointed out, to suggest that there could be such a thing as work that is not cooperative is to misunderstand the nature of the modern organisation and the fundamentally intertwined nature of all work. However, it is possible to make some sensible remarks about cooperation and its nature. Cooperation literally means “working together” (from the Latin “co”, together; and “operare”, to work). We can see this as the root of all modern uses of the term - it denotes some kind of activity conducted between two or more people, to common gain. Much work has gone on in social psychology in studying the subject, and from this Argyle (1991, p.4) offers as a definition of cooperation: “acting together, in a coordinated way at work, leisure or social relationships, in the pursuit of shared goals, the enjoyment of the social activity, or simply furthering the relationship”. In this descriptive sense, then, cooperation refers to any sort of activity that two or more people conduct together. Studies of such situations have been made by sociologists (e.g. Heath and Luff, 1991), by anthropologists (e.g. Hutchins, 1991), and by social psychologists (e.g. Axelrod, 1984). A point that must be made about the last of these three groups is that the studies of cooperation in social psychology have tended to use artificial laboratory situations where only a limited amount of cooperation is permitted (such as the much-used but limited — because of its artificiality — Prisoner’s Dilemma), whereas the sociologists and anthropologists study real situations of cooperation. A few principles about cooperation arise from this work. It requires communication and awareness of others’ actions, thoughts and feelings. It also requires the establishment of shared understanding and shared (or at least mutually compatible) goals between those cooperating. There is usually some kind of benefit (not necessarily material) for all participants; it can be highly productive and personally satisfying. This is essentially descriptive. There have also been those keen to be prescriptive of cooperation as a good in itself: this has happened among social justice campaigners, assertiveness and negotiation trainers, and that part of the political left who have remodelled business as a partnership of mutual owners, under the name of the cooperative movement. It is often contrasted with conflict, as illustrated by a cartoon used by a number of justice groups. Two mules, joined by a rope, are shown, each with piles of food. The rope is too short to allow them both to eat from their own piles at once. They strain against each other for a while, resulting in neither of them eating anything, then realise their folly and join together: first they both eat from one pile of food, then from the other. The caption reads: “Cooperation is better than conflict”. This contrast is not always helpful. As Howard (1987:176) has written, “to the degree that the idea of ‘cooperative work’ neglects or underestimates the role of conflict in working life and in the design and use of information technology, it leaves out a major (indeed, central) dimension of work experience”. That is, if we ignore the fact that within systems of work that involve cooperation, there will also be conflict between the different people and groups involved, we will be in danger of ignoring the needs of many of those groups. This was precisely the criticism that Ehn and Kyng (1987) levelled against the socio-technical approach: that in the name of industrial harmony, it perpetuated the existing power relations and imbalances. I have considered this question below in terms of the multiple stakeholders relevant to a cooperative system. We must also be wary of taking cooperation to always be entirely positive. A great deal of cooperation, of precisely the sort I have described, can be found within armies, terrorist groups, the Mafia and other organisations which are at best of questionable value and more often morally

3

wrong. In these organisations, cooperation takes place within the group, but it is harnessed to ends that are harmful to those outside the group. Systems and Systems Thinking

It is crucially important to distinguish between two uses of the word “system” when we consider it in this context. The word is in use by computer scientists and organisational theorists to denote a collection of computer hardware, software and networks - one talks of a computer system and means this kind of technological mixture. Indeed, the name under which a large part of the students of computers in use collects themselves is “information systems”. This overshadows in computer circles an older and more general use of the term, which has been in use since around 1940 - the use of systems thinking. Systems thinking is an approach which views the world in terms of models which have the common property that, in Aristotle’s phrase, “the whole is greater than the sum of its parts” - there are properties of the entity viewed as a whole that are not to be found by considering the constituent parts of that entity. A good example is given by Lewis (1994:44), who considers a bicycle. This is composed of a number of pieces - two wheels, frame, handlebars, chain, saddle and so on - but taken separately none of these has any particular meaning. However, by combining the pieces together in the right way, we may create a system that affords transport. That is, the ability of a bicycle to carry me to work (given motor power from my legs) is an emergent property of the complete system. As Senge (1990:68) puts it, “systems thinking is a discipline for seeing wholes ... a framework for seeing interrelationships rather than things, for seeing patterns of change rather than snapshots”. This older notion of system - from which the computer sense arose - is intended as a general perspective on all kinds of entities. It arose in biology, through the work of Ludwig von Bertalanffy (1969) and while to some extent, as Morgan (1986:45) comments, it can be seen as a “biological metaphor in disguise”, it is taken by its advocates to be considerably more general. Indeed, Kenneth Boulding, one of those working under the banner of “general systems theory”, has written of the world as a “total system” (1985), identifying a hierarchy of eleven levels of system in which we can view the world, ranging from the lowest level of mechanical systems (those governed by Newtonian physics), to social systems (the interaction of human beings, and the use of artefacts) and transcendental systems (which we know through religious experience). While Boulding gives examples of all kinds within the ‘lower’ systems, most of human collective experience can be found in the category of the social systems. Accordingly, these have been the objects of study of most systems theorists. In particular, they have concerned themselves with the study of the interactions of people within organisations, and particularly with companies and public-sector institutions. Thus, systems thinking has been substantially used within business schools. Putting the words together: what is a cooperative system?

So to summarise the above material - cooperation is a process of two or more people engaging in an activity for shared gain, supported by communication and coordination; and a system is a collection of objects with emergent properties, here involving people and technology. Putting these together gives the definition at the start of the paper. A few comments can be made about this definition: • technology is seen as one part of the system, but co-equal with the organisational and human structures also necessary for cooperation; • the role of the system is the facilitation of other activities: it is a means to an end. That end may just be the process of cooperation, but it is not the use of the system; • for cooperation to take place, there must be some kind of gain for all members. Of course, this is just one view of what constitutes a cooperative system — other views may emphasise the technology more or the group dynamics more. For me, keeping a balance between

4

the different elements of technology, people and organisations is the key to understanding the nature of cooperative systems.

Developing the methodology in theory It is a common piece of academic finesse to distinguish between theory and practice: between the time one spends in one’s ivory tower reading great works, and analysing field-notes; and the time one spends in ‘The Field’ (or, depending on the norms of one’s discipline, ‘The Lab’ or indeed ‘The Computer Room’) getting real things done, or at least watching real people do them. I am more than a little sceptical about this dualism: my experience is that a great deal of thought-work goes on in the field, and that the ivory tower is not nowadays so far removed from the real world as the myth goes (especially in these days of cuts and under-funding of universities). That aside, I shall proceed to use the theory-practice dualism in an entirely conventional way. So: this section is about how I developed my methodology with reference to various theoretical questions, and the following section is about how a particular study led me to various of the issues involved. Of course the two are related, not least because both enterprises took place at the same time — but separating them allows me to consider these questions in slightly different ways.

Systemic Thinking1 Let us then start by looking again at systems thinking. What does it mean for something to be systemic? It means that it is based on the whole system, that it takes a holistic view of the situation under study. Of course, one needs to beware of language here (again!). As I have already discussed, the word system has become appropriated by technologists to denote things to do with computers2 . In a sense this is the same kind of holistic thinking as mentioned above: a whole is identified that has properties that do not exist in the separate parts. However, it does confine the system to only the artificial parts of the mixture. My preference would be to include human beings, collections of human beings, and the procedures and practices by which they combine, in this term. A second problem of language is that while the principle of looking at wholes rather than parts has been followed by a number of different schools of thoughts in different ways, all too often the practice has become conflated to a particular form of study. This involves a set of tools from mathematics and electrical engineering: feedback loops, input/output, control centres, and the like (sometimes, but not necessarily, expressed in equations or simplified “lines and boxes” diagrams). Unfortunately, these methods are often held to be identical to the principle of systemic thought, its sole repository. This has been encouraged by the latest business best-seller to use systems thinking, Senge (1990), an excellent book, but one which happens to use that forms of systems thinking (and which has encouraged some to confuse the tools with the concepts). So it is worth emphasising here that to be systemic, for me, is to consider the whole system, to look at something overall and be aware that it contains properties that the individual parts do not. All this might be said to be so vague and worthy that no one could disagree. Two problems have been raised, however. The first is what one might call the shopping list problem: there are so many perspectives to be taken into account, so many disciplines to choose from, that one ends up with an enormous long list that then needs to be fulfilled on each occasion. Either this takes a massive amount of time or corners are cut. The second problem views systems thinking as hubris, pointing out the impossibility of an individual knowing all corners of a system, especially given that any one view is necessarily biased and incomplete. This is very reasonable scepticism, and it cannot be answered by saying ‘bring lots of perspectives to bear’ (we know from Gödel that such perspectives will still be incomplete). 5

One answer to this is Checkland’s (1981) view that the systems perspective is be epistemology rather than ontology (that is, systems do not exist as such in the real world, but are rather ways of viewing the world). If we act as if we were trying to know the whole system (recognising that we it is not possible) then at least our hearts will be in the right place, and the rest will follow. To some extent, this is also the answer to the first concern: being systemic becomes not so much a particular framework or required list of concerns to be brought out on every possible occasion, but rather a way of looking at the world, a lens to guide our vision. As Tom Peters (1992:376) writes in a slightly different context, “it’s one more story about relationships and waves, not charts and particles”.

The traditional model of evaluation Evaluation is yet another of those words that requires a certain amount of caution, meaning many things to many different people. In computing circles, it tends to be used to mean something like “studying a computer system [in the narrow sense] in use with a view to making it better” or “determining whether a computer system fulfils a certain set of criteria”3 . The criteria, or the thing being made better, may relate to software engineering questions (efficiency, the fact the system works etc. - in which case evaluation is pretty close to testing); or they may relate to usability issues; or they may relate to the meeting of the requirements specification. The way in which the evaluation is performed depends on the tastes of the evaluator, but it often tends to go on in isolation from the ‘real world’ of work, in labs and the like. For example, software companies tend to do their usability evaluation in specialised labs with one-way mirrors, semi-realistic tasks and so on. A slightly wider use of the term comes from those areas of computing that have a direct concern for the world of work, principally HCI, CSCW and some schools of requirements engineering. In those fields, the term evaluation often refers to the same kinds of goals as those given above, but with criteria that are based on the effects of the computer system on the organisation in which it is used. This gets much closer to the systemic perspective I have discussed above. A rather earlier use of the term comes from those who evaluate educational and social change programmes. An important distinction here has again been between formative and summative evaluation - making the programme better or determining whether it has been a success according to some criteria. An enormous amount of debate has gone on among practitioners about the appropriate kinds of methodology (Easterby-Smith 1994). I situate my model of evaluation as essentially a formative one. My aim is to contribute to the building of better cooperative systems. It may be the case that the evaluation does not directly contribute to a given system, but I do believe that all evaluation is ultimately formative in that it feeds back into future systems design. Given that my concern is with socio-technical systems, I am clearly at the organisationally-focused end of the evaluation spectrum I mentioned in computing; but I am also seriously concerned with the introduction of ideas from programme evaluation into work in computing. Within this domain, I am closest to the goal-free perspective by choice, although the political aims of the interventionist view are very important to me also. However, I feel ultimately that these existing methods are limited in their applicability to the CSCW domain. The chief reason - that they all tend to deny the multiplicity of stakeholder viewpoints - will be detailed in the following section. Two other problematic kinds of multiplicity were mentioned in the introduction: the range of different disciplines and techniques that go to make up CSCW (see Ross et al, 1995); and the fact that many different activities go on under the term ‘evaluation’ (Ramage 1996).

The crisis in evaluation: multiple stakeholders, multiple interests The term “stakeholder” is one that has gained considerable currency in recent years. It is used to refer to “any identifiable group or individual who can affect ... or is affected by” a system 6

(Freeman and Reed, 1983). A few typical stakeholder groups in a CSCW setting are those who use the software, their colleagues and managers, the software developers and retailers, the Information Systems department of their organisation (if appropriate), and perhaps the customers of the organisation. Wider groups (such as trade unions, parent companies, employers’ associations, shareholders and governments) may also be stakeholders. The reason for the popularity of the term, in a variety of domains from management to economics and politics, is that as well as being descriptive of the groups within the system, it also is prescriptive in that it argues that certain groups and their interests are important. In this way, the stakeholder perspective replaces views such as that which wishes only to focus on the interests of managers and shareholders. It is an attempt to be inclusive in one’s view of the organisation. The reason for doing this varies according to the person using the term, but two reasons are typical: out of a sense of natural justice, or for instrumental cause. First, the concept is sometimes used by those who believe in workplace democracy, the breaking down of hierarchies, the “quality of working life” and so on. For such people, the consideration of all stakeholders’ interests is an end in itself. On the other hand, there are also people who seek to use the stakeholder concept as a means to a different kind of end, typically the survival of the organisation in an ever-changing world. In the report Tomorrow’s Company (RSA, 1995) for example, the stakeholder model is used to illustrate the way in which companies need a certain coalition of interests to be behind them in order to survive - such as employees, customers, shareholders, managers and governments. If the needs of some of these groups are neglected, that group may, in effect, withdraw its approval for the further survival of the company, leading to its ultimate demise. The problem of conducting evaluation in a situation where there are multiple stakeholders, of the kinds given above, all with potentially conflicting interests, is then relatively clear: If we are evaluating according to some pre-defined criteria or objectives, then whose criteria? Those of the employees may contradict those of the shareholders, for example. Again, if we are looking at the effects of a particular system, then the effects upon whom? This makes finding a system to be ‘Good’ or ‘Bad’ - forming a single, binary, measure of its appropriateness - extremely difficult. If we say a workflow program installed in a particular company (e.g. Bowers et al, 1995) is good for the managers (because it gives them access to detailed information about what their employees are and have been doing) then it may be Bad for those employees (because they are scrutinised and so feel threatened, or because their work is altered by the overheads necessary to keep the workflow system fed with information). Indeed, it may be Good for some customers - those for whom price is an issue, and so increased financial control may assist; and Bad for others - who have to deal with unhappy staff or who get work from them that is slower or less well done than before. While this problem has not been addressed explicitly in the CSCW literature in the past, various solutions have been taken to it. Most of these involve defining it out of existence, pretending it is not a problem and either hoping it will just go away (the strategy of many ethnographers); or explicitly favouring the interests of one stakeholder group or another - typically managers (as in the systems analysis methods) or workers (as in the participatory design methods). This latter solution (privileging one group) seems to me to be entirely legitimate, as long as it is recognised as a political move. The former solution (ignoring it) is a little more problematical: rather in the same way that someone who claims they are acting on common sense is all too often simply acting on the prevailing worldview of society, it tends to favour the status quo - as this often has sufficiently strong momentum to require some effort to push it in a different direction. A solution often taken by those influenced by HCI is also worth mentioning: the labelling of two stakeholder groups as designers (in which are included evaluators) and users, and then the examination of the relationship between the two. Many discussions have unpicked the concept of user, pointing out that there are many different types of users, but the dualism often remains. In earlier work I was involved in to develop the PETRA framework (Ross et al, 1995), we spent a good deal of time considering the question of multiple perspectives, of evaluators and users, and effectively creating a dialectical relationship between those two perspectives. However, as is often a problem with dialectical approaches, in the process we defined these as the only two perspectives and as opposite to each other. This is clearly wrong: there are many more stakeholder perspectives

7

than these two, and indeed the categories we presented are more complex than possessing just a single perspective: there are many different kinds of user. (The same argument can be readily followed against trying to create a dualistic opposition, whether resolved dialectically or not, between the interests of managers and workers.) So what solution do I advocate here to this problem? The first answer is not seeing it as a problem but rather as part of the nature of a complex cooperative system, and building one’s considerations around that awareness. Thus any evaluation of a cooperative system should be designed with the multiple, possibly contradictory, interests of different stakeholders in mind. It should early on contain some kind of mapping of the relevant groups, and some attempt to address the issue of who one regards as the key stakeholders. This awareness may itself be enough, but ultimately it cannot solve the problem of there being more than one definition of goodness within and for the system. My solution to this is indeed to side-step it, not pretending it does not exist, but working in a way so as to nullify it, by reframing: changing the definition of evaluation so as to allow for different viewpoints to be heard. This is the subject of the next section.

Evaluation as a process of facilitating stakeholder learning The view of evaluation that I advocate instead is one that sees it not as a task of performing a study and giving back results, but rather as part of an ongoing process of learning among the stakeholders and the organisations to which they belong. For different groups, this learning will take place in different ways and at different times. So for system developers, the learning that goes on in an evaluation (whenever it takes place) may concern how their system could be changed to better fit the needs of real or hypothetical people; but it may equally well be at the level of transferable information, such as the ‘postmortems’ that Microsoft are reported to hold at the end of a project, to allow lessons learned to be used in the next project (Cusumano and Selby, 1995). For managers, the learning from a system implementation may be technical (“don’t use Appletalk, use Ethernet instead”), strategic (“don’t buy from company X, their products are terrible”), organisational (“groupware will really undermine our hierarchy”) or various others. For evaluators, learning may be in terms of methods that did or didn’t work (what you might call action research). And so on along the list of stakeholders - but for each, the key is that learning takes place. Why is this a useful perspective on evaluation? In the first place, it gives space for the various stakeholders to have different outcomes from the evaluation at appropriate times. Second, it recognises that much learning goes on simply because of the evaluator’s presence. It is now commonplace for various kinds of social researchers, especially ethnographers, to take a reflexive stance - to be aware of their own effects upon a situation under study (Plowman et al, 1996). It is useful for this to be extended to evaluation. In the Sheffield study for example (see below), my experience was that the fact the project team had an evaluator connected to the project, hanging around and coming to meetings and asking questions, was enough to constantly remind them of their goal to be reflective as a team and to learn from their own group processes. Of course, this is very much a hands-on role for an evaluator to take. One has to give up one’s metaphorical white coat - the symbol of a disinterested scientist-expert, able to stand back and offer dispassionate advice from a position of omniscience - and replace it with a grey suit (the stereotypical clothing of a surveyor — see section 3) or whatever else is required to become part of the team. One’s first allegiance will eventually be to the project rather than to science, although one cannot be a total insider if the evaluator role is to be useful. This might sound like the classic “fly on the wall” descriptions of the role of an ethnographer, and indeed it does resemble that to some degree; except that my presence, and my peripheral membership of the group, was always an issue. I have discussed above the role of such evaluation-as-learning in the facilitation of individuals’ learning. What is also interesting is that it ties in with a larger process of organisational learning (which I take here - ignoring for now the interesting question of what is an organisation that it can learn - to be the learning that goes on in the organisation as an entity: the changes in 8

shared values, understandings, knowledge, myths, culture and procedures). If we assume that one is conducting some kind of study of a system in use, then it will be the case that a certain amount of reflexivity within the organisation has gone on before the evaluator enters the situation - to decide upon the system, to set it up and so on - and that reflexive process will continue once the evaluator has gone, as the organisation continues to learn from its experiences. Thus the learning that the evaluator facilitates by their presence and their actions is part of a longer process. This focus on process is crucial. Organisational learning (and indeed any learning) has very little to do with being presented with the results of something, but rather to do with the process that is undertaken to reach that goal. As the Chinese proverb has it: “Tell me and I’ll forget. Show me and I may remember. Involve me and I’ll understand.” So it is that the most important thing in the evaluation is the doing of the evaluation, not the result reached (which should simply be a confirmation of what has already been learned). This means that each stage of the evaluation process becomes a learning experience for those participating in it - and this has much to do with the ability for this evaluation-as-learning to offer a variable pace to the various stakeholder groups. Much of the organisational learning literature is based on Argyris and Schön’s (1978) concept of double-loop learning, that goes beyond the simple first-order learning (a cycle of plando-reflect) to build in reflection on the criteria under which one is reflecting (“learning to learn”). That is, organisational learning is not just about changing behaviours - it is about changing the patterns that underlie those behaviours. Such pattern-shifting is notoriously difficult and fraught, particularly if it involves long-held patterns (an example is the difficulty the British railway system is currently having in seeing itself as twenty-five companies, each with different agenda and goals, rather than one company with a common agenda); but the results can be well worth it. This kind of evaluation will tend towards looking at first-order learning, as the aim is usually to improve rather than to alter radically; and most evaluation methods reflect this. However, it is certainly open to the possibility that the learning that is facilitated could be of the second-order: the systemic focus of the process, the awareness of all stages of the evaluation as learning opportunities, and the space given to all stakeholders makes this an option if it turns out to be necessary.

Developing the methodology in practice My views on evaluation have also been shaped by a one-year evaluation of the learning processes of a research project team. The project was researching learning processes, and specifically organisational learning, within the surveying profession (funded by the Royal Institution of Chartered Surveyors, the RICS — the professional body for British surveyors). It took place at Sheffield Hallam University (SHU). The project was primarily carried out by four people: a fulltime researcher (Fides), and three other staff members acting as advisers to her (Mike, David and Margaret). Around two months into the course of the project, Fides realised that although the project team were looking at surveyors’ learning processes, they had made little consideration of their own. As she knew of my interest in evaluation, she therefore asked me to come in as an outsider to evaluate their work. I started this process in late November 1995, and conducted my final interviews in early November 1996, shortly after the end of the project. My main methods for conducting the research were a standard set of ethnographic ones: interviews, attending meetings and ‘hanging around’. My role was enhanced slightly, as well as giving me some useful data, by additional activities I carried out, principally the programming of a database/spreadsheet combination to allow for easy entry of the data from a questionnaire; and providing feedback at various stages, but especially the summary I prepared of an exercise during one of the steering committee meetings. My role was also frequently to be a “human reminder” of their commitment to the process of change and of conducting their research in a more reflective manner. My evaluation of this project turned out to be highly influential in the development of my ideas on the nature of evaluation and the main issues I wished to focus on in my methodology. It therefore seems appropriate to describe the main methodological issues that came up during the 9

study, and how they affected my thoughts on the nature of evaluation. This is not a full case study, therefore, but a description of how aspects of the study affected my thinking.

Expanding notions of types — looking at the whole system One of the key learning experiences for me during this study has been the expansion of the notion of what the possible types of CSCW evaluation are. I became increasingly aware that what I found interesting in the system had nothing to do with technology but mostly with organisational issues. To some extent, this was clear from my initial remit from the project team: to consider the extent to which the project was a learning organisation (as that was the topic of the research, but it seemed to be the case that, along the lines of the old saying that the cobbler’s children never have any shoes, the team were not looking at their own learning). In the event, the amount of interesting technology was pretty limited. I had hoped initially that there would be more: the project team members were split across three sites in Sheffield, separated by seven miles, with a further half-member of the team being based in Guildford (200 miles from Sheffield). I was curious to know whether their use of email would serve to bind the team together between meetings. In general, it did not: email was used principally for occasional messages, to disseminate documents more quickly than the quirky SHU internal mail, and to facilitate editing of draft documents. One occasion where it did become more useful was in the selection of which companies to which to send a questionnaire: email (via Microsoft Mail attachments) was used to allow the team members to select companies and see each others’ choices. This raised one technical issue, as two of the team were unable to receive such attachments, and so their input to the selection process had to be by faxes to Fides… However, most of the interesting questions turned out to be concerned with other matters than technology. So the guiding questions and issues in the evaluation changed over the length of the study. Here are snapshots of the issues involved, from the sets of interviews I conducted at three different stages of the project - they clearly show the development in my interests. • December 95 (near the beginning of the project): Key stakeholders, roles of the members of the project team, the centre of the project; which media were being used to communicate, and whether these served as a bridge to keep the team together between meetings; to what extent the project (team) was a learning organisation. • April 96 (in the middle of the project): How the roles of different team members have changed; effects of communications media; the project team as learning organisation. • November 96 (shortly after the end of the project): The effect of the project’s focus on organisational learning upon the style of the research, and upon interactions in the project; the changing pattern of relations between stakeholders; effects of comms media; what effect my being there had.

Identifying stakeholders A question that came up early in the life of the project was that of who were the key stakeholders in the project. In my first round of interviews (see the above section), I was much concerned with knowing what roles the members of the project team were taking on, and how they related to one another. A month after those interviews, I conducted an exercise with the team to identify the main stakeholders. Here is the map we co-created:

10

Figure 1 - stakeholder map Before the exercise began, I had prepared a large piece of paper with lots of little pieces of paper representing who I thought were relevant stakeholders (12 in all, some of them individuals, some organisations and some groups). During the exercise, the project team added 7 more, making the set as in the map above. It seemed to be the case that just deciding who should be the stakeholders, and where they should go in relation to one another, was a significant learning experience for the team. This learning was especially exemplified by the drawing of the three circles shown above (the project team, ‘the professional side’ — surveyors — and ‘the organisational learning side’), and the decision as to who should be in which circles. The groups more or less arose spontaneously (it was certainly not my plan before the exercise began) from the way the tokens representing the stakeholders were placed. Particularly interesting was the placing of the project team circle, which includes all five members of the team, but cuts two of them in half to indicate their less full involvement at the time the map was made. (A third member, David, has a small corner of his token outside the circle, which is not so clear in the version above.) An interesting reflection of this is that the three members of the team who were left doing the exercise with me when the circles were drawn — Fides, Mike and David — are those who feature completely, or almost completely, within the circle. A fourth member of the team, Margaret, had been present earlier, but had to leave part-way through the exercise. The process of drawing the other two circles raised fewer issues. The chief item of interest regarded the one indicating the surveying side of the project, which when first drawn excluded Mike (the head of the Surveying Division at SHU!) and Fides (the researcher) completely, and the circle was only altered to add them at my prompting. What unconscious motivation that had can only be speculated on… Finally, at two different times the question of the centre of the diagram came up. This was held to be at two different places: about halfway down Mike’s token, and about halfway down Fides’. While these two crosses are rather close together, the tension this reflected (between theory and practice) seemed to speak to the team members. While this exercise was the main use of stakeholders in this project (although the map created, and the awareness of it, was a constant feature at the back of the minds of the team’s discussions and especially my reflections with them), I became through this exercise very convinced of the usefulness of the technique for the self-understanding of groups. This fits in well with the final major point of learning for me from this study.

11

Evaluation as learning I have mentioned in the above two sections that a focus for me in the project that became increasingly pronounced was the notion of evaluation as a process of facilitating organisational learning. I have discussed this theoretically in some detail above, but here I want to mention why it became relevant in this project. Partly this was due to the topic of the project and the fact I was thus surrounded by issues of organisational learning, books and erudite discussions on the subject, and a basic aim by the team members to be ‘learningful’ in their work; and a good deal of it filtered into my thinking. Also, my basic remit was to look at the learning processes of the project team, so that to some extent this focus became inevitable. There were other reasons, however. The stakeholder mapping exercise described above fairly clearly was a process which led to a lot of learning for the project team about who they regarded as the core members of the team and what was the nature of the relationships within the team and with others. The extent to which they had learned things during the hour or so of this exercise was something I noted in my field-notes. Also, reflecting on my continuing presence in the project as it went on, it was clear to me that I was not doing anything much in particular except being there, and occasionally asking questions, and yet my presence seemed to have some kind of effect; so I characterised what role I had in the project as one of facilitating organisational learning, and that seemed to fit for them. A helpful metaphor for this that I have since formulated is to think of the evaluator not as a scientist in a white coat, finding out a clean and objective truth, but as a midwife, assisting at a messy and untidy birth: and this was very much the way it felt in this project. I did give some comments to the project team along the way, chiefly informal discussions but also paper contributions. For example, when two of the team gave a presentation about the project (in a rather innovative manner involving multiple voices of different people linked to the project), my voice was present through a short, one-and-a-half page, paper describing my view of the project to date, considering the particular issues of whether the project was a learning organisation and the different communications media used (which I also circulated this around the rest of the project team).

The result: a methodology for evaluation The following describes a set of steps representing my methodology, Systemic Evaluation for Stakeholder Learning (SESL). This is somewhat of a sleight of hand, as the real point of SESL is the principles indicated in its name and discussed above in theory and practice. These steps are necessary for the conduct of the methodology, but not sufficient: if one were to follow them with the expert-evaluator view of the world, they would have little benefit. They are rather intended as guiding points along the way for someone who is following the principles outlined in section 2, acting as a facilitator of an ongoing organisational learning process and a participant in the system. The diagram summarises these steps, although it is not intended to be especially prescriptive of the order in which things are done. Each step is detailed further below.

12

4.1 Determine system

4.2 Decide on type

4.4

4.3 Identify stakeholders

Study

Analyse

Key issues

4.5 Feedback results

Figure 2 - the stages of SESL (numbers refer to the sections below) An alternative way through the methodology is by the use of a pro-forma list. I have found that writing information against the following headings encompasses most of the issues raised here: system; main technology; type; stakeholders; methods; key questions; analytic framework. Such a list can be written in part at the start of the evaluation, as part of negotiating its parameters with the various stakeholders, and then progressively modified during the course of the evaluation. It will then serve as a structuring device both for the conduct of the evaluation and for the discussion of its results.

Determining the nature of “the system” As SESL is a methodology for evaluating cooperative systems, it is vital to decide what system one is evaluating. We know from lots of studies that how a groupware program is used depends on the context in which it is used, the organisational and cultural situation and so on. So to study a CSCW system only as technology seems to be folly. However, one could in theory do that; and likewise, one could in theory study only the “social system” - the context within which the technology is embedded, without reference to the technology. This is also foolish, as it ignores the fact that the technology changes the organisational structure and culture (and not least its processes) - and this change is one of the selling points of the technology. However, it is sometimes interesting to make the focus of one’s study primarily the social or the technical system (remembering that this is simply a statement about what one is studying, not about the inherent nature of the system). This is what I mean by determining what is “the system under study”.

Deciding the type of the evaluation It is helpful to consider what type of evaluation you are conducting, as it is often the case that it is not clear to evaluators that there are many different activities that can validly be called evaluation. I have discussed this at some length in Ramage (1996), where I identified five types of evaluation: formative evaluation, evaluating effects of a system, evaluating solely the effects of the organisational and psychological parts of the system, evaluation of concepts (in research projects), and evaluation for the sake of purchasing new products. Being clear which one or more of these is being undertaken is very useful for clarifying how the evaluation should take place. The types can be usefully categorised by the following questions, and the resulting matrix (which takes the two Effects categories together): 1) Are you looking at the effects of a system (what it does to various groups of people and their work) or its objectives (how things are intended to be)? 2) Are you looking at the way things are now, or their potential? 13

Figure 3 - the types of evaluation

Identify stakeholders (and their viewpoints) The next question to ask is, who are the stakeholders in the system and what are their interests? A simple rule of thumb is to ask who affects, depends on or can influence the system; and likewise who is affected by it or is influenced by it. Another way, that I find particularly helpful, is to have a representative group collectively construct a stakeholder map (see above). Researchers on the use of stakeholders, especially Mason & Mitroff (1981:94-103) have described several more methods for identifying them, including: • taking standard demographic groups (sex, age, seniority in company etc.) and considering their relevance; • asking various people who they think are the key stakeholders; • studying (or creating) ethnographic accounts to find those who express pertinent interests. Another significant question (which is assisted by the process of stakeholder mapping) is who are the key stakeholders: whose views need most to be met? Whose involvement is so vital that the system will break down without it? It is important to make it clear who you think these are, as this will alter how much weight to give to different aspects of the evaluation. It is also useful to be clear what are the perspectives of these key stakeholders upon the system. It must be stressed that the lists of stakeholders, their interests and relationships gained by these methods are subjective: different methods (and different participants) will produce different lists, and they will also change over time. As Korzybski wrote, “the map is not the territory” (quoted by Bateson, 1972:449). For this reason, it is greatly desirable to produce them collaboratively, preferably involving both insiders and outsiders to the organisation.

Study & analysis: key questions to ask Now we move on to the study and analysis. The type of study conducted (ethnography, usability lab experiment, semi-situated study where your colleagues try out the software, etc.) does not affect this methodology directly: use whatever comes to hand, according to your skills, experiences and resources. The analysis of the system will depend upon the exact study method used and the kind of evaluation being conducted. A lab experiment will have a very different relationship between the study and the analysis from an ethnography. Again, it will depend on the theoretical background of the evaluator —functionalist sociology and distributed cognition (say) will provide two very different ways of conducting an ethnography and thinking about the results. However, in general I assume here that there is some kind of study cycle, where observation and analysis are intertwined: one does not just observe once and then go back to one’s ivory tower to analyse, but observes and analyses back and forth. Clearly the cycle requires some key questions to guide the process: what are the main areas under study? What are the main concerns the various stakeholders have? Relevant questions (which will depend on the type of the evaluation) might include: 14

• What are the system’s effects upon...

a) the work of the group using the system? b) the life of the group? c) the life of the people in the group? d) the life and work of the people outside the group? e) the organisation(s) of which the group is a part? f) society? • What are the potential effects of the system upon… (same list as above)? • What are the system’s objectives (from the different perspectives of the various stakeholders)? To what extent are these being met? • What are the objectives for this new system (from the different perspectives of the various stakeholders)? How well are they likely to be met?

Feedback results Communication is vital in evaluation: unless one tells people what one has determined in the evaluation it is not much use to anyone. So the first part of the response process is telling the relevant stakeholders what has been determined in the evaluation. The next part is assisting them in decision-making. It is not the evaluator’s job to make decisions about what should be done with the system, nor is it their job to decide who should make decisions. However, it is their job to give sufficient information to all those who might make decisions (which is not necessarily the same as those obviously in power), that they can do so. Finally, perhaps the most important role for an evaluator is to act as a facilitator for organisational learning. This may be implicit in their attitudes, or it may be explicit, and involve taking a view that evaluation is not so much about making judgements as about assisting in the learning processes of all those involved in a project (I have written elsewhere on this issue). Think of an evaluation, not as a process of decision-making, but as a process of greater understanding.

Conclusions The reader has a right by now to be slightly dazed from the different perspectives seen in this tour around the theory and practice of the evaluation of cooperative systems. I have tried to argue for the expansion of various concepts beyond their purely technical interpretations: that cooperative systems are more than CSCW systems; that evaluation needs to look at the whole system and the viewpoints of all stakeholders; that evaluation should be seen as a learning process rather than a process of finding out the objective truth. I have performed this argument by discussion, analysis of literature and presentation of a case study. Some may find the Sheffield study and my methodology problematic when related to their own evaluation practice. “Is it not the case”, they might argue, “that the relation between this and CSCW is pretty limited? I didn’t see much sign of computers in your discussions”. I hope it is clear by now that I do not think this is a valid criticism, as to me all CSCW systems (and indeed IT in general) involve people within organisations as their reason for existence, and will not be taken up by those people if the systems are just clever technology. Therefore, whether we approach this question from the angle of the technology or the people matters less, as they are inexorably intertwined: we must eventually consider all angles. The contrary view might also be put in criticism: “what is there here that cannot be found in the general management literature?”. I think the answer to that is a respect for technology as a driving agent but not in itself a sole determinant of the future of organisations. I have all too often noticed a naïve view of the role of technology in organisations that regards it as some kind of unstoppable force (good or bad) over which people have little control. I am not partial to this technological determinism, and would suggest again a more complex relationship that interweaves the implementation of the technology with the changes in organisational systems occurring at the same time.

15

Of course, the biggest flaw in this work is lack of experience. For all my rhetoric about the relevance of the case study I have presented, and the importance of the issues involved, I do recognise that there is a need for further studies in a more “orthodox” CSCW context to check that the same issues arise there. But I find the key point in this paper fairly incontrovertible: evaluation is a complicated process that needs to take into account many viewpoints and perspectives to be successful, and is not so much to do with objective study as with involved learning. To return to the midwife metaphor: it is a messy, difficult, skilled and potentially dangerous activity; but if it is carried out with respect, it can be a way to assist people into the bringing out of exciting new ways of being.

Acknowledgements My thoughts on organisational learning, stakeholders and cooperation, and how they link to evaluation, have been shaped by many discussions and work with Fides Matzdorf (who also commented on a draft); by working with my other friends in the RICS project; by a workshop on Organisational Learning & CSCW at CSCW 96 (see SIGOIS Bulletin, December 1996); and at meetings of the Learning Individuals in Learning Organisations group. I also received many helpful and challenging comments from Ilfryn Price (especially concerning cooperation), Colston Sanger, and three anonymous IRIS reviewers. My research is funded by the Engineering & Physical Sciences Research Council (UK), and by the Digital Equipment Corporation.

References Argyle, Michael (1991). Cooperation: The Basis of Sociability. London: Routledge. Argyris, Chris and Donald Schön (1978). Organizational Learning. Reading, MA, USA: AddisonWesley. Axelrod, Robert (1984). The Evolution of Cooperation. New York: Basic Books. Bannon, Liam (1993). Use, Design, and Evaluation: Steps towards an integration. Shaerding CSCW Workshop. Bateson, Gregory (1972). Form, Substance and Difference. In Steps to an Ecology of Mind, San Francisco: Chandler, pp. 448-464. von Bertalanffy, Ludwig (1969). General system theory: foundations, development, applications. New York: Braziller. Boulding, Kenneth (1985). The World as a Total System. Thousand Oaks, CA, USA: Sage. Bowers, John, Graham Button and Wes Sharrock (1995). Workflow from Within and Without: Technology and Cooperative Work on the Print Industry Shopfloor. Procs. of the Fourth European Conference on Computer-Supported Cooperative Work (ECSCW 95), pp. 51-66. Checkland, Peter (1981). Systems Thinking, Systems Practice. Chichester: Wiley. Cusumano, M and R Selby (1995). Microsoft Secrets. New York: Free Press. nd Easterby-Smith, Mark (1994). Evaluating Management Development, Training and Education (2 edition). Aldershot, UK: Gower. Ehn, Pelle and Morten Kyng (1987). The Collective Resource Approach to Systems Design, in Gro Bjerknes, Pelle Ehn and Morten Kyng (eds.), Computers and Democracy, Aldershot, UK: Avebury, pp. 17-58. Freeman, R. Edward and David Reed (1983). Stockholders and Stakeholders: A New Perspective on Corporate Governance. California Management Review, 25(3): 88-106. Grudin, Jonathan (1988). Why CSCW Applications Fail: Problems in the Design and Evaluation of Organisational Interfaces. Procs. of the Conference on Computer-Supported Cooperative Work (CSCW 88). Grudin, Jonathan (1993). Interface. Communications of the ACM, 36 (4): 112-119. Heath, Christian and Paul Luff (1991). Collaborative Activity and Technological Design: Task Coordination in London Underground Control Rooms. Procs. of the Second European Conference on Computer-Supported Cooperative Work (ECSCW 91).

16

Howard, Robert (1987). Systems design and social responsibility: the political implications of “computer-supported cooperative work”. Office: Technology and People, 3:175-187. Hughes, John, Dave Randall and Dan Shapiro (1991). CSCW: Discipline or Paradigm? A Sociological Perspective. Procs. of the Second European Conference on Computer-Supported Cooperative Work (ECSCW 91). Hutchins, Edwin (1991). Organizing work by adaptation. Organizational Science, 2(1): 14-39. Lewis, Paul (1994). Information-systems development: systems thinking in the field of informationsystems. London: Pitman. Mason, Robert and Mitroff, Ian (1981). Challenging Strategic Planning Assumptions, Chichester: Wiley. Morgan, Gareth (1986). Images of Organisation. Thousand Oaks, CA, USA: Sage. Peters, Tom (1992). Liberation Management. London: Macmillan. Plowman, Lydia, Richard Harper and Yvonne Rogers (eds.) (1996), The ‘Professional Stranger’. CSRP 428, University of Sussex, Brighton, England. Ramage, Magnus (1994). Engineering a smooth flow? A study of workflow software and its connections with business process reengineering. MSc Dissertation, University of Sussex. Ramage, Magnus (1996). CSCW Evaluation in Five Types. Report CSEG/17/96, Computing Department, Lancaster University, UK. Ross, Susi, Magnus Ramage and Yvonne Rogers (1995). PETRA: Participatory Evaluation Through Redesign And Analysis. Interacting With Computers, 7 (4): 335-360. RSA (1995). Tomorrow’s Company. Report of an Inquiry by the Royal Society for the Encouragement of Arts, Manufactures and Commerce, London. Senge, Peter (1990). The fifth discipline: the art and practice of the learning organization. New York: Doubleday. Sommerville, Ian, Richard Bentley, Tom Rodden and Peter Sawyer (1993). Cooperative Systems Design. Report CSCW/10/93, Computing Department, Lancaster University, UK. Thomas, Peter (ed.) (1995). CSCW Requirements and Evaluation. Berlin: Springer-Verlag. 1

The four parts of this section correspond to the words of the name of my methodology: Systemic Evaluation for Stakeholder Learning. 2 In fact, the way in which this was done — via the RAND Corporation, a mostly-military think-tank in the USA, during the 1950s — gives me considerable cause for concern; but then so much of the history of computing is about the struggle between the military-industrial complex and the human liberation movement. 3 These two uses are often referred to as formative and summative evaluation.

17