Developing Evaluation Capacity BUILDING EVALUATION ... - CiteSeerX

1

Priority topic: Developing Evaluation Capacity

BUILDING EVALUATION CAPACITY: LINKING PROGRAMME THEORY, EVIDENCE, AND LEARNING

Sarah C.E. Batterbury

Evaluation Institute, University of Glamorgan, [email protected]

Evaluation Institute, University of Glamorgan, Pontypridd CF37 1DL + 44 (00 1443 483693 (direct) + 44 (00 1443 482138 (fax) email: [email protected]

2

BUILDING EVALUATION CAPACITY: LINKING PROGRAMME THEORY, EVIDENCE, AND LEARNING 1.0 Introduction Evaluation Capacity is a multi-dimensional phenomenon that involves the ability to apply knowledge to policy as well as the ability to generate useable evaluation findings. Both the policy community and professional evaluators can aspire to evaluation capacity. It operates at a number of different scales. Evaluation capacity can be found (and created) at project, programme, organisation, and interorganisational levels. At each of these levels, the capacity to evaluate relies on the creation of an evaluation output with intrinsic utility and relevance. Evaluation capacity should be understood – ideally - as a process whereby reflection and learning extend beyond the confines of individual organisations and programmes into the policy system. The development of evaluation capacity is of greatest value where evaluative activity permeates throughout particular policy systems creating opportunities for learning and intelligent policy formulation. As Howlett & Ramesh have noted a fundamental purpose of evaluation is to contribute to “effectuating changes to the policy in question” (1995, p.175). The development of evaluation capacity therefore entails both the production of evaluation and its effective use. Evaluation and policy making are intrinsically connected and in an optimal system provide twin processes that necessarily inter-relate and inform each other. The paper argues that learning, theory, and evidence are all integral components of evaluation capacity. The policy cycle (Lasswell 1956, Brewer 1974) can be adapted to demonstrate the linked, cyclical elements of an optimal system-wide evaluation capacity. Learning from evaluation entails ability to share and disseminate this learning throughout and across policy systems. This recognises the importance of accumulation of a useful and accessible shared evidence-base facilitating the process of policy formulation based on workable programme theory. However, this represents an optimal scenario, an arrangement that is difficult to find in practice. Achievement of a culture shift which replaces blame and shame with dialogue, learning and reflection are essential for optimal and relevant evaluation outputs. The paper advocates that evaluation capacity will be enhanced by a process of professionalisation of evaluation, foreseeing a greater role for evaluation societies in helping to mitigate some of the more difficult aspects of political influence on the evaluation process as well as helping enhance and build training provision in evaluation.

3

The paper first discusses the ideal-type scenario where evaluation capacity functions optimally. It then examines some of the obstacles to achieving this outcome. Obstacles exist at a number of different levels: inter-organisational (system-wide), organisational, programmatic. Difficulties with the adequacy of the evidence-base can also pose problems for utility and subsequent utilisation of evaluation. The paper then examines ways of minimising the obstacles to building evaluation capacity. The question of developing evaluation capacity for regions and across policy systems is a key one in policy circles in Europe, the ability of systems to learn, share, and apply evidence is critical. Reformulation of policy and programme theory in the light of lessons learned through evaluation is (theoretically) integral to the evaluation process. While a driving force behind the desire to evaluate is often that of meeting public accountability requirements, policy makers are also increasingly looking for innovative solutions to build up knowledge systems in a number of different sectors, and evaluation can help contribute to a knowledge-rich policy environment. The obstacles to effective evaluation are however formidable. This is most especially visible in policy domains where the stakes are particularly high. Evaluation is therefore least likely to contribute to policy learning, adaptation of policy theory, or the investment of time in the application and collection of unbiased evidence where there are large financial stakes or where personal or group positions may be threatened. We know that evaluation is an essentially political process and is situated in a hierarchically ordered polity and reality (Palumbo 1987). The politicised environment poses a significant threat to evaluation and skews the development of evaluation capacity in particular ways. For so long as evaluation is commissioned by powerful elites with a stake in programme assessment this situation is unlikely to change. However, although evaluations that are independently designed and implemented achieve greater distance from these political constraints, they nevertheless encounter difficulties of relevance to policy makers (Batterbury 1998). In developing evaluation capacity it becomes necessary to find ways to address the problems of politicisation of the evaluation environment, and the quality and relevance of the evaluations and evidence base. Evaluation capacity is a multidimensional phenomenon, so too are the required solutions for building this capacity. This requires a system-wide vision in which training about evaluation and its utility is available to all actors and where dialogue between different stakeholders and policy transparency strives to minimise institutional and political bias. There is no easy solution however, and the process of developing effective evaluation capacity is one that remains an on-going, negotiated, obstacle course. A number of strategies including slow-time evolution, dialogue and the professionalisation and integration of the evaluation community may help ameliorate some of these obstacles. 2.0 Evaluation Capacity: the Ambition We can postulate an ideal-type scenario where evaluation capacity may be considered to function optimally. This is an ambitious outcome where different sectors of public policy seek to learn from previous actions in the similar and related policy spheres through a process of evaluation, dialogue and cross-programme and policy reflection,

4

analysis, and evaluation. The problem of resources is however significant in achieving this ambition. We are reminded that ‘policy making typically involves a pattern of action extending over time and involving many decisions’ (Anderson 1975, p.10 author’s italics). Evaluation forms only one component of this process. The policy cycle, first elaborated by Lasswell in 1956 and refined later by Brewer (1974), provides us with a logical and rational model for the policy process. We can adapt this (see Figure 1), to show how evaluation might ideally be situated within the policy system. The model is one governed by reflection about what works and how it works. The sharing of knowledge and information across programmes and policies in a joined up integrated system is also critical. Evaluation and learning are therefore at the core of a reflective polity where the goal of policy improvement predominates. A willingness to refine programme theory and to reflect, modify, and implement new policies in the light of a continuous and on-going process of evaluation, learning and policy change forms an evaluation cycle (or spiral) leading to the emergence of responsive policy rooted in intelligent policy thinking.

Figure 1: Showing Stages in an Ideal System-wide Evaluation Capacity based on an Evaluation Cycle of Learning and Reflection

Collection and sharing of different Sharing of evidence kinds of evidence

Accumulation of knowledge about On-going what works meta - evaluation

Reflect, modify, implement

Reformulate policy and consult

Accumulation of knowledge about what works

Refine programme theory for specific programmes.

5

The evaluation cycle provides us with a useful model. It is useful because it exemplifies the close inter-relationship between evaluation and policy formulation. It also shows how evaluation can best be utilised where it is desired and integrated into the policy process. Evaluation works best where it is central to a learning-focused system where policy development is encouraged and rewarded. The evaluation cycle is not about using the evaluation process merely an ex-post assessment of performance and impact, nor to simply provide an excuse for public accountability, nor should it be used as a post hoc rationalisation of policy/programme behaviour. The model demonstrates the utility of evaluation, an intrinsic and integral component of evaluation capacity. It relies on the emergence of a receptive environment that welcomes evaluation, using evaluation for positive learning-focused and policy development functions. Evaluation capacity is also about producing evaluations that are actually useable. This implies the need to produce evaluations that are relevant, understandable, appropriate, and timely. It also relies on the capacity to interpret evidence and apply it appropriately to the policy context. The evidence-base for evaluation can take many forms. It may entail cross-programmatic comparison, specific data about longitudinal events, scientific, realist, or constructivist accounts and analysis. Building an evidence base is easier than developing evaluation capacity per se; although it can take time to facilitate the emergence of effective and useful evidence and data, it does not, in itself, have a threatening character. The obstacles to developing a reliable and effective evidence base reside in problems of resourcing and the time and vision needed to collect and record appropriate information and data. The application of evidence is more problematic however. Utilising and applying evidence may result in decisions that materially alter significant stakes thereby providing a political rationale for dysfunctional behaviour. This is unfortunately quite common. Procrastinating over dissemination of difficult evaluation findings and decisions can be found in most activities of public policy. We know that the policy environment seldom allows a smooth passageway for evaluation or for any knowledge creation and application. The degree to which evaluation informs policy in the logical and rational way portrayed in Figure 1 is conditioned by the political forces that compete to influence policy. The lack of a performance-optimising policy environment does not necessarily prevent the production and use of evaluation however, although it can skew the decision making process. Multiple stakeholders can induce programmes to operate with a degree of transparency and reliability. The emergence of an evaluation culture across Europe is leading to a more demanding stakeholder constituency expecting to find evidence of effective programme and policy operation. In what follows a number of obstacles to building evaluation capacity are considered. Examples of issues that have impacted on the production of quality evaluations and their utilisation are discussed. These two elements are the fundamental components of evaluation capacity and so ‘defensive routines’ (Argyris 1985) that inhibit these functions inevitably create difficulties for the emergence of a strong evaluation capacity. In the final section we discuss ways forward for minimising these effects.

6

3.0 Theory and Evaluation Capacity Theory exists at two levels, there is (normally) a programme or policy theory, realist evaluators have also advocated that evaluation should be based on an evaluation theory. In the first instance, the process of policy formulation is based on the development of workable programme/policy logic. This logic forms the basis of the underlying programme or policy theory. Effective evaluation capacity implies ability to identify programme or policy theory, the intervention logic. This enables evaluations to work with programme and policy theories to enhance implementation, development, and reflection and learning among programme participants. Effective policy performance therefore relies on there being an adequate policy theory. Secondly, Pawson and Tilley (1997) have also argued that realist evaluations themselves need additionally to have a strong theoretical base. For Pawson and Tilley this theoretical base entails determining context, mechanisms, and outcomes of a policy or programme - thus the theoretical basis of an evaluation “must be framed in terms of propositions about how mechanisms are fired in contexts to produce outcomes” (1997, pp.84-85) This provides a way of moving away from the now rather sterile epistemological debates surrounding scientific versus pluralistic and constructivist paradigms. Pawson and Tilley (197) argue that in focusing on theory (context, mechanisms, and outcomes) the realist evaluator will inevitably produce a more useful evaluation output - namely the identification of ‘what works for whom in what circumstances’ (1997, p.85). The realist evaluation circuit implies a feed through of evaluation knowledge into refined theory development until an increasingly clear understanding the likely outcomes of specific programmes is achieved. Pawson and Tilley (1997) have also advocated realist cumulation of evaluations aimed at moving beyond the single evaluation to a position where lessons are drawn cumulatively across successive programmes and between different policies. This concurs with the argument of this paper that evaluation capacity is best achieved where reflection and learning extend beyond individual organisations and programmes into the policy system. For Pawson and Tilley (1997) cross-evaluation learning is best achieved where individual evaluations give consideration to the interaction between context and mechanism in explaining outcomes. The realist approach is useful and encourages us to put theory at the centre of evaluation work. The approach is powerfully argued but represents a rather idealised view of the world of evaluation and the context within which evaluations come about. Pawson and Tilley (1997) make only passing reference to the political contexts of evaluations. Their solution to the vexed and often non-rationale policy behaviour of policy makers and evaluation commissioners and users is to advocate the development of a teacher - learner relationship with the policy maker. Change is to be achieved in a slow and incremental fashion where positive evaluation experiences make the approach more acceptable in the future. However, this does not adequately address the multi-dimensional, and often problematic, character of the commissioning environment. It does not either suggest ways to address dysfunctional bureaucratic behaviour (Merton 1980, Blau 1956) or defensive routines (Argyris 1985) that can stand in the way of rational and logical design and implementation of programmes and policies.

7

Several problems therefore emerge with the somewhat idealised nature of making theory the central feature of evaluations. The first of these is concerned with the commissioning environment. Commissioners of evaluations often draw up very tight specifications for the kind of evaluation they require. This is seldom couched in terms of theory based evaluation. It can be difficult therefore for evaluators to deviate radically from the framework imposed by the commissioners. The demonstrates the need for evaluation capacity to be developed in the policy world as well as among other evaluation professionals. In addition, the use made of evaluation is often determined by political factors not related to the style or type of evaluation approach. Evaluations of the Structural Funds provide good examples.1 Jansky (2002) has argued that the political rewards associated with the Structural Funds mean that little notice has been taken of evaluation findings. He shows how the importance given to evaluation findings in considering consecutive reforms of the Structural Funds has diminished with each major reform. This was not because of the nature of the evaluation (realist or otherwise), but because the “growing participatory character of EU Regional Policymaking put evaluators into competition with a number of other policy actors” (2002, p.16). An enlarged Structural Fund budget raised the stakes associated with the programme with the consequence that more actors vied for policy influence reducing the impact achieved by individual evaluation exercises. There are a few notable exceptions however, but inevitably the greatest influence was achieved where the policy environment was receptive to change and supportive of the resulting message. This finding does not suggest that theory based evaluations played a central role in achieving maximum influence. Political obstacles are particularly acute where the stakes are large. These stakes can be concerned with status gains, financial gains, and defence of personal empires (among other things). The influence of politics has the potential to paralyse the logical implementation of evaluation design, impeding evaluation capacity by inhibiting effective use of theory, evidence and the take-up and utilisation of evaluation findings. The second problem encountered with the use of programme theory as the basis of the evaluation approach lies in the occurrence of inadequate or non-existing programme theory. This occurs very often where programmes have evolved somewhat spontaneously with no real logic or theory underpinning the design or inception of the programme. While it is possible to re-construct a programme logic in an ad hoc fashion, it is unlikely that officials and policy makers are able to offer any rationale for some programmes that have evolved over slow time. A recent example of this concerns the evaluation of a funding stream provided by a regional government. The programme “just evolved, the projects never applied, they began with a small grant and it just rolled out to get bigger and bigger” (programme official, 2002)

although they are not unique and many other evaluation domains also provide similar experiences

1

8

Under-conceptualised a-theoretical programmes and policies like these pose difficulties for evaluations that place theory centrally. An official associated with this programme recently commented “we do not want policy recommendations” (local commissioner of evaluation, 2002). The idea that the evaluator should enter into a teacher-learner relationship in this context is a little optimistic. In spite of the obstacles outlined above, the evaluation theory approach moves us on from the, now rather sterile, debate about purpose of evaluation to consider new approaches to system-wide learning as a defining feature of evaluation capacity. Ways of developing this capacity through slow incremental changes are advocated strongly by realist evaluators. This is more likely to achieve a positive outcome in areas where there is political will to support these shifts in evaluation culture. 4.0 Evidence and Evaluation Capacity In this section we consider why evidence is an intrinsically related to evaluation capacity. Evidence on its own (regardless of its quality) is not especially useful. A perfect evidence base is not of use unless it can be applied to inform and induce appropriate changes to policy, programme, and/or organisational design and performance. It is the application of evidence to ensuring policy and programme learning, and to cross-programme change and policy formulation that matters for the building of an optimal evaluation capacity. For this there needs to be strong, relevant, and reliable evidence. Evaluation capacity entails having access to quality useable data and evidence, and being able to utilise this evidence effectively. Academics and evaluators have long bemoaned the policy world’s ability to ignore research evidence. Doreen Massey famously complained about this at the Institute of British Geographers conference in Brighton in January 2000. Where research does influence policy there is often a policy lag between the publication of evidence and the subsequent feed through into policy design. A good example of this was the belated wide-scale belief that Small and Medium-sized Enterprises (SMEs) would provide the motor for European Regional Development. This view prevailed in the European Commission during the mid-1990s, as one official commented “You have to believe in SMEs here, their importance cannot really be questioned” (Commission official, DG XVI, 1995). This view took hold in the Commission at a time when critiques of the notion of the transferability of the industrial district model had already been developed. It is often the case that it is difficult for officials to challenge policy thinking and change the status quo in spite of significant evidence being in the public domain that challenges existing practice. A question mark therefore hangs over the nature and type of evidence most likely to influence policy direction. Published research would not appear to be best suited to bringing about changes in policy in real time. The collection and use of evidence for evaluation purposes may have a greater prospect of success as the result of the evaluation’s status as a desired and required input requested by those with influence in the policy world. Work by Carol Weiss (1998) however, would suggest that this relationship is tenuous and pragmatic at best. Her typology of use suggests that evaluations have multiple ways to influence policies and programmes and in some cases serve an effective purpose for the avoidance of change. Of the different types

9

she elaborates only one, instrumental use, that involves direct influence and take up of evaluation findings by the programme or policy being evaluated. Again, political processes are critical in influencing the readiness of policy makers to create policy on the basis of available evidence. Nutley et al (2002) point out, for example, that in the UK the use of evidence in policy making has taken a back seat to ideologies in certain periods: “during the 1980s and early 1990s there was a distancing and even dismissal of research in many areas of policy, as the doctrine of ‘conviction politics’ held sway” (p.2)

Utilisation of evaluation (and evidence) is inherently connected with evaluation capacity. However so too is the issue of utility. A useable evaluation implies an effective evidence base that sharpens the credibility and rigour of the analysis and enables and facilitates cross programme and policy comparison in meaningful ways. Pawson and Tilley (1997) would suggest that only realist evaluations fall into this category but it is my contention that other forms of evaluation exercise also have utility depending on the purpose of the evaluation endeavour. Regardless of the epistemological basis of the evaluation, evaluators often articulate the complaint that there is insufficient evidence to complete specific evaluation tasks. A number of factors lie behind this reported impoverished evidence base. Firstly the restriction in funding and time-scales for evaluation necessarily impacts on the quality of the result. In addition, where evaluation is a relatively new concept the policy world often fails to build into programmes the requirements to collect data in ways which are comparable between projects and programmes, and which can be meaningfully used for subsequent analysis. Building evaluation capacity consequently is a task which benefits from early collection of appropriate evidence and data. Capacity to produce useful evaluations therefore relies on a system-wide capability to develop and collect evidence of use in a number of different areas and levels of intervention: project/programme/policy/organisation. Meta-evaluation is a useful approach as it compares analytical categories across programmes and policy areas thereby adding value at a systemic level. Pawson and Tilley (1997) dismiss the metaevaluation as being founded on comparison between quasi-experimental evaluations. “we have to admit that we find the meta-evaluation track record to date somewhat disappointing. Our reasoning comes down to the ‘raw materials’ of such studies, the difficulty being that the quasi experimental evaluations ordinarily used to drive metaevaluations are...very poorly placed to provide the vital explanatory clues” (p.149)

However meta-evaluation need not (and should not) be founded just on quasiexperimental evaluations. Pragmatics alone dictates that evaluations are likely to provide a mix of methodological approaches ranging from economic evaluation to participatory approaches depending on the nature of specific evaluation tasks. This should provide a range of evaluation materials for informed meta-evaluation. Martin has advocated (2001) the greater use of meta-evaluation to provide enhanced added value and greater economy within the evaluation function. Rather than endlessly commissioning new and expensive evaluations he suggests that more

10

effective use and value can be gained from cross-evaluation analysis which will provide the prospect of system-wide learning and rewards. An additional advantage of meta-evaluation is that it helps prevent the ‘reinventing the wheel’ syndrome, where knowledge about what has been done before is lost to the system, a costly and wasteful phenomenon that evaluation is well placed to minimise. The Thematic evaluation is also useful for system-wide learning. A number of meta-evaluations have produced some degree of policy reform and change. This implies that these evaluations have an intrinsic utility. It is difficult to do more than speculate on this however, as the political context inevitable inhibits and promotes the take up and use of evaluations. Meta-evaluations commissioned at the time of policy change, flux, and uncertainty can be particularly useful in providing evidence to enable or holdback policy change. In addition evaluations can provide reinforcement of individual positions and bargaining power within institutions creating the illusion of significant use being made of particular meta-evaluation outputs. There is a clear causality between utility and utilisation here which is difficult to un-pick. Useable evaluations are those which are utilised. This does not necessarily imply that these evaluations are superior to others, they may fortuitously come about in a receptive policy environment rather than achieve impact because of an inherent strength and quality. Political factors constrain and enable the direct take up of ideas and findings. The Structural Funds provide a good example of multiple stakeholders seeking influence on policy thinking. This has a strong impact on the take up of evaluation even where evidence is credibly collected and application principles sound and rigorous (Jansky 2002). Getting evidence right is nevertheless important – this is especially significant for giving evaluations an intrinsic utility. Useful evaluations need to have a strong and, as far as possible, “accurate” evidence base to allow the formulation of knowledge and policy thinking. Evaluations apply as well as create evidence, they also need to be eclectic and look across programmes where this is possible. Evaluation capacity entails the emergence and development of useable evaluation outputs as well as the ability to ensure their subsequent utilisation. 5.0 Learning and Evaluation Capacity Building evaluation capacity entails learning in two different communities. The first of these concerns evaluation professionals learning is about accumulating and internalising information, and learning how to produce effective, relevant and timely evaluations of use to those commissioning evaluations. The second concerns the policy maker and evaluation commissioner: here learning is gleaned from evaluations and applyed for policy, programme, and/or organisational improvement. Here evaluation is by definition concerned with learning, in a sense this is a main purpose of the evaluation activity. The concept of learning is a difficult one to define, measure, and capture. How do we know if organisations/programmes/policies experience learning? Work on utilisation has attempted to identify the ways in which ideas and learning from evaluation enter into policy thinking, but policy thinking is clearly influenced by multiple sources. Learning about and from the evaluation process is manifested by specific actions, in the case of those producing evaluation outputs, evidence of improved evaluations and

11

improved negotiation tactics with clients are suggestive of accumulated experience and learning. Among the users of evaluation, changes in the policies and programmes evaluated, changes in organisational behaviours or in the design of new policies and programmes, and the commissioning of new work to fill existing gaps are all indicative of the use of evaluation as part of a learning focused polity. These are merely symptoms that suggest that evaluation may have led to some policy level or organisational learning. Learning can take place at many levels, from the individual, the organisational and beyond that into the inter-organisation/ cross-policy levels. Learning about evaluation is gained through experience and contact with the complex world of policy and with different kinds of clients. Learning from evaluation entails ability to share and disseminate knowledge throughout and across policy systems. Evaluation can play a role as part of a learning polity, enabling intelligent policy formulation and fostering a culture of reflection and learning. Evaluation capacity therefore needs to be recognised as including the ability to learn to do useable evaluations, and the ability to learn from effective evaluations. In the former case learning is something that takes time and requires hands-on experience. Undesirable learning outcomes are also possible, shortcuts, fudged outcomes, and inaccurate findings. For achieving successful useable evaluations some cognisance with political niceties and negotiation behaviour is also needed. In the latter case learning from evaluations needs to achieve, ideally, a broad system-wide capacity. In this ideal scenario evaluation is central to a learning-focused system where policy development is a priority. Where evaluation is central to particular policy systems and creates opportunities for learning and intelligent policy formulation it exemplifies a desirable use and application of evaluation. Where evaluation is a relatively new activity the lack of evaluation capacity is often lamented. Policy makers are often among the first to complain that evaluation capacity is ‘inferior’. This is a difficult and challenging conclusion to draw. Capacity issues of all kinds emerge where organisations are asked to do tasks they have never previously had to do and where structures, personnel and experience may not be in place. The EU has seen the emergence and standardisation of evaluation practice over the last decade in an effort to induce some comparability and common standards between different Member States. In this context the notion of evaluation capacity can be a somewhat loaded term implying questions of superiority which are not necessarily appropriate. Difference in evaluation approaches may make life difficult for comparative use of the findings but this is not the same as lacking capability to perform the evaluation function. This issue has become increasingly important now with the imminent enlargement of the European Union. Doing useful evaluation also entails the ability to effectively utilise appropriate evaluation theory, to choose appropriate methodologies, and to use effective evidence. All these things come together to make up evaluation capacity from the point of view of those involved in producing and delivering evaluations. Learning emerges from experience in doing successful (and unsuccessful) evaluations that are used by policy makers and other stakeholders. A number of obstacles lie in the way of learning from evaluation. Argyris (1985) famously catalogued defensive routines that stand in the way of organisational change

12

and learning. Merton (1980), Blau (1956), and others describe dysfunctional bureaucratic behaviours which inhibit efficiency. Merton's approach is summarised by Crozier; Merton contends that the discipline necessary for obtaining the standardized behaviour required in a bureaucratic organization will bring about a displacement of goals. Bureaucrats will show ritualist attitudes that will make them unable to adjust adequately to the problems they must solve. This will entail the development of a strong esprit de corps at a group level and create a gap between the public and the bureaucracy (Crozier 1964, p.180)

The combination of defensive routines and goal displacement whereby “an instrumental value becomes a terminal value” (Merton 1980, p.25) can effectively halt positive desirable learning. Inability to solve problems is symptomatic of a lack of openness to new knowledge and to change. Learning from evaluations is likely to be adversely affected in this context. Evaluation inevitably enters with difficulty into such environments, making learning for effective policy change and development problematic. Indeed, Palumbo points out that, “if evaluations cannot be turned to the advantage of program managers, then it is in their interests to suppress or simply ignore them,” (1987, p.23). Or as Cronbach et al. put it, “rarely or never will evaluative work bring about an 180-degree turn in social thought” (1980, p.157). Obstacles to policy learning from evaluations can also be brought about by ineffective evaluation outputs. Inappropriate ways of conveying a message, failure to accept or recognise political constraints for policy, and inadequate or partial evidence being used to suggest inaccurate findings can all inhibit learning from evaluation. Evaluators also, sometimes, avoid delivering the kind of evaluation outputs requested because of a lack of specific technical expertise.2 Evaluation capacity also implies learning by evaluation commissioners about what can be realistically achieved. Current procedures at all levels of governance for the tendering of evaluation contracts would suggest that there is considerable room to enhance learning in this regard. Evaluation and learning should ideally form the basis of a reflective polity focused on continuous policy improvement. At a research level there are difficulties in tracking where and when learning takes place. This does not mean it has not occurred as evaluation use (and learning from it) can take many forms as Carol Weiss (1998) has demonstrated. 6.0 Conclusion: Getting round the Obstacles Theory, evidence and learning are all closely related components of the evaluation process. They overlap in many respects as applying evidence, for example, involves learning; generating useable evaluations involves accessing reliable and appropriate evidence as well as developing a sound understanding of programme theory. Theory, evidence and learning are best understood as complementary aspects of evaluation, and intrinsic to the building of evaluation capacity. However, as we have seen, achieving evaluation capacity involves additional activities that contend with the 2

avoidance of macroeconomic modelling and top-down analysis is quite common in this regard

13

politicised environment that distorts the process of use and commissioning of effective evaluation. Building evaluation capacity entails learning to do effective and methodologically sound evaluations as well as finding ways to make sure that evaluation is useable in a positive way. A focus on individual components of evidence and enhanced programme theory may help in developing the capacity for cross-programme systemic learning. Currently professional training in evaluation is not widely available in Europe and resource investment in this field would significantly help to enhance the capacity of future evaluators to produce high quality evaluation outputs. Training can also equip professional evaluators to contend with the political obstacle course within which evaluation is undertaken. The politicisation of the evaluation and policy making process constitutes the most difficult obstacle to building evaluation capacity. Other obstacles are more easily addressed with provision of resources and time to build and develop experience and expertise in the field. Many authors advocate a process of slow change for the evolution of capacity, Pawson and Tilley (1997) for example, see evaluation as part of a process in which knowledge accumulates over time and across successive programmes. This knowledge about what works in specific contexts is supplemented by a teacher-learner relationship in which the evaluator brings together different bits of policy knowledge into a meaningful whole. Thompson (1995) researching institutional capacity in developing countries has also argued that cultural change is possible in slow time. He writes, “such a transformation is neither easily nor quickly achieved"(1995, p.1544). This confirms existing knowledge about the general pace of evolution of contemporary organisations, as Tömmel has observed in her analysis of the EU, “it can be said, first, that the system has improved in its institutional structure. This did not occur through the implementation of a grand design but through an incremental, piecemeal process of realising minor steps of change” (1997, p.14). The notion of slow evolution and change has an impressive pedigree of advocates. One worry remains however, the political manipulation of programmes and policies is not diminishing but remains a constant factor in public policy. The evaluation process nevertheless still has to find ways to penetrate through this problematic context. More than twenty years ago Alkin et al. wrote with some alarm, In the graveyard of ignored or disregarded evaluations rest not only those technically inferior studies which earned their consignment to oblivion; there are also many studies seemingly of high quality which somehow failed to move their audiences to action. These latter, "wasted" evaluations disturb evaluators and decision-makers alike because they draw into question whether the evaluation enterprise is, in fact, working. (1979, p.13)

Pawson and Tilley (1997) suggest nonetheless that intellectual rigour should be the key to effective policy evaluation and influence: “we remain wantonly idealist in refusing to accept that the consequence of programs being peopled and political is that researchers become mere palliators and pundits. We believe, in short, that the strengths of evaluation research depends on the perspicacity of its view of explanation” (1997, p.219 authors italics)

14

However the reality of the context within which evaluation is undertaken demonstrates significant distorting influences which do not necessarily reward excellence of explanation, rather findings which support political will and direction. Nutley et als’ (2002) observation about the predominance of conviction politics in shaping policy formulation in the UK in the 1980s and 1990s serves as a timely warning about the potential for burying credible evaluation in the pursuit of political objectives. It is not surprising then that evaluations, often, have to enter into the political game in order to ensure they have some utility. As Weiss has “Evaluation reports themselves unavoidably take a political position even if they claim to be objective” (1987, p.18). Work in other areas suggests some interesting complementary solutions to this slow evolution. Work on ways round political clientelism suggests that the emergence of horizontal (as opposed to hierarchical) interest groups can provide an antidote to clientelistic practices. Graziano alludes to this, writing about the Italian political system, he commented that, “in the absence of any notion of collective interests, the provision of collective goods is unproductive in terms of political influence....the administration of the state became a gigantic spoils system for the benefit of political clientele” (1973, p.15).

If we apply this notion to the current problem, we may find that increasing the activity of evaluation societies and the emergence of a set of common and shared interests within the profession is helpful. This also provides an argument for increased professionalisation of evaluation leading to the emergence of a powerful lobby which helps reinforce the need for evidence based policies and the contribution that the evaluation venture can make within this. Work by Hass (1992) on epistemic communities also supports this notion, Jansky summarises the approach thus: “The epistemic communities approach argues that under conditions of uncertainty (e.g. due to the complexity of a policy), a network of professionals with recognised expertise (e.g. evaluators) in a particular policy domain can offer solutions to policy makers (e.g. governments) who become increasingly dependent on such epistemic communities, as policies also become increasingly complex over time. Moreover, the more experts are integrated into the policy process (e.g. by assuming formal positions in institutions) the more influence they can exert on the process of policy evolution.” (2002, p.3)

Evaluation capacity is a multidimensional phenomenon. It requires the ability of professional evaluators to produce effective evaluation founded in theory, and strong appropriate evidence. Evaluation capacity requires that evaluations have utility and relevance generating learning for the users of evaluation. It also requires that institutions and policy makers have the capacity to utilise evaluations, that distorting political influences are minimised in favour of evidence based policy making where evaluation induces system-wide and policy/programme learning. This entails the emergence of reflective practice among policy makers using evaluations in conjunction with other sources of knowledge and learning (research, practice, lobbying) to build an integrated learning-focused knowledge based system. Growing experience and learning also requires reflection on the nature of the commissioning culture, and an acceptance that capacity enhancement means allowing

15

new evaluators opportunities to gain experience. Policy makers, evaluation commissioners, and evaluation professionals need to maintain and build a dialogue so better understandings are reached about professional principles, capacity requirements, and the needs of policy relevant research. Evaluation capacity can be enhanced at a professional level by a number of factors. Firstly, slow time evolution in the policy world as capacity is enhanced through demonstration of the value of quality outputs. Secondly, the achievement of greater integration and dialogue between the policy world and the evaluation practitioner will help irradiate misunderstandings of the utility and utilisation of evaluation. Finally, political obstacles to evaluation utilisation, and inadequate professional training provision can both be addressed through the professionalisation process for evaluation. This affords the prospect of greater influence and relevance for evaluation as well as greater focus of developing high quality evaluation outputs following training opportunities. Evaluation societies are well placed to perform this essential function of professionalising the evaluation function and directly helping to build evaluation capacity in the process. References Alkin, M.C., R. Dailllak, P.White, (1979), Using Evaluations: Does Evaluation make a Difference? London: Sage Publications Anderson, J. (1975), Public Policy Making, Nelson Argyris, C., (1985), Strategy, Change, and Defensive Routines, London:Pitman Batterbury, S.C.E., (1998), Top-down meets Bottom-up: - Institutional Performance and the Evaluation/Monitoring of the EU's SME Policies in Galicia and Sardinia, Unpublished D.Phil thesis, University of Sussex Blau, P.M., (1956), Bureaucracy in Modern Society, New York: Random House Brewer, G. D., (1974), The policy sciences emerge: to nurture and structure a discipline, Policy Sciences (1974), pp.239-44 Cronbach, L.J., S. Robinson Ambron, S. M. Dornbusch, R.D. Hess, R.C. Hornik, D.C. Phillips, D. F. Walker, S.S. Weiner, (1981), Toward Reform of Program Evaluation, London: Jossey-Bass Publishers Crozier, M., (1964), The Bureaucratic Phenomenon, Chicago: University of Chicago Press Graziano, L., (1973), Patron-client relationships in southern Italy, European Journal of Political Research, 1 (1973) pp.3-34. Amsterdam: Elsevier Scientific Publishing Company Hass, P.M., (1992), Introduction: epistemic communities and international policy coordination, International Organization, 46, No 1.(winter) pp.1-35

16

Howlett , M., M. Ramesh, (1995) Studying public policy: policy cycles and policy subsystems, Oxford:OUP Jansky, R., (2002), The Cognitive dimension of European Union policy making: the role of evaluation in the Structural Funds reforms (1988, 1993 and 1999), paper presented at the Regional Studies Association International Conference, Evaluation and EU Regional Policy: New Questions and New Challenges, Aix en Provence, France 31st May and 1st June 2002 Lasswell, H.D., (1956), The Decision Process: Seven Categories of Functional Analysis, College Park: University of Maryland Merton, R.K., (1980), Bureaucratic Structure and Personality, in Etzioni, A. & E.W. Lehman, A Sociological Reader on Complex Organizations, 3rd edition. Holt, New York and London: Renehart and Winston, Martin, S. (2001), Devolution and Evaluation, Seminar 15 October, Welsh Assembly Government Nutley, S., Davies, H., Walter, I., (2002), Evidence based policy and practice: cross sector lessons for the UK, Keynote paper for the Social Policy Research and Evaluation Conference, Wellington New Zealand, http://www.stand.ac.uk/~cppm/home.htm (accessed 02/10/02, 01.53) Palumbo, D.J. (ed.), (1987), The Politics of Program Evaluation, London: Sage Publications Pawson, R. N. Tilley (1997) Realist Evaluation, London: Sage Publications Thompson, J., (1995), Participatory approaches in government bureaucracies: facilitating the process of institutional change, World Development Vol. 23, No 9. pp.1521-1554 Tömmel, I., (1997), The Political System of the EU, a Democratic System? mimeo. paper presented at the Symposium of Delphi on: Rethinking Democracy and the Welfare State Weiss, C.H., (1987), Where politics and evaluation research meet, in Palumbo, D.J. (ed.), The Politics of Program Evaluation, London: Sage Publications Weiss, C.H. (1998), Have we learned anything new about the use of evaluation?, American Journal of Evaluation, Vol. 19, No.1, pp.21-33

Developing Evaluation Capacity BUILDING EVALUATION ... - CiteSeerX

Developing Evaluation Capacity BUILDING EVALUATION ... - CiteSeerX

Suggest Documents

Developing Evaluation Capacity BUILDING EVALUATION ... - CiteSeerX

EVALUATION CAPACITY BUILDING - Innovation Network

Evaluation Capacity Building: - Innovation Network

Developing Capacity for International Developmental Evaluation

Developing Capacity for International Developmental Evaluation

199 using evaluation capacity building (ecb) to interpret evaluation ...

Development of an Evaluation Capacity-Building ... - ScienceDirect

Evaluation of REACCH: Capacity building in ...

The Unified Outcomes Project: Evaluation Capacity Building ...

Evaluation Capacity Development through Cluster Evaluation

evaluation capacity development - Independent Evaluation Group

Functional and work capacity evaluation issues - CiteSeerX

Functional Capacity Evaluation

EVALUATION CAPACITY DEVELOPMENT

Developing evaluation standards and assessing evaluation quality

Developing a Competence Framework and Evaluation ... - CiteSeerX

An evaluation capacity-building process for sustainable community IT ...

Development of an Evaluation Capacity-Building Program for Nurse ...

Evaluation of Oxfam Novib's Capacity Building Programme for ...

evaluation report of the pan-african capacity building programme

Evaluation of impact of capacity building training among ... - TPO Nepal

Evaluation of UNESCO's work in capacity building in the ... - unesdoc

An Evaluation of the Capacity-building Effects of Participatory GIS ...

Empowerment evaluation in Brazil: building capacity and ... - SciELO