integrating qualitative and quantitative evaluation ... - Science Direct

11 downloads 0 Views 1MB Size Report
Requests for reprints should be sent to Dr. Michael L. Dennis, Research Triangle Institute, 3040 Comwallis Road, Re-h. Triangle Park, NC. 21709-2 194. 419 ...
Evaluation and ProgramPlanning.Vol. 17, No. 4, pp. 419-427.1994 Coovrieht 0 1994 Elsevia Science Ltd P&d in Ihe USA. All rights mawd 0149-7189l94 $6.00 + .oo

Pergamon

0149-7189(94)ooo30-1

INTEGRATING QUALITATIVE AND QUANTITATIVE EVALUATION METHODS IN SUBSTANCE ABUSE RESEARCH

MICHAELL. DENNIS Research Triangle Institute

DAVID M. FETTERMAN Stanford University

LEE SECHREST

University of Arizona

ABSTRACT For many years, there has been an ongoing debate about whether we should focus on qualitative or quantitative evaluation. Although prior discussions have often been adversarial (i.e., advancing one to the exclusion of the other or defending one’s existence), most practitioners largely consider them to be two sides of the same coin. The need for integration is particularly evident when evaluating substance abuse programs because the individuals involved often have competing contextual demands and multiple problems that require the use of multiple types of treatment, outcomes, ana’ analysis models. Unfortunately, there is little published literature on how to do this. In this paper, some speciftc opportunities and techniques are identijied for combining and integrating qualitative and quantitative methods from the design stage through implementation and reporting.

INTRODUCTION

and quantitative methods as two sides of the same coin is particularly obvious when doing formative evaluation or applied clinical research. A laundry list of direct and implied questions often has to be sorted out, with the desire being to develop specific recommendations for program changes. Thus, there is a need to do a comprehensive evaluation that meets both qualitative and quantitative evaluation standards. Unfortunately, most evaluation techniques have been taught at a methodological or substantive level. Although

For many years, there has been an ongoing debate about whether substance abuse research should focus on qualitative or quantitative evaluation. This general debate recently reemerged in a series of presidential addresses at the American Evaluation Association (Lincoln, 199 1; Se&rest, 1992) and renewed calls for synthesis (Cordray, 1993; Reichardt & Rallis, 1994). The need to consider qualitative

This paper is based on a workshop held at the 1993 American Evaluation Association (AEA) Conference. The authors thank the conference participants who helped move along our thinking in this area and Lynn Usher for detailed comments on the draft manuscript. They also thank Richard S. Straw for editing the manuscript and Linda B. Barker for typing it. The AEA workshop and pteparation of this article were partially supported by National Institute on Drug Abuse Grants No. P50-DA06990 to the Research Triangle Institute. and No. DA 06918 to Amity and the University of Arizona Requests for reprints should be sent to Dr. Michael L. Dennis, Research Triangle Institute, 3040 Comwallis Road, Re-h Triangle Park, NC 21709-2 194. 419

420

MICHAEL L. DENNIS

there are some notable exceptions from applied evaluation (Fairweather & Tornatzky, 1977; National Institute on Drug Abuse [NIDA], 1993a; Posavic & Carey, 1992; Rossi dz Freeman, 1987), few textbooks are available for doing a comprehensive evaluation that combines qualitative and quantitative methods. Moreover, several examples are in print in which both have been done, but they are typically presented as two related but separate studies - even to the point of being reported in separate documents. This paper tries to review some of the ways that qualitative and quantitative methods can be integrated when evaluating substance abuse treatment programs. It reviews some of the key issues that have been raised in some of our more recent writings (Dennis, 1994; Fetterman, 1989; Se&rest & Sidani, in press) and relates them to the evaluation of substance abuse interventions. Our paper identifies opportunities and issues related to integrating methods during three phases of an evaluation: design, implementation, and reporting. Some of the key questions that need to be addressed in each of these areas include: l

l

l

mandated that NIDA and the National Institute on Alcoholism and Alcohol Abuse (NIAAA) set aside 15% of their funds for “services research.” Public Law 102321 also transformed the Office for Treatment Improvement (OTI) into the Center for Substance Abuse Treatment (CSAT) under the newly created Substance Abuse and Mental Health Services Administration (SAMHSA). Although CSAT continues to focus on block grants and funding demonstration, it has also been trying to improve the quality of local evaluations of a series of demonstrations to “enhance” treatment. Many of the issues related to substance abuse treatment evaluations are found in other fields, but several inherent impediments make evaluation difficult in substance abuse research. Some of these include the following: Substance abuse is a chronic condition with a high rate of relapse and cooccurring problems, including mental illness, physical abuse, unemployment, inadequate housing, criminal activities, infectious diseases, primary care problems, and interpersonal conflicts. Many people face logistical and personal barriers to entering and staying in treatment, including transportation, childcare, paying for treatment, social stigmatization, and conflicts with program ideology or sensitivity. People are often in treatment to avoid criminal penalties and may feel coerced into certain terms of treatment (e.g., urine testing); consequently, they may be wary and less than frank with both treatment and evaluation staff. Several treatments and services appear to improve outcomes, yet no single treatment is considered a cure or panacea, and the cooccurring problems often require cotreatment for drug treatment to be effective. Most evaluations focus on single problems, outcomes, and “black box” treatment models that miss the complexity of the problem, treatment, and outcomes. Much of the published literature from related fields fails to deal with the issue of multiple problems or even the issue of multiple drug use, which is increasingly the norm in publicly funded programs.

Design. How do we find the right questions and decide how to prioritize them? How will the answers be used, and how rigorous does the information need to be? What are the measures that are needed, and are they sensitive enough to detect what we are looking for? Implementation. How does the evaluation fit into the needs, concerns, and daily lives of the program staff and clients? How can it be cast in a way to meet the ethical, scientific, and logistical constraints of both researchers and practitioners? Reporting. Are there ways of providing early feedback and checking the direction the evaluation is going? Can the information be structured to make it more useful immediately for improving the program or for developing subsequent parts (or new grants) for the program?

These discussions apply to virtually any evaluation (be it anthropological or experimental, formative, or surnmative) in collaboration with a program or as independently conducted. At each stage, we attempt here to identify key qualitative and quantitative issues that should be addressed.

BACKGROUND SUBSTANCE

ABUSE

ON EVALUATING INTERVENTIONS

The issues under discussion here can arise in any field. They have received increased attention in substance abuse treatment research because of the Alcohol, Drug Abuse, and Mental Health Administration (ADAMHA) Reorganization Act of 1992 (Public Law 102-321) which

et al.

For these reasons, Federal, State, and local programs have all attempted to enhance treatment and make it more comprehensive. Not only are several national evaluations currently under way, but also interest has grown in improving the quality and applicability of local evaluations and making them a more integral part of program development. Some of the questions that these program evaluations typically ask include the following: l l l

Who is being served? Who is being missed or needs to be served? To what extent are services being delivered most needy, appropriate, or targeted group?

to the

Integrating l l

l

l l

l

l

Qualitative and Quantitative

What services are actually being delivered? To what extent are some approaches more effective in reaching and keeping the target population involved? How does the client case mix affect overall program or staff performance? How can we match services to clients? To what extent are different approaches and levels of training and resources associated with changes in the level of services delivered? How are changes in the level and appropriateness of services related to changes in client and program performance? Are some approaches more cost-effective than others?

Such questions require both normative and empirical judgments. To help programs begin to address them, we need to start thinking about the process by which evaluation can be designed to complement program development. This means less focus on specific methods per se, and more focus on the problem, program context, and the kind/quality/timing of information that is needed.

DESIGN ISSUES Questions, Stakeholders, and Designs The first step of any evaluation should be to determine (a) what you want to know, (b) who wants to know it, (c) how will the information be used, and (d) what fiscal, logistical, and time constraints must be considered. Each of these questions may affect the kinds of methods that are most appropriate. The need to understand a problem, validate measures, and get feedback all require some level of qualitative methods. The need to generalize, compare, or evaluate outcomes requires some level of quantitative methods. Because most evaluations address several questions and occur over time, they typically require a combination of methods. In most evaluations, it is also important to consider the likelihood that there are multiple audiences for whom the answers to these questions may be very different. For instance, in estimating drug use in the DC metropolitan area, NIDA found that, although including homeless people increased the total population estimate by only 0.3%, it increased the estimated number of past year needle users by approximately 25% (NIDA, 1993b). For the epidemiologist interested in the household population, the addition of homeless people would change the overall rates of past year needle use from 0.2% to 0.25% and would not even be reflected in the existing reports. For treatment providers, however, it would increase the total number of past year injection drug users potentially in need of treatment by over 25%. Although algebraically equivalent, the context and relative precision of the same information has grossly differ-

Evaluation

421

ent implications for the importance of including homeless people in drug use estimates. When defining the questions, needs, and constraints of substance abuse research, it is essential to consult with the potential stakeholders and end users of the research. To the extent that alternative definitions or issues can be incorporated, the evaluation will be easier to implement and the report will be more useful. Failure to consider their input may, in contrast, lead to roadblocks and subversion. Most evaluators consult with the head of a program or agency they are studying but often leave out several key groups. These include the program’s staff, the program’s clients or consumers, and related agencies or services. Some of the ways these groups can be involved include interviews, advisory board meetings, surveys, focus groups, participant observation, or full ethnographic studies of the situation. Having reached some understanding of the evaluation’s purpose, the next important step is to summarize it in writing and operationalize the procedures and expectations for the evaluations. Such designs and workplans dictate the process and decision rules, not the outcome. As such, they are consistent with both qualitative and quantitative evaluation methods. Two key components of any evaluation design or workplan are the time line and budget. To take advantage of an evaluation, most decisionmakers will require either quick or preliminary feedback on the results. Good evaluations, whether qualitative, quantitative, or both, often take years to complete. It is therefore essential to think of them more as research programs than as a single research study. That is particularly true if they involve demonstration projects that are likely to extend over time but that are also likely to change as they develop. Although we must still be wary of releasing information that can jeopardize the study’s implementation or validity, we must also not hold back information that impacts on its usefulness or ethical decisions about continuing the evaluation. Evaluations can be integral parts of ongoing program development. They can often produce early information that would be useful to several stakeholders if provided early but would be of little or no use if provided 5-10 years later. Examples of this might include a needs assessment component, preliminary study data to support a funding request, preliminary evidence suggesting that an experiment is very harmful (or more beneficial than expected), the results of an individual’s clinical assessment, and input on a staffing problem. The time line and budget are also major factors in determining the kind of study to conduct. If there is little time, a small budget, and pressing need, one may consider doing either a pilot study, rapid impact evaluation, or even an evaluability assessment. When more time or resources are available or the problem is more complicated, then it becomes increasingly more useful and important to flesh

422

MICHAEL L. DENNIS et al.

out an understanding of the problem, treatment, and outcomes. This might include record studies, ethnographic auditing (e.g., Fetterman, 1990), and follow-up-only studies. For the large and ongoing problems that warrant the commitment of more resources and time, it then becomes more appropriate to consider full ethnographic evaluations (e.g., Fetterman, 1984), participant observation, longitudinal studies, and experiments. To the extent to which evaluations address multiple problems, then multiple methods or a series of methods may be more appropriate. Unlike general research, it is also important to realize that most evaluations are dealing with ongoing entities and people. The first implication of this fact is the need to have formative evaluation components that help identify how to understand and improve the treatment program. The second implication is that, although it may be meaningful to do a summative evaluation on a particular procedure or service, it is often less useful to do this on an entire modality, program, or agency. For instance, early drug treatment evaluations focused on the effectiveness of the main treatment modalities (e.g., outpatient, methadone, short-term residential, long-term residential), whereas subsequent analyses have found considerable variation within modalities that severely limits the usefulness and generalizability of the findings (Condelli & Hubbard, 1994). Integrating Designs Substance abuse researchers typically employ both qualitative and quantitative investigation methods as fairly separate and orderly processes, but the reality for most evaluations is that an evaluator walks into the middle of a process with many decisions having been made and many constraints already in place. The preceding design decisions should not occur in a vacuum. Either through previous work or during the beginning of an evaluation, an evaluator must become familiar with the terrain and context in which he or she will be working to be effective (Usher, 1993, October). No qualitative or quantitative method offers a panacea for avoiding the need to understand the problem, service, or treatment being evaluated. Having done so, however, an evaluator is then able to better focus on the design, measures, and other issues that are. likely to make the evaluation more successful. Through experience or a process of collaboration with those who possess the necessary experience, the evaluation design must be guided by a good understanding of the existing context in establishing the goals and objectives. Probably the most important way that the qualitative and quantitative methods can be integrated in the design phase is in deciding on which of many questions to focus. Both evaluators and stakeholders are likely to generate a laundry list of questions of varying importance and difficulty. Some of the ways that both methods can help con-

tribute to focusing the evaluations (and references with more detailed discussions on them) include the following: Prioritization. Identifying the questions in an evaluation agenda that are likely to have the greatest impact on current policy or programmatic debates is of primary importance. The kind and quality of information that will be necessary to satisfy each side of the debate are a fundamental part of regular decisionmaking for most clinicians and practitioners (see Posavic & Carey, 1992; Se&rest & Sidani, in press a). 0 Hypothesis generation. If clinicians or program managers repeatedly report seeing a problem, their interpretation may not always be right, but there is almost always a real problem there. Trying to operationalize and triangulate various sources of information to evaluate their hypotheses is a way to build applied theory (see Dennis, 1994; Fetterman, 1989). l Evaluability assessment. Some questions cannot be answered empirically or the design constraints may prohibit the statistical power necessary for obtaining the answers. For an outcome or experimental evaluation, it is also essential to evaluate whether clinicians think the proposed treatment contrast is capable of producing the proposed differences in outcomes (see Dennis, 1994; Wholey, 1983). l Identifying likely criticisms. In a field like evaluation, it is essential to understand the arguments that are likely to be made against a new intervention and address them as part of the implementation process. Similarly, it is important to anticipate the likely methodological criticisms and prioritize their risk or importance (see Campbell Jz Stanley, 1966; Rossi & Freeman, 1987). l Balancing internal and external validity. In applied research, it is important to seek a balance between the internal validity/precision of qualitative and quantitative methods and the external validity/generalizability of the results. Part of the latter includes collecting data that allow other audiences to compare the primary program with theirs and including case studies or norms that help them understand or use the information (see Cook & Campbell, 1979; Cronbach, 1983; Fairweather & Tomatzky, 1977). l

It is important to realize that it is just as meaningful to talk about the internal and external validity of a “goalfree” evaluation as an experiment (i.e., “Was it done as “To what can it be generalized?“). proposed?” Conversely, the mere use of random assignment does not make a piece of research timely or relevant to anything. Multisite field experiments, for instance, often produce large differences in the program models implemented that can alternatively be viewed as either uncontrolled variations or as robust tests of concurrent replicability.

Integrating Qualitative and Quantitative Evaluation

IMPLEMENTATION ISSUES Inquiry Process To answer a question, Se&rest and Sidani (in press a) suggested that we need to seek one or more of three types of evidence: l

l

l

Intuitive sources include whether a potential answer seems right, feels good, or seems to make sense. Authoritative sources include asking experts, looking in books (e.g., Physician’s Desk Reference) or articles, consulting respected norms (e.g., legislation, regulations, procedure manuals), or seeking spiritual guidance. Empirical sources entail actions in which a person analyzes qualitative and quantitative information through either primary or secondary analyses.

Frequently, evaluations use a combination of evidence (both within and across types) to triangulate on a likely answer. In the clinical practice for drug treatment, few problems will be resolved by just one test, nor can simple sets of questions be used to formulate conclusive diagnoses or evaluate a client. Similarly, in evaluation, no service or program should be evaluated solely on the basis of a single piece of evidence. Our goal is to look for a larger pattern in which several indicators present consistent evidence that something does (or does not) work. Adapted from Se&rest and Sidani (in press a), Table 1 illustrates some of the qualitative and quantitative methods that can be used to collect, analyze, interpret, and report information in substance abuse evaluations. Unlike the original exhibit, this version has been divided into methods that attempt to be more open ended (intuitive) versus those that attempt to follow strict rules (formulaic). Clinicians, program managers, regulators, and evaluators may follow several paths. Although an evaluator must complete each process before proceeding to the next stage in a meaningful way, he or she may decide to go back to collect more data or reanalyze data in more of an iterative process. Like the analogy of a research program suggested earlier, most long-term evaluations are quite recursive or cyclic, with repeated reversions to unanswered questions or reanalyses based on new understandings or paradigm shifts. Complementary methods may also be used in answering questions. Fetterman (1989), for example, demonstrated how combining methods can be used for creative problem solving in evaluation. The math scores in one educational program for dropouts was unusually low during one quarter of the year. The ethnographic description of program implementation provided a simple explanation for the low scores during that period - the absence of a mathematics teacher. The program had difficulty recruiting and retaining mathematics teachers given the

423

competitive market for these individuals. The psychometrician knew this quarter was an aberration but did not know why; the ethnographer sat in the classrooms and described daily events and was thus able to explain the poor scores. In addition, Fetterman (1993) highlighted the value of analyzing the quality of quantitative information. In the same program for dropouts discussed above, Federal sponsors were considering closing one of the schools because the attendance was only 60-70%. The attendance in the neighboring urban high schools was somewhat higher. The ethnographer reminded the sponsoring agency that these students were the ones who had dropped out of the neighboring urban high school; hence, the baseline was zero attendance. Although this might have eventually been deduced from the data, federal policy makers would have closed down the program by then. A clinician, ethnographer, or someone who is intimately familiar with the day-to-day situation improves the design, implementation, and analysis - and in some cases may be the difference between the life and death of a program. Implementation Management and Analysis Perhaps one of the most important times to use both qualitative and quantitative methods is during the implementation management and analysis of a treatment demonstration or study. In a chapter on doing randomized experiments, Dennis (in press) suggested that clinicians will often agree to a change and then simply not do it. Worse yet, if they are not persuaded that the experiment is ethical, they may actively subvert it. If an evaluation includes the implementation and evaluation of a new treatment component, it is essential that an evaluator present evidence that supports the proposed intervention and design to the practitioners involved. To the extent necessary, an evaluator may then need to revise the intervention or design to meet their ethical and logistical concerns. It may even be necessary to subcontract out an intervention component to another group to avoid ethical or paradigm conflicts. Having achieved nominal consent, it is often necessary to remember that most experiments require people to change their behavior. Like any individual or organizational change, it is useful to provide feedback to participants on how they are doing. Ideally, this process will be interactive and also provide an evaluator with early anecdotal evaluations of the intervention from the clinicians. If an implementation phase is incorporated in the design, this information can be used to further refine the intervention and evaluation. Not only will this improve the evaluation, but it will also provide an opportunity for the clinicians to “buy-in” to the study. Whether evaluations are conducted internally, in collaboration with a program, or by an independent third party, it is typically important for the evaluator to gain

MICHAEL L. DENNIS et al.

424

TABLE 1 EXAMPLES OF HOW QUALITATIVE AND QUANTITATIVE METHODS CAN BE USED TO COLLECT, ANALYZE, INTERPRET, AND REPORT INFORMATION IN GOAL-FREE AND FORMULAIC WAYS IN SUBSTANCE ABUSE TREATMENT Data Stage Collection

Analysis

Interpretation

Reporting

Intuitive

Formulaic

Participant observation with clients waiting for treatment to understand their perspective

Urinalysis of random samples to detect drug use

Open-ended interviews with clients about why they use drugs and how they see the world

Structured questionnaires to determine an individual’s history and current treatment needs

Developing a typology or ethnography to summarize an individual’s perceptions and behaviors

Summarizing the empirical distribution of individuals with principal components or factor analysis

Reviewing an incident or set of data with a group of staff or individuals to find out what it means to them

Comparing changes in two or more comparison groups to determine the relative effectiveness of different treatment components

Using multiple definitions and criteria (e.g., harm reduction vs. abstinence)

Using statistical are reliable

Uses the insider’s perspective as a lens in which to describe interpret events

Using hypothesis data set

Reports/articles perspectives

of reality and

that try to present multiple

Detailed narratives ethnographies, many detailed tables

staff cooperation. Some of the specific strategies Dennis (1994) suggested include the following:

and

that

Explain to the staff why the study is being done and how the information is being used. They need to be persuaded just like everyone else. Present the draft instruments and procedures to the stafffor critique and input. They may not be able to tell you what to do, but they will almost always be right in what they tell you will cause a problem. Be responsive to staff concerns and make some accommodations. Even if you cannot eliminate every problem raised by the staff (e.g., more paperwork), accommodating the staff on some issues will make them more likely to go along in other areas. Give the staff feedback on the study’s progress and findings. Many experiments take several years. As in any endeavor, staff should be given pep talks. Letting them know that you are actually using all of that extra paperwork is the best way to get it done and done well. Be sensitive to internal time horizons and deadlines. Minimize the extent to which your procedures and reporting guidelines conflict with the program’s operations. Much of the information you will collect may have alternative uses by the program staff and should be shared where feasible.

testing to decide which findings

testing to ask questions

of the

Reports/articles that attempt to include background, methods, results, and discussions Executive findings

summaries that attempt to highlight

key

Because evaluators often work for the sponsor, one of the simplest things they often forget to do is to send copies of the final report to the program and brief the program staff. When doing formative evaluations or services research, evaluators can develop collaborative relationships with program staff. Fetterman (1993, 1994a, 1994b) has even further advocated the integration of selfevaluation into the planning and management process. Dennis, Fair-bank, Bonito, and Rachal (1991) also suggested three questions that should be answered when evaluating the implementation of a new treatment component: To what extent have the experimental and control interventions been implemented as planned? To what extent do the experimental and control interventions differ from each other, even in unplanned ways? To what extent does the randomized experiment represent a fair or valid test of any observed differences? Conceptually, the first two questions may be considered a form of treatment validation analysis for the main study and at least partially involve normative clinical judgments about what the objectives should be. Even if one or more interventions are not what was expected, important information about effectiveness can still be

Integrating Qualitative and Quantitative Evaluation learned. It is not uncommon for a totally unexpected finding to be one of the most important outcomes of an evaluation or study. Anticipating the unexpected is something that both qualitative and quantitative methods can be tailored to do. The last question is traditionally addressed in the context of Cook and Campbell’s (1979) threats to internal validity (such as contamination, history, maturation, measurement bias, selectivity). It is important to note that even when randomization is successfully implemented, some threats to internal validity may still exist; differential attrition and compensatory rivalry are particularly threatening (Dennis, 1990; Fetterman, 1982). Development of Meaningful Measures For both the implementation and full analysis, it is important to identify or develop meaningful qualitative and quantitative measures. Many qualitative methods produce an enormous amount of information that has to be synthesized and interpreted. Developing typologies of people and behaviors is an essential step in enthnographic studies and many qualitative procedures. A great deal of emphasis is placed on identifying and comparing the perspective(s) of the people involved in the evaluation. The goal is to develop descriptions that relevant groups could agree on. In quantitative methods, measures are often evaluated by the extent to which they involve two types of error: random and bias. Random or uncorrelated error means that, although the measure is not perfect, the mistakes in each direction are roughly equivalent and cancel each other. Bias means that the measure systematically goes in one direction or the other. Actually, as Sechrest and Sidani (1994) maintained, the same concepts of error are as applicable to qualitative as quantitative methods. In qualitative evaluations, for instance, the data may not be representative or too much weight may be given to particularly salient (but atypical) reports. In either case, it is important to know the extent of both types of error because a small amount of bias with a small amount of random error (i.e., a small interval estimate) may be much more preferable than an unbiased estimate with a large amount of random error (i.e., a large interval estimate). In substance abuse treatment, a great deal of focus has been on single-item measures, such as a self-report or urinalysis. Unfortunately, this has considerable limits for reflecting the chronic nature and complexity of substance abuse. Some of the ways that researchers can be more sensitive to these larger issues include the use of: l

l

Cognitive appraisals to compare the evaluator’s and respondent’s understanding of both the key questions and responses; Time to an event, the frequency of events, or Likert scales rather than yes-no types of variables;

. 0 .

425

Multiple items to create a scale or index; Multiple items or scales in a multivariate or structural equation approach; and The appropriate metrics and formats for presenting the data.

The latter is important if the next stage, reporting, is to be effective. Although tables of log odds ratios may be technically flawless, they may be totally uninterpretable for many clinicians without also having an example or figure. (See Lennox & Dennis in this issue for a more detailed discussion of measurement issues.)

REPORTING ISSUES Current Problems Generic problems occur in many substance abuse evaluations (and in general), and these problems are related to the reporting and integration of qualitative and quantitative data. Some of the key ones in both qualitative and quantitative reports include the following: Salient case studies are often presented that offer insightful information but lack statistical reliability or generalizability. Like a police officer seeking circumstantial evidence, we want to notice other things that at least make the interpretation appear likely. Substance abuse evaluations present a sea of data that obscures both problems and successes. Whether it is 50 tables or 300 pages of field notes, the reader is typically looking to the evaluator for synthesis and interpretation. Substance abuse evaluators who have used multiple methods may still fail to integrate their qualitative and quantitative findings in the final interpretation and reporting phase. For example, if a report indicates that an intervention’s effects were modest and later mentions difficulties in program implementation, the reader needs to know whether the evaluator believes that the latter has any bearing on the former. The timing of reports often fails to meet the needs of clients, clinicians, program managers, and policymakers. Although the main findings of an evaluation may need to undergo a thorough review process before publication, in many cases a lot of data could be used much earlier (e.g., implementation and feasibility results). The failure to address these problems has limited the usefulness of many substance abuse intervention evaluations. It also has led to conflicts when a treatment program realizes that data possessed by the evaluator would greatly help with the treatment of an individual or the submission of a demonstration grant. Even with a Federal certificate of

426

MICHAEL L. DENNIS et al.

confidentiality in hand, it is hard to explain to program directors why they cannot be given access to data collected within their own programs. Establishing expectations for access and the timing of reports to meet the needs of the programs and clients is essential to successful utilization of evaluation data and the avoidance of problems. In actual communication, quantitative and qualitative kinds of information are almost always intertwined. Qualitative reports (e.g., ethnographies) are often too long and too complex to constitute a “final” report, executive summary, or conclusions. Report authors often will resort to such summary statements as “most respondents” or “many informants,” which represent quantitative integration of qualitative information. Conversely, large tables of data and many statistical tests will need to be boiled down to such statements as “the intervention appears to be effective with Type Q participants.” The more quantitative data that are available, the more an evaluator must rely on qualitative judgments about the practical significance of his or her research focus. Toward Integrated Reporting Because applied evaluations of substance abuse interventions typically involve multiple questions and are addressed to multiple audiences, there is almost by definition a need for a variety of reporting mechanisms. It is particularly important that an evaluator realize the impact of interleaving qualitative and quantitative data in memoranda, interim reports, or preliminary reports. Some of the ways this can be done include the following: Providing summaries of the initial efforts to define the goals and objectives (because of the difficulty of deciding what qualitative and quantitative data to focus on, it is important to have clinicians check the face validity of the proposed approach); Providing feedback to the clinicians to let them know how they are doing relative to each other and the objectives (this keeps them on track and prompts everyone to address problems while there is still time to potentially correct them); Getting feedback from the clinicians on the preliminary analyses and interpretation (this is an excellent test of both parsimony and clarity); Using qualitative data to illustrate a complex pattern of quantitative data; and Using quantitative data to put qualitative data into a broader perspective.

Sequencing the integration can be viewed as either serial or simultaneous. Serial integration means that one type of data is collected in response to another. Examples of this might include the following: Quantitative data suggest that a program was not effective, so qualitative data are sought and found to suggest that the program was poorly implemented, or Qualitative observation indicates that active participation may be low by some types of persons, so records are checked to confirm different participation rates. Simultaneous integration means that quantitative and qualitative data are collected simultaneously and used to inform each other. Examples of this include: l

l

Feeding back qualitative information on implementation while there is still time to intervene and correct problems; or Using quantitative data as feedback and to identify subtler problems to improve treatment integrity.

Clearly, many evaluations may do both (i.e., sequence and simultaneously integrate). Thus, an experiment with good quality control on treatment may use both qualitative and quantitative methods to control the intervention and flesh out the reporting process. If the analysis shows that the treatment worked particularly well for one subgroup, the evaluators might go back to the qualitative and quantitative data (or collect new data) to try to understand why.

CONCLUSION In this paper, we have tried to show how qualitative and quantitative methods are both necessary and readily integrated. As evaluators, we see that both methods have much in common and have much to offer each other. In practice, we do not conceive of our own evaluations as being either qualitative or quantitative. Rather, we look for the method or methods that best answer our questions given the constraints of time, budget, and imagination.

REFERENCES CAMPBELL, D.T., & STANLEY, J.S. (1966). Experimental and quasiexperimenfd

Planning for integration should be part of the design phase and clearly reflected in the presentations, memos, and reports that are generated. Ideally, this will be an iterative process in which the methods complement each other and lead to further insight or help others understand the evaluation’s main findings.

designsfor

research Chicago:

Rand McNally.

CONDELLI, W.S., & HUBBARD, R.L. (1994). Relationship between time spent in treatment and client outcomes from therapeutic communities. Journal ofSubstance Abuse Treatmenr, II, 25-33. COOK, T.D., & CAMPBELL, D. (1979). Qunsi-experimenfotion: Design and analysis issues forfield settings. Boston: Houghton-Mifflin.

Integrating CORDRAY, D.S. (1993). Synthesizing Evaluarion Prucrice, 14(l), 1-8.

evidence

CRONBACH, L.J. (1983). Designing evaluations social programs. San Francisco: Jossey-Bass.

Qualitative and practices.

of educarionnl

the validity drug abuse

Evaluation

FETTERMAN, From California 305313.

421

D.M. (1994b). Steps of empowerment evaluation: to Capetown. Evuluarion and Program Planning, 17.

and LINCOLN, Y.S. (1991). The arts and sciences Evaluation Practice, /Z(l), l-7.

DENNIS, M.L. (1994). Ethical and practical randomized field experiments. In J.S. Wholey, H. Hatry, & K. Newcomer (Eds.), Handbook of practical program evaluation (pp. 155-197). San Francisco: JosseyBass. DENNIS, M.L. (1990). Assessing experiments: An example from Evaluation Review, 14.347-373.

and Quantitative

of randomized field treatment research.

DENNIS, M.L., FAIRBANK, J.A., BONITO, A., & RACHAL, J.V. (1991). Treatment process study design (Technical Document No. 7, NIDA Contract No. 271-88-8230). Research Triangle Park, NC: Research Triangle Institute. FAIRWEATHER, G.W., & TORNA’IZKY, L.G. (1977). Experimental methodsfor social policy research. New York: Pergamon Press. FETTERMAN, D.M. (1982). Ibsen’s baths: Reactivity and insensitivity (A misapplication of the treatment-control design in a national evaluation). Educational Evaluation and Policy Analysis, 4(30), 261-279.

of program

NATIONAL INSTITUTE ON DRUG ABUSE (Bymes, R., Cardenas, E., DeJong, W., Simpson, D.D., & Soncy, G.P.). (1993a). How good is your drug abuse treatment program? A guide to evaluation (NIH Publication No. 93-3609). Rockville, MD: Author. NATIONAL INSTITUTE ON DRUG ABUSE (Dennis, M.L., Iachan, R., Thornberry, J.P., Bray, R.M., Packer, L.E., & Bieler, G.S.). (1993b). Prevalence of drug use in rhe Washington, DC, mertvpolimn anso and homeless and transient population: 1991 (NIDA Contract No. 271-898340, Washington, DC, Metropolitan Area Drug Study). Rockville, MD: Author. POSAVIC, E.J., 8t CAREY, R.G. (1992). Program evaluation: Methods and case smdies (4th ed.). Englewood Cliffs, NJ: Prentice-Hall. REICHARDT, C.S., & RALLIS, S.F. (Eds.). (1994). The qualitativequantitative debate (New Directions for Program Evaluation No. 63). San Francisco: Jossey-Bass. ROSSI, P.H., & FREEMAN, H.E. (1987). Evaluation: approach (3rd ed.). Beverly Hills, CA: Sage.

A systematic

FETTERMAN. D.M. (1984). Ethnography in educational research: The dynamics of diffusion. In D.M. Fetterman (Ed.), Ethnography in educational evaluation (pp. 17-29). Beverly Hills, CA: Sage.

SECHREST, L. (1992). Roots: Back to our first generations. Practice, 13(l), 1-7.

FETTERMAN, D.M. (1989). Park, CA: Sage.

SECHREST, L., & SIDANI, S. (in press). Quantitative methods. Evaluation Practice.

Ethnogruphy:

Step by step. Newbury

FETTERMAN, D.M. (1990). Ethnographic auditing: A new approach to evaluating management. In W.G. Tiemey (Ed.), Assessing acudemic climates and culrures (New Directions for Institutional Research No. 68, pp. 19-34). San Francisco: Jossey-Bass.

evaluation.

Evaluation

and qualitative

SECHREST, L., & SIDANI, S. (1994). Measurement. In B. Crabtree, G. Addison, A. Kuzel, & W. Miller (Eds.), Designing multimethod research (pp. 14-24). Newbury Park, CA: Sage.

FETTERMAN, D.M. (1993). Speaking the longuoge ofpower: Communication, collaboration, and advocacy (Translating ethnography into action). London: Palmer Press.

USHER, L. (1993, October). Balancing stakeholder interests in evaluations of innovative programs to serve families and children. Paper presented at the Association for Policy Analysis and Management, Washington, DC.

FETTERMAN, D.M. (1994a). Empowerment evaluation: dential address. Evaluation Practice, IS,(l), l-15.

WHOLEY, J.S. (1983). Evaluation Boston: Little, Brown.

AEA presi-

and effective public management.