A Multiperspective, Multivariable Evaluation of Reliable ... - CiteSeerX

52 downloads 0 Views 109KB Size Report
Aug 30, 2000 - The results of these comparisons indicate that the RCI is effective in identifying those who make reliable improvement in therapy but is less ...
Page 1 of 19

Journal of Consulting and Clinical Psychology April 1998 Vol. 66, No. 2, 400-410

© 1998 by the American Psychological Association For personal use only--not for distribution.

A Multiperspective, Multivariable Evaluation of Reliable Change Kirk M. Lunnen Department of Psychology Benjamin M. Ogles Department of Psychology ABSTRACT

N. S. Jacobson and P. Truax's (1991) method for evaluating the clinical significance of client change has gained some prominence in psychotherapy outcome research. However, little has been done to investigate the validity of this methodology. This study addresses this limitation by comparing (a) the perceived level of change (as subjectively reported from 3 distinct perspectives) across outcome groupings based on Jacobson and Truax's reliable change index (RCI) and (b) subjective reports of therapeutic alliance and satisfaction across outcome groupings. The results of these comparisons indicate that the RCI is effective in identifying those who make reliable improvement in therapy but is less effective in differentiating between nochangers and deteriorators. In addition, the relationship between treatment outcome and satisfaction with service is questioned.

Correspondence may be addressed to Benjamin M. Ogles, Department of Psychology, Ohio University, 241 Porter Hall, Athens, Ohio, 45701. Received: November 27, 1996 Accepted: July 14, 1997

Therapy researchers have used inferential procedures to compare group means and to examine both withinand between-group variability. If these tests of mean and variance differences are found to be beyond the range of chance, the effects are deemed "statistically significant." However, the use of statistical significance as an index of client change may be hampered on at least two fundamental levels. First, because statistical significance is based on group means and variances, it is difficult, if not impossible, to winnow out useful information regarding a specific client ( Barlow, 1981 ; Garfield, 1981 ; Hugdahl & Ost, 1981 ; Kazdin, 1977 ). Second, how does one interpret statistically significant results ( Barlow, 1981 )? Does a statistically significant result necessarily imply a clinically significant or clinically meaningful result? Although inferentially based statistical significance clearly provides an effective measure of the magnitude of change, it has no provision for an analysis of the relevance of change ( Jacobson, Follette, & Revenstorf, 1984 ; Jacobson & Revenstorf, 1988 ; Jacobson & Truax, 1991 ; Kazdin, 1977 ; Kendall & Grove, 1988 ; Wolf, 1978 ). Other methods, such as social validity ( Kazdin, 1977 ; Wolf, 1978 ), normative comparison ( Kendall & Grove, 1988 ), and statistically derived clinical significance ( Jacobson et al., 1984 ; Jacobson & Revenstorf, 1988 ; Jacobson & Truax, 1991 ), have been developed to ameliorate the interpretive difficulties of reliance on statistical significance alone (cf. Hansen & Lambert, 1996 ). Jacobson's clinical significance methodology

http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 2 of 19 provides a means whereby individual outcome can be assessed in a manner sensitive to both the magnitude and the relevance of changes made. The rationale is based on two specific statistical indexes (a cutoff point between the normal and dysfunctional distributions and an evaluation of the reliability of the pre- to posttreatment change score) that provide a specific guideline for interpreting the meaningfulness of treatment results. This approach to clinical significance is not without limitations. First, Jacobson et al. (1984) were remarkably vague in what they meant by a "functional group" ( Saunders, Howard, & Newman, 1988 ). Kazdin's (1977) original concept of social comparison suffered from this same limitation. Tingey, Lambert, Burlingame, and Hansen (1996) argued that this problem will take care of itself if instruments with adequate norms across multiple symptom levels are used. Such well-normed instruments can provide specific operationalizations of functional and dysfunctional populations. However, this possible remedy is still subject to the psychometric sophistication of the reference instrument and to possible floor and ceiling effects in these instruments. Floor and ceiling effects are especially problematic because many instruments are heavily weighted toward pathology, making any administration to a "normal" sample troublesome ( Lambert & Hill, 1994 ). In addition, patients who have the highest level of pathology have the greatest opportunity to show positive changes ( Mintz & Keisler, 1982 ). This introduces the problem that some of the change reflected in preminus posttreatment differences is the result of regression to the mean ( Speer & Greenbaum, 1995 ). Second, the Jacobson and Truax (1991) methodology has been criticized on the basis of its somewhat simplistic comparison of only two sample populations. Wampold and Jenson (1986) argued that arbitrarily using two distributions assumes that the distribution of the population in question is bimodal, which may or may not be the case. Consequently, they argued that this is not necessarily an accurate index of clinically significant change. Further, Hollon and Flick (1988) argued that if Wampold and Jenson's (1986) criticism is valid, the calculation of cutoff points that will accurately reflect clinically meaningful change and be stable across different samples will be difficult, if not impossible. Kendall and Grove (1988) as well as Blanchard and Schwartze (1988) proposed that a possible remedy of this limitation is to consider symptoms as a dimensional continuum rather than as a pair of bimodal distributions. Tingey et al. (1996) adopted this strategy by establishing a "continuum of dysfunction" as rated by the Symptom Checklist–90–Revised (SCL—90—R; Derogatis, 1983 ). Tingey et al.'s continuum was separated into four statistically distinct groups: asymptomatic, mildly distressed, moderately distressed, and severely distressed. They argued that these distributions provide more descriptive information about client change. Other researchers ( Condon, 1994 ; C. T. Grundy, 1994 ; L. M. Grundy, 1994 ; Segger, 1994 ) have used this same approach with other popular instruments: the Hamilton Rating Scale for Depression ( Hamilton, 1960, 1967 ), the State—Trait Anxiety Inventory (STAI; Spielberger, 1983 ; Spielberger, Gorsuch, & Lushene, 1970 ), the Beck Depression Inventory (BDI; Beck, Ward, Mendelson, Mock, & Erbaugh, 1961 ), and the Child Behavior Checklist ( Achenbach & Edelbrock, 1983 ). Although specific methodological issues relating to clinical significance have been questioned, there have been relatively few attempts actually to validate this approach empirically. Only two studies ( Ankuta & Abeles, 1993 ; Speer & Greenbaum, 1995 ) to date have attempted to compare outcome interpretations derived by this methodology with other methodologies. Speer and Greenbaum (1995) compared Jacobson and Truax's (1991) methodology with four other pre- and posttreatment difference methods. Unfortunately, all of these methods are based on similar underlying statistical premises; consequently, although it provides important results differentiating these four similar approaches, it fails to address the more basic questions of the face and construct validity of the methodology itself. Ankuta and Abeles (1993) addressed the question more directly. They compared clients who demonstrated http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 3 of 19 clinically significant improvement according to Jacobson and Truax's (1991) methodology (as measured on the SCL—90—R; Derogatis, 1983 ) with the client's own perceived satisfaction with therapy. They operationalized "satisfaction" as extent of self-reported change resulting from therapy (as measured on Strupp, Fox, & Lessler's, 1969 , Patient Questionnaire). They found that clients designated as having experienced "clinically significant" improvement did indeed report higher levels of satisfaction than those experiencing "nonclinically significant" change. This provides important initial evidence of the validity of Jacobson and Truax's methodology; however, the study did not address a number of important questions. First, although the introduction of their report highlighted the importance of multiple perspectives in outcome assessment, they did not actually implement Strupp and Hadley's (1977) recommendations to test Jacobson and Truax's (1991) methodology. Rather, their entire analysis was based on the clients' self-reported perspective. Inclusion of the therapist as well as a significant other would provide a more comprehensive picture. "Measurements from the therapist's and the independent observer's perspectives would increase the generalizability of the findings" ( Ankuta & Abeles, 1993 , p. 74). Second, Ankuta and Abeles (1993) considered one possible outcome indicator. They evaluated "satisfaction" (which might be labeled "perceived change" within their study). Although perceived change is an important outcome indicator, there are others. For example, increasing attention is being given to satisfaction with services as an outcome indicator ( Attkisson & Zwick, 1982 ; Ogles, Lambert, & Masters, 1996 ; Seligman, 1995 ). Third, Ankuta and Abeles (1993) only considered clinically significant improvement but did not evaluate the possibility of clinical deterioration. Questions regarding the appropriateness of using the Jacobson and Truax (1991) criteria to detect deterioration have not been addressed to date. The Jacobson and Truax (1991) method has been questioned as being too conservative (e.g., that the amount of change required to meet both the cutoff point requirement and the reliable change index (RCI) requirement is excessive ( Ogles, Lambert, & Sawyer, 1995 ; Tingey et al., 1996) . For example, a strict application of the two requirement methodologies (e.g., cutoff and RCI) makes clinically significant improvement impossible for an individual whose pretreatment score is lower than the cutoff between the functional and dysfunctional groups. This implies that a "mildly symptomatic" individual (whose pretreatment assessment is lower than the cutoff between the functional and dysfunctional populations) can never make clinically significant improvement. Likewise, a severely symptomatic individual could experience a huge reduction in symptom distress as reflected in statistically reliable change and yet, no matter how great the magnitude of that change, if "normalization" is not reached, then a conclusion of clinically significant improvement for this individual is not warranted. Admirable steps have been made to evaluate clinical significance, but a number of important questions persist. For example, are reliable changes, as defined by Jacobson and Truax's (1991) RCI, also noticeable or meaningful to clients, their spouses or significant others, and their therapists? To what degree will changes deemed as meaningful to the individual client be observable to the spouse or significant others and therapists? How does the RCI criteria, based on symptomatic improvement, match up with other important indexes of outcome such as satisfaction with services and therapeutic alliance? Will the global retrospective opinions of clients, therapists, and significant others "agree" with the clients' self-reported symptom change scores? This would directly evaluate the use of the RCI as a sole indicator of clinical significance. The present study examined the validity of the RCI component of Jacobson and Truax's (1991) methodology using multiple perspectives and multiple correlates of outcome. Outpatient participants from a http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 4 of 19 community mental health center were categorized (using the RCI) into improved, no change, or deteriorated groups based on their pre- to postdifference scores on a criterion outcome instrument. Clients, their therapists, and their significant others were asked to rate the effects of therapy using multiple variables, including perceived change, therapeutic alliance, and satisfaction with services.

Method Participants Participants in the study included adult outpatients, their spouses or significant others, and their therapists. The outpatients were drawn from a community mental health center and represented a range of psychological disturbance and severity. Clients. Fifty-two adult outpatients (35 women, 17 men) from a midwestern community mental health center participated in the study. All of the participants were Caucasian. The clients were primarily of low economic status and received services on an income-adjusted basis. The clients' mean yearly income (based on selfreported monthly income) was less than $8,000 ( M = $7,470, SD = $5,350). Their average age was 32.59 years ( SD = 11.33). The participants represent a range of Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM—IV ; American Psychiatric Association, 1994 ) diagnoses: 38.7% dysthymia, 22.6% major depression, 16.1% adjustment disorder, 6.5% substance abuse, 6.4% personality disorder, and 9.7% miscellaneous other diagnoses. These percentages represent the primary diagnosis for billing purposes; however, in several cases there was a deferred diagnosis on Axis II. The diagnosticians in the present study were master's-level counselors who were under supervision of a licensed clinical psychologist. Approximately half of the clients ( n = 28, or 55%) were taking a psychotropic medication prescribed for their disorder. Therapists. Eight therapists, four men and four women, participated in the project. The average age of the therapists was 31.25 years ( SD = 3.62 years). The therapists represented a range of training, including master's-level counselors, clinical psychology doctoral students, and a doctoral-level clinical psychologist. The average number of years of training was 3.88 ( SD = 1.36). The average years of clinical experience, including training practicum, was 4.13 ( SD = 1.81). Each therapist rated varying numbers of clients, ranging from 18 (35% of total) to 1 (1.9% of total). Four therapists accounted for 88.9% of the clients rated. A chi-square analysis showed that there were no significant differences in outcome groupings of clients by therapist, ? 2 (18, N = 52) = 14.95, p < .38 (values were not significant at .05). Spouse or significant other. Thirty-nine spouses or significant others–18 women and 21 men–participated in the study. Their average age was 34.7 years ( SD = 13.5 years). Instruments The Outcome Questionnaire (OQ-45; Lambert, Lunnen, Umphress, Hansen, & Burlingame, 1994 ) was

http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 5 of 19 used as the criterion instrument for classification of participants into the respective outcome groups (e.g., improvers, no-changers, or deteriorators). The Patient Questionnaire (Q-P; Strupp et al., 1969 ) and the Therapist Questionnaire (Q-T; Strupp et al., 1969 ) were used to measure degree of perceived change from the clients' and therapists' perspectives, respectively. The Helping Alliance Questionnaire (HAq; Alexander & Luborsky, 1986 ) was used to assess therapeutic alliance for clients and therapists. The Client Satisfaction Questionnaire (CSQ-8; Larsen et al., 1979 ) was used to rate client and spouse or significant other satisfaction. Criterion instrument. The OQ-45 ( Lambert et al., 1994 ) was used as the criterion instrument. It was designed to measure patient progress in therapy by repeated administration during the course of treatment and at termination. Patient progress is measured based on Lambert's (1983) conceptualization suggesting that three aspects of the patients life should be monitored: (a) subjective discomfort (intrapsychic functioning), (b) interpersonal relationships, and (c) social role performance. Items address commonly occurring problems across a wide variety of disorders and tap the symptoms most likely to occur. The items also measure personally and socially relevant characteristics that affect the individual's quality of life. Each item is scored on a 5-point scale (0 = never, 1 = rarely, 2 = sometimes, 3 = frequently, 4 = almost always ), yielding a range of possible scores of 0 to 180. The OQ-45 provides a total score (TOT) as well as three subscale scores. The OQ-45 TOT, which provides a global assessment, was used in the present study. Higher values indicate the endorsement of pathology. Lambert et al. (1994) reported adequate internal consistency for the TOT ( α = .93). The 3-week test— retest value for the TOT is also satisfactory ( r = .84) ( Lambert et al., 1994 ; Lambert, Hansen, et al., 1996 ). Concurrent validity figures, as estimated by correlating the TOT with the SCL—90—R ( Derogatis, 1983 ), BDI ( Beck et al., 1961 ), Zung Self-Rating Depression Scale ( Zung, 1965 ), Zung Self-Rating Anxiety Scale ( Zung, 1971 ), Taylor Manifest Anxiety Scale ( Taylor, 1953 ), STAI ( Spielberger, 1983 ; Spielberger, Gorsuch, & Lushene, 1970 ), Inventory of Interpersonal Problems ( Horowitz, Rosenberg, Baer, Ureno, & Villesenor, 1988 ), and the Social Adjustment Scale ( Weissman & Bothwell, 1976 ) were all significant at the .01 level. Normative information based on data collected in a western state for the OQ45 has been reported ( Lambert et al., 1994 ; Lambert, Burlingame, et al., 1996 ; Umphress, Lambert, Smart, Barlow, & Clouse, 1997 ). Perceived change. Symptomatic change represents perhaps the most obvious dimension of outcome evaluation ( Ankuta & Abeles, 1993 ; Lambert & Hill, 1994 ; Ogles et al., 1996 ). Clients usually enter treatment as a result of discomfort (for themselves or others) associated with certain pathological symptoms. Successful treatment by definition should ameliorate these symptoms. Consequently, comparing the extent of change rated by multiple perspectives retrospectively with the RCI-based classification of pre- to postchange would provide a powerful test of the validity of this portion of the Jacobson and Truax (1991) methodology. The Q-P ( Strupp et al., 1969 ) and the Q-T ( Strupp et al., 1969 ) were chosen to address perceived symptomatic change. The Q-P consists of 89 items designed to provide a comprehensive view of the client's subjective therapeutic experience. The instrument has six usable measurement clusters: l

Therapist's warmth

http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 6 of 19

l

Amount of change

l

Present adjustment/current status

l

Amount of change apparent to others

l

Therapist interest, integrity, and respect

l

Degree of disturbance before therapy.

Cluster 2 was used in the present study. It contains four items (numbers 18, 19, 73, and 79) targeted toward the degree of benefit the clients feel they have received from therapy: (a) "How much have you benefited from your therapy?"; (b) "Everything considered, how satisfied are you with the results of your psychotherapy experience?"; (c) "How much do you feel you have changed as a result of psychotherapy?"; (d) "To what extent have your complaints or symptoms that brought you to therapy changed as a result of your treatment?" This cluster most specifically targets the clients' views on their therapeutic experience and is consequently the most relevant to the present study. This cluster has a range of − 10 to 9; higher values indicate greater degree of self-reported change. The item—scale correlations of the items in this cluster range from .84 to .48 ( Strupp et al., 1969 ). The internal consistency for cluster 2 was .84 ( Ankuta & Abeles, 1993 ). Cluster 2 has been used as a stand-alone assessment of perceived change in several studies ( Ankuta & Abeles, 1993 ; Eaton, Abeles, & Gutfreund, 1988 ; Lichtenstein, 1985 ). Clusters 1 and 5 were excluded because they concern therapist variables, which either are irrelevant in the present study or redundant with the therapeutic alliance instrument. Similarly, Cluster 3 was redundant to the posttherapy administration of the criterion instrument. Cluster 4 was excluded because it required the client to speculate as to the opinions or feelings of individuals other than him or herself. Finally, Cluster 6 has questionable reliability and validity because it requires individuals to appraise their level of functioning weeks or even months earlier. A modified Q-P Cluster 2 was used to evaluate spouses' or significant others' views on how much they feel the client has changed through therapy. The modification consisted of altering the items from first to third person. For example, Item 73 on the client version reads, "How much do you feel you have changed as a result of psychotherapy?"; on the modified version, it was altered to read "How much do you feel your spouse or significant other has changed as a result of psychotherapy?" The internal consistency for the modified scale, calculated from data in the present study, was acceptable ( α = .87). The Q-T ( Strupp et al., 1969 ) evaluated the therapist's impressions of the extent of change the client experienced in treatment. The Q-T is composed of 23 items designed to parallel roughly the Q-P content areas. The Q-T was also subjected to a cluster analysis resulting in five usable clusters: l

Therapy success (symptomatic improvement)

l

Remaining disturbance

l

Warmth of patient—therapist relationship

l

Patients' capacity for intensive therapy

http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 7 of 19

l

Adjustment before therapy.

Cluster 1, composed of Items 12, 13, and 17, was used to measure the therapists' view of symptomological improvement and the overall "success" of treatment. These items yield a range of 0 to 12; higher values indicate a greater degree of observed improvement. No internal consistency coefficient is reported for Cluster 1 in the original report. Cronbach's α based on data in the present study was .85. Satisfaction. Client satisfaction with services can also serve as an important indication of the effects of psychotherapy ( Alexander & Luborsky, 1986 ; Ciarlo, Edwards, Kiresuk, Newman, & Brown, 1981 ; Corcoran & Fischer, 1987 ; Kazdin, 1977 ; Larsen, Attkisson, Hargreaves, & Nguyen, 1979 ; Ogles et al., 1996 ; Wolf, 1978 ). A common way to measure client satisfaction is to administer postservice questionnaires. Their use is fairly widespread across mental health institutions, businesses, and other service organizations to gauge the extent of consumer satisfaction with the goods or services provided ( Ogles et al., 1996 ). These instruments range from throw-together, homemade tests to lengthy, psychometrically sophisticated devices. Higher levels of satisfaction with therapy are reported for those with "positive" treatment outcomes ( Ihilevich & Gleser, 1979 ; Kazdin, 1977, 1993 ; Wolf, 1978 ). Consequently, those classified by Jacobson and Truax's (1991) RCI as having made "reliable improvement" should report a significantly greater degree of satisfaction with treatment than groups classified as "no change" or significantly "deteriorated." The CSQ-8 ( Larsen et al., 1979 ) assessed satisfaction with services. It is a brief, eight-item instrument designed to assess postservice satisfaction. It has been demonstrated to have adequate psychometric properties and has been reviewed favorably by several independent sources ( Ciarlo et al., 1981 ; Corcoran & Fischer, 1987 ). Various tests of its internal consistency have yielded alpha levels ranging from .86 to .94 ( Corcoran & Fischer, 1987 ). As evidence of its concurrent validity, scores on the CSQ-8 were found to be highly correlated with clients' ratings of global improvement of symptomatology and therapists' ratings of clients' progress and likability. Further, CSQ-8 scores are also correlated with dropout rates; lower satisfaction clients have higher dropout rates. The CSQ-8 is scored by simply summing the individual item scores to produce a range of 8 to 32; higher scores indicate a greater degree of satisfaction. A modified CSQ-8 (items reworded from first to third person) was used to assess satisfaction with treatment from the spouse or significant other's perspective. Therapeutic alliance. Goldfried, Greenberg, and Marmar (1990) described therapeutic alliance as the "overarching general process variable that relates to outcome" (p. 670). Therapeutic alliance has been measured by examining the client's and therapist's individual contributions to therapy and the dynamics of their interaction. The primary areas of focus are the measurement of the interaction in terms of the therapist—client bond and congruence on the tasks and goals of therapy ( Lambert & Hill, 1994 ). Positive treatment outcomes have been associated with high degrees of therapeutic alliance ( Garfield, 1994 ; Lambert & Hill, 1994 ; Ogles et al., 1996 ). Consequently, individuals classified by Jacobson and Truax's (1991) RCI as having made reliable improvement would be expected to report a higher degree of therapeutic alliance than those classified as having made "no change" or having "deteriorated." The HAq ( Alexander & Luborsky, 1986 ) was chosen to assess both client and therapist views of the therapeutic relationship. It consists of 19 items rated on a 6-point scale (1 = strongly disagree, 2 = http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 8 of 19 disagree, 3 = slightly disagree, 4 = slightly agree, 5 = agree, 6 = strongly agree ). The HAq was designed with parallel forms for completion by the client and the therapist. A total degree of alliance from each perspective is derived by summing the ratings. Alexander and Luborsky (1986) reported adequate psychometric properties for the HAq. Investigations of the interrater reliability for the HAq resulted in values of .90 to .94 for the patient version and .91 to .93 for the therapist version ( Luborsky, Barber, Siqueland, & Johnson, 1996 ). Test—retest reliability figures were .79 for the patient version and .57 for the therapist version. Concurrent validity was estimated by correlating HAq scores with the California Therapeutic Alliance Rating System ( Marmar, Horowitz, Weiss, & Marziali, 1986 ; Marmar, Weiss, & Gaston, 1989 ), yielding values of .51 to .71 for the patient version and .72 to .80 for the therapist version. It was also found that the HAq effectively ( r = .51—.72) predicted outcome in veterans with drug-dependency problems ( Alexander & Luborsky, 1986 ). Table 1 provides a summary of the instruments used in the present study classified by outcome domain and perspective source. Procedures During the initial financial intake at the community mental health center, patients were asked to participate in an ongoing outcome evaluation project by agreeing to complete the OQ-45 before each therapy session. Fewer than half (46.6%) of the potential participants (e.g., individuals who completed the OQ-45 during the financial intake) either did not participate or were excluded from the study because they did not, for one reason or another, complete the OQ-45 before their subsequent therapy sessions. Therapists at the center were asked to participate with the understanding that any data collected would not be used to assess their performance. As many spouses or significant others for patients were also included (contacted and asked to participate via letter). "Significant other" was defined as any adult living in the same residence as the client who knows he or she is in therapy. Patients for the study were selected from the total case database on the basis of change in the level of symptomatic distress as measured by session to session scores on the OQ-45. Clients who demonstrated reliable change (according to Jacobson and Truax's, 1991 , RCI criteria) were mailed a packet containing the Q-P, CSQ-8, and HAq for themselves to complete and a packet containing the MQ-P and the MCSQ8 for their spouse or significant other to complete. At approximately the same time as the mailing, each client's therapist was asked to complete a packet containing the Q-T and the therapist version of the HAq. The reliable changers were separated into two groups: Those whose change represented symptom reduction were classified as the improvers and those whose change represented symptomological increase were classified as the deteriorators. No-changers included clients whose OQ-45 scores did not reliably change on two or more administrations of the instrument. Each no-changer, their therapist, and their spouse or significant other were asked to complete the same instruments, respectively (without knowing the clients' outcome grouping). In hopes of increasing participation, clients and spouses or significant others were given $5 for completing their respective questionnaires. We mailed 133 packets, of which 8 were returned with no forwarding address; of the remaining 125, 53 (42.4%) of the client forms were returned and 40 (32%) of the spouse or significant other forms were returned. Chi-square analyses were conducted to evaluate possible differences in return rates by outcome group for the client and spouse or significant other. Results indicate no differences in return rate for the three groups, client ? 2 (2, N = 52) = 2.30, p < .31, spouse or significant other, ? 2 (2, N = 39) = 4.47, p < .11 (values were not significant at .05). Data in the present study were collected over a 10-month period during the 1995—1996 fiscal year. Reliable change was calculated on an individual basis, participant by participant, using the following reliable change criteria recommended by Jacobson and Truax (1991) . Individuals made a reliable change if they met

http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 9 of 19 or exceeded Jacobson and Truax's RCI, defined as follows: For the OQ-45 TOT, a 21-point difference is required to make clinically reliable change. Table 2 provides exact figures for each of the components in the RCI formula. For reliable changers (both improvers and deteriorators), the number of sessions used in the pre- and postdetermination ranged from 2 to 14 ( M imp = 3.29, SD imp = 2.87; M det = 3.08, SD det = 2.22). For no-changers, the number of sessions ranged from 2 to 12 ( M nc = 2.47, SD nc = 1.61). No significant differences existed between groups in terms of the number of sessions before the mailing ( p < .629).

Results The results are divided into four analyses. The first considers the demographic characteristics of the sample and evaluates possible differences across groups. The three subsequent analyses evaluate each of the three individual outcome perspectives (e.g., client, therapist, and spouse or significant other). Because three planned comparisons were conducted for each of the outcome perspectives, a Bonferroni correction is necessary to reduce the likelihood of making a Type I error. With the correction, the appropriate alpha level indicating statistical significance is .02. In addition, because the planned analyses were a priori using conceptually independent—dependent variables, no aggregate multivariate analyses were conducted. Demographic Analysis To minimize possible confounds associated with differences between groups, clients were evaluated across several demographic and treatment dimensions: (a) gender, (b) age, (c) income, (d) DSM-IV diagnosis, (e) therapist, (f) medication use, (g) pretherapy severity of symptoms (as measured by the OQ-45 completed at intake), (h) number of sessions between pre- and postadministration of the OQ-45, and (i) time (in days) between the pre- and postadministration of the OQ-45. Analyses of variance (ANOVAs) were conducted using age, income, pretherapy severity of symptoms, number of sessions between pre- and post-OQ-45, and number of days between pre- and post-OQ-45 as the dependent variables and RCI-based outcome groupings as the independent variable. There were no significant differences between the three groups on any of these dimensions. Table 3 provides the results of the univariate analyses. For the categorical demographic variables, gender, DSM—IV diagnosis, therapist, and medication use, chi-square analyses where conducted. As with the other demographic variables, there were no differences across groups on gender, ? 2 (2, N = 52) = 1.83, p < .401; DSM—IV diagnosis, ? 2 (12, N = 52) = 15.84, p < .199; therapist, ? 2 (18, N = 52) = 17.46, p < .492; or medication use, ? 2 (2, N = 52) = 1.62, p < .445 (no values significant at .05). These results indicate that there are no substantive differences between the clients in the three outcome groups on these potentially confounding variables. Evaluation of Client Perspective To evaluate possible differences in client data across the three outcome groups, three a priori planned comparisons were conducted within three one-way ANOVAs (improvers, no-changers, deteriorators) using the Q-P, HAq, and CSQ-8 as dependent variables. The cell sizes for each outcome group are listed in Table 4 . The first planned comparison evaluated the possible differences between the improvers and no-changers. http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 10 of 19 For this contrast, the two groups differed on perceived change and therapeutic alliance but not on satisfaction. The second contrast compared the improvers group to the deteriorators group. With the adjusted alpha level, the univariate analyses indicate that the two groups differed significantly on the Q-P and HAq but not on the CSQ-8. The third comparison assessed possible differences between the no-changers and the deteriorators. The univariate analyses indicate that the two groups did not differ significantly on any of the dependent variables. Evaluation of Therapist Perspective To evaluate possible differences in therapist data across the three outcome groups, three a priori planned comparisons were conducted within two one-way ANOVAs (improvers, no-changers, deteriorators) using the Q-T and the HAq as dependent variables. Univariate analyses contrasting improvers with no-changers indicated that the two groups differed significantly on the Q-T but not on the HAq. When comparing the improvers with the deteriorators, univariate analyses indicated that the two groups differed significantly on both the Q-T and HAq. In contrast, a comparison of the no-changers with the deteriorators found nonsignificant differences on either of the dependent variables. Evaluation of Spouse or Significant Other Perspective The final analysis examined differences in spouse or significant other data across the three outcome groups. Again, three a priori planned comparisons were conducted within two ANOVAs (improvers, no-changers, deteriorators) using the modified Q-P and the modified CSQ-8 as the dependent variables. In this analysis no significant differences were identified in any of the three contrasts (improvers vs. deteriorators, improvers vs. no-changers, and no-changers vs. deteriorators). Table 5 provides a summary of the means and standard deviations for each of the dependent variables and lists the significance levels of all the contrasts across all perspectives.

Discussion The aim of this project was to evaluate the utility of the RCI as an indicator of clinically significant or meaningful change. Outcome groups identified based on reliable changes occurring in session by session symptom ratings were compared using a multivariable, multiperspective methodology. There were several interesting findings. Improvers–those who made reliable reductions in symptomatology–were in fact distinguishable from nonchangers and deteriorators in terms of client and therapist ratings of perceived change and the helping alliance. On the other hand, deteriorators–those who made reliable increases in symptomatology–were not significantly different from no-changers using any perspective or any measure. The spouse or significant other perspective failed to differentiate between any of the outcome groups on any of the outcome variables. Finally, with the exception of a single analysis of the six conducted, satisfaction failed to differentiate between outcome groups in only one perspective (client perspective improvers vs. deteriorators). Across two of the three perspectives and two of the three outcome dimensions, improvers were distinguishable from no-changers and deteriorators. Both clients' and therapists' retrospective evaluations of change and alliance concurred with the reliable change-based groupings. This directly contradicts the purported lack of congruence between prospective and retrospective ratings of change described elsewhere ( Howard et al., 1996 ).

http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 11 of 19

The client-reported change data replicate the findings of Ankuta and Abeles (1993) , who examined the perceived level of change (rated retrospectively) for clients' grouped using Jacobson and Truax's (1991) methodology. They concluded that clients designated as having experienced "clinically significant" improvement did indeed report more change than those experiencing "nonclinically significant" change. The current replication extends the Ankuta and Abeles's (1993) findings, which were based on data collected in a university counseling center to a more heterogeneous sample from a community mental health center. In addition, the present study suggests that therapists make similar retrospective ratings. Therapists reported greater amounts of change for improvers than for clients who did not change or who deteriorated. Pekarik and Wolff (1996) also found a relationship between therapist ratings of change and outcome classifications. As with the data regarding perceived change, levels of therapeutic alliance were rated as significantly different for improvers versus no-changers and deteriorators by both clients and therapists. These results replicate the findings of many other studies ( Gaston, 1991 ; Gomes-Schwartze, 1978 ; Horvath & Greenberge, 1989 ) that indicate a consistent relationship between alliance and outcome. Studies report moderate to great correlations between alliance and outcome ( Henry, Strupp, Schacht, & Gaston, 1994 ). Horvath and Symonds (1991) found that clients' perspective of alliance was more highly correlated with outcome than was therapist-rated alliance. Similarly, Henry et al. (1994) proposed that therapists may have a "blind spot" when rating alliance. The data in the present study, however, suggest that therapists' ratings of alliance match categorical outcome groupings at least when comparing improvers with no-changers and deteriorators. One of the most interesting findings in this study was that deteriorators were indistinguishable from nochangers. Significant differences between no-changers and deteriorators were only found in one of the seven contrasts (client-rated satisfaction). This suggests that clients who report an increase in session by session symptomatology on the reference instrument (OQ-45) retrospectively report comparable levels of change, alliance, and satisfaction to client's who report no session by session symptom change. This begs the question of whether reductions in symptoms are necessarily the ruler individuals use to make judgments about amount of change, quality of the alliance, and treatment satisfaction. Sloan, Staples, Cristol, Yortson, and Whipple (1975) and Pekarik and Wolff (1996) , among others, argued that there are a variety of seemingly innocuous factors by which clients rate outcome independent of problem reduction (e.g., the physical attractiveness of the clinic, the friendliness of the support staff, the availability of parking). There are several other possible explanations for this failure to differentiate between the no-change and deterioration groups. For example, there is some question about the availability of instruments with the sensitivity to measure deterioration. This problem is compounded by the use of global, retrospective devices that inevitably provide a greater variance in favor of positive rather than negative effects ( Mohr, 1995 ). Second, well-documented findings indicate a propensity on the part of clients and therapists to collude in overestimating treatment efficacy. Studies repeatedly suggest that therapists in particular consistently fail to identify individuals who deteriorate ( Lambert, Bergin, & Collins, 1977 ; Mohr, 1995 ). Consequently, there is a relative dearth of information concerning those who actually experience negative changes during therapy. Lambert and Bergin (1994) , among others, agreed that psychotherapy can potentially be a negative experience for some people and even harmful for others. After reviewing the empirical literature and critiques of the evidence accumulated, it is our view that psychotherapy can and does harm a portion of those it is intended to help. The study of negative change has important implications for the selection of clients for treatment, the suitability of specific procedures for some clients, and the selection, training, and monitoring of http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 12 of 19 therapists. (p. 176) The finding in the present study that approximately 10% of the total number of available participants actually worsened during treatment is consistent with other reported findings. Lieberman, Yalom, and Miles (1973) reported a deterioration rate of 10% among clients. In their reanalysis of the National Institute of Mental Health Collaborative Depression Study, Ogles et al. (1995) found that from 2% to 13% (M = 6%) of participants worsened. Of course, the finding that some clients report an increase in symptoms while in treatment does not indicate that the deterioration was caused by the treatment. A variety of factors may contribute to an increase in the clients' level of distress independent of the therapy. Lambert and Bergin (1994) reported that patient diagnosis and degree of disturbance are significantly related to deterioration. The findings of multiple studies suggest that the more severe the pathology, the greater is the likelihood of deterioration ( Horowitz, 1974 ; Lieberman et al., 1973 ; Weber, Elinson, & Moss, 1965 ). These studies considered pathologies typically resulting in inpatient services. In contrast, Strupp et al. (1969) examined an outpatient sample and found that deteriorators were not more significantly disturbed at the beginning of therapy than patients who subsequently had more positive outcomes. The same was true in the present study. The deteriorators did not exhibit greater pretherapy pathology (as measured on the OQ-45) than the other two outcome groups. It goes without saying that deteriorators are an interesting population who warrant further study. Little new information will be gathered until more sophisticated methods of assessing these negative effects are developed. Lambert and Bergin (1994) argued that it is psychotherapists' responsibility to "be sensitive to both the positive and negative effects of therapy and base our treatment efforts on a broad empirical foundation" (p. 182). Compliance with this duty is impossible without further investigations into the nature, extent, and consequences of these negative effects ( Mohr, 1995 ). The spouse or significant other perspective has long been neglected in psychotherapy research ( Strupp & Hadley, 1977 ; Lambert & Hill, 1994 ). For several reasons, the gathering of such data is problematic. From purely a pragmatic standpoint, it is often difficult for researchers to obtain information from the therapeutic dyad itself, let alone from a noninvolved party. Furthermore, if the researcher is able to tap this population, he or she must walk an ethical tightrope involving confidentiality. The possible information is limited to global rather than specific queries that could potentially breach client—therapist confidentiality. Alone among the three perspectives, spouses or significant others reported outcomes inconsistent with the RCI-derived groupings. The analysis revealed no significant differences between the outcome groups on either perceived change or satisfaction ratings of significant others. Several factors undoubtedly contributed to these findings. Among these, the limited sample size certainly played a role. The power of these analyses was quite low (range = .044—.160). Additionally, more stringent controls were needed to ensure that the "significant others" actually had enough knowledge about the clients' status to complete the forms. On the other hand, the lack of findings is a sobering reminder that changes occurring during treatment may not be noticed by individuals who daily interact with the participants. It should be noted, however, that the significant other ratings were uniformly positive. Despite lack of client self-reported symptomatic change, significant others rated the respective clients as having made moderate improvements while reporting a high degree of satisfaction with the services received. This may indicate that significant others report a general satisfaction based on the client's involvement in treatment even though they do not notice specific symptom changes. With the exception of one of the six analyses (client-rated improvers vs. no-changers), satisfaction with http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 13 of 19 services failed to differentiate among the three outcome groups. Rather than impugning the RCI methodology, these data may be a reflection of the relationship between satisfaction and outcome in general. Increasing evidence seems to imply that this relationship between satisfaction and symptom change is not as strong as previously believed. In fact, in a study published while the present project was being conducted, Pekarik and Wolff (1996) compared client satisfaction across significant therapeutic successes and failures for both therapist and client perspectives and found no significant differences. The fact that this study used stricter criteria (clinical significance) to define and separate those who were therapeutic successes from those who had failed and still found little in the way of significant relationships between outcome and satisfaction provides stronger evidence than previously available of the negligible relationship between satisfaction and outcome. The clinical-significance definitions clearly produced extreme groups of successes and failures, yet even these were not related to satisfaction. This strongly suggests that satisfaction is not meaningfully related to traditional client measures of outcome. (pp. 205—206) As with Pekarik and Wolff (1996) , the present data suggest that satisfaction is independent of the degree of symptomatic change. "It is conceivable that clients have rated their degree of satisfaction on the basis of something other than their satisfaction with the symptom or presenting problem change" ( Pekarik & Wolff, 1996 , p. 206). Sloan et al. (1975) and Seligman (1995) , among others, reported that several factors independent of symptomatic change, such as likability of the therapist, availability of services, and pleasantness of the clinic environment, may contribute to clients' ratings of satisfaction. The lack of variability in satisfaction data in the present study is also similar to the results of other research on satisfaction that show that clients tend to report high satisfaction regardless of outcome ( Pekarik & Wolff, 1996 ). This is doubtless a function of a number of factors. For example, in contrast to completing selfreport symptom checklists that may require the report of "negatives" about themselves, clients may be reluctant to report dissatisfaction with the service provider. There is also some question as to whether current satisfaction instruments actually provide an appropriate range of items reporting dissatisfaction. Pekarik and Wolf (1996) suggested that "future research could address this by generating satisfaction items that . . . assess a wider range of clinical phenomenon. For example, rather than simply ask about satisfaction with the therapist, items could assess specific aspects of therapist behavior (e.g., therapist advice on how to cope with problems outside the session)" (p. 206). In contrast with the majority of outcome evaluations based on therapy completion, the present study attempted to evaluate the relevance of changes that occurred while therapy was ongoing. Consequently, it provides some preliminary information regarding the typical number of sessions required to achieve clinically reliable change. The number of sessions clients required to make reliable change ranged from 2 to 12 ( M = 3.3, SD = 2.9) for the improvers and from 2 to 9 ( M = 3.1, SD = 2.2) for the deteriorators. It should be noted that sessions did not correspond to weeks or days; the average number of days to make reliable change ranged from 14 to 270 ( M = 65.57, SD = 29.93) for improvers and from 14 to 150 ( M = 72.67, SD = 38.77) for the deteriorators. Fourteen percent of the clients made reliable improvement and 10% experienced reliable deterioration. The improvement figure is noticeably smaller than that obtained by Ankuta and Abeles (1993) , who found that approximately 28% of clients in their sample achieved clinically significant improvement. This may be explained by the fact that participants in their study each had a minimum of 10 sessions as opposed to the average of 3 for the present study. In addition, their participants were drawn http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 14 of 19 from a university counseling center as opposed to a community mental health center. Without applying the RCI criteria, Howard, Kopta, Krause, and Orlinsky (1986) determined that by the 8th session approximately 50% of patients had shown some "measurable" improvement. The present findings seem consistent with several other reports ( Budman & Gurman, 1988 ; Howard et al., 1986 ; Kadera, Lambert, & Andrews, 1996 ; Orlinsky & Howard, 1986 ; Smith, Glass, & Miller, 1980 ) indicating that the major positive impact of psychotherapy occurs in the first 6 to 8 sessions. "Improvement is proportionally greater in the earlier sessions. . .and increases more slowly as the number of sessions grows. . .This analysis also suggests a course of diminishing returns with more and more effort required to achieve just noticeable difference in patient improvement" ( Orlinsky & Howard, 1986 , p. 361). Several characteristics of the current sample are problematic. Power analyses reveal low powers for several of the planned comparisons, in part, because of small samples ( Cohen, 1988 ). This was particularly true for the spouse or significant other perspective. The selectivity of the sample is also a problem. Nearly half (46.6%) of potential participants were excluded from or dropped out of the study. There were two primary reasons for this: (a) failure of the client to return for treatment, and (b) failure of the support staff consistently to provide the client with the instrument. This difficulty was further compounded by the fact that of the 133 packets distributed, only 53 (42.4%) were returned. At the same time, the final sample represents one fifth of all clients seen for outpatient treatment in this small rural community mental health center during the period of data collection. In addition, no apparent systematic confound resulted in differential placement of clients in the outcome-based groups. Consideration of the therapist ratings should also be tempered by the fact that the majority of ratings (88.9%) were made by four of the eight therapist participants. In fact, a single therapist accounted for approximately 40% of the ratings herself. A chi-square analysis showed that there were no significant differences in RCI-based outcome groupings by therapist. Nevertheless, this overrepresentation presents an obvious problem in terms of generalizing these findings. Although the results of this study should be evaluated within the context of these sampling problems, our participants were recruited as part of routine treatment administered within a rural community mental health center. As a result, the findings may generalize to other similar facilities that have similar dropout rates and a small number of therapists. There is also some reason to question the range of responses the selected instruments provided. In most cases, the results were positively skewed. For example, even the deteriorators reported a "high" level of satisfaction with treatment and a "moderate" degree of perceived change. The instruments selected for the study had restricted ratings of negative therapeutic consequences. It is conceivable that, with instruments that provide ratings of finer levels of negative effects, participants may be more likely to endorse negative items. To date, no such instruments have been developed. Future research would be well served by developing these kinds of tools ( Mohr, 1995 ). Finally, the present study conducted an evaluation of session by session self-report information based on retrospective self-report information. This poses a number of obvious problems. For example, when reporting on indexes that require a judgment of the quality or value of services (such as satisfaction or therapeutic benefit), people may tend to be overly optimistic ( Mohr, 1995 ; Pekarik & Wolf, 1996 ). Conway and Ross (1984) provided further reason to question these self-report data. They found that not only do people have a tendency to recall the past in ways that are consistent with their present condition, but they exaggerate or even reconstruct memories of past events to support otherwise invalid theories of change. In their study, people who had participated in a bogus study habits improvement program recalled their http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 15 of 19 preprogram evaluations as being worse than they actually were to justify their participation in the program. This variant of cognitive dissonance may explain the uniformly glowing outcome reports across the three groups. The circularity of this self-report evaluation of other self-report—based methodology was somewhat, but by no means totally, ameliorated by the inclusion of multiple perspectives. Future results may be more compelling if other non-self-report-based indexes are used. Specific, objectively based criteria such as number of arrests, days missed at work, money spent annually on therapy, and so on may provide powerful independent arguments for or against this clinical significance methodology.

References Achenbach, T. M. & Edelbrock, C. S. (1983). Manual for the Child Behavior Checklist and Revised Child Behavior Profile. (Burlington, VT: Department of Psychiatry, University of Vermont) Alexander, L. B. & Luborsky, L. (1986). The Penn Helping Alliance Scales.(In L. S. Greenberg & W. M. Pinsoff (Eds.), The psychotherapeutic process: A research handbook (pp. 325—366). New York: Guilford Press.) American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.).(Washington, DC: Author) Ankuta, G. Y. & Abeles, N. (1993). Client satisfaction, clinical significance, and meaningful change in psychotherapy.( Professional Psychology: Research and Practice, 24, 70—74.) Attkisson, C. C. & Zwick, R. (1982). The client satisfaction questionnaire: Psychometric properties and correlations with service utilizations and psychotherapy outcome.( Evaluation and Program Planning, 5, 233—237.) Barlow, D. H. (1981). On the relation of clinical research and clinical practice: Current issues, new directions.( Journal of Consulting and Clinical Psychology, 49, 147—155.) Beck, A. T., Ward, C. H., Mendelson, M., Mock, J. & Erbaugh, J. (1961). An inventory for measuring depression.( Archives of General Psychiatry, 4, 561—571.) Blanchard, E. B. & Schwartze, S. P. (1988). Clinically significant changes in behavioral medicine. ( Behavioral Assessment, 10, 171—188.) Budman, S. H. & Gurman, A. S. (1988). Theory and practice of brief therapy. (New York: Guilford) Ciarlo, J. A., Edwards, D. W., Kiresuk, T. J., Newman, F. L. & Brown, T. R. (1981). Final report: The assessment of client/patient outcome techniques for use in mental health programs (Contract No. 278-80-0005 DB).(Washington, DC: National Institute of Mental Health) Cohen, J. (1988). Statistical power analysis for behavioral sciences (2nd ed.).(New York: Academic Press) Condon, K. (1994). Assessing clinical significance: Application to the State/Trait Anxiety Inventory. (Unpublished doctoral dissertation, Brigham Young University, Provo, UT) Conway, M. & Ross, M. (1984). Getting what you want by revising what you had.( Journal of Personality and Social Psychology, 47, 738—748.) Corcoran, K. & Fischer, J. (1987). Measures for clinical practice. (New York: Free Press) Derogatis, L. R. (1983). SCL-90: Administration, scoring, and procedures manual for the revised version. (Baltimore, MD: Clinical Psychometric Research) Eaton, T., Abeles, N. & Gutfreund, M. J. (1988). Therapeutic alliance and outcome: Impact of treatment length and pretreatment symptomatology.( Psychotherapy, 25, 536—542.) Garfield, S. L. (1981). Psychotherapy.(A 40 year appraisal. American Psychologist, 36, 174—183.) Garfield, S. L. (1994). Research on client variables in psychotherapy.(In A. E. Bergin & S. L. Garfield (Eds.), Handbook of psychotherapy and behavior change (4th ed.) (pp. 72—113). New York: Wiley.)

http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 16 of 19 Gaston, L. (1991). Reliability and criterion-related validity of the California Psychotherapy Alliance Scales. ( Psychological Assessment, 3, 68—74.) Goldfried, M. R., Greenberg, L. S. & Marmar, C. (1990). Individual psychotherapy: Process and outcome. ( Annual Review of Psychology, 41, 659—688.) Gomes-Schwartze, B. (1978). Effective ingredients in psychotherapy: Prediction of outcome from process variables.( Journal of Consulting and Clinical Psychology, 46, 1023—1035.) Grundy, C. T. (1994). Assessing clinical significance: Application to the Hamilton Rating Scale for Depression. (Unpublished doctoral dissertation, Brigham Young University, Provo, UT) Grundy, L. M. (1994). Assessing clinical significance: Application to the Personality Inventory for Children. (Unpublished doctoral dissertation, Brigham Young University, Provo, UT) Hamilton, M. (1960). A rating scale for depression.( Journal of Neurology, Neurosurgery and Psychiatry, 23, 56—62.) Hamilton, M. (1967). Development of a rating scale for primary depressive illness.( British Journal of Social and Clinical Psychology, 6, 278—296.) Hansen, N. B. & Lambert, M. J. (1996). Clinical significance: An overview of methods.( Journal of Mental Health, 5, 17—24.) Henry, W. P., Strupp, H. H., Schacht, T. E. & Gaston, L. (1994). Psychodynamic approaches.(In A. E. Bergin & S. L. Garfield (Eds.), Handbook of psychotherapy and behavior change (4th ed.) (pp. 467— 508). New York: Wiley.) Hollon, S. D. & Flick, S. N. (1988). On the meaning and methods of clinical significance.(Special issue: Defining clinically significant change. Behavioral Assessment, 10, 197—206.) Horowitz, L. (1974). Clinical prediction in psychotherapy. (New York: Jason Aronson) Horowitz, L. M., Rosenberg, S. E., Baer, B. A., Ureno, G. & Villesenor, V. S. (1988). Inventory of interpersonal problems: Psychometric properties and clinical applications.( Journal of Consulting and Clinical Psychology, 56, 885—892.) Horvath, A. O. & Greenberge, L. S. (1989). Development and validation of the Working Alliance Inventory.( Journal of Counseling Psychology, 36, 223—233.) Horvath, A. O. & Symonds, D. B. (1991). Relationship between working alliance and outcome in psychotherapy: A meta-analysis.( Journal of Counseling Psychology, 38, 139—149.) Howard, K. I., Kopta, S. M., Krause, M. S. & Orlinsky, D. E. (1986). The close effect relationship in psychotherapy.( American Psychologist, 41, 159—164.) Howard, K. I., Moras, K., Brill, P. L., Martinovich, Z. & Lutz, W. (1996). Evaluation of psychotherapy: Efficacy, effectiveness, and patient progress.( American Psychologist, 51, 1059—1064.) Hugdahl, K. & Ost, L. (1981). On the difference between statistical and clinical significance.( Behavioral Assessment, 3, 289—295.) Ihilevich, D. & Gleser, G. C. (1979). A manual for the progress evaluation scales. (Shiawasse, MI: Community Mental Health Services Board) Ihilevich, D. & Gleser, G. C. (1982). Evaluating mental-health programs: The Progress Evaluation Scales. (Lexington, MA: D.C. Health) Jacobson, N. S., Follette, W. C. & Revenstorf, D. (1984). Psychotherapy outcome research: Methods for reporting variability and evaluating clinical significance.( Behavior Therapy, 15, 336—352.) Jacobson, N. S. & Revenstorf, D. (1988). Statistics for assessing the clinical significance of psychotherapy techniques: Issues, problems, and new developments.( Behavioral Assessment, 10, 133—145.) Jacobson, N. S. & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research.( Journal of Consulting and Clinical Psychology, 59, 12—19.) Kadera, S. W., Lambert, M. J. & Andrews, A. A. (1996). How much therapy is really enough?(A sessionby-session analysis of the psychotherapy dose-effect relationship. The Journal of Psychotherapy Practice and Research, 5, 132—151.) http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 17 of 19 Kazdin, A. E. (1977). Assessing the clinical or applied importance of behavior change through social validation.( Behavior Modification, 1, 427—452.) Kazdin, A. E. (1993). Evaluation in clinical practice: Clinically sensitive and systematic methods of treatment delivery.( Behavior Therapy, 24, 11—45.) Kendall, P. C. & Grove, W. M. (1988). Normative comparisons in therapy outcome.( Behavioral Assessment, 10, 147—158.) Lambert, M. J. (1983). Introduction to assessment of psychotherapy outcome: Historical perspective and current issues.(In M. J. Lambert, E. R. Christensen, & S. S. DeJulio (Eds.), The assessment of psychotherapy outcome (pp. 3—32). New York: Wiley.) Lambert, M. J. & Bergin, A. E. (1994). The effectiveness of psychotherapy.(In A. E. Bergin & S. L. Garfield (Eds.), Handbook of psychotherapy and behavior change (4th ed.) (pp. 143—189). New York: Wiley.) Lambert, M. J., Bergin, A. E. & Collins, J. L. (1977). Therapist-induced deterioration in psychotherapy.(In A. S. Gurman & A. M. Razin (Eds.), Effective psychotherapy: A handbook of research (pp. 452—481). New York: Pergamon Press.) Lambert, M. J., Burlingame, G. M., Umphress, V., Hansen, N. B., Vermeersh, D. A., Clouse, G. C. & Yanchar, S. C. (1996). The reliability and validity of the Outcome Questionnaire.( Clinical Psychology and Psychotherapy, 3, 249—258.) Lambert, M. J., Hansen, N. B., Umphress, V., Lunnen, K., Okishi, J., Burlingame, G. M., Hefner, J. C. & Reisinger, C. R. (1996). Administration and scoring manual for the Outcome Questionnaire (OQ— 45.2). (Wilmington, DE: American Professional Credentialing Services LLC) Lambert, M. J. & Hill, C. E. (1994). Assessing psychotherapy outcomes and processes.(In A. E. Bergin & S. L. Garfield (Eds.), Handbook of psychotherapy and behavior change (4th ed.) (pp. 72—113). New York: Wiley.) Lambert, M. J., Lunnen, K., Umphress, V., Hansen, N. B. & Burlingame, G. (1994). Administration and scoring manual for the Outcome Questionnaire (OQ-45.). (Salt Lake City, UT: IHC Center for Behavioral Healthcare Efficacy) Larsen, D. L., Attkisson, C. C., Hargreaves, W. A. & Nguyen, T. D. (1979). Assessment of client/patient satisfaction: Development of a general scale.( Evaluation and Program Planning, 2, 197—207.) Lichtenstein, A. B. (1985). The effect of client and therapist gender on the outcome and process of psychotherapy.( Dissertation Abstracts International, 45/12B, 3949.) Lieberman, M. A., Yalom, I. D. & Miles, M. B. (1973). Encounter groups: First facts. (New York: Basic Books) Luborsky, L., Barber, J. P., Siqueland, L. & Johnson, S. (1996). The revised Helping Alliance questionnaire (HAq-II): Psychoanalytic properties.( Journal of Psychotherapy Practice & Research, 5, 260—271.) Marmar, C. R., Horowitz, M. J., Weiss, D. S. & Marziali, E. (1986). The development of the therapeutic alliance rating system.(In L. S. Greenberg & W. M. Pinsof (Eds.), The psychotherapeutic process: A research handbook (pp. 367—390). New York: Guilford Press.) Marmar, C. R., Weiss, D. S. & Gaston, L. (1989). Toward the validation of the California Therapeutic Alliance Rating System.( Psychological Assessment: A Journal of Consulting and Clinical Psychology, 1, 46—52.) Mintz, J. & Keisler, D. J. (1982). Individualized measures of psychotherapy outcome.(In P. C. Kendall & J. N. Butcher (Eds.), Handbook of research methods in clinical psychology (pp. 491—534). New York: Wiley.) Mohr, D. C. (1995). Negative outcome in psychotherapy: A critical review.( Clinical Psychology-Science & Practice, 2, 1—27.) Ogles, B. M., Lambert, M. J. & Masters, K. J. (1996). Assessing outcome in clinical practice. (New York: Allyn & Bacon) http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 18 of 19 Ogles, B. M., Lambert, M. J. & Sawyer, J. D. (1995). Clinical significance of the National Institute of Mental Health treatment of depression collaborative research program data.( Journal of Consulting and Clinical Psychology, 63, 321—326.) Orlinsky, D. E. & Howard, K. I. (1986). Process and outcome in psychotherapy.(In S. L. Garfield & A. E. Bergin (Eds.), Handbook of psychotherapy and behavior change (3rd ed.). New York: Wiley.) Pekarik, G. & Wolff, C. B. (1996). Relationship of satisfaction to symptom change, follow-up adjustment, and clinical significance.( Professional Psychology: Research and Practice, 27, 202—208.) Saunders, S. M., Howard, K. I. & Newman, F. L. (1988). Evaluating the clinical significance of treatment effects: Norms and normality.( Behavioral Assessment, 10, 207—218.) Segger, L. (1994). Assessing clinical significance: Application to the Beck Depression Inventory. (Unpublished doctoral dissertation, Brigham Young University, Provo, UT) Seligman, M. E. P. (1995). The effectiveness of psychotherapy.( American Psychologist, 50, 965—974.) Sloan, R. B., Staples, F. F., Cristol, A. H., Yortson, N. J. & Whipple, K. (1975). Short-term analytically oriented psychotherapy versus behavior therapy.( American Journal of Psychiatry, 132, 373—377.) Smith, M. L., Glass, G. V. & Miller, T. I. (1980). The benefits of psychotherapy. (Baltimore, MD: John Hopkins University Press) Speer, D. C. & Greenbaum, P. (1995). A comparison of five methods for computing significant individual client change and measurement rates: An individual growth curve approach.( Journal of Consulting and Clinical Psychology, 63, 1044—1048.) Spielberger, C. D. (1983). Manual for the State-Trait Anxiety Inventory STAI (Form Y). (Palo Alto, CA: Consulting Psychologists Press) Spielberger, C. D., Gorsuch, R. L. & Lushene, R. E. (1970). The State Trait Anxiety Inventory Self Evaluation Questionnaire. (Palo Alto, CA: Consulting Psychologists Press) Strupp, H. H., Fox, R. & Lessler, K. (1969). Patients view their own psychotherapy. (Baltimore, MD: Johns Hopkins University Press) Strupp, H. H. & Hadley, S. W. (1977). A tripartite model of mental health and therapeutic outcome: With special reference to negative effects in psychotherapy.( American Psychologist, 32, 187—196.) Taylor, J. A. (1953). A personality scale of manifest anxiety.( Journal of Abnormal and Social Psychology, 48, 285—290.) Tingey, R. C., Lambert, M. J., Burlingame, G. M. & Hansen, N. B. (1996). Assessing clinical significance: Proposed extensions to method.( Psychotherapy Research, 6, 109—153.) Umphress, V., Lambert, M. J., Smart, D. W., Barlow, S. H. & Clouse, G. (1997). Concurrent and construct validity of the Outcome Questionnaire.( Journal of Psychoeducational Assessment, 15, 40— 55.) Wampold, B. E. & Jenson, W. R. (1986). Clinical significance revisited.( Behavior Therapy, 17, 302— 305.) Weber, J. J., Elinson, J. & Moss, L. M. (1965). The application of ego strength scales to psychoanalytic clinic records.(In G. S. Goldman & D. Shapiro (Eds.), Developments in psychoanalysis at Columbia University: Proceedings of the 20th anniversary conference. New York: Columbia Psychoanalytic Clinic for Training and Research.) Weissman, M. M. & Bothwell, S. (1976). The assessment of social adjustment by patients self-report. ( Archives of General Psychiatry, 33, 1111—1115.) Wolf, M. M. (1978). Social validity: The case for subjective measurement or how applied behavior analysis is finding its heart.( Journal of Applied Behavior Analysis, 11, 203—214.) Zung, W. W. K. (1965). A self-rating depression scale.( Archives of General Psychiatry, 12, 63—70.) Zung, W. W. K. (1971). A rating instrument for anxiety disorders.( Psychosomatics, 12, 371—379.) Table 1. http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Page 19 of 19

Table 2.

Table 3.

Table 4.

Table 5.

http://spider.apa.org/ftdocs/ccp/1998/april/ccp662400.html

8/30/2000

Suggest Documents