Development and Initial Validation of a Brief Mental Health Outcome ...

11 downloads 115 Views 52KB Size Report
short scale designed to measure the effectiveness of mental health treatment .... ducted by William R. Lenderking and Ashley deLorrell and were taped and.
JOURNAL OF PERSONALITY ASSESSMENT, 73(3), 359–373 Copyright © 1999, Lawrence Erlbaum Associates, Inc.

Development and Initial Validation of a Brief Mental Health Outcome Measure Mark A. Blais, William R. Lenderking, Lee Baer, Ashley deLorell, Kathleen Peets, Linda Leahy, and Craig Burns Department of Psychiatry Massachusetts General Hospital/Harvard Medical School

Using a combination of classical test theory and Rasch item analysis, we developed a short scale designed to measure the effectiveness of mental health treatment across a wide range of mental health services and populations. Item development for the scale was guided by literature review and interviews with senior clinicians and with patients. Using 3 different samples consisting of inpatients, outpatients, and nonpatients, we reduced our initial item pool from 81 to 10 items. The 10-item scale had an alpha of .96 and showed strong correlations with commonly used measures of psychological well-being and distress. Our results suggest that the scale appears to measure a broad domain of psychological health. The scale appeared to lack ceiling and floor effects, and it discriminated between inpatients, outpatients, and nonpatients, suggesting the scale has excellent potential to be broadly responsive to a variety of treatment effects. In addition, the new scale proved to be sensitive to treatment changes in a sample of 20 psychiatric inpatients. Overall, the initial data suggest that we have developed a brief, sensitive outcome measure designed to have wide application across psychiatric and psychological treatments and populations.

There is growing pressure for mental health care providers and organizations to document the effectiveness of their treatments for both scientific and economic reasons (Joint Commission on Accreditation of Healthcare Organizations, 1997; Lyons, Howard, O’Mahoney, & Lish, 1997; Sederer, Dickey, & Herman, 1996). Maximizing the value of the mental health services provided requires linking improvements in the process of care with measurement of the outcomes of care (Eckert, 1994). The historical approach to outcomes assessment has focused mainly on specific conditions (e.g., anxiety disorders or depression) and has been principally connected with clinical trials research (Lambert, Okiishi, Finch & John-

360

BLAIS ET AL.

son, 1998). Although there has been a long history of outcomes measurement for some mental health services (e.g., psychotherapy), in general the present push to measure outcomes more widely has been resisted by many mental health practitioners and service organizations (Sederer et al., 1996; Talley, Strupp, & Butler, 1994). The goal of our research was to develop a measure of the effectiveness of treatments provided within the Department of Psychiatry at Massachusetts General Hospital (MGH; by various mental health practitioners including psychiatrists, psychologists, and social workers). We hoped our measure would not be limited by any one particular theory of psychological functioning, so that it would be meaningful to practitioners of different theoretical persuasions. In addition, we aimed to develop a measure that would be useful across a variety of different treatment modalities, including individual psychotherapy, group psychotherapy, family therapy, psychopharmacology, and electroconvulsive therapy. To accomplish this goal, we set out to develop a measure that did not focus directly on the signs and symptoms of specific disorders. Although the benefits of systematic outcomes assessment are potentially substantial, large, diverse mental health organizations, such as academic psychiatry departments, face particular difficulties incorporating uniform outcomes assessment into their multiple delivery sites. These difficulties are due to the diversity of services offered, the range of theoretical orientations of treating clinicians, and the variability in the severity of patients being cared for (outpatient, inpatient, or emergency ward). This degree of complexity has led some authors (Lyons et al., 1997) to advocate that organizations use separate outcome measures matched to the patient and service profile of each site, a recommendation clearly in contrast with what we tried to achieve. Whereas the Lyons et al. approach might capture the unique features of the services of each site with great sensitivity, such a strategy would not provide a common metric for understanding treatment outcomes across the sites of the organization. The MGH Department of Psychiatry desired a brief outcome measure that would provide a way—brief enough to not be a burden on patients or programs—to monitor the effectiveness of treatments throughout its multiple delivery sites. Despite the recent proliferation of outcome instruments (Lyons et al., 1997; Sederer et al., 1996), we failed to identify any existing instrument that would meet the aforementioned needs. Previous attempts to measure outcomes or therapeutic changes in psychiatry have typically focused on single diagnoses or closely related diagnoses. For example, there are numerous scales to measure depression, such as the Zung Depression Scale (Zung, 1965), the Beck Depression Inventory (Beck & Steer, 1987; Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) the Hamilton Rating Scale for Depression (Hamilton, 1960), the Montgomery–Asberg Depression Rating Scale (Montgomery & Asberg, 1979), and the Center for Epidemiologic Studies–Depression Scale (Radloff, 1977), that have also been used to measure

BRIEF MENTAL HEALTH OUTCOME MEASURE

361

treatment-related changes in depression. Similar, widely used scales exist for the measurement of the effects of treatment on anxiety (i.e., the State–Trait Anxiety Inventory; Spielberger, Gorsuch, & Lushene, 1970) and obsessive–compulsive disorder (Yale–Brown Obsessive–Compulsive Disorder Scale; Goodman et al., 1989). Condition-specific measures are unlikely to provide complete coverage of issues or aspects of functioning for patients with different conditions. In addition, such measures are typically focused on symptoms characteristic of the disorder and do not address the impact of the disorder on other aspects of the individual’s life, such as functioning and work limitations. However, these other areas are also important to examine when considering the impact of treatment. In depressed patients for example, symptomatic improvement may not directly translate into improvement in either work functioning or social activity level (Lenderking et al., 1999). The Behavior and Symptom Identification Scale–32 (BASIS–32; Eisen, Dill, & Grob, 1994) is a broader (multiscaled) and relatively brief (32 items) measure of psychiatric outcome. Unfortunately, the BASIS–32 was developed exclusively on an inpatient sample, possibly limiting its applications. Furthermore, questions regarding the psychometric features of the BASIS–32 exist. For example, a number of studies have shown that three of its five subscales are highly intercorrelated, suggesting that they may be measuring a single dimension rather than separate aspects of psychological functioning (Eisen et al., 1994; Eisen, Wilcox, Schaefer, Culhane, & Leff, 1997). Recently, the Outcome Questionnaire–45 (OQ–45; Lambert et al., 1996) has become available as a mental health outcome measure. However, the OQ–45 was designed primarily to assess psychotherapy outcome, potentially limiting its application to other psychiatric treatments. Further research will be needed to see how responsive the OQ–45 is to other forms of mental health treatment. Other instruments, such as the Symptom Checklist–90 (SCL–09; Derogatis, 1983) and Brief Psychiatric Rating Scale (BPRS; Overall & Gorham, 1962), have also commonly been used as outcome measures. However, the SCL–90 with its 90 items might be too burdensome for some clinical units, and the BPRS is a clinician-rated instrument, which requires considerable training to achieve and maintain reliable ratings. On the other hand, widely used outcome measures (e.g., those used to measure quality of life or health status) that are administered to a broad variety of medical patients include domains that may not be directly affected by most mental conditions. The 36-Item Short Form Health Survey (SF–36; Ware, Kosinski, Snow, & Gandek, 1993), the Sickness Impact Profile (Bergner, 1984), and the Nottingham Health Profile (NHP; Hunt, McEwen, & McKenna, 1985) are examples of such instruments. For example, the domains of role, physical (SF–36) or physical mobility (NHP), are not likely to be greatly affected by successful treatment of anxiety. By including domains not directly relevant to the condition being evaluated, the measure may lose sensitivity to important treatment effects.

362

BLAIS ET AL.

From the material previously reviewed, there appears to be a need for a brief mental health outcome measure that would be suitable to assessing a wide range of patients receiving a variety of mental health treatments. In the following section, we describe the development of such a scale.

METHOD To establish a working definition of improvement in mental health treatments, we began with semistructured interviews of senior clinicians. We interviewed four psychologists, seven psychiatrists, and one neurosurgeon and held two patient focus groups. The clinicians were chosen to represent diverse professional orientations and practices, including group therapy; psychodynamic psychotherapy; cognitive– behavioral therapy; psychotherapy researchers; psychopharmacologists specializing in depression, anxiety, obsessive–compulsive disorder, and bipolar illness; and consult-liaison, emergency ward, and inpatient clinicians. The interviews commenced with a standard question: “What do you think changes in a person’s life when the treatment you provide is successful?” Probing questions were asked about how treatment affected each of the following domains: physical, psychological, and social functioning and symptoms. These domains have been identified as core components of well-being (Stewart & Ware, 1992). The interviews were conducted by William R. Lenderking and Ashley deLorrell and were taped and transcribed. Two focus groups of five patients were held; they were asked to respond initially to the following questions: “What has changed in your life as a result of your treatment?” and “What do you hope will change as a result of your treatment?” The subsequent discussion explored the impact of treatment on the patients’ perception of their psychological health. These groups were also taped and transcribed. The tapes were analyzed for important themes. The interviews with clinicians indicated that there were substantial areas of agreement, in spite of theoretical differences, about what should change when treatment works. Table 1 presents sample descriptions, provided by clinicians from a variety of theoretical perspectives, of the broad domain of psychological functioning we measured. A pool of 81 test items was developed from the interview transcripts and a literature review. These 81 items were rated on 7-point Likert scales, ranging from 0 (never) to 6 (all of the time), for relevance to patients and likelihood of changing in response to effective treatment (responsiveness) by a group of 15 clinicians (which included some of the clinicians initially interviewed); they were also rated for relevance and comprehensibility by a second group of five patients. Means and standard deviations for relevance, responsiveness, and comprehensibility were calculated for each item and across items. Items were retained if their modal clinician rating for relevance or responsiveness was 5 or 6

BRIEF MENTAL HEALTH OUTCOME MEASURE

363

TABLE 1 Descriptions of the Broad Domain of Well-Being, Which Changes With Effective Treatment From Clinicians of Varying Theoretical Perspectives Type of Clinician Psychodynamic psychotherapist

Psychopharmacologist Behavioral psychologist

Eclectic psychiatrist

Psychodynamic psychiatrist

Consult-liaison psychiatrist

Description “The capacity to manage anxiety and be serene in a silent moment is something that I watch for [in a sicker patient] to see whether they can have these moments of serenity even in very chaotic lives. … Peace of mind for the healthier person involves being able to leave something without it being perfect.” “If someone doesn’t have a job and then you treat them and they are working and they are being productive and happy.” “Job and social functioning are always important … more sensitive than some of the clinical measures … a sense of well-being while working.” “Early on in therapy, patients … describe feeling victimized by the helplessness they feel in their lives. … I feel like therapy has worked when they begin to talk about the choices they make to be in the situations they are in. … They feel less victimized by reforming where they are at in their lives as choice rather than victimization.” “Often at the root of their dysfunction is interpersonal relationships and self-esteem. … I would expect to see a person feeling better about themselves … better able to meet their goals.” “I try to find out from that person what turned him on, for example … if I had a picture and showed you that picture, would your heart do that kind of leap … something in your deep inside kind of turns on and comes to life. I … think of it as the ‘Do-it-again center.’”

(6 being the high point of the response scale) or if the item mean was one standard deviation above all the items for relevance and responsiveness. The patient ratings were used to confirm the relevance and comprehensibility of the items retained. We then reviewed the list and added several items based on theoretical considerations. This procedure resulted in 47 items being retained in the initial version of the instrument.

RESULTS The initial 47-item version of the scale was administered to 112 patients across a variety of hospital sites. These included the outpatient psychotherapy clinic (n = 42) and psychopharmacology clinic (n = 33), an emergency ward acute psychiatry clinic, and a 21-bed inpatient medical psychiatry unit (emergency room patients

364

BLAIS ET AL.

and inpatients combined, n = 37). The sample consisted of 69 women (62%) and 43 men. The average age was 37 years (SD = 12). A principal components analysis with varimax rotation resulted in 10 factors with eigenvalues greater than 1. However, the first factor had an eigenvalue of 19.57 and accounted for 42% of the total variance, whereas the second factor had an eigenvalue of 2.2 and accounted for only 5% of the total variance. These results indicated that the scale was essentially unifactorial. Items with factor loadings on the first factor of greater than .60 were retained for further study. This left 31 items. Further item reduction occurred through examination of mean and modal scores for each of the 31 items. The items had been rated on a 7-point scale, ranging from 0 (never) to 6 (all or nearly all of the time). Items having either a mean or modal score of 5 or higher were dropped to eliminate ceiling effects. This resulted in a scale of 20 items. The 20-item scale had excellent reliability, very high internal consistency (Cronbach’s α = .95), and a split-half reliability coefficient of .92. Scores could range from 0 to 120. There were no significant differences for either sex or age. Men (n = 38) had a mean score of 64 (SD = 24), whereas women (n = 61) had a mean score of 66 (SD = 24). Those participants who were 40 years old or younger (n = 43) had a mean score of 66 (SD = 27), whereas those over 40 (n = 60) had a mean score of 65 (SD = 24). The 20-item scale separated the combined inpatient and emergency room patients (n = 21, M = 58, SD = 30) from the combined outpatient groups (n = 95, M = 74, SD = 26) and a group of nonpatients (n = 34, M = 99, SD = 11), overall F(2, 147) =7.77, p < .001. (High scores on the scale represent better psychological health or functioning.) A new sample was drawn to further test the 20-item scale. This consisted of 25 community mental health clinic patients and 35 nonpatients (n = 60). The community clinic participants had a mean score of 75 (SD = 23), whereas the nonpatients had a mean score of 99 (SD = 10), a significant difference, t(59) = –4.15, p < .01. The sample consisted of 39 women (65%) who were aged 34 years on average (SD = 11). Principal components analysis again revealed the scale to be unifactorial. The first factor had an eigenvalue of 12 and accounted for 60% of the total variance, whereas the second had an eigenvalue of 1.19 and accounted for only 6% of the total variance. Cronbach’s alpha was .96, and the split-half reliability coefficient was .92. Test–retest reliability over 1 week was obtained for the nonpatients (n = 32) and was found to be .87. Table 2 presents the corrected item-to-scale correlations and the factor loadings for the 20 items. Table 2 shows that all the corrected item-to-scale correlations were well above the lower bound of .30 (Nunnally & Bernstein, 1994), and all the items loaded substantially onto the first unrotated factor. During the next phase of the project, we sought to obtain preliminary data on validity and to potentially reduce the scale further. The preceding analyses indi-

BRIEF MENTAL HEALTH OUTCOME MEASURE

365

TABLE 2 Corrected Item-to-Scale Correlalationsa and Loadings on the First Principal Component Factor for the 20 Beta Scale Items Beta-Test Item Number 1 2b 3 4 5 6b 7b 8 9 10b 11 12 13 14b 15b 16b 17b 18b 19b 20

Item-to-Scale Correlations

Factor Loadings

.81 .60 .68 .85 .79 .50 .68 .52 .89 .85 .64 .76 .88 .73 .72 .67 .78 .74 .81 .85

.64 .72 .86 .81 .54 .71 .56 .91 .87 .67 .79 .90 .77 .75 .72 .81 .77 .83 .87 .84

Note. n = 60. a Corrected item-to-scale correlations were computed by subtracting the item under study from the total scale score. bItem was ultimately retained for the final 10-item version of the scale.

cated that the scale was unifactorial, making it a candidate for Rasch item analysis (Andrich, 1988; for an overview of the Rasch model, see Wright & Stone, 1979). We recruited a third sample of 85 participants made up of patients (n = 57) and nonpatients (n = 28). Patients were again recruited from a variety of clinical sites, including inpatient, outpatient, and emergency room clinics within the hospital. Nonpatient participants were recruited primarily from the hospital staff (nurses, research assistants, and support personnel). All participants were paid $5.00 to complete a lengthy battery of scales, including the current scale, the Beck Hopelessness Scale (Beck, Kovacs, & Weissman, 1975); a measure of self-esteem (Heatherton & Polivy, 1991); the Positive Affect and Negative Affect Scale (Watson, Clark, & Tellegen, 1988); the Survey Form–12 (SF–12; Ware, Kosinski, & Keller, 1995); the Mental Health Index–5 (MHI–5) or Well-Being Scale from the Medical Outcomes Study (Stewart, Sherbourne, Hays, & Ware, 1992), also used in the SF–36 (Ware et al., 1993) and the Functional Status Questionnaire (Jette et al., 1986); the Fatigue Scale from the Medical Outcomes Study and the SF–36; the Sense of Coherence Scale (Antonovsky, 1979, 1987), both the Life Satisfaction

366

BLAIS ET AL.

question (Andrews & Withey, 1976) and the Satisfaction With Life scale (Pavot & Diener, 1993); and two scales currently under development, Psychiatric Symptoms (Blais, 1999) and Desire to Live (Lenderking, 1992). These scales were selected because they have been widely used (mostly) and relate to the construct we were trying to measure. The Rasch item analysis was conducted by using the Bigsteps computer program (Linacre & Wright, 1997). Rasch analysis is a form of modern itemresponse theory (Embreston, 1996), which allows for sample-independent calibrations of item difficulty, as long as the items form a unidimensional scale (cf. Andrich, 1988; for an overview of the Rasch model, see Wright & Stone, 1979). Based on the work of Danish mathematician George Rasch, Rasch item analysis produces a conjoint but independent scaling of both person ability and item difficulty (trait loading). When item data fit the Rasch mathematical model, the generality of the findings are assured regardless of sample size or characteristics (Wright & Stone, 1979). The Rasch analysis of the 20-item scale indicated that three items had poor fit and did not conform to the mathematical model. The remaining 17 items conformed well to the Rasch model and had difficulty scores or trait loadings (presented in logits, log-odd units) ranging from –.32 to .53. We selected 10 items, evenly spaced across the logit difficulty range, to make up the final version of the measure. The Rasch characteristics of the final 10-item scale were good. The sample (participant) separation was 4.2, the sample alpha was .94, the item separation was 3.0, and the item alpha was .90. The traditional psychometric characteristics of the scale were also strong. The scale had a Cronbach’s alpha of .96, and the corrected item-to-scale total correlations ranged from .74 to .90. A principal components analysis indicated one factor, which accounted for 76% of the variance. The final version of the scale, with items arranged in increasing order of difficulty (logits), is presented in Appendix A. Table 3 presents the convergent and divergent validity correlations for the final 10-item version of the scale. Strong negative (divergent) correlations were obtained with measures of psychopathology (Psychiatric Symptom Scale), hopelessness (Beck Hopelessness Scale), fatigue (Fatigue Scale), and negative affect. Negative affect is a construct that is broadly associated with depression and anxiety (Negative Affect scale; Watson et al., 1988). Strong positive (convergent) correlations were obtained between our scale and measures of psychological well-being (MHI–5), life satisfaction, desire to live, positive self-esteem, positive affect, the sense of coherence, and the mental component of the SF–12 (Mental Health Component scale). The scale was also moderately correlated with the physical component of the SF–12 (Physical Health Component scale). The construct of positive affect is associated with a sense of well-being, competence, and effective interpersonal engagement (Positive Affect scale; Watson et al., 1988).

BRIEF MENTAL HEALTH OUTCOME MEASURE

367

TABLE 3 Convergent and Divergent Validity Correlations for the 10-Item Scale Validity Scale

Convergent

Psychiatric Symptoms Scalea Beck Hopelessness Scaleb PANAS Negative Affect scalec Fatigue Scaled Well-Being Scaled Desire to Live Scalee Satisfaction With Life Scalef PANAS Positive Affect scaleb Self-Esteem Scaleg Sense of Coherence Scaleh SF–12 MCSi SF–12 PCSi

Divergent –.66 –.64 –.72 –.75

.86 .86 .78 .67 .81 .81 .76 .36

Note. PANAS = Positive Affect and Negative Affect Scale; SF–12 = Survey Form–12; MCS = Mental Health Component scale; PCS = Physical Health Component scale. a Blais (1999). bBeck, Kovacs, and Weissman (1975). cWatson, Clark, and Tellegen (1988). dStewart, Sherbourne, Hays, and Ware (1992). eLenderking (1992). fPavot and Diener (1993). gHeatherton and Polivy (1991). hAntonovsky (1979). iWare, Kosinski, and Keller (1995).

Evaluating Sensitivity to Change To provide initial sensitivity to change data, the final 10-item version of the scale was administered at admission and discharge to 20 inpatients undergoing treatment on a locked psychiatric unit. The patients had a mean age of 50 years (SD = 16), their average length of treatment was 11 days (SD =11), and there were 11 women and 9 men in the group. The sample had a mean score at admission of 29 (SD = 13) and a mean discharge score of 42 (SD =12). A two-tailed, paired t test showed these scores to be significantly different, t(19) = –5.23, p < .001, indicating that the scale is sensitive to treatment changes.

DISCUSSION We have successfully developed a brief, highly reliable measure of psychological health that has good potential as a measure of the effectiveness of psychiatric treatments for diverse patient groups receiving a variety of treatments in diverse settings. Across three separate samples, the Cronbach’s alpha was above .90, indicating a high degree of internal consistency. The scale demonstrated good test–retest reliability over 1 week in nonpatients. The results of factor analysis in two separate sam-

368

BLAIS ET AL.

ples were remarkably consistent. In addition, the new scale showed no ceiling or floor effects across a wide range of patient and nonpatient groups and was able to discriminate among inpatients, outpatients, and nonpatients. Some of these findings were based on the earlier 20-item version of the scale. However, our use of Rasch analysis to guide the final item selection makes us confident that the shortened test will function in a manner highly similar to the longer scale. Lastly, the scale showed sensitivity to change in a sample of inpatients undergoing brief psychiatric care. A review of the convergent and divergent validity correlations indicate that the new scale represents a single, broad dimension of psychological health. The new scale showed high positive correlations with positive affect, self-esteem, sense of coherence, and general life satisfaction and had strong negative correlations with psychiatric symptoms, hopelessness, negative affect, and fatigue. This unidimensional view of psychological health is consistent with the findings of Veit and Ware (1983) regarding the structure of psychological well-being and distress in the general population. The sensitivity to change findings are also interesting and shed additional light on the nature of the scale. The 20 psychiatric inpatients who completed the final version of the scale showed a fairly rapid increase in their psychological health. Their scores increased from a mean of 29 on admission to a mean of 43 on discharge, despite being treated for only 11 days on average. This finding is consistent with Howard’s (Howard, Lueger, Maling, & Martinovich, 1993) phase model of psychotherapy outcome , which describes three stages of improvement from psychological or psychiatric care: restoration of psychological health (which they define as a sense of well-being), symptomatic improvement, and life functioning improvement. According to this theory, improvement in psychological health is expected to precede symptom reduction and increased life functioning. Howard et al.’s data showed that well-being improved quickly in treatment, with substantial increases in psychological health seen in the first two to four sessions. Furthermore, their data indicated that symptomatic improvement was unlikely to occur in the absence of improved psychological health (Howard et al., 1993). Therefore, markers of psychological health may serve as highly sensitive and early predictors of patient improvement. There are two primary limitations to our study. The first is that each of our samples was smaller than ideal for the analyses we undertook. In spite of this, the results were remarkably stable across samples. In addition, we had sufficient participants for the Rasch analysis, and those results were consistent with the others as well. A second limitation for a measure of the effectiveness of treatments is that we evaluated only one form of treatment in a sample of participants. Thus we cannot yet be certain of the scale’s responsiveness to change across a variety of treatment methods. This is a matter for further research, which we are in the process of conducting. Furthermore, we specifically wrote items based on our qualitative research that were designed to measure those do-

BRIEF MENTAL HEALTH OUTCOME MEASURE

369

mains that would change as the result of successful treatment and selected items for the final scales that clinicians rated as most likely to change. We offer a few words of caution regarding the scale’s use. The scale was developed mainly as a group outcome measure; therefore, the clinical application of the scale with individual patients will require special attention. Although the scale does not directly measure psychiatric symptoms, thereby somewhat disguising the intent of the instrument, the individual items are clearly transparent and all items are scored in the same direction. These characteristics combine to make the test susceptible to distortion from either a response style (socially desirable responding or malingering) or a global response set (all good or all bad responding). Careful attention should be given to the demand characteristics of the assessment setting to minimize the respondent’s motivation to distort his or her results. The availability of standardized norms for the scale (which are currently being developed) will also help decrease this potential liability. We have named our scale the Schwartz Outcomes Scale (SOS–10), in honor of Kenneth B. Schwartz, a health care lawyer who died in 1995 at MGH from lung cancer. Shortly before his death, he founded the Kenneth B. Schwartz Center in Boston, whose mission is strengthening the relationship between patient and caregiver. In Schwartz’s (1995) description of his experience, he stressed the importance of the care he received from many caregivers (particularly Dr. Edwin Cassem, MGH Chief of Psychiatry), which helped him to maintain his psychological health during his catastrophic illness. We hope the SOS–10 will be used to improve patient–caregiver communication about the impact of treatment (psychiatric or physical) on a patient’s sense of psychological health. It is also our hope that, by developing a brief scale based on input from clinicians of vastly diverse theoretical persuasions (demonstrating that there are commonalties across treatment approaches), more clinicians will be inclined to measure the outcome of their work and, with measurement of the benefits patients receive, the true value of our services can be empirically demonstrated.

ACKNOWLEDGMENTS Mark A. Blais and William R. Lenderking are listed alphabetically, as each author contributed equally to this project. We acknowledge the contributions of the following experts in psychiatric treatment who were interviewed and some of whom also participated in the original rating of the items: Robert Abernethy, MD; David Ahern, PhD; Anne Alonso, PhD; Edwin M. Cassem, MD; Rees Cosgrove, MD; Michael Jenike, MD; Michael Jellinek, MD; Jerrold Rosenbaum, MD; Ron Schouten, MD; and Jeffrey Weilburg, MD. We are also grateful to the patients and nonpatients who contributed their

370

BLAIS ET AL.

time and effort by filling out questionnaires so that the benefits of psychiatric treatments could be more adequately and widely measured. REFERENCES Andrews, F. M., & Withey, S. B. (1976). Social indicators of well-being: America’s perception of life quality. New York: Plenum. Andrich, D. (1988). Rasch models for measurement. Newbury Park, CA: Sage. Antonovsky, A. (1979). Health, stress, and coping: New perspectives on mental and physical wellbeing. San Francisco: Jossey-Bass. Antonovsky, A. (1987). Unraveling the mystery of health: How people manage stress and stay well. San Francisco: Jossey-Bass. Beck A. T., Kovacs, M., & Weissman, A. (1975). Hopelessness and suicidal behavior: An overview. Journal of the American Medical Association, 234, 1146–1149. Beck, A. T., & Steer, R. Z. (1987). Manual for the revised Beck Depression Inventory. San Antonio, TX: Psychological Corporation. Beck, A. T., Ward, C., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4, 561–571. Bergner, M. (1984). The Sickness Impact Profile (SIP). In N. K. Wenger, M. E. Mattson, C. D. Furberg, & J. Elinson (Eds.), Assessment of quality of life in clinical trials of cardiovascular therapies (pp. 152–159). New York: Le Jacq. Blais, M. A., (1999). The Amount of Trouble Scale: A brief measure of major psychiatric symptoms. Manuscript in preparation, Harvard Medical School. Derogatis, L. R. (1983). SCL–90–R: Administration, scoring and procedures manual II. Baltimore: Clinical Psychometrics Research. Eckert, P. A. (1994). Cost control through quality improvement: The new challenge for psychology. Professional Psychology: Research and Practice, 25, 3–8. Eisen, S. V., Dill, D. L., & Grob, M. C. (1994). Reliability and validity of a brief patient-report instrument for psychiatric outcome evaluation. Hospital and Community Psychiatry, 45, 242– 247. Eisen, S. V., Wilcox, M., Schaefer, E., Culhane, M., & Leff, H. L. (1997, March). Use of the BASIS–32 for outcome assessment of recipients of outpatient mental health services. Belmont, MA: Substance Abuse and Mental Health Services Administration’s Center for Mental Heath Services. Embretson, S. (1996). The new rules of measurement. Psychological Assessment, 8, 341–349. Goodman, W. K., Price, L. H., Rasmussen, S. A., Mazure, C., Fleischmann, R. L., Hill, C. L., Heninger, G. R., & Charney, D. S. (1989). The Yale–Brown Obsessive–Compulsive Scale: Development, use and reliability. Archives of General Psychiatry, 46, 1006–1011. Hamilton, M. (1960). A rating scale for depression. Journal of Neurology Neurosurgery and Psychiatry, 23, 56–62. Heatherton, T., & Polivy, J. (1991). Development and validation of a scale for measuring state selfesteem. Journal of Personality and Social Psychology, 60, 895–910. Howard, K. I., Lueger, R. J., Maling, M. S., & Martinovich, Z. (1993). A phase model of psychotherapy outcome: Causal mediation of change. Journal of Consulting and Clinical Psychology, 61, 678–685. Hunt, S., McEwen, J., & McKenna, S. P. (1985). Measuring health status: A new tool for clinicians and epidemiologists. Journal of the Royal College of General Practice, 35, 185–188. Jette, A. M., Davies, A. R., Cleary, P. D., Calkins, D. R., Rubenstein, L. V., Fink, A., Kosecoff, J., Young, R. T., Brook, R. H., & Delbanco, T. L. (1986). The Functional Status Questionnaire: Reliability and validity when used in primary care. Journal of General Internal Medicine, 1, 143–149.

BRIEF MENTAL HEALTH OUTCOME MEASURE

371

Joint Commission on Accreditation of Healthcare Organizations (1997). ORYX outcomes: The next evolution in accreditation: Performance measurement systems: Evaluation and selection. Oakbrook, IL: Author. Lambert, M. J., Hansen, N. B., Umphress, V. J., Lunnen, K., Okiishi, J., Burlinggame, G. M., & Reisinger, C. W. (1996). Administration and scoring manual for the Outcome Questionnaire (OQ– 45.2). Stevenson, MD: American Professional Credentialling Services. Lambert, M., Okiishi, J., Finch, A., & Johnson, L. (1998). Outcome assessment: From conceptualization to implementation. Professional Psychology: Research and Practice, 29, 63–70. Lenderking, W. R. (1992, April). Initial work with the desire to live construct: A pilot study. Paper presented at the Seminar on the Sense of Coherence at the New England Medical Center, Boston. Lenderking, W. R., Tennen, H., Nackley J. F., Hale, M. S., Turner, R. R., & Testa, M. A. (1999). The effects of venlafaxine on social activity level in depressed outpatients. Journal of Clinical Psychiatry, 60, 157–163. Linacre, J., & Wright, B. (1997). A users guide to Bigsteps: Rasch-model computer program (Version 2.7) [Computer software]. Chicago: MESA. Lyons, J., Howard, K., O’Mahoney, M., & Lish, J. (1997). The measurement and management of outcomes in mental health. New York: Wiley. Montgomery, S. A., & Asberg, M. (1979). A new depression scale designed to be sensitive to change. British Journal of Psychiatry, 134, 382–389. Nunnally, J., & Bernstein, I. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill. Overall, J. E., & Gorham, D. R. (1962). The brief psychiatric rating scale. Psychological Reports, 10, 799–812. Pavot, W., & Diener D. (1993). Review of the satisfaction with life scale. Psychological Assessment, 5, 164–172. Radloff, L. S. (1977). The CES–D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401. Schwartz, K. B. (1995, July 16). A patient’s story. The Boston Globe Magazine, pp. 21–58. Sederer, L., Dickey, B., & Herman, R. (1996). The imperative of outcomes assessment in psychiatry. In L. Sederer & B. Dickey (Eds.), Outcome assessment in clinical practice (pp. 1–7). Baltimore: Williams & Wilkins. Spielberger, C., Gorsuch, R., & Lushene, R. (1970). STAI: Manual for the State–Trait Anxiety Inventory. Palo Alto, CA: Consulting Psychologists Press. Stewart, A. L., Sherbourne, C. D., Hays, R. D., & Ware, J. E. (1992). Summary and discussion of MOS measures. In A. L. Stewart, & J. E. Ware (Eds.), Measuring functioning and well-being: The medical outcomes study approach (p. 345). Durham, NC: Duke University Press. Stewart, A. L., & Ware, J. E. (1992). Measuring functioning and well-being: The medical outcomes study approach. Durham, NC: Duke University. Talley, P., Strupp, H., & Butler, S. (1994). Psychotherapy research and practice: Bridging the gap. New York: Basic Books. Veit, C., & Ware, J. (1983). The structure of psychological distress and well-being in general populations. Journal of Consulting and Clinical Psychology, 51, 730–742. Ware, J. E., Kosinski, M., & Keller, S. D. (1995). SF–12: How to score the SF–12 Physical and Mental Health Summary scales (2nd ed.). Boston: The Health Institute, New England Medical Center. Ware, J. E., Kosinski, M., Snow, K. K., & Gandek, B. (1993). SF–36 Health Survey manual and interpretation guide. Boston: The Health Institute, New England Medical Center. Watson, D., Clark, L., & Tellegen, A. (1988). Development and validation of a brief measure of positive and negative affects: The PANAS. Journal of Personality and Social Psychology, 54, 1063–1070. Wright, B., & Stone, M. (1979). Best test design: Rasch measurement. Chicago: MESA. Zung, W. K. W. (1965). A self-rating depression scale. Archives of General Psychiatry, 12, 63–70.

372

BLAIS ET AL.

APPENDIX A SCHWARTZ OUTCOMES SCALE–10 © Instructions: Below are 10 statements about you and your life that help us see how you feel you are doing. Please respond to each statement by circling the response number that best fits how you have generally been over the last seven days (1 week). There are no right or wrong responses and it is important that your responses reflect how you feel you are doing. Often the first answer that comes to mind is best. Thank you for your thought effort. Please be sure to respond to each statement. 1) (–.32) Given my current physical condition, I am satisfied with what I can do. 0 1 2 3 4 5 6 Never

All of the time or nearly all of the time

2) (–.27) I have confidence in my ability to sustain important relationships. 0 1 2 3 4 5 6 Never

All of the time or nearly all of the time

3) (–.19) I feel hopeful about my future. 0 1 2 3

4

Never

5

6

All of the time or nearly all of the time

4) (–.12) I am often interested and excited about things in my life. 0 1 2 3 4 5 6 Never

5) (–.05) I am able to have fun. 0 1 Never

All of the time or nearly all of the time

2

3

4

5

6

All of the time or nearly all of the time

6) (+.04) I am generally satisfied with my psychological health. 0 1 2 3 4 5 6 Never

All of the time or nearly all of the time

7) (+.20) I am able to forgive myself for my failures. 0 1 2 3 4 5 6 Never

All of the time or nearly all of the time

8) (+.34) My life is progressing according to my expectations. 0 1 2 3 4 5 6 Never

All of the time or nearly all of the time

9) (+.53) I am able to handle conflicts with others. 0 1 2 3 4 5 6 Never

All of the time or nearly all of the time

BRIEF MENTAL HEALTH OUTCOME MEASURE

10) (+.53) I have peace of mind. 0 1

2

3

Never

4

5

373

6

All of the time or nearly all of the time

Note. The Rasch Logit scores are provided for each item in the parentheses. Schwartz Outcomes Scales–10 (SOS–10) copyright © 1997 by The Massachusetts General Hospital Department of Psychiatry. Reprinted with permission. Mark A. Blais Blake–11, Massachusetts General Hospital 55 Fruit Street Boston, MA 02114 E-mail: [email protected] Received April 5, 1999 Revised July 26, 1999

Suggest Documents