A Structured Interview Version of the Hamilton Rating ...

17 downloads 0 Views 470KB Size Report
interview version of the Hamilton Rating Scale for Depression (HRSD) integrated with ... Mental Health Grant 5 RO1 MH33838-08 awarded to Neil S. Jacobson.
Copyright 1989 by the American Psychological Association, Inc. 1040-3590/89/S00.75

Psychological Assessment: A Journal of Consulting and Clinical Psychology 1989, Vol. I, No. 3, 238-241

BRIEF REPORTS

A Structured Interview Version of the Hamilton Rating Scale for Depression: Reliability and Validity Mark A. Whisman, Kirk Strosahl, Alan E. Fruzzetti, Karen B. Schmaling, Neil S. Jacobson, and Donna M. Miller University of Washington

Reliability and validity data are provided for pre- and posttreatment administrations of a structured interview version of the Hamilton Rating Scale for Depression (HRSD) integrated with the National Institute of Mental Health Diagnostic Interview Schedule (DIS). Ss were 70 adult patients requesting therapy for depression. Results indicate excellent agreement between DIS-HRSD ratings made by graduate students and psychiatrist-administered HRSD ratings. The DIS-HRSD exhibited a pattern of correlation with other scales of depression similar to that of the HRSD, thus supporting the validity of the new scale. Intraclass correlations and concurrent validity estimates obtained from analyzing data separately for pre- and posttest administrations were consistently lower than those obtained from the whole sample, suggesting that methodological shortcomings in prior psychometric studies of the HRSD may have spuriously inflated the obtained results.

In clinical research settings the Hamilton Rating Scale for Depression (HRSD; Hamilton, 1960,1967) has become the standard clinical scale to measure depression severity. The emergence of the HRSD is no surprise given its ease of administration and high interrater reliability (Hedlund & Vieweg, 1979). Despite its widespread use, the HRSD has several limitations that have recently been addressed (e.g., Miller, Bishop, Norman, & Maddever, 1985; Rush etal, 1986; Williams, 1988). First, the unstructured nature of the interview procedure has resulted in ambiguity of research findings across settings. Because there are no empirically validated, behaviorally specific anchor points for ratings, documented high interrater reliability within one facility may be partly a function of the shared background and training of raters within that setting. Consequently, reliability between raters from different settings may be lower. Furthermore, because most methodologically sound research projects use more general interview measures of psychopathology such as the Diagnostic Interview Schedule (Robins, Helzer, Croughan, & Ratcliff, 1981) in addition to specific measures of depression, the need to administer the HRSD results in an additional, often redundant, research procedure. In order to address these limitations while retaining the many

strengths of the HRSD, we rewrote the rating scale as a structured interview and integrated it with the National Institute of Mental Health (NIMH) Diagnostic Interview Schedule (DIS; Robins et al., 1981). Endicott, Cohen, Nee, Fleiss, and Sarantakos (1981) have already provided a Schedule for Affective Disorders and Schizophrenia (SADS; Endicott & Spitzer, 1978) extracted version of the HRSD; given the widespread use of the DIS and the HRSD in investigations of depression, the DIS-HRSD should be a welcome addition to the literature. Furthermore, whereas the SADS requires a clinical interviewer, the DIS (and thus the DIS-HRSD) does not. We conducted the present study to assess (a) interrater reliability between the DIS-HRSD (administered by clinical psychology graduate students) and standard HRSD ratings (administered by an experienced psychiatrist), and (b) concurrent validity between the DIS-HRSD and other measures of depression. As Cicchetti and Prusoff (1983) noted, most studies that have examined the reliability and validity of the HRSD have suffered from methodological deficiencies that have distorted the psychometric properties of the scale (cf. Hedlund & Vieweg, 1979). These deficiencies include using joint rather than independent interviews, testing a heterogeneous group of subjects, and combining pre- and posttreatment ratings. Our second purpose in the current study was to examine interrater reliability and concurrent validity estimates obtained separately from pre- and posttreatment evaluations of depressed subjects and then to compare these estimates with those obtained using traditional methods of combining ratings from more than one time period.

Preparation of this article was supported by National Institute of Mental Health Grant 5 RO1 MH33838-08 awarded to Neil S. Jacobson. Portions of this article were presented at the annual meeting of the Western Psychological Association, San Francisco, April 1988. We wish to thank Michelle DeKlyen, Victoria Follette, Amy Holtzworth-Munroe, Sheppard Salusky, and Elizabeth Wasson for their help in collecting portions of the data presented. We would also like to thank the reviewers of an earlier draft of this manuscript for their helpful comments. Correspondence concerning this article should be addressed to Neil S. Jacobson, Department of Psychology, NI-25, University of Washington, Seattle, Washington 98195.

Method Subjects were 70 female outpatients requesting psychotherapy for depression as part of a larger outcome study. All subjects were married, and they had a mean age of 39.0 years (range = 23 to 59). Most subjects (78%) had some college-level education. Subjects either were self-re-

238

239

BRIEF REPORTS ferred, were referred by community agencies, or were responding to newspaper or radio advertisements. Of the initial 70 subjects who entered the study, 30 were reevaluated following treatment. HRSD. The HRSD consists of 17 items designed to measure severity of depression: 9 are rated on 5-point (0-4) scales, and 8 are rated on 3point (0-2) scales. Total scores range from 0 to 52. The HRSD ratings were made by an experienced psychiatrist who was blind to the subjects' DIS-HRSD scores. DIS-HRSD. Graduate research assistants administered the DIS-HRSD, which was the HRSD integrated with the DIS'S Anxiety, Affective, and Paranoia sections.' Most of the DIS-HRSD items were written in the same structured format as the other DIS items, allowing interviewers to assign an exact HRSD symptom rating with a minimum of subjective interpretation. The DIS-HRSD yields item scores parallel to those of the HRSD. Seven graduate student research assistants received didactic instructions from, and practiced role playing with, professional staff regarding administration and scoring of the DIS-HRSD. In addition to this training, they each watched and practiced coding live and taped interviews conducted by the professional staff. Raters were observed during their first two to four interviews to ensure compliance with interview rating procedures. Subjects requesting psychotherapy were scheduled fora pretreatment evaluation, which consisted of completing the Beck Depression Inventory (BDI; Beck, Rush, Shaw, & Emery, 1979) and the DIS-HRSD as part of a battery of clinical measures. Subjects who were not screened out were given a packet of questionnaires that included the Minnesota Multiphasic Personality Inventory (MMPI; Hathaway & McKinley, 1983) and the Carroll Rating Scale for Depression (CRS; Carroll, Feinberg, Smouse, Rawson, & Greden, 1981) to complete at home. They were subsequently scheduled for an unstructured evaluation with the staff psychiatrist, from which the HRSD ratings were derived. Following therapy, subjects were readministered the self-report questionnaires and were scheduled for two posttreatment evaluations identical to the pretreatment evaluations. For both assessment periods, the DIS-HRSD and HRSD evaluations occurred approximately 2 weeks apart (M = 12.1 days,SZ>= 10.1 days).

Results To measure the interrater reliability for the DIS-HRSD and the psychiatrist-administered HRSD, intraclass correlation coefficients (Bartko, 1966) for total scores and for individual items were computed for the total sample and then separately for preand posttreatment administrations of the two scales. These results are presented in Table 1. For the whole sample, the intraclass correlation between the total scores of the two versions was .85 (p < .001). For the two versions, 64% of the protocols were within 3 points of each other, 72% were within 4 points, and 82% were within 5 points. There was no evidence of systematic disparity between the two scales: When the two scales were not in exact agreement, the DIS-HRSD ratings were greater than the HRSD ratings exactly 50% of the time. Intraclass correlations for individual items ranged from -.24 to .96, with a median of .89; 12 items (71%) yielded significant correlations (i.e., ps < .001). We used the guidelines proposed by Cicchetti and Sparrow (1981) for the clinical evaluation of the intraclass correlations of individual scale items: 11 (65%) of the items had excellent reliability (i.e., correlations of .75 to 1.00), 4 items (24%) had good reliability (i.e., correlations of .60 to .74), and 2 items (12%) had poor reliability (i.e., correlations of less than .40). When the pre- and posttest ratings were analyzed separately, interrater reliability estimates were consistently lower than

Table 1 Intraclass Correlations Between the DIS-HRSD and the HRSD Item Depressed mood Feelings of guilt Suicide Insomnia early Insomnia middle Insomnia late Work and activities Psychomotor retardation Psychomotor agitation Anxiety psychic Anxiety somatic Somatic symptoms gastrointestinal Somatic symptoms general Genital symptoms Hypochondriasis Loss of weight Insight Total score

Total

Pre

Post

86* .91* .91* .96* .94* .89* .91* .02 .69 .90* .73*

-.26 .15 .74* .94* .87* .82* .49 -.35 .54 -.79 .22

-.09 .82* .65 -.47 .84* -.24 1.00*

.64

.33

.01

.94* .84* .69 .89* -.24 .85*

.77* .39 .28 .88* .15 .51*

.76 .37 .81* -.94 -.09 .55

a

a

.64 .71

a

Intraclass correlation could not be computed because symptom was never rated as present by both raters.

when the whole sample was analyzed. For pretest ratings, the intraclass correlation for the total score was .51; 59% of the ratings were within 3 points of one another, 64% were within 4 points, and 79% were within 5 points. Intraclass correlations for individual items ranged from —.79 to .94 with a median of .39; six items (35%) yielded significant correlations. Five items (29%) had excellent reliability, one item (6%) had good reliability, two items (12%) had fair reliability (i.e., correlations of .40 to .59), and nine items (53%) had poor reliability. Similar results were found using posttest ratings. The intraclass correlation for the total score was .55; 77% of the ratings were within 3 points of one another, and 90% were within 4 points. Intraclass correlations for individual items ranged from -.94 to 1.00, with a median of .64; four items (27%) yielded significant correlations. Five items (33%) had excellent reliability, three items (20%) had good reliability, and seven items (47%) had poor reliability. In many psychotherapy and drug-therapy outcome studies, HRSD scores of 14 or greater at pretest are required for subjects to be included in the study, whereas scores of 6 or lower at posttest are necessary for subjects to be classified as recovered. With a cutoff of 14 or greater on pretreatment ratings, there was an 87% correspondence between the DIS-HRSD and the HRSD for treatment inclusion. Similarly, with a cutoff of 6 or lower on posttreatment ratings, 80% agreement as to who was recovered was observed between the two ratings. When the two ratings were not in agreement, the DIS-HRSD ratings tended to be more conservative estimates of depression severity: With the psychiatrist-administered HRSD rating as the criterion, 78% of the cases that were not classified the same by both ratings were falsely 1

Copies of the DIS-HRSD questions are available upon request from Neil S. Jacobsen.

240

BRIEF REPORTS

Table 2 Correlations Between the Two Versions of the HRSD and Other Measures of Depression HRSD version

MMPI-D

CRS

.78*** .26* .50**

.50*** .02 .20

7^*** .42*** .72***

.85*** .27* .67***

.62*** .20 .50**

.78*** .41*** .68***

DIS-HRSD

Total Pre Post HRSD

Total Pre Post

Note. BDI = Beck Depression Inventory; MMPI-D = Minnesota Multiphasic Personality Inventory-Depression scale; CRS = Carroll Rating Scale for Depression. *p

Suggest Documents