Comparability of Telephone and Face-to-Face Interviews in Assessing ...

12 downloads 0 Views 56KB Size Report
Objective: The present study examined the comparability of data obtained by telephone and face-to-face interviews for diagnosing axis I and II disorders.
ROHDE, TELEPHONE Am J Psychiatry LEWINSOHN, AND154:11, FACE-TO-FACE AND November SEELEY INTERVIEWS 1997

Comparability of Telephone and Face-to-Face Interviews in Assessing Axis I and II Disorders Paul Rohde, Ph.D., Peter M. Lewinsohn, Ph.D., and John R. Seeley, M.S.

Objective: The present study examined the comparability of data obtained by telephone and face-to-face interviews for diagnosing axis I and II disorders. Method: Sixty young adults from the community were interviewed face-to-face and over the telephone regarding axis I disorders; another 60 subjects were interviewed twice regarding axis II disorders. The order of interviews was counterbalanced, and subjects with a history of disorder were oversampled. Agreement between telephone and face-to-face interviews was contrasted with interrater values, which were obtained by having a second interviewer rate a recording of the original interview. Results: Interrater reliability was excellent. Agreement between telephone and face-to-face assessment was excellent for anxiety disorders and very good for major depressive disorder and alcohol and substance use disorders; agreement was problematic, however, for adjustment disorder with depressed mood. Strong support was shown for the validity of the axis II telephone assessment format. Small but consistent trends were noted for lower rates of psychopathology reported in the second interview. Conclusions: This is the first study in which telephone and face-to-face assessments of axis I and II psychopathology were conducted with the same subjects assigned to conditions in a counterbalanced manner. The present findings provide qualified justification for the use of telephone interviews to collect axis I and II data. The apparent concerns do not appear sufficient to override the economic and logistic advantages of telephone interviewing. (Am J Psychiatry 1997; 154:1593–1598)

T

he reliability of psychiatric diagnosis has been markedly enhanced through the use of standardized interview schedules (e.g., the Schedule for Affective Disorders and Schizophrenia [1], the Diagnostic Interview Schedule [2], and the Structured Clinical Interview for DSM-III-R [3]). Although these interviews were developed for use in a face-to-face format, interviews that assess axis I and II disorders in research settings are often done by telephone (J. Endicott, D. Kilpatrick, and R. Kessler, personal communications, 1996). The major advantage of telephone interviews over face-to-face interviews is cost efficiency (4). Telephone interviewing is also logistically simpler, especially if the participant resides in a geographically distant location. The extensive use of telephone interviews in research presupposes that the obtained diagnostic information is as valid as that obtained in person. The goal of the present study is to examine this assumption with regard to axis I and

Received Oct. 30, 1996; revision received March 12, 1997; accepted May 16, 1997. From the Oregon Research Institute. Address reprint requests to Dr. Rohde, Oregon Research Institute, 1715 Franklin Blvd., Eugene, OR 97403-1983. Supported in part by NIMH award MH-50522. The authors thank Helen Orvaschel, Ph.D., for her assistance.

Am J Psychiatry 154:11, November 1997

II diagnoses obtained in a group of young adults from the community. In addition to the obvious research implications, understanding the adequacy of data obtained by telephone has clinical importance. Telephone-based programs have been used to screen for psychiatric difficulties such as depression and obsessive-compulsive disorder (5, 6), administer smoking-cessation programs (7), conduct psychotherapy (8), and provide expert consultation to underserved populations, such as individuals in rural settings (9, 10). Most previous studies that examined the comparability of telephone and face-to-face interviewing have contrasted the relative prevalence rates of disorder associated with the two assessment procedures. Given similar rates of disorder for the two methods, it has been concluded that the methods are comparable (11–13). A more rigorous test of the comparability of the two assessment methods is to repeat the interview by using both telephone and face-to-face procedures. A few studies have adopted this approach. Wells et al. (14) reinterviewed over the telephone 230 adults who had been interviewed face-to-face 3 months earlier as part of the Epidemiologic Catchment Area study. Diagnostic agreement for depression (major depression and dysthymia) was fair (kappa=0.57). The authors concluded

1593

TELEPHONE AND FACE-TO-FACE INTERVIEWS

that the telephone interview had acceptable agreement with the original “gold standard” face-to-face interview and that no evidence suggested that one method resulted in more positive reports regarding depression. Paulsen et al. (15) compared the reliability of lifetime anxiety disorders in 39 probands who initially had been interviewed face-to-face and were reinterviewed by telephone 12–19 months later. Interrater agreement ranged from good to excellent (kappa=0.69–0.84). Given the long interval between interviews, reliability may have been attenuated by change in the subjects’ clinical status. In the present study, the interval between test-retest assessments was evaluated as one of the measures that may affect agreement. A second limitation of previous studies is that the telephone interview always followed the face-to-face interview. The commonly noted finding of less psychopathology being reported on the second interview (16, 17) makes it impossible to determine if systematic differences were due to the method of assessment or the order of interviews. In the present study, the order of the two assessment methods was counterbalanced. Finally, gender may influence the degree to which individuals reveal certain kinds of information in the context of different interview formats. There is some indication that between face-to-face and telephone interviews men show a greater discrepancy than women in the reporting of information, at least for sensitive material (18). Gender differences in comparability across assessment methods were examined in the present study. METHOD Subjects and Procedures The current study was conducted in the context of an ongoing follow-up of participants from the Oregon Adolescent Depression Project. Extensive data have been collected previously from these individuals on two separate occasions while they were in high school (14–18 years of age). A detailed description of the Oregon Adolescent Depression Project is provided elsewhere (19). Subjects from the Oregon Adolescent Depression Project were invited to participate in a third wave of diagnostic assessments at age 24, which included structured psychiatric interviews to assess axis I and II disorders. Written informed consent was obtained after the procedures had been fully described. For the current study, 60 subjects who were residing in the local area were chosen to be interviewed both face-to-face and over the telephone regarding axis I disorders; an additional 60 subjects were chosen to be interviewed twice regarding axis II disorders. To guarantee an adequate representation of various psychiatric disorders, in each group of 60 participants, 20 were selected on the basis of a prior diagnosis of major depression, 20 were selected on the basis of a prior psychiatric disorder other than depression, and 20 had no diagnosed psychopathology. Of the 40 subjects selected because of a psychiatric disorder other than major depressive disorder, rates of disorder were as follows: substance use disorder (42.5%, N=17), adjustment disorder (32.5%, N=13), anxiety disorder (22.5%, N=9), disruptive behavior disorder (12.5%, N=5), dysthymia (7.5%, N=3), and an eating disorder (2.5%, N=1). There was no overlap between subjects who repeated the axis I interview and those who repeated the axis II interview. Fifty percent of the subjects participated in the faceto-face interview first, and 50% were interviewed by telephone

1594

first. The median duration between the two axis I interviews was 14 days (mean=22.5, range=2–92). Median duration between axis II interviews was 12 days (mean=16.8, range=1–55). Seventy (58.3%) of the subjects were young women. The mean age at the time of interview was 24.4 years (SD=0.3). The vast majority (95.8%, N=115) identified themselves as Caucasian. Over two-thirds had either a high school diploma (67.5%, N=81) or a General Equivalency Diploma (2.6%, N=3), and 21.7% (N=26) had gone on to receive a bachelor’s degree. Eighty percent (N=96) were working, 10.0% (N=12) were homemakers, and 3.3% (N=4) were students. The majority (55.0%, N=66) were single, 40.8% (N=49) were married, and 4.2% (N=5) were separated or divorced. The median household income was between $10,000 and $14,999.

Diagnostic Interviews Assessment of axis I disorders. To cover the period between the previous assessment and the current study, the Longitudinal Interval Follow-Up Evaluation (20) was administered to each participant. This methodology provided detailed information about the longitudinal course of all disorders that were present at the previous assessment. The Longitudinal Interval Follow-Up Evaluation also probed for the occurrence of new disorders since the previous assessment. To maintain diagnostic continuity with the first two assessments, a modified version of the Schedule for Affective Disorders and Schizophrenia for School-Age Children (KIDDIE-SADS) (21) that combined features of the epidemiologic and present episode versions was used to assess axis I disorders that began in the interval before the current study. Additional questions were added to 1) assess disorders that were not previously examined (e.g., posttraumatic stress disorder [PTSD] and somatoform disorders), 2) reflect the adult presentations of disorders, and 3) incorporate changes associated with DSM-IV. In the present study, we focus on DSM-IV disorders or disorder categories that had prevalence rates greater than 5%. This included major depressive disorder, anxiety disorders (most often PTSD or panic disorder), alcohol use disorders (dependence and abuse), substance use disorders (most often cannabis abuse or dependence), and adjustment disorder with depressed mood. Assessment of axis II disorders. The Personality Disorder Examination (22) was used to assess all axis II disorders. The Personality Disorder Examination, which is organized according to issues (work, self, interpersonal relations, affect, reality testing, impulse control) usually took 1–2 hours to administer. When an item was endorsed, specific examples were solicited to rate the trait (0=behavior or trait absent or normal, 1=exaggerated or accentuated, 2=criterion level or pathological). In general, traits needed to be present for 5 years to meet diagnostic criteria. A detailed scoring manual was available that defined the scope and meaning of each item. The Personality Disorder Examination has been found to be among the most reliable assessments of axis II disorders and to be less influenced by concurrent depression or anxiety levels than other assessment methods (23). Items that assessed the provisional disorders of self-defeating and sadistic personality disorder were not included. With the assistance of one of the interview’s authors (Dr. Loranger), 20 items were added to the interview—which was originally developed to assess DSM-IIIR criteria—to assess changes in criteria associated with DSM-IV (all forms used in this study are available from Dr. Rohde upon request).

Interviewer Training Six diagnostic interviewers (five women, one man; five with master’s or doctoral degrees in clinical or counseling psychology) were carefully trained in an extensive didactic and experiential course in interviewing. Before collecting data, interviewers were required to demonstrate a minimum kappa of 0.80 across all symptoms for at least two consecutive training interviews. In the first interview, the trainee scored diagnostic data collected by an experienced interviewer (either live or from recording). In the second training interview, the trainee conducted a live interview, while an experienced interviewer independently observed and scored diagnostic data. During data collection, biweekly discussion sessions were held between the supervisor and the interviewers to review interview pro-

Am J Psychiatry 154:11, November 1997

ROHDE, LEWINSOHN, AND SEELEY

TABLE 1. Prevalence of Axis I Disorders in 60 Young Adults From the Community, by Assessment Method and Interview Order Prevalence of Disorder Assessment Method

Interview Order

Telephone

Disorder

N

%

N

%

N

%

N

%

Interrater

Repeated

Major depressive disorder Anxiety disorders Alcohol use disorders Substance use disorders Adjustment disorder with depressed mood

16 7 18 14 9

26.7 11.7 30.0 23.3 15.0

18 7 13 9 10

30.0 11.7 21.7 15.0c 16.7

19 8 16 12 11

31.7 13.3 26.7 20.0 18.3

15 6 15 11 8

25.0 10.0 25.0 18.3 13.3

0.96 0.87 1.00 1.00 0.74

0.67a 0.84 0.70b 0.73d 0.31

aSignificantly lower than interrater reliability (z=2.50, df=1, p