Marie Claire Shankland and John Wright. Sheffield Consulting and Clinical Psychologists and Community Health. Sheffield NHS Trust. Susan D. Field.
Psychotherapy Research 7(2) 155-171, 1997
DIMENSIONS OF CLIENTS’ INITIAL PRESENTATION OF PROBLEMS IN PSYCHOTHERAPY: THE EARLY ASSIMILATION RESEARCH SCALE William B. Stiles Miami University
Marie Claire Shankland and John Wright Sheffield Consulting and Clinical Psychologists and Community Health Sheffield NHS Trust
Susan D. Field University of Leeds
We report the development of the Early Assimilation Research Scale (EARS), a measure of clients’ presentation of their problems during the first 20 minutes of their first psychotherapy session. Using an iterative group-based process in which we cycled between listening to first-session tape recordings and discussing our understandings, we identified the following dimensions of clients’ presentations: specificity, internality, in-session distress, reported distress, richness of understanding, and openness of the negotiation. An application of the EARS to data from a large comparative psychotherapy research project yielded acceptable interrater reliability and preliminary but promising evidence of construct validity, based on correlations of EARS scales with measures of symptom intensity at intake and impact of the first session. This paper describes an instrument for rating how clients present their problems at the beginning of therapy. The instrument’s content and structure were informed by a model of clients’ assimilation of their problematic experiences (Stiles et al., 1990), and we call the instrument the Early Assimilation Research Scale (EARS). We explain how we identified a set of conceptual dimensions and scales that characterize presenting problems, and we report the interrater reliability and This research wa5 done at the Medical Research Council/Economic and Social Research Council Social and Applied Psychology Unit at the University of Sheffield, United Kingdom. It was supported in part by Senior International Fellowship number 1 FOG TWO1808-01 awarded to William B. Stiles by the Fogarty International Center of the National Institutes of Health. Portions of this study were reported at the meeting of the North American chapter of the Society for Psychotherapy Research, Santa Fe, New Mexico, USA, in February, 1994, and at the meeting of the Society for Psychotherapy Research, York, England, in June 1994. We thank Michelle Atherton, Jane Edmonds, Sarah Flowers, and Dave Newman for assistance in rating, and we thank Gillian E. Hardy, Lara Honos-Webb, and Crystal L. Park for helpful discussions and comments on earlier versions. Correspondence regarding this article should be addressed to William B. Stiles, Department of Psychology, Miami University, Oxford, OH 45056, USA.
155
156
EARLY ASSIMILATION RESEARCH SCALE
sessions of clients in the Second Sheffield Psychotherapy Project (Shapiro et al.,
1994). Clinically, clients’ first communications have long been accorded special significance (e.g., Coltart, 1987; Goldman & Milman, 1978; Malan, 1982; Wolberg, 1977). Alvarez (1992) commented, “everything we need to know about the patient is contained in the first session, if only we had the wit and understanding to see it. . . . [Ilt is nevertheless difficult to be a microscope and a telescope at one and the same time” (p. 15). Most research on first sessions has concerned drop-out populations and focused on clients’ demographic characteristics, distance from the clinic, evaluations of their therapists, and level of personal adjustment, rather than the content or style of clients’ presentations (Cross & Warren, 1984; Gaines & Stedman, 1981; Lowman, DeLange, Roberts, & Brady, 1984; Shapiro & Budman, 1973). Investigators studying the therapeutic relationship or working alliance have agreed that early sessions are crucial (Gelso & Carter, 1985; Horvath & Greenberg, 1986; Luborsky, Crits-Christoph, Alexander, Margolis, & Cohen, 1983), but have generally assumed that assessment of the alliance prior to the third session would not be valid (Hartley & Strupp, 1983; Horvath & Greenberg, 1986; Morgan, Luborsky, Crits-Cristoph, Curtis, & Solomon, 1982), reasoning that the alliance is unstable in its formative stages. However, as noted by Gelso and Carter (1985), no one has examined the extent to which this is true, and important aspects of the alliance may be established within the first session (Morgan et al., 1982; Kokotovic & Tracey, 1990). The EARS was meant primarily to assess current cognitive and affective characteristics of the client. However, we recognized that the dialogue of first sessions is jointly constructed, so that ratings of a client’s presentation in part reflect the therapist and the therapeutic relationship.
THE ASSIMILATION MODEL
The assimilation model is an evolving conceptualization of clients’ internal processes (Stiles et al., 1990). It proposes that in successful psychotherapy, the client’s problematic experiences (perceptions, memories) are assimilated into schemata (scripts, ways of thinking) that are developed during treatment. In assimilating an experience, a schema integrates it into its system of associations. Once assimilated, the formerly problematic experience becomes part of the schema. According to the model, assimilation proceeds through a series of 8 predictable stages, which are identified by cognitive and affective markers and are described in the Assimilation of Problematic Experiences Scale (APES; Stiles et al., 1991). Our names for these stages, in sequence, are: warded off, unwanted thoughts, vague awareness/emergence, problem statement/ clarification, understanding/insight, application/working through, problem solution, and mastery. As a problematic experience passes through these stages, the salience of the problematic content in the client’s awareness is assumed to increase until understandinghnsight is achieved, and then to decrease until the content is fully assimilated and dealt with automatically. In concert with these cognitive changes, the client is hypothesized to have a parallel sequence of emotional reactions, from being oblivious or feeling only vaguely disturbed when the experience is warded off or unwanted, to experiencing the content as
STILES, SHANKLAND, WRIGHT, AND FIELD
157
acutely painful as it emerges into awareness. The distress then decreases as the experience is formulated as a problem and then understood. Then, the affect may turn positive as the formerly problematic experience is worked through and solved and, finally, neutral when the problem is mastered and no longer an issue. Clients seem most likely to enter treatment with problems in the stages of unwanted thoughts, vague awarenesdemergence, or problem statement/ clarification. If their problems were warded off, clients would probably not be motivated to present for therapy. If, on the other hand, they had already achieved a satisfactory understandinghsight, they would probably not feel a need for professional help. The assimilation model offers an integrative approach to understanding change in different types of psychotherapy (cf. Prochaska & DiClemente, 1984; Ryle, 1990). At, a practical, clinical level, the APES stages suggest a series of treatment subgoals, which consist in advancing each problematic experience from one APES stage to the next. The transitions between different APES stages may require different sorts of therapeutic interventions. For example, unwanted thoughts may be best evoked with supportive, exploratory interventions, whereas stated problems may respond to more direct interpretive, restructuring interventions. Although most therapies can probably work at many levels, the APES stages suggest a coherent basis for assigning problems to treatments (Reynolds et al., 1996; Stiles, Barkham, Shapiro, & Firth-Cozens, 1992; Stiles et al., 1990). Psychodynamic, experiential, and interpersonal psychotherapies seem to focus on unassimilated experiences and work toward understanding. By contrast, cognitive and behavioral therapies seem to focus on known problems and apply rational solutions in practical situations. Thus, clients presenting problems at the problem statementlclarification stage might logically be assigned to cognitive or behavioral treatments, whereas client presenting poorly assimilated problems might be assigned to a psychodynamic, experiential, or interpersonal approach. Research on these sorts of issues, however, will require systematic measurement of clients’ presenting problems, including their degree of assimilation. Designed to address these issues, the EARS includes the APES but also new scales meant to measure assimilation indirectly. The early stages of assimilation are difficult to measure because by definition the client’s problematic experiences cannot be clearly expressed. For example, problematic experiences at the vague awarenesdemergence stage should tend to be cognitively vague and diffuse but affectively intense and distressing. Thus, clients who describe their problems in vague terms but with intense distress are likely to have problematic experiences at the vague awarenesdemergence stage, even though neither we nor they could say just what the problems are. Our goals in this paper, then, were to describe the development of the EARS and to evaluate its interrater reliability and construct validity. DEVELOPMENT OF THE EARS
The APES stages suggested three heuristics that differentiate among the levels of assimilation: affect, cognition, and ease of articulation of the problematic experience. Guided by these heuristics, we sought to identify and describe distinct, salient, observable features and dimensions that appeared to differen-
158
EARLY ASSIMILATION RESEARCH SCALE
tiate among clients’ presentations of their problematic experiences at the outset of therapy. We were guided particularly by whether the dimensions appeared relevant to the assimilation model, but also by whether they captured other clinically salient aspects of the clients’ initial presentation. Thus, we cast a net that was somewhat wider than the assimilation model, encompassing some variation among clients that seemed important to us on other clinical or conceptual grounds. GROUP-BASED PROCEDURE FOR DRAFTING SCALES We (all four authors participated) studied audiotape recordings of first sessions from pilot cases from the Second Sheffield Psychotherapy Project (Shapiro et al., 1994; described later). In all, approximately 30 first-session tapes were used in this process. The observable features and dimensions were identified, elaborated, and incorporated into rating scales using a group-based systematic iterative procedure adapted from one developed by Ward (1987): 1 . We began by discussing our general goals and initial understandings. 2. We independently listened to tapes of first sessions and drafted scales
and rating procedures. 3. We circulated our drafts and listed the strengths of each others’ drafts. 4. We met to discuss and consolidate the list of strengths. 5. We listened to further tapes, tried to apply the scales, and revised our drafts, borrowing freely from each other and attempting to incorporate all of the identified strengths in the revisions. We then repeated steps 3-5 several times. After about three iterations, we found that our versions converged. Each of us had contributed to shaping every scale, and we felt that we had synthesized each collaborator’s clinical observations, theoretical understandings, and creative insights, rather than settling for only those understandings that we had initially held in common. We applied a consolidated version to further tapes and made further revisions to remove ambiguities and refine our explanation of what each scale measured. The results of our iterative, qualitative work included (a) a set of dimensions that seemed to underlie and inform the more complex assimilation rating represented by the APES, (b) a set of 5-point scales for rating these dimensions as manifested early in therapy, and (c) systematic procedures for applying the scales. The EARS dimensions, scales, and procedures were documented in a rating manual (Stiles, Field, Shankland, & Wright, 1993). We note that each of the dimensions we identified bears a similarity to dimensions identified previously by others. EARS DIMENSIONS AND SCALES
Specificity concerns the degree to which the client’s presenting problematic experience is specific, precise, and concrete as opposed to being vague, general, or abstract. Specificity ranges from global characterizations of problems (e.g., “I seem to miss the point all my life”) to descriptions of unique instances (specific targets and occasions, e.g., “When I was 7 years old, I ran out of school assembly. I recall that day clearly. It was my first panic attack.”). Theoretically,
STILES, SHANKLAND, WRIGHT, AND FIELD
159
specificity should increase over APES range from vague awareness to problem solution. At very low levels of assimilation, however, problem descriptions may have a specific quality but be perceived as originating outside of the self. Evidence and theory from other approaches also link lack of specificity with vulnerability to psychopathology. Over-generality in retrieval of personal memories is a significant predictor of suicidality and degree and chronicity of emotional disturbance (Williams & Dritschel, 1988; Williams et al., 1996). According to the reformulated learned helplessness model, over-global attributions are associated with depression (Abramson, Seligman, & Teasdale, 1978). Internality concerns the client’s perception of self-in-relation-to-problem. Does the client consider the problem as “out there” or “in here”? Low ratings describe problems that the client attributes mainly to excessive external stressors, such as job, family, or physical conditions or illness. High ratings are given to problems that describe personal failures to cope, such as inability to perform up to expectations or emotional overreactions. Heider (1958) distinguished causes that reside within the person (internal) from causes concerning other people, luck, or circumstances (external). Theoretically, problematic experiences that are unassimilated (warded off) might be associated with externalized attributions for psychological problems, whereas problematic experiences in the process of painfully emerging might be associated with exaggerated internalization. Clinically, others have linked the concept of internality to esteem-related emotions such as pride and guilt and interpersonal emotions such as pity and anger (Brewin, 1988). Attributing personal problems to internal causes has also been linked to depression (Abramson et al., 1978; Peterson et al., 1982). Importantly, however, the EARS internality and specificity scales seek to measure dimensions of clients’ initial presentation, which may or may not reflect traitlike cognitive styles. Distress concerns the emotional valence and intensity of the problematic experience. We distinguished and constructed separate scales for (a) distress displayed in-session, and @) distress reported outside the session in the client’s current life. Theoretically, distress may be low at very low levels of assimilation reflecting successful warding off, highest in the vague awarenedemergence stage, and then declining until the problem is worked through and solved. Clinically, distress is a salient feature of clients’ presentation of their problematic experiences and often forms the basis for diagnosis. Although distress is negative by definition, affective experiencing, catharsis, and working with emotions in therapy may make positive contributions to outcome in many types of psychotherapy (Karasu, 1986). Richness of understanding concerns the content of clients’ schemata for understanding their presenting problematic experiences. It represents an assessment of a client’s cognitive resources, ranging from impoverished or rigid understanding, through a moderate level of “wondering” or acknowledging a lack of understanding or insight, to a rich flexible understanding that incorporates a coherent story with alternative perspectives. Watts (1992) described this “higher-order’’ level of understanding as the interpretational level of cognitive processing, which can be contrasted with more basic emotion-related thoughts and memories. Theoretically, cognitive resources within the domain of a particular problematic experience should promote assimilation of the experience.
160
EARLY ASSIMILATION RESEARCH SCALE
Openness of the negotiation concerns clients’ apparent willingness or ability to consider alternative views of their problematic experience raised by the therapist. This scale therefore reflects interpersonal engagement, an important quality of the alliance in its very early stages. Either one-sided extremerefusing to listen to the therapist or expecting the therapist to provide a full understanding-is given a low rating. Insofar as openness reflects a greater flexibility and gives access to a wider range of alternative schemata, it should promote assimilation and be promoted by assimilation. Gomes-Schwartz (1978) found that patients who were not hostile or mistrustful and who actively contributed to the sessions had better outcomes than clients who were withdrawn, defensive, and apparently unwilling to engage. Assessing the openness of the negotiation seems particularly relevant in light of evidence that therapists may not adequately judge the collaborative aspects of psychotherapy (Horvath & Symonds, 1991). Nominal Scales. In addition to the foregoing 5-point rating scales, the EARS .includes several nominal scales. These were meant primarily as preparation for rating (described later) and are not analyzed in this paper. They included classifications of the content of the main presenting problem and of the predominant type of stress experienced by the client. Summary APES rating. After applying all of the other scales, raters are instructed to give a summary APES rating, taking into account all of the other information. The APES was rated on an 8-point (0-7) scale using descriptions of the scale points developed previously (Stiles et al., 1991). EARS RATING PROCEDURE
The EARS provides conceptual preparation for rating by systematically directing raters’ attention to crucial issues and features underlying the scales. As with other systems for measuring complex psychological phenomena (e.g., adult attachment classifications system, Main & Goldwyn, in press), EARS ratings require consideration of multiple factors and integration of multiple impressions. Rating specificity, for example, requires integration of data across an extended stretch of dialogue. Similarly, the summary APES rating is not merely a mechanical combination of the contributing dimensional ratings but may depend on emphasizing one dimension, de-emphasizing another, using the inevitably partial information available, and meeting criteria not represented in the other EARS dimensions. The EARS rating form and manual (Stiles et al., 1993) guide raters towards complex judgments by drawing their attention to relevant aspects of the material in each session. First, the manual instructs raters to briefly “quote or paraphrase the main problem for which the client is seeking help . . . from the client’s point of view.” (Separate forms may be used in case of multiple main problems.) Then raters classify the problem as to whether it concerned primarily symptoms, mood, self-esteem, relationship difficulties, or specific (work) performance issues (these categories are defined in the manual). Later, in preparation for rating internality, raters are asked to report the presence and type of excessive stress and of coping failure, as viewed by the client. Finally, the whole series of dimensional rating contributes to preparation for the summary APES rating. Our iterative procedure also yielded general instructions:
STILES, SHANKLAND, WRIGHT, AND FIELD
161
a. Ratings should be based on the first 2 0 minutes of the first therapy session, starting from the point where the therapist first allows or invites the client to state or discuss the problems or reasons for coming. In our pilot work, we found that this interval afforded sufficient information for raters while remaining relatively free from approach-specific interventions. b. Numerical ratings, except for the summary APES, are on 5-point (1-5) scales; the manual provides full descriptions for low (l), moderate (3), and high ( 5 ) anchor points, and briefer descriptions for intermediate points (2 and 4). c. Where ratings are on nominal (category) scales, raters may assign more than one category if necessary, listing the most prominent or appropriate category first. d. Insofar as possible, ratings should reflect what the client actually presents, not what the rater or the therapist interprets or infers about what the client may think. e. As they listen, raters note (transcribe, paraphrase, or describe) key passages that illustrate the highest level of each scale that the client has demonstrated up to that point in the session. f. Thus, the numeric ratings should reflect the highest degree of specificity, internality, distress, richness of understanding, and openness of the negotiation demonstrated spontaneously by the client during the initial 2 0 minutes. However, ratings may be lowered by one scale point for qualities demonstrated very briefly or demonstrated only after persistent prompting by the therapist. g. Raters indicate their confidence in each rating on a scale from 0 to 100%.
METHOD
To assess the interrater reliability and construct validity of the EARS, we drew data from the Second Sheffield Psychotherapy Project, a comparison of two time-limited treatments for depression, a psychodynamic-interpersonal (PI) therapy and a cognitive-behavioral (CB) therapy (Shapiro et al., 1994). In overview, clients were assessed with a battery of standard self-report and structured interview measures. Those who met the criteria were stratified for severity of depression and then randomly assigned to either 8 or 16 sessions of one of the treatments. Clients were assessed again at the end of treatment and at 3-month and 1-year follow-up, to estimate the extent of improvement or deterioration. Clients and therapists also evaluated the impact of each of their sessions (Reynolds et al., 1996; Stiles et al., 1994). In all treatment conditions, clients averaged substantial improvement from the beginning to the end of treatment. On a few of the measures, clients in the CB treatment averaged a little more improvement than did clients in the PI treatment. Clients assigned to the 8- and 16-session versions of the treatment averaged similar levels of improvement, though among the most severely depressed clients, somewhat greater gains were shown in the longer treatments (Shapiro et al., 1994). Thus, we were confident before beginning our EARS ratings that most of tin this study of the EARS. Their average age at intake was
162
EARLY ASSIMILATION RESEARCH SCALE
40.6 years; 52% were women; 60% had a university education or professional qualification; 65 % were married or cohabitating. The therapists were five clinical psychologists (three men, two women), who were also investigators on the project. Each of the therapists saw clients using each treatment approach and duration. (See Shapiro et al., 1994, for details.) TREATMENTS The objective, strategy and techniques of each treatment followed project manuals, summarized by Shapiro and Firth (1987). The PI therapy was based on Hobson’s (1985) Conversational Model. Using psychodynamic, interpersonal, and experiential concepts, it focuses on the therapist-client relationship as a vehicle for revealing and resolving interpersonal difficulties viewed as primary in the origins of depression. The CB therapy was a multimodal approach incorporating cognitive and behavioral strategies. A wide range of techniques is available to the therapist, including anxiety-control training, self-management procedures, cognitive restructuring, and a job-strain package. Therapists participated in weekly peer supervision to maintain the clinical quality of treatment and adherence to the CB and PI treatment protocols. Adherence ratings were obtained on 220 sessions from the Sheffield project. These demonstrated pure delivery of the two treatments, despite the use of the same therapists for both (Startup & Shapiro, 1993). MEASURES
Symptom Intensity. The Beck Depression Inventory (BDI; Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) was used to measure depressive symptoms. The Global severity index of the Symptom Checklist-90, Revised (SCL-90R; Derogatis, 1983) measured general symptomatology. Intrapersonal aspects of depression were assessed by a measure of self-esteem (SE; O’Malley & Bachman, 1979). Interpersonal difficulties were measured by the Inventory of Interpersonal Problems (IIP; Horowitz, Rosenberg, Baer, Ureno, & Villasenor, 1988). These are all standard, widely used instruments of acceptable reliability and validity (details are available in the references cited for each scale and in Shapiro et al., 1994). Session Impact. The Session Evaluation Questionnaire (SEQ; Stiles, 1980; Stiles et al., 1994), completed immediately after sessions, measures the impact of the session on participants. From the SEQ, we used the two orthogonal session evaluation dimensions, depth and smoothness, each of which is measured as the mean rating on 5 bipolar adjective scales, rated from 1 to 7. Depth concerns the session’s perceived power and value. Smoothness concerns the session’s perceived comfort and pleasantness. The SEQ’s psychometric characteristics in the Sheffield sample have been reported elsewhere (Stiles et al., 1994); internal consistency was high for both depth (alpha = .90) and smoothness (alpha = .92). SHEFFIELD PROJECT PROCEDURE Prior to the first session, each client went through a two-session assessment procedure with a trained assessor, which included the BDI, SCL-90R, SE, and
STILES, SHANKLAND, WRIGHT, AND FIELD
163
IIP. These measures were repeated at the end of treatment and at three-month follow-up to assess treatment outcomes (Shapiro et al., 1994). At the intake assessment, clients were randomly assigned to a therapist and treatment condition. Clients did not meet their therapist until the first session. The therapists were not shown the results of the initial research assessments, except for a list of target complaints, so the first session was the therapist’s introduction to the client and the problems. All sessions, including the first, were tape recorded for the research, with clients’ written permission. After each session, including the first, clients and therapists completed SEQ forms and returned these to the clinic secretary. Clients were informed that their therapists would not see these forms until after the end of their treatment. For comparison with EARS ratings, we considered clients’ and therapists’ ratings of thsession was the therapist’s introduction to the client and the problems. All sessions, including the first, were tape recorded for the research, with clients’ written permission. After each session, including the first, clients and therapists completed SEQ forms and returned these to the clinic secretary. Clients were informed that their therapists would not see these forms until after the end of their treatment. For comparison with EARS ratings, we considered clients’ and therapists’ ratings of the depth and smoothness of the first session only. EARS RATINGS OF THE SHEFFIELD PROJECT TAPES
Seven psychology graduates (5 women, 2 men), who were either in clinical training or had previous experience in psychotherapy process research, served as EARS raters. Before beginning, they read the EARS manual and rating form (Stiles et al., 1993) along with descriptions of the assimilation model (e.g., Stiles et al., 1990, 1991; Stiles, Meshot, Anderson, & Sloan, 1992). Then, they were given extensive practice applying the EARS to pilot cases, with feedback from experienced raters and discussion of difficult points. The EARS was applied to all available audiotaped recordings of first sessions of project cases from the Second Sheffield Psychotherapy Project (N = 112). Each tape was rated three times independently. The tapes were randomly assigned to the raters and rated in random order. Weekly project meetings attended by all raters were aimed to maintain accuracy and minimize rater drift. The raters were unaware of clients’ assessment and session impact scores. Because of their differing availability, the 7 raters rated different numbers of the tapes, ranging from 14 to 74 (median = 50).
RESULTS
Table 1 shows the means, standard deviations, and interrater reliabilities of the seven EARS numerical scales-specificity, internality, in-session distress, reported distress, richness of understanding, openness of the negotiation, and the summary APES. Means were calculated by first averaging across the three independent ratings and then across the 112 clients; the standard deviations in Table 1 are for the three-rater mean ratings. As expected, although the APES could be scored from 0 to 7, most clients’ problematic experiences at intake
164
EARLY ASSIMILATION RESEARCH SCALE
Table I . Means, Standard Deviations, and Interrater Reliabilities of EARS Numerical Scales Interrater reliability EARS scale
Mean
Standard deviation
Unadjusted ratinas
Standardized ratings
Specificity Internality In-session distress Reported distress Rich understanding Openness of negotiation APES
3.39 3.34 2.11 3.22 2.81 3.06 2.27
.94 .93 .94
.72 .73 .79 .55 .52 .50 .29
.77 .73 .81 .67 .61 .62 .49
.80
.76 .67 .49
Note: N = 122 first sessions, each rated three times. EARS = Early assimilation research scaleYPES = assimilation of problematic experiences scale. Each scale could range from 1 to 5 except the APES, which could range from 0 to 7. Seven raters participated. Ratings were first averaged across the three independent ratings of each session and then across the 112 clients; standard deviations are for the three-rater mean ratings of each session. Interrater reliability = intraclass correlation coefficient designated ICC(1,3) by Shrout & Fleiss (1979), which gives the reliability of the mean of three independent ratings, where raters are treated as a random effect. Reliabilities in right column were based on ratings that were standardized within raters before averaging.
were rated in the narrow range of unwanted thoughts (scored as 1) to problem statement (scored as 3), with an average just above the vague awareness/ emergence level (scored as 2). None of the EARS scales was significantly correlated with client age. Women (n = 58) tended to receive higher ratings than did men (n = 54) on in-session distress, M = 2.47 versus 1.73, t(l10) = 4.44, p < .001, and on reported distress, M = 3.38 versus 3.05, t(110) = 2.23, p = .028. Men and women had similar mean ratings on the other EARS scales. RELIABILITY OF THE EARS Interrater reliability of the EARS scales was measured by the intraclass correlation coefficient designated ICC(1,3) by Shrout and Fleiss (19759, which gives the reliability of the three-rater mean rating. ICC(l,3) is a conservative index, which treats rater as a random effect. Because some of the reliabilities of the unadjusted ratings were lower than we had hoped (Table l), we tried three approaches to improving reliability: a. Reasoning that part of the unreliability might reflect raters setting different thresholds for scale points, we standardized ratings within each rater (i.e., so that each rater had the same mean and standard deviation on each scale). For most of the EARS scales, the reliability of these standardized ratings was higher than the reliability of the unadjusted ratings (Table l), so we used mean standardized ratings (i.e., standardized within raters and then averaged across raters) for subsequent analyses. b. Reasoning that part of the unreliability might reflect raters’ uncertainties regarding some difficult subset of the session tapes, we used raters’ confidence judgments to select subset of the ratings. For example, in one analysis, we included only ratings of which raters were “50%” or more confident. Overall, the improvement in reliability was negligible. Ac-
STILES, SHANKLAND, WRIGHT, AND FIELD
165
cordingly, we did not delete low-confidence ratings, though we remain puzzled that confidence seemed so little related to interrater agreement. c. Reasoning that part of the unreliability might reflect different raters targeting different problematic experiences, we attempted to assess target similarity across raters using the initial open-ended statements of the pesenting problem. The three independent raters’ statements were typed and presented to two people, who independently judged the similarity of each pair on a 5-point scale (1 = clearly different to 5 = alike). Reliability of this similarity judgment was ICC(l,2) = .66, sufficient to indicate the potential value of this approach. When EARS ratings were divided into those ostensibly aimed at similar targets (judged at least 4 = “more alike than different”) and those ostensibly aimed at different targets (less than 4), the interrater reliabilities of the EARS ratings were very similar in both groups (and similar to the overall reliabilities shown in Table 1). Accordingly, we did not delete ratings based on low target similarity. The mean standardized ratings for the specificity, internality, in-session distress, reported distress, richness of understanding, the openness of the negotiation scales were acceptably reliable, though the latter two were only minimally so. The figures in Table 1 represent the mean of three raters work; scores based on a single rater’s work would not have been acceptably reliable. Even after standardizing and averaging, the summary APES rating’s interrater reliability was weak (Table l), and we dropped it from further analyses in this paper (but see Stiles, Shankland, Wright, & Field, in press). In part, this low reliability may have reflected the theoretically expected restricted range of the APES among presenting problems. Its reliability might have been improved by better conceptual preparation of raters, including more discussion of categorical distinctions on the APES. However, previous work has suggested that reliable APES ratings may require much more familiarity with each case than was available to these raters (Field, Barkham, Shapiro, & Stiles, 1994). EARS INTER-SCALE RELATIONSHIPS
To assess the interrelationships among the 6 numerical rating scales of the EARS, we conducted a principal components analysis of the mean standardized ratings, followed by varimax rotation. Three factors were extracted (all with eigenvalues greater than l.O), accounting for 77.5% of the total variance. The varimax-rotated factor matrix (Table 2) showed that the two scales representing cognitive and interpersonal resources-richness of understanding, and openness of the negotiation-along with the specificity scale, loaded highly on the first factor. In-session and reported distress loaded highly on the second factor, while internality loaded only on a single-item third factor. Based on the factors shown in Table 2, we combined the in-session and reported distress scales into a single distress scale and we combined the richness of understanding and openness of the negotiation scales into a single resources scale. Each combined scale was the mean of the two constituent standardized ratings. Internal consistencies (alpha) were .64 and .65, respectively; interrater reliabilities--ICC(1,3)-were -79 and .61, respectively. We retained internality as a separate scale. We also retained specificity as a separate scale, even though
166
EARLY ASSIMILATION RESEARCH SCALE
Table 2.
Varimax Rotated Factor Matrix of EARS Scales Factor name
EARS scale
Resources
Distress
Internality
Specificity Internality In-session distress Reported distress Rich understanding Openness of negotiation
.73 .04 .06
.25 .03
.10
.87' - .06 .16
- .40 .94* .07 - .05 .01 .34
.88'
.86* .75*
Note: Based on ratings of first sessions (N = 112). EARS = Early Assimilation Research Scale. Ratings of the EARS scale were standardized within rates and then averaged across raters (n = 3 per session) before they were intercorrelated. 'Scale included in factor index (see text).
it loaded highly on the resources factor. As described earlier, specificity is conceptually distinct-a characteristic of a particular problematic experience, rather than a broader characteristic of the client. Although it may have been correlated with the resources scales over the range observed in this study, it is theoretically expected to diverge over other parts of the range (e.g., at very low levels of assimilation, specificity is may be high when resources are low). EARS RELATIONS WITH SYMPTOM INTENSITY AND SESSION IMPACT
In comparisons of the mean standardized EARS ratings with the intake BDI, SCL-90R, SE, and IIP, the internality scale showed significant positive correlations'with all four assessment measures (Table 3). That is, over the range Table 3. Correlations of EARS Scales with Assessment Scores at Intake and with Impact of First Session EARS scale
Assessment or impact scale Intake assessment BDI SCL-90R SE
IIP Impact on Client Depth Smoothness Impact on therapist Depth Smoothness
Specificity
Internality
Distress
Resources
- .04 - .05 - .09 .05
.23' .29" - .30* * * .36* *
.24'
- .30'* .07
.03 .07 -.25** .I6
.02 -.21*
- .02 - .29**
- .05
.03
.oo .10 .06
.30* .09
.26'*
.ll
- .22'
.O9
.21' ,210
Note: N = 107 to 112 because of missing data on some measures. EARS = Early Assimilation Research Scale. Ratings on the EARS scale were standardized within raters and then averaged across raters (n = 3 per session). Distress = mean of In-Session Distress and Reported Distress. Resources = mean of Richness of Understanding and Openness of Negotiation. BDI = Beck Depression Inventory. SCL-90R = Symptom Checklist-90, Revised. SE = Rosenberg Self-Esteem Scale. IIP = Inventory of Interpersonal Problems (total score). Impact dimensions measured by the Session Evaluation Questionnaire, completed by clients and therapists immediately after the first session. * p < .05; * * p< ,011 * * * p< ,001.
STILES, SHANKLAND, WRIGHT, AND FIELD
167
represented by these clients, greater internality was associated with more problems and greater symptom intensity. EARS distress was significantly correlated with the BDI, SCL-90R, and SE, but not with the IIP. Somewhat paradoxically, the EARS scale representing cognitive and interpersonal resources (richness of understanding and openness of the negotiation) showed a modest but significant negative correlation with SE. The specificity scale was not significantly correlated with any of these intake assessment measures. Correlations of the EARS scales with measures of the first session’s impact (Table 3) showed, plausibly, that clients rated high on distress had sessions that both they and their therapists rated as relatively rough (uncomfortable, tense; shown by negative correlations with SEQ smoothness). Clients whose problems were rated high in internality also tended to judge their sessions as relatively rough; however, therapists tended to rate first sessions with these clients as relatively deep (powerful and valuable). The EARS resources scale was also modestly but significantly correlated with therapists’ SEQ depth and smoothness. In considering the modest size of these correlations, it should be recalled that the EARS ratings covered only the first 20 minutes, whereas the SEQ was completed at the end of the hour-long session.
DISCUSSION
The EARS offers a quantification of major dimensions of psychotherapy clients’ first communications with their therapists. Our qualitative, iterative procedure for identifying dimensions and constructing scales yielded an instrument that appears promising both psychometrically and conceptually. Although they were developed based on our interest in assimilation of the client’s problematic experiences, the identified dimensions deal with more general issues of affect, cognitive resources, and causal attribution. Thus, the EARS may have application beyond research on the assimilation model. Specificity was associated with the resources factor (Table 2), but we retained it as separate for reasons noted earlier. It had no significant correlations with the symptom intensity or session impact measures (Table 3). Nevertheless, we thought that its good interrater reliability (Table 1) and its prior links to psychopathological mechanisms (Abramson et al., 1978; Williams & Dritschel, 1988; Williams et al., 1996) made it worth retaining for future research. Internality also showed good reliability (Table l), and it was largely independent of other EARS scales (Table 2). Its correlations with measures of problems and symptom intensity at intake (Table 3) and with clients’ experience of sessions as low in smoothness (i.e., rough and difficult) are consistent with the cognitive theories linking attribution of problems to internal causes with depression and interpersonal difficulties (Abramson et al., 1978; Brewin, 1988). On the other hand internality’s association with the therapists’ ratings of session depth (power, value; Table 3), could suggest that the therapists considered internal attributions to be therapeutically productive. The failure to find parallel associations with clients’ ratings is consistent with a well-documented lack of correspondence between client and therapist ratings of session depth and value (Dill-Standiford, Stiles, & Rorer, 1988; Stiles, 1980). The distress scale’s significant, though modest, correlations with three of
168
EARLY ASSIMILATION RESEARCH SCALE
the four measures of symptom intensity at intake and the negative correlation of EARS in-session distress with SEQ smoothness as rated from both client and therapist perspectives (Table 3) support the distress subscales’ construct validity. Separately, the reliability of in-session distress was good, whereas the reliability of reported distress was weaker (Table I), perhaps because the former relied on direct judgments of tape-recorded events, whereas the latter depended on inferences from clients’ descriptions. The finding that most of the high-distress clients were women raised the possibility that gender-related norms tended to suppress men’s expressions of current distress in therapy, at least during the first 20 minutes. Curiously, the resources scale was modestly but significantly negatively correlated with SE, suggesting that overt indications of these resources in the first session were slightly linked with a poorer self-concept, though not with the more direct symptoms of depression measured by the BDI and SCL-9OR (Table 3). Perhaps the greater psychological mindedness implicit in the EARS ratings of richness of understanding and openness to negotiation was reflected in a slight tendency toward self-criticism that lowered SE scores. More straightforwardly, the correlations with therapists’ ratings of session depth and smoothness (Table 3) suggest that therapists responded positively to sessions in which clients revealed high levels of these resources. Clinical implications. The EARS appears promising for elucidating the initial stages of therapy, and it could have practical implications for choosing therapeutic strategies. The dimensions of specificity, internality, distress, and resources appear salient and discriminative and might usefully be included in therapists’ conceptual repertoire for understanding their clients’ initial presentations. Eventually, according to the assimilation model, an assessment of the level of assimilation of clients’ presenting problems might usefully guide clients with poorly assimilated problems toward exploratory (e.g., experiential or psychodynamic) treatments and clients with more formulated problems toward prescriptive (e.g., cognitive or behavioral) treatments (Reynolds et al., 1996; Stiles, Barkham, et al., 1992; Stiles et al., in press). Limitations. The EARS is currently a research technique, not a clinical assessment instrument. Much further work will be required before it can be used for guiding clients to treatments. In addition, our results’ external validity is limited by the demographic and diagnostic restrictions of the sample (professional and managerial workers diagnosed with major depression). As a final caution, one should not expect EARS scales to be significant linear predictors of treatment outcome (EARS-outcome correlations, not reported, were consistent with this null expectation). In the first place, as described in the Measure Development section, the expected relations of EARS dimensions with degree of assimilation are not linear. For example, distress is expected to be most intense at the stage of vague awarenesdemergence and to be less intense at both lower an higher stages of assimilation. More subtly, but perhaps more importantly, therapist responsiveness to the qualities measured by the EARS can be expected to defeat linear outcome predictions. To the extent that therapists respond differentially and appropriately to clients’ specificity, internality, distress, and resources, they neutralize any predictive power that the dimensions might have held in an imaginary world where all clients were treated identically (or randomly) regardless of their presenting problems (Stiles, 1988; Stiles, Honos-Webb, & Surko, 1996).
STILES, SHANKLAND, WRIGHT, AND FIELD
169
REFERENCES Abramson, L. Y., Seligman, M. E. P., & Teasdale, J. D. (1978). Learned helplessness in humans: Critique and reformulation. Journal of Abnormal Psychology, 84, 49-74. Alvarez, A. (1992). Live company: Psychoanalytic psychotherapy with autistic, borderline, deprived and abused chikiren. London: Routledge. Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4, 561-571. Brewin, C. R. (1988). Cognitivefoundations of clinical psychology. London: Lawrence Erlbaum Associates. Coltart, N.(1987). Diagnosis and assessment for suitability for psychoanalytical psychotherapy. British Journal of Psychotherapy, 4, 127-134. Cross, D. G., & Warren, C. E. (1984). Environmental factors associated with continuers and terminators in adult out-patient psychotherapy. BritishJournal of Medical Psychology, 57, 363-369. Derogatis, L. R. (1983). SCL-9OR: Administration, scoring andprocedures manual &for the revised version. Towson, MD: Clinical Psychometric Research Inc. Dill-Standiford, T. J., Stiles, W. B., & Rorer, L. G. (1988). Counselor-client agreement on session impact. Journal of Counseling Psychology, 35, 47-55. Field, S. D., Barkham, M., Shapiro, D. A,, & Stiles, W. B. (1994). Assessment of assimilation in psychotherapy: A quantitative case study of problematic experiences with a significant other. Journal of Counseling Psychology, 41, 397-406. Gaines, T., & Stedman, J. M. (1981). Factors associated with dropping out of child and family treatment. American Journal of FamiIy Therapy, 9, 45-5 1. Gelso, C. J., & Carter, J. A. (1985). The relationship in counseling and psychotherapy. The Counseling Psychologist, 13, 155-244. Goldman, G. D., & Milman, D. S. (1978). PSYchoanalytic psychotherapy. Reading, MA: Addison-Wesley. Gomes-Schwartz, B. (1978). Effective ingredients in psychotherapy: Prediction of outcome from process variables. Journal of Consulting and Clinical Psychology, 46, 1023- 1035. Hartley, D. E., & Strupp, H. H. (1983). The therapeutic alliance: Its relationship to outcome in brief psychotherapy. In J. Masling (Ed.), Empirical studies in analytic theories,
vol. 1 (pp. 1-37). Hillsdale, NJ: Erlbaum. Heider, F. (1958). The psychology of interpersonal relations. New York: Wiley. Hobson, R. F. (1985). Forms of feeling: The heart of psychotherapy. London: Tavistock Press. Horowitz, L. M., Rosenberg, S. E.,Baer, B. A , , Ureno, G., & Villasenor, V. S. (1988). Inventory of Interpersonal Problems: Psychometric properties and clinical applications. Journal of Consultingand Clinical Psychology, 56, 885-892. Horvath, A. O., & Greenberg, L. (1986). The development of the Working Alliance Inventory. In L. Greenberg & W. Pinsof (Eds.), The psychotherapeuticprocess: A resource handbook (pp. 529-556). New York: Guilford. Horvath, A. O., & Symonds, B. D. (1991). Relationship between working alliance and outcome in psychotherapy: A meta-analysis. Journal of Counseling Psychology, 38, 139-149. Karasu, T. B. (1986). Specifity versus nonspecifity. AmericanJournal of Psychiatry, 143, 687-695. Kokotovic, A. A., & Tracey, T. J. (1990). Working alliance in the early phase of counseling. Journal of Counseling Psychology, 37, 16-21. Lowman, R. L., DeLange, W. H., Roberts, T. K., & Brady, C . P. (1984). Users and “teasers”: Failure to follow through with initial mental health services inquiries in child and family treatment center. Journal of Community PSyChOlOgy, 12, 253-262. Luborsky, L., Crits-Cristoph, P., Alexander, L., Margolis, M., & Cohen, M. (1983). Two helping alliance methods for predicting outcomes of psychotherapy: A counting signs vs. a global rating method. Journal of Nervous and Mental Disease, 171, 480-491. Main, M., & Goldwyn, R. (in press). Adult attachment scoring and classification system. In M. Main (Ed.), Assessing attachment througb discourse, drawing, and reunion situations. New York: Cambridge University Press. Malan, D. H. (1982). Individualpsychotherapy and the science ofpsychodynamics. London: Butterwonhs. Morgan, R., Luborsky, L., Crits-Cristoph, P., Curtis, H., & Solomon, J. (1982). Predicting the outcome of psychotherapy by the Penn Helping Alliance Rating Method. Archives of General Psychiatty, 39, 397-402. O’Malley, P. M., & Bachman, J. G. (1979). Self-esteem and education: Sex and cohort comparisons among high school seniors.
170
EARLY ASSIMILATION RESEARCH SCALE
Journal of Personality and Social Psychology, 37, 1153-1159. Peterson, C., Semmel, A,, von Baeyer, C., Abramson, L. Y., Metalsky, G. I., & Seligman, M. E. P. (1982). The attributional style questionnaire. Cognitive Therapy and Research, 6, 287-300. Prochaska, J. O., & DiClemente, C. C. (1984). The transtheoretical approach: Crossing the boundaries of therapy. Homewood, IL: Down Jones-Irwin. Reynolds, S., Stiles, W. B., Barkham, M., Shapiro, D. A., Hardy, G. E., & Rees, A. (1996). Acceleration of changes in session impact during contrasting time-limited psychotherapies. Journal of Consulting and Clinical PSyChOlOgy, 64, 577-586. Ryle, A. (1990). Cognitive-analytic therapy: Active participation in change. A new integration in brief psychotherapy. Chichester, UK: John Wiley & Sons. Shapiro, D. A., Barkham, M., Rees, A,, Hardy, G. E., Reynolds, S., & Startup, M. J. (1994). Effects of treatment duration and severity of depression o n the effectiveness of cognitivehehavioral and psychodynamichnterpersonal psychotherapy. Journal of Consulting and Clinical Psychology, 62, 522-534. Shapiro, D. A., & Firth, J. (1987). Prescriptive vs. Exploratory psychotherapy: Outcome of the Sheffield Psychotherapy Project. British Journal of Psychiatry, 151, 790-799. Shapiro, R. J., & Budman, S. H. (1973). Defection, termination, and continuation in family and individual therapy. Family Process, 12, 55-67. Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86, 420-428. Startup, M. J., & Shapiro, D. A. (1993). Therapist treatment fidelity in Prescriptive vs. Exploratory psychotherapy. British Journal of Clinical Psychology, 32, 443-456. Stiles, W. B. (1980). Measurement of the impact of psychotherapy sessions. Journal of Consulting and Clinical Psychology, 48, 176-185. Stiles, W. B. (1988). Psychotherapy processoutcome correlations may be misleading. Psychotherapy, 25, 27-35. Stiles, W. B., Barkham, M., Shapiro, D. A., & Firth-Cozens, J. (1992). Treatment order and thematic continuity between contrasting psychotherapies: Exploring an implication
of the assimilation model. Psychotherapy Research, 2, 112-124. Stiles, W. B., Elliott, R., Llewelyn, S. P., FirthCozens, J. A., Margison, F. R., Shapiro, D. A,, & Hardy, G. (1990). Assimilation of problematic experiences by clients in psychotherapy. Psychotherapy, 27, 411-420. Stiles, W. B., Field, S., Shankland, M. C., & Wright, J. (1993). Early assimilation research scales: Definitions and descriptions. PTRC Memo No. 191. Psychological Therapies Research Centre, University of Leeds, United Kingdom. Stiles, W. B., Honos-Webb, L., & Surko, M. (1996). Responsiveness in psychotherapy. Manuscript submitted for publication. Stiles, W. B., Meshot, C. M., Anderson, T. M., & Sloan, W. W., Jr. (1992). Assimilation of problematic experiences: The case of John Jones. Psychotherapy Research, 2, 81-101. Stiles, W. B., Morrison, L. A,, Haw, S. K., Harper, H., Shapiro, D. A,, & Firth-Cozens, J. (1991). Longitudinal study of assimilation in exploratory psychotherapy. Psychotherapy, 28, 195-206. Stiles, W. B., Reynolds, S., Hardy, G. E., Rees, A., Barkham, M., & Shapiro, D. A. (1994). Evaluation and description of psychotherapy sessions by clients using the Session Evaluation Questionnaire and the Session Impacts Scale. Journal of Counseling Psychology, 41, 175-185. Stiles, W. B., Shankland, M. C., Wright, J., & Field, S. D. (in press). Aptitude-treatment interactions based on clients’ assimilation of their presenting problems. Journal of Consulting and Clinical Psychology. Ward, A. (1987). Design archetypes from group processes. Design Studies, 8, 157-169. Watts, F. N. (1992). Applications of current cognitive theories of the emotions to the conceptualization of emotional disorders. British Journal of Clinical Psychology, 31, 153-168. Williams, J. M. G., & Dritschel, B. (1988). Emotional disturbance and the specificity of autobiographical memory. Cognition and Emotion, 2, 221-234. Williams,J. M. G., Ellis, N. C., Tyers, C., Healy, H., Rose, G., & Macleod, A. K. (1996). The specificity of autobiographical memory and imageability of the future. Memory and Cognition, 24, 116-125. Wolberg, L. (1977). The technique of psychotherapy. New York: Grune & Stratton.
Zusammenfassung In diesem Beitrag wird die Entwicklung der ‘early assimilation research scale’ (EARS) dargestellt, die der Erfassung der Problemprasentation von Klienten wahrend der ersten 20 Minuten ihrer ersten Therapiesitzung dient. Mit Hilfe eines iterativen, gruppengestutzten Vorgehens, in dem alterniert
STILES, SHANKLAND, WRIGHT, AND FIELD
171
wurde zwischen dem Anhoren von Aufnahmen erster Therapiesitzungen und Diskussionen der Zuhorer, wurden folgende Dimensionen der Problemprasentation identifiziert: Spezifitat, Internalitat, sitzungsbedingte Belastungen, berichtete Belastungen, Breite des Verstandnisses und Offenheit des Aushandelns. Die Anwendung der Skala auf die Daten aus einer umfangreichen Therapievergleichsstudie ergab eine akzeptable Beurteileriibereinstimmungund erste, vielversprechende Hinw eise auf die Konstruktvaliditat. Letztere stiitzen sich auf Korrelationen der EARS-Subskalen mit Erhebungen der Symptomauspragung und den Auswirkungen der ersten Therapiesitzung.
Resume Nous rapportons le developpement de l’echelle de recherche d’assimilation initiale (EARS) une mesure de la presentation de la problematique des clients durant les 20 premieres minutes de leur premiere seance de psychotherapie. En utilisant un procede ittratif base sur le groupe dans lequel nous passons de l’kcoute des enregistrements de la premi2re seance A la discussion de nos comprehensions, nous avons identifie les dimensions suivantes de presentation des clients: la specificite, l’internalite, la detresse en seance, la detresse rapportte, la richesse de comprehension et l’ouverture des negociations. Une application de la EARS aux donnees d’un projet de recherche sur une psychotherapie largement comparative donna une instruction et une fidelite et entre therpeutes acceptable, et promettant une evidence de validitt de construction, b a k e sur les correlations de la EARS avec des mesures de l’intensite des symptbmes et l’impact de la premiere seance.
Received November 15, 1 9 9 5 Revisions Received May 10, 1 9 9 6 Accepted May 28, 1 9 9 6