The Ecological Validity of Clinical Tests of Memory ... - Semantic Scholar

16 downloads 0 Views 138KB Size Report
Subtests from the TEA (Robertson et al., 1994) that were used are: Elevator. Counting with Distraction, Visual Elevator, and Elevator Counting with Reversal. In.
Archives of Clinical Neuropsychology, Vol. 15, No. 3, pp. 185–204, 2000 Copyright © 2000 National Academy of Neuropsychology Printed in the USA. All rights reserved 0887-6177/00 $–see front matter

PII S0887-6177(99)00004-9

The Ecological Validity of Clinical Tests of Memory and Attention in Multiple Sclerosis Christopher I. Higginson, Peter A. Arnett, and William D. Voss Washington State University

Ecological validity—the degree to which clinical tests of cognitive functioning predict functional impairment—has recently become an area of interest in neuropsychology. The current study used a sample of 31 cognitively and functionally impaired multiple sclerosis (MS) patients to determine if tests commonly used to assess memory and attentional functioning in MS are ecologically valid. Two methods of improving the ecological validity of cognitive testing were employed. Stepwise multiple regression analyses suggested that tests of memory and attention more analogous to everyday tasks are better predictors of functional impairment in MS than both standard clinical tests of memory and attention, and memory questionnaires completed by the patient or a significant other. Nonetheless, both standard clinical tests and more ecologically valid tests significantly predicted functional impairment. Importantly, they were not significantly correlated with one another, suggesting that the inclusion of both types of tests in evaluating MS patients is warranted. © 2000 National Academy of Neuropsychology. Published by Elsevier Science Ltd Keywords: ecological validity, multiple sclerosis, memory, attention

The ability of clinical tests of cognitive functioning to predict impairment in everyday living, or functional impairment, has recently become an area of interest in neuropsychology (Sbordone & Long, 1996). This characteristic of tests is referred to as ecological validity, and is defined as the “. . . functional and predictive relationship between the client’s performance on a set of neuropsychological tests and the client’s behavior in a This study was part of a doctoral dissertation completed by Christopher I. Higginson at Washington State University under the supervision of Peter A. Arnett, PhD. The research was also presented in part at the 26th Annual Meeting of the International Neuropsychological Society, Honolulu, HI, February, 1998. Special thanks to the many neurologists in the Inland Northwest who contributed their time to verifying MS diagnoses and ratings of course for the MS participants in the project. We would especially like to thank Drs. William Bender, John Wurst, and Jon Tippin, who were instrumental in helping us recruit participants for the project. We would also like to thank Doreen Evans, Diane Wicks, and Lorri Bays for their help with the project in the Spokane area. Additionally, we thank John Randolph, William Porter, Jennifer Geier, Eman Ziada, and Cindy Eason for their help with various aspects of the project. Finally, we express our gratitude to the MS participants and their significant others who generously contributed their time to helping us better understand the nature of multiple sclerosis. Address correspondence to: Christopher Higginson, Department of Neurology, University of Kansas Medical Center, 3901 Rainbow Boulevard, Kansas City, KS 66160-7314; E-mail: [email protected]

185

186

C. I. Higginson, P. A. Arnett, and W. D. Voss

variety of real-world settings (e.g., at home, work, school, community, etc.)” (Ginsberg, Kibby, & Long, 1995). Recent interest in ecological validity is due, in part, to the changing role of the clinical neuropsychologist, which now involves greater emphasis on prediction of functional deficits in social and occupational settings. Surprisingly, little research has addressed the degree to which neuropsychological testing is ecologically valid (Nussbaum, Goreczny, & Haddad, 1995), partially because of a widely held assumption that test scores can predict real-world functioning (Sbordone, 1996). Wilson (1993) states “. . . it remains true that these indexes do not give us sufficient detail to be able to predict what kinds of everyday problems are likely to be faced, nor do they tell us much about the nature and frequency of the problems” (p. 209). Heinrichs (1990) suggests three possible explanations for these limitations (p. 173): (a) the tests commonly used in clinical settings do not access the knowledge specific to home and work activities because they are too abstract and/or general; (b) the tests are sampling an incorrect or incomplete subset of skills on which everyday tasks depend; and (c) insufficient attention is paid to the role of the environment in the expression of disability. Another explanation is that the testing situation, which is designed to optimize performance and eliminate confounding variables (e.g. noise, methods of compensation), is unlike the real world, where multiple aspects of the environment influence performance (Acker, 1990; Franzen & Wilhelm, 1996; Sbordone, 1996). Multiple sclerosis (MS) is a disease for which ecological validity may be especially pertinent. Because of the relatively early onset in the life span and long duration of the course of MS, patients typically experience many years of functional disability. Therefore, any way in which neuropsychologists can enhance their ability to predict functional disability in MS patients is likely to be beneficial for treatment planning and rehabilitation. Also, MS involves multiple neurological, psychological, and neuropsychological symptoms, which are extremely variable both within and between patients. Given the heterogeneity and unpredictability of MS symptomatology, thorough assessment to aid in the prediction of functional impairment in a particular patient is of utmost importance. Disease-related factors that have been associated with functional impairment in MS are: Longer duration of illness, greater level of physical disability, fewer years of education (LaRocca, Kalb, Scheinberg, & Kendall, 1985), greater depression (Voss, Arnett, Stevens, Rao, & Bernardin, 1995), and greater cognitive dysfunction (Amato et al., 1995; Beatty, Blanco, Wilbanks, Paul, & Hames, 1995; Kessler, Cohen, Lauer, & Kausch, 1992; Rao, Leo, Ellington, et al., 1991). Kessler et al. (1992) concluded that previous studies that failed to find an association between cognitive dysfunction and functional disability used inadequate measures of functional disability, such as the Kurtzke Expanded Disability Status Scale (EDSS; Kurtzke, 1983), which are not associated with higher cortical functioning (e.g., Iwasaki, Kinoshita, Ikeda, & Takamiya, 1989). Fatigue is also identified by MS patients as one of the most disruptive symptoms on a daily basis (Antonak & Livneh, 1995; Cella et al., 1996; Murray, 1995), but has yet to be associated with functional impairment empirically. Although a number of studies suggest that cognitive impairment is associated with functional disability in MS, the majority have used global indices of cognitive impairment that combine the results of multiple specific tests. Thus, these studies fail to speak to the ecological validity of specific, individual measures. Two exceptions to this are Kessler et al. (1992), who addressed memory and not other cognitive domains, and Beatty et al. (1995), who looked specifically at occupational status and not functional ability generally. Summarizing the current situation in a recent review of the literature, Fischer et al. (1994) concluded that we do not know for certain the degree to which cognitive dysfunction in MS impacts everyday functioning, and suggested more research is needed.

Ecological Validity of Clinical Tests in MS

187

Two approaches have been described as methods of improving the ecological validity of neuropsychological tests: (a) combining neuropsychological measures with information from behavioral observations, rating scales, and self-report measures; and (b) developing new tests more analogous to everyday behaviors (Franzen & Wilhelm, 1996; Ruisel, 1991; Wilson, 1993). The most commonly used and highly developed examples of the first approach to improve ecological validity are memory questionnaires/rating scales, such as Sunderland, Harris, and Baddeley’s (1983) Everyday Memory Questionnaire, which rate the frequency and/or severity of real-life memory problems. Ruisel (1991), in summarizing the literature, describes memory rating scales as having satisfactory reliability but relatively low validity (i.e., weak correlations with performance measures) when completed by the patient. Larrabee and Crook (1996) agree that self-rating memory scales are not powerful predictors of actual memory performance, but add that newer scales have shown better validity. The lack of association with performance may be due to the fact that the abilities assessed by the performance measures are unlike those used in everyday life. Indeed, Sunderland et al. (1983) cited studies that found greater correlations when the rating scales were compared to more ecologically valid memory tests (Hermann, 1979; Shlechter, Hermann, Sronach, Rubenfeld, & Zenker, 1982). Interestingly, correlations with performance measures significantly increase when the questionnaires are completed by significant others (Broadbent, Cooper, FitzGerald, & Parkes, 1982; Sunderland et al., 1983). A review of the literature revealed only three studies that used memory questionnaires in an MS sample. McIntosh-Michaelis et al. (1991) found that 44% of their MS sample self-reported memory impairment, compared to 54% of relatives who reported observing memory impairment in the patients. Similarly, Beatty and Monson (1991) found that many MS patients underestimated their memory difficulties on a memory questionnaire. Taylor (1990) found an insignificant correlation between cognitive performance and a patient-completed memory questionnaire that became significant when completed by an informant. Two examples of the second approach to increasing ecological validity (i.e., by developing new performance tests) are the Rivermead Behavioral Memory Test (RBMT; Wilson, Cockburn, & Baddeley, 1985) and the Test of Everyday Attention (TEA; Robertson, Ward, Ridgeway, & Nimmo-Smith, 1994). The RBMT provides analogues of everyday memory situations that were chosen on the basis of head-injured patients’ reported and observed memory problems (Sunderland et al., 1983, in Wilson et al., 1985). Performance on the RBMT correlated with therapist/observer memory failures and predicted independence after 5 years, whereas the Wechsler Memory Scale-Revised did not (Wilson, 1993). The TEA includes normed measures of selective attention, sustained attention, attentional switching, and divided attention that utilize “familiar everyday materials” (Robertson et al., 1994, p. 4). The predictive validity of the TEA has yet to be established, but it has been found to have high test-retest reliability and to correlate with other measures of attention (Robertson, Ward, Ridgeway, & Nimmo-Smith, 1996). Although no studies were found that used the TEA with an MS sample, one was found that used the RBMT; MS participants had the most difficulty on subtests involving remembering a person’s name after a delay and remembering to ask for a personal item previously given to the test administrator (McIntosh-Michaelis et al., 1991). The purpose of the current study was to explore empirically whether memory questionnaires and tests of memory and attention developed with ecological validity in mind, such as the RBMT and TEA, are indeed better predictors of functional disability than standard clinical measures of memory and attention commonly used in assessing MS patients. Memory and attention will be addressed because these are the areas of cognitive functioning most frequently impaired in MS, and because tests of these areas have been

188

C. I. Higginson, P. A. Arnett, and W. D. Voss

developed with ecological validity in mind. Unlike previous studies that addressed cognitive dysfunction in general, and its association with functional disability (e.g., Amato et al., 1995; Beatty et al., 1995; Kessler et al., 1992; Rao, Leo, Ellington, et al., 1991), individual measures of cognitive function and their association with functional disability were compared. The study by Kessler et al. (1992) will be extended by using measures of attention as well as memory, and the study of Beatty et al. (1995) will be extended by measuring multiple areas of functional ability, not just occupational status. Also, our study was designed to improve on past research by using a measure of functional disability that addresses activities of daily living (ADLs) one would expect to be influenced by cognitive dysfunction specifically, rather than those that focus on basic motor and sensory functions. It is hypothesized that: (a) a global index of cognitive functioning based on the TEA and RBMT, the measures developed with ecological validity as a paramount goal, will be a better predictor of functional status than a global measure of cognitive functioning based on commonly used clinical measures of memory and attention; (b) summary scores from the RBMT and TEA will each predict functional status better than summary scores from standard clinical memory and attentional tests, respectively; (c) a memory questionnaire completed by a significant other will predict RBMT scores but not standard clinical memory test scores; (d) the patient memory questionnaire will not correlate with any of the memory tests; and (e) the RBMT will predict functional status better than either memory questionnaire.

METHOD Participants Thirty-one cognitively and functionally impaired Caucasian MS patients who met Poser et al. (1983) criteria for definite or probable MS (determined by board-certified neurologists) were recruited from outpatient neurology clinics and a local MS Society in the Northwestern United States. Letters were sent to neurologists’ MS patients and an ad was placed in the local MS Society newsletter informing MS patients of the opportunity to participate in a longitudinal study of the relationship between cognitive, psychological, and physical symptoms of MS. Those who expressed an interest in the study and who met inclusionary criteria were administered a structured telephone interview and scheduled for testing. Participants were excluded if they: (a) had a history of drug or alcohol abuse or nervous system disorder other than MS, (b) had severe motor or visual impairment that might interfere with cognitive testing, or (c) could not be evaluated on an outpatient basis. Furthermore, participants were excluded from data analysis if they did not exhibit some degree of cognitive impairment (defined as performance below the 16th percentile of the sample on more than one of the neuropsychological measures described below) and functional disability (defined as Environmental Status Scale (Holland, Francabandera, & Wiesel-Levison, 1986) score greater than 0). No participant was experiencing an exacerbation of symptoms at the time of testing. All participants were provided with a written neuropsychological screening evaluation and verbal feedback in return for their participation, gave informed consent according to institutional guidelines, and were treated in accordance with the ethical standards of the American Psychological Association. Subject and illness characteristics are summarized in Table 1. Estimated IQ is based on Shipley Institute of Living Scale total score (Shipley, 1940). Board-certified neurologists rated certainty of diagnosis according to Poser et al. (1983) criteria, and diagnostic type according to recently published Neurology criteria (Lublin & Reingold, 1996). Seventeen participants had a relapsing-remitting course, 11 had a secondary-progressive

Ecological Validity of Clinical Tests in MS

189

course, 1 had a primary-progressive course, and 2 had a progressive-relapsing course. Patients were also rated on the EDSS (Kurtzke, 1983). Materials Neuropsychological Tests The California Verbal Learning Test. The California Verbal Learning Test (CVLT; Delis, Kramer, Kaplan, & Ober, 1987) is a measure of verbal learning and memory. A shopping list of 16 items from four categories (spices and herbs, fruits, clothing, tools) is presented five times, each time followed by testing of immediate free recall. After the five trials, a second list is presented followed by immediate free recall testing. Then, free and cued recall of the first list is tested, followed by a 20-minute delay, free and cued recall, and recognition testing. Because recall appears to be more impaired than recognition in MS (Thornton & Raz, 1997), and because Kessler et al. (1992) found total correct for the first five trials (List A Total), free recall after the interference trial (Short-Delay Free Recall), and free recall after the 20-minute delay (Long-Delay Free Recall) to be most predictive of functional status, these indices were analyzed. The 7/24 Spatial Recall Test. The 7/24 Spatial Recall Test (Rao, Hammeke, McQuillen, Khatri, & Lloyd, 1984) is a nonverbal analogue of the CVLT. The examinee is presented a design of seven black circles on a checkerboard with 24 spaces (6 ⫻ 4) for 10 seconds and is required to arrange a set of black checkers in the same pattern as the design shown to them. After five trials are presented, or the occurrence of two consecutively correct trials, a second design is presented and immediate free recall for it tested. Then free recall testing for the first pattern is conducted followed by delayed free recall testing for the first design after a 30-minute delay. The indices analogous to those analyzed in the CVLT were included. The Symbol Digit Modalities Test. The Symbol Digit Modalities Test (Smith, 1968) is similar to Wechsler’s Digit Symbol subtest, except that the participant responds with numbers instead of symbols so that a verbal response is possible (thus eliminating the significant motor-writing component present when subjects are asked to write responses). The subject is provided with a key consisting of nine simple symbols, each of which is paired with a single digit ranging from one to nine. Below the key is a random list of the symbols without the numbers. The subject is instructed to respond verbally to each of the symbols, in order, with the correct number paired with it in the key. The test is timed TABLE 1 Sample Demographic and Clinical Characteristics (N = 31) M (SD)

Age (years) Education (years) Estimated IQ EDSS Disease duration (years since diagnosis) BDI Fatigue Impact Scale Environmental Status Scale

46.7 (7.6) 15.0 (2.4) 107.5 (10.7) 5.0 (1.5) 8.7 (6.0) 13.4 (8.4) 76.0 (36.0) 9.4 (5.5)

Note. Estimated IQ based on Shipley Institute of Living Scale total score. EDSS = Expanded Disability Status Scale; BDI = Beck Depression Inventory.

190

C. I. Higginson, P. A. Arnett, and W. D. Voss

and the key is kept in sight. Total correct responses in 90 seconds was the dependent variable. Paced Auditory Serial Addition Test. In the version of the Paced Auditory Serial Addition Test (PASAT; Gronwall, 1977) that we used (Rao, Leo, Haughton, St. Aubin-Faubert, & Bernardin, 1989), two strings of single-digit numbers were presented and participants were required to add each successive digit to the one presented immediately prior to it. Digits were presented every 3 seconds with the first string (easy) and every 2 seconds with the second string (hard). The task was administered using a sound-card– equipped computer, whereby the stimuli were presented and the task was described via the ForThought computer program (PASAT: User and Technical Manual, 1994) using digit strings from Rao et al.’s (1989) study. The dependent variable for each string was total number of correct additions out of 60 possible. Because of the high correlation between scores on the two presentation rates (r ⫽ .89, p ⬍ .0001), they were combined into a single index by summing the raw scores for each string. RBMT. Subtests from the RBMT (Wilson et al., 1985) that were included in the analyses were First and Second Name (Name), Belonging, Appointment, Story-Immediate and Story-Delayed. Name involves remembering the name of a person presented in a photograph at a short delay of approximately 20 minutes. Belonging is a measure of prospective memory where participants must remember to ask for an object of theirs that was taken from them earlier. They must also state the location of the object. Appointment is also a prospective memory test; the participant must ask a certain question (“When are you going to call me about the test results?”) when a buzzer set for 20 minutes rings. Story involves memory for a brief prose passage tested in a free recall format immediately after presentation (Story-Immediate) and at a 30-minute delay (Story-Delayed). Three subtests of the RBMT involving orientation and short-term memory were excluded from data analysis based on early findings of lack of variability due to ceiling effects. Additionally, two subtests were excluded because they required participants to walk around the testing room, a difficult task for nonambulatory patients. Conceivably, this task could have been accomplished in nonambulatory patients based on the observation that half of the patients in the original RBMT validation study were wheelchairbound; however, we decided not to include it anyway because of the difficulty the task would have entailed to the patients and our fear that requiring them to engage in such a physically difficult task would have undermined rapport for the rest of the extensive test battery administered. A composite score (RBMT-TOTAL) was calculated as the sum of the “Standardized Profile Score” (SPS) for each subtest included. The SPS was developed by the RBMT’s authors to equate the subtests, allow for comparisons between subtests, and provide a total score. It ranges from 0 to 2. TEA. Subtests from the TEA (Robertson et al., 1994) that were used are: Elevator Counting with Distraction, Visual Elevator, and Elevator Counting with Reversal. In the Elevator Counting with Distraction subtest, participants are asked to pretend that they are in an elevator that has an inoperative floor indicator. By counting a series of taperecorded tones that have a low pitch, and ignoring higher-pitched tones, participants can determine the floor on which they end up. It was designed as a test of auditory selective attention. Visual Elevator is a timed test involving participants determining the floor on which a visually presented elevator is located. The elevator starts on the first floor and moves one floor each time it is presented. Intermittently presented arrows indicate a change in direction. All floors and arrows are presented on a single page for each of

Ecological Validity of Clinical Tests in MS

191

10 trials. It is thought to be a measure of attentional switching, or cognitive flexibility. Visual Elevator provides two scores: a raw total accuracy score and a timing score, which is equivalent to time per switch for correct items. Elevator Counting with Reversal is an auditory analogue of Visual Elevator, which is audiotape-presented at a fixed speed, and thus is not timed. It loads on an auditory-verbal working memory factor. Three subtests of the TEA requiring fine visual acuity were eliminated due to MS patients’ frequent difficulty with vision. Another two, Elevator Counting and Lottery, which involve simple sustained attention, were excluded based on early findings of lack of variability due to ceiling effects. A composite score (TEA-TOTAL) of the four indices included was calculated by summing the age-scaled scores (range, ⫽ 1–19; M ⫽ 10; SD ⫽ 3) for each.

Questionnaires Memory Rating Scale. The Memory Rating Scale (Rao et al., 1984) is an adaptation by Rao of a measure developed by Sunderland et al. (1983), the Everyday Memory Questionnaire. The measure has two parts and was administered to the participant and, where possible, a significant other familiar with the participant’s daily behavior and memory. Part One asks about the patient’s ability to remember day-to-day information, such as appointments, names, bus schedules, etc. Part Two asks about the frequency of certain memory problems, such as forgetting directions and misplacing things. Both parts utilize a 6-point Likert-type scale. Of the 53 items on the Memory Rating Scale, 25 are unchanged from the Everyday memory questionnaire. The Memory Rating Scale was scored so that lower scores are indicative of memory impairment. Beck Depression Inventory. The Beck Depression Inventory (BDI; Beck & Steer, 1987) is a frequently used measure of depression that assesses cognitive/affective and vegetative/somatic symptoms. It includes 21 items, each consisting of four statements reflecting a certain symptom. Patients pick one of the statements from each item that best describes the way that they have been feeling the past week, including the present day. The statements are arranged within the items in order of increasing severity and are assigned point values between 0 and 3, accordingly. The BDI was included as a measure of depression because of the high prevalence of depression in MS (Beason-Hazen, Lynn, Rammohan, & Bornstein, 1995; Gilchrist & Creed, 1994) and because depression has been associated with functional disability (Voss et al., 1995). Fatigue Impact Scale. The Fatigue Impact Scale (Fisk, Pontefract, Ritvo, Archibald, & Murray, 1994) is a 40-item self-report questionnaire assessing the fatigue characteristic of MS. Participants rate the degree to which the items pertain to themselves according to a 5-point Likert-type scale ranging from 0 (no problem) to 4 (extreme problem). Examples of items include “Because of my fatigue I feel less alert” and “Because of my fatigue I feel unable to meet the demands that people place on me.” A measure of fatigue was included because MS patients report fatigue as one of their most disruptive symptoms on a daily basis (Antonak & Livneh, 1995; Cella et al., 1996; Murray, 1995). Other Measures EDSS. The EDSS (Kurtzke, 1983) is the standard measure used to evaluate neurological disability in MS. It was developed to quantify the neurologic exam and uses a scale

192

C. I. Higginson, P. A. Arnett, and W. D. Voss

ranging from 0, indicating a normal neurological exam, to 10, indicating death due to MS. A score of 5.0 is described as “Ambulatory without aid or rest for about 200 meters; disability severe enough to impair full daily activities (e.g., to work a full day without special provisions).” Scores for patients in our study were based on ratings of degree of difficulty (on a 4-point Likert-type scale) experienced in eight neurological categories (limb weakness; tremor and balance; double vision, slurred speech or difficulty swallowing; numbness; elimination; blurred vision; memory, calculation, and reasoning; and muscle stiffness and jerking), ability to walk, and day-to-day functioning. Scale increments are half-points. Ambulation and motor function weigh heavily in the total score. Environmental Status Scale. The Environmental Status Scale (Holland et al., 1986) is a broad measure of functional disability based on an interview with the patient (and, when available, a significant other). It uses a 6-point (0–5) Likert-type scale to assess MS patients’ difficulty performing everyday tasks. Higher scores reflect greater impairment. The scale is composed of seven items: Employment status, financial/economic status, modifications to residence, personal assistance required, ability to use modes of transportation, community assistance required, and social activity. The Environmental Status Scale was chosen because it samples daily activities most likely to be impacted by the cognitive as well as the physical disability of MS. Global indices of cognitive functioning were computed based on all neuropsychological measures included (COGIMP), the measures developed with ecological validity in mind (EV-COG), and the standard clinical measures (SC-COG). Scores on the global indices reflect the number of individual neuropsychological measures “failed,” where failure is defined as performance below the 16th percentile of the sample (about 1 SD below the mean) on the measure. The method used to calculate these indices was much like the procedure followed by Amato et al. (1995), Beatty et al. (1995), and Rao, Leo, Bernardin, and Unverzagt (1991), except in these other studies failure was defined as performance below the fifth percentile of a control group.

Procedure After participants provided informed consent, they were administered a structured psychosocial interview followed by the neuropsychological test battery. Interviewing and testing were conducted by graduate students extensively trained in interviewing and test administration. Approximately half of the participants were administered the RBMT, Symbol Digit Modalities Test, and 7/24 Spatial Recall Test in the morning and CVLT, PASAT, and TEA in the afternoon, whereas the other half of the sample had this order reversed. In this way we attempted to address the possible confounding of performance with order of administration, given the potential role of fatigue in a lengthy battery, as well as any idiosyncratic test order effects. Participants took breaks as necessary to reduce fatigue. To reduce the length of the testing session, the Memory Rating Scale was mailed to participants and their significant other 1 week prior to testing for home completion the day before testing. The data analytic strategy involved a regression approach, where neuropsychological indices, memory rating, depression, fatigue, physical disability, course, and disease duration were used to predict functional status in a series of stepwise regression analyses. The predictive ability of the various measures was compared by performing R2-change analyses, which indicate the variation in functional status accounted for by each predictor, independent of the other predictors. To test for the presence of multicollinearity,

Ecological Validity of Clinical Tests in MS

193

the tolerance of independent/predictor variables was evaluated. An alpha level of .05 was used for all a priori predictions.

RESULTS Four participants could not identify a significant other familiar enough with their dayto-day memory to complete the Memory Rating Scale. Five participants were unable to complete the PASAT, one participant was unable to complete the Elevator Counting with Distraction subtest of the TEA, two participants were unable to complete the Elevator Counting with Reversal subtest of the TEA, and one was unable to complete the Visual Elevator subtest of the TEA. Following the procedure of Dikmen, Donovan, Loberg, Machamer, and Temkin (1993), the participants unable to complete tests were assigned a score one point lower than the lowest score of the tested participants. If the lowest score of the tested participants was zero, an equivalent score was assigned. Table 2 provides the means and standard deviations for the sample on the neuropsychological indices and memory questionnaires. Mean performance on the cognitive tests was genTABLE 2 Descriptive Statistics on Cognitive Indices and Memory Rating Scales

CLVT List A total Short-Delay Free Recall Long-Delay Free Recall 7/24 Spatial Recall Test: Pattern A Total Short-Delay Free Recall Long-Delay Free Recall RBMT: Total Name Belonging Appointment Story - Immediate Story - Delayed PASAT Symbol Digit Modalities Test TEA: Total Elevator Counting with Distraction Visual Elevator Elevator Counting with Reversal Time per Switch Memory Rating Scale - Patient Memory Rating Scale - Significant Other COGIMP SC-COG EV-COG

M

SD

z

52.1 10.7 11.0

10.3 3.6 3.5

⫺0.90a 0a ⫺1.0a

28.9 5.1 5.0

6.2 1.9 2.1

⫺0.63b – ⫺0.32b

8.3 3.3 3.3 1.5 9.3 8.1 75.1 46.9

1.5 1.1 0.8 0.8 3.5 3.5 29.1 11.3

– ⫺0.43a ⫺0.98a ⫺0.59a ⫺0.12a ⫺0.12a ⫺1.46b ⫺1.16b

34.7 7.2 7.1 4.4 5.1 163.7 174.7 3.7 1.7 2.0

8.9 3.1 2.6 3.0 1.7 40.5 43.6 2.5 1.7 1.6

– ⫺0.35a ⫺0.45a ⫺0.16a ⫺0.82a – – – – –

CVLT ⫽ California Verbal Learning Test; RBMT ⫽ Rivermead Behavioural Memory Test; PASAT ⫽ Paced Auditory Serial Addition Test; TEA ⫽ Test of Everyday Attention; COGIMP ⫽ number of scores on all cognitive indices below the 16th percentile; SC-COG ⫽ number of scores on standard clinical indices below the 16th percentile; EV-COG ⫽ number of scores on ecological indices below the 16th percentile. aWhere available, z-scores are based on published norms. bz-scores are based on 99 normal volunteers from Rao, Leo, Bernardin, and Unverzagt (1991) (mean age and education ⫽ 46 and 13 years, respectively).

194

C. I. Higginson, P. A. Arnett, and W. D. Voss

erally in the low average and average ranges, based on available normative data. Thus, defining impairment on a test as performance below the 16th percentile of this sample, where the sample’s performance is slightly below that of a normative control group, is similar to defining impairment based on a lower percentile of a control group, as done previously (e.g., Amato et al., 1995; Beatty et al., 1995; Rao, Leo, Bernardin, et al., 1991). Examination of the tolerance of predictor variables suggested multicollinearity was not significantly present in any of the regression analyses. Table 3 illustrates intercorrelations between functional status and disease-related variables. As expected, based on previous research, the Environmental Status Scale was associated with COGIMP, EDSS, and the BDI. The correlations between EDSS and Environmental Status Scale scores, and cognitive functioning and Environmental Status Scale scores were similar in magnitude. Counter to expectations, the Environmental Status Scale did not correlate with years of education, disease duration, or the Fatigue Impact Scale. Interestingly, the Fatigue Impact Scale was highly associated with BDI scores (r ⫽ .66, p ⬍ .0001). The correlations between the Environmental Status Scale and the memory and attention indices are listed in Table 4. While three of eight indices (38%) from the standard clinical tests of memory and attention correlated with the Environmental Status Scale (CVLT-Long-Delay Free Recall, PASAT, Symbol Digit Modalities Test), six of nine subtests (67%; excluding composite/summary scores) from the tests developed with ecological validity in mind correlated with Environmental Status Scale score (RBMTName, Belonging, Story-Delayed; TEA- Elevator Counting with Distraction, Elevator Counting with Reversal, Visual Elevator-Time per Switch). Elevator Counting with Distraction from the TEA was the index with the strongest correlation with the Environmental Status Scale (r ⫽ ⫺.58, p ⬍ .0001). Also, the composite score from the TEA correlated with the Environmental Status Scale. Two other results are noteworthy: (a) five of six attention indices (83%; excluding the sum of age-scaled scores for the TEA subtests) correlated with the Environmental Status Scale, compared with 4 of 11 memory indices (36%; excluding the sum of Standardized Profile Scores for the RBMT subtests), and (b) although global indices based on standard clinical tests and the tests developed

TABLE 3 Intercorrelations Between Functional Status and Disease-Related Variables Environmental Status Scale

Age (years) Education (years) Estimated IQ Duration (years since diagnosis) EDSS BDI Fatigue Impact Scale COGIMP

0.11 ⫺0.26 ⫺0.28 0.25 0.57**** 0.41* 0.22 0.53***

Note. Estimated IQ is based on Shipley Institute of Living Scale total score. EDSS ⫽ Expanded Disability Status Scale; BDI ⫽ Beck Depression Inventory; COGIMP ⫽ number of scores on cognitive indices below the 16th percentile. *p < .05. ***p < .001. ****p < .0001.

Ecological Validity of Clinical Tests in MS

195

TABLE 4 Intercorrelations of Functional Status with Memory and Attention Indices Environmental Status Scale

SC-COG EV-COG CVLT List A Total Short-Delay Free Recall Long-Delay Free Recall 7/24 Spatial Recall Test: Pattern A Total Short-Delay Free Recall Long-Delay Free Recall PASAT Symbol Digit Modalities Test RBMT Total Name Belonging Appointment Story-Immediate Story-Delayed TEA Total Elevator Counting with Distraction Visual Elevator Elevator Counting with Reversal Time per Switch

0.40* 0.42** ⫺0.24 ⫺0.14 ⫺0.31* ⫺0.16 0.10 0.10 ⫺0.30* ⫺0.41* ⫺0.27 ⫺0.35* ⫺0.35* 0.15 ⫺0.18 ⫺0.30* ⫺0.47** ⫺0.58**** 0.13 ⫺0.38* 0.36*

SC-COG ⫽ number of scores on standard clinical indices below the 16th percentile; EV-COG ⫽ number of scores on ecological indices below the 16th percentile; CVLT ⫽ California Verbal Learning Test; PASAT ⫽ Paced Auditory Serial Addition Test; RBMT ⫽ Rivermead Behavioural Memory Test; TEA ⫽ Test of Everyday Attention, *p < .05. **p < .01. ****p < .0001.

with ecological validity in mind both correlated with the Environmental Status Scale, they did not correlate with each other (r ⫽ .18, p ⫽ ns). Results of a stepwise multiple regression comparing the ability of EV-COG, SCCOG and disease-related variables (EDSS, BDI, Fatigue Impact Scale, course, and duration) in predicting Environmental Status Scale score are listed in Table 5. EDSS was the best predictor of Environmental Status Scale score, predicting 33% of its variance. The second and only other significant predictor of Environmental Status Scale score in this analysis was EV-COG, predicting an additional 10% of the variance in Environmental Status Scale score. None of the indices from the standard clinical memory tests or the summary score from the RBMT, RBMT-total, significantly predicted Environmental Status Scale score in a stepwise regression comparing their ability, along with disease-related variables, in predicting Environmental Status Scale score. In an attempt to eliminate the possibility that the summary score from the RBMT was masking significant predictive relationships between individual subtests and functional status, a second regression analysis was performed using the RBMT subtest scores as independent variables. This approach seemed justified in light of the significant zero-order correlations between some of the RBMT subtests and the Environmental Status Scale (see Table 4). The results of this analysis are displayed in Table 6. As in the regression described in Table 5, EDSS was the best

196

C. I. Higginson, P. A. Arnett, and W. D. Voss TABLE 5 Summary of Stepwise Regression Analysis for Global Cognitive Indices and Disease-Related Variables Predicting Functional Status Variable

Step 1a EDSS Step 2b EV-COG

B

SE B



2.09

.56

.57**

1.13

.52

.33*

EDSS ⫽ Expanded Disability Status Scale; EV-COG ⫽ number of scores on ecological indices below the 16th percentile. aR2 ⫽ .33, p < .01. bdelta R2 ⫽ .10, p < .05. *p < .05. **p < .01.

predictor of Environmental Status Scale score. The second significant predictor of Environmental Status Scale score was the Story-Delayed subtest from the RBMT. StoryDelayed predicted an additional 11% of the variance in Environmental Status Scale score after that accounted for by the EDSS. On a third and final step, BDI predicted an additional 8% of the remaining variance in Environmental Status Scale score. Table 7 summarizes the results of a similar stepwise regression analysis that compared tests of attention rather than tests of memory: the sum of age-scaled scores the TEA subtests was compared to the standard clinical tests of attention, PASAT and Symbol Digit Modalities Test, and disease-related variables in their ability to predict Environmental Status Scale score. The EDSS was the best predictor of Environmental Status Scale score and was followed by the second and only other significant predictor, the sum of age-scaled scores for the TEA subtests. This variable predicted an additional 9% of the variance in the Environmental Status Scale. In an attempt to compare the two methods of improving the ecological validity of neuropsychological testing, a stepwise regression analysis was performed with the sum of Standardized Profile Scores for the RBMT subtests, patient-completed Memory RatTABLE 6 Summary of Stepwise Regression Analysis for Memory Indices and Disease-Related Variables Predicting Functional Status Variable

Step 1a EDSS Step2b RBMT Story-Delayed Step 3c BDI



B

SE B

2.09

.56

⫺.05

.02

⫺.34*

.19

.09

.29*

.57**

EDSS ⫽ Expanded Disability Status Scale; RBMT ⫽ Rivermead Behavioural Memory Test; BDI ⫽ Beck Depression Inventory. aR2 ⫽ .33, p < .01. bdelta R2 ⫽ .11, p < .05. cdelta R2 ⫽ .08, p < .05. *p < .05. **p < .01.

Ecological Validity of Clinical Tests in MS

197

TABLE 7 Summary of Stepwise Regression Analysis for Attention Indices and Disease-Related Variables Predicting Functional Status Variable

Step 1a EDSS Step 2b TEA-TOTAL

B

SE B

2.09

.56

⫺.20

.09



.57** ⫺.32*

EDSS ⫽ Expanded Disability Status Scale; TEA ⫽ Test of Everyday Attention. aR2 ⫽ .33, p < .01. bdelta R2 ⫽ .09, p < .05. *p < .05. **p < .01.

ing Scale, significant other completed Memory Rating Scale, and disease-related variables as predictors of Environmental Status Scale score. Neither of the memory questionnaires nor the sum of Standardized Profile Scores for the RBMT subtests significantly predicted Environmental Status Scale score. As expected, the patient completed Memory Rating Scale did not correlate with any of the performance measures of memory from the RBMT, CVLT, and 7/24. Unexpectedly, the significant other completed Memory Rating Scale correlated with Short-Delay Free Recall and Long-Delay Free Recall from the 7/24 Spatial Recall Test (r ⫽ ⫺.56, p ⬍ .001 and r ⫽ ⫺.57, p ⬍ .001, respectively). Note that the direction of these relationships indicate that better performance is associated with lower memory ratings. Consistent with predictions, the significant other completed Memory Rating Scale correlated with Belonging (r ⫽ .37, p ⬍ .05), Story-Immediate (r ⫽ .34, p ⬍ .05), and Story-Delayed (r ⫽ .36, p ⬍ .05) from the RBMT, but did not correlate with any of the CVLT indices included in analyses; the correlation with the sum of Standardized Profile Scores for RBMT subtests was marginally significant (r ⫽ .32, p ⫽ .051). Finally, the patient-completed and significant othercompleted memory questionnaires did not correlate with one another.

DISCUSSION The present study was designed to examine the relative efficacy of tests specifically designed with ecological validity in mind versus standard clinical tests of memory and attention in predicting functional disability in MS patients. Our results indicate that the primary hypotheses were supported; tests of memory and attention developed with ecological validity in mind were better predictors of functional disability than both memory questionnaires and specific neuropsychological tests of memory and attention commonly used in assessing the MS patient. We predicted with our first hypothesis that a global index of cognitive functioning based on the measures developed with ecological validity in mind (i.e., the RBMT and TEA) would be a better predictor of functional status than a similar measure based on commonly used tests of memory and attention (i.e., the CVLT, 7/24 Spatial Recall Test, PASAT, and Symbol Digit Modalities Test). We took this approach in an attempt to provide a general impression of whether the newer measures, when used in combination (much like they would be used in a neuropsychological test battery), predicted patients’ functional status better than those measures currently in use. The initial stepwise multiple regression (see Table 5) supported this hy-

198

C. I. Higginson, P. A. Arnett, and W. D. Voss

pothesis: The composite of indices from the two newer measures, the RBMT and TEA, was a better predictor of functional status, predicting 10% of the variance remaining after that predicted by neurological disability was removed. Although the correlation between the global measure of cognitive functioning based on the standard clinical indices and functional status was nearly as strong as that between the global measure based on the “ecologically valid” indices, two points support the inclusion of more “ecologically valid” tests in the neuropsychological evaluation of MS patients: (a) the greater percentage of indices from the “ecologically valid” tests correlating with functional status (67% vs. 38%), and (b) the failure of the global indices to correlate significantly. This latter finding suggests that the tests developed with ecological validity in mind are measuring something different than what is measured by the standard clinical tests. It also minimizes the possibility that the composite score based on the standard clinical tests did not correlate with functional status because the variance it predicts was removed at the second step by the composite based on the tests developed with ecological validity in mind. We conducted additional stepwise regression analyses to address the second hypothesis that summary scores from the RBMT and TEA will each predict functional status better than summary scores from their standard clinical memory and attention test counterparts. In other words, these analyses involved the questions: (a) is the new, supposedly more ecologically valid, memory test a better predictor of MS patients’ functional status than those memory measures commonly in use; and (b) is the new, supposedly more ecologically valid, test of attention a better predictor of MS patients’ functional status than those attention measures commonly in use? The second stepwise regression addressed the memory tests involved in the hypothesis and compared total score from RBMT with the chosen indices from the standard clinical memory measures (CVLT and 7/24) in their ability to predict functional status. Unexpectedly, none of the memory indices in the initial stepwise regression significantly predicted functional status. Because only those indices from the standard clinical memory tests that proved to have the highest correlation with functional status were chosen as predictors, and because the use of a summary score for the RBMT may have concealed significant predictors among the subtests that comprise the RBMT, another stepwise regression was performed using the subtests from the RBMT as predictors, rather than the composite score. In this analysis (see Table 6), the only significant cognitive predictor was the Story-Delayed subtest from the RBMT. Long-term memory (LTM), assessed via delayed recall tasks, is one of the most frequently cited cognitive deficits in MS (Fischer et al., 1994; Rao, 1986). Most would agree that having to recall ideas from a story after a brief delay is a memory task commonly performed by people in their everyday lives. Intuitively, it seems that this task would be a better predictor of daily functioning than having to recall, without any cues, a grocery list made by someone else, or having to recall a pattern of seven checkers on a grid. This finding is similar to that of Kessler et al. (1992) who found that Logical Memory-Delayed from the WMS-R was significantly correlated with functional disability in an MS sample; Logical MemoryDelayed is essentially the same task as Story-Delayed, but with two different stories. Thus, a standard clinical memory test already widely used, the WMS-R, has a subtest analogous to the Story-Delayed subtest we found to be highly related to MS patients’ functional status. The fourth regression analysis addressed the attention tests involved in the second hypothesis by comparing total score from the TEA with the chosen indices from the standard clinical attention measures (Symbol Digit Modalities Test and PASAT) in their ability to predict functional status. Consistent with this hypothesis, total score from

Ecological Validity of Clinical Tests in MS

199

the TEA entered the regression model on the second step behind physical disability (see Table 7), suggesting that it is a better predictor of MS patients’ everyday functioning than the standard clinical attention measures composite. We predicted with our third hypothesis that a memory questionnaire completed by a significant other would predict RBMT scores, but not standard clinical memory test scores. This hypothesis was supported in that significant-other ratings correlated with the majority of subtests from the RBMT. This outcome is consistent with the results of Hermann (1979), and Shlechter et al. (1982), who found that significant others’ ratings of patients’ memory are better correlated with performance measures more analogous to everyday behaviors. Counter to our hypothesis, significant others’ memory ratings also correlated with two indices from the 7/24 Spatial Recall. Surprisingly, better memory ratings were related to poorer actual performance. Although it is possible that these correlations are spurious, they are not completely unintelligible, given that the 7/24 Spatial Recall Test is a nonverbal memory test and the questionnaire mostly assesses verbal memory. Nonetheless, the significance of this finding is unclear. Sbordone (1996) states that the best estimate of ecological validity is the degree of consistency between test scores and other sources of data. The strong association between the RBMT subtests and both functional status and significant others’ memory ratings (compared to those involving the CVLT and 7/24) provides converging evidence of the ecological validity of the RBMT. The indirect relationship between the 7/24 Spatial Recall Test indices and significant other ratings are not inconsistent with this thesis given the counterintuitive nature of the relationships. As predicted in the fourth hypothesis and consistent with previous research (Beatty & Monson, 1991; Broadbent et al., 1982; McIntosh-Michaelis et al., 1991; Ruisel, 1991; Sunderland et al., 1983; Taylor, 1990), there was no association between the patientcompleted memory questionnaire and any of the performance memory tests. Thus, the significant others’ questionnaire ratings of the patients’ memory were more highly related to patient verbal memory test performance than the patients’ questionnaire ratings of their own memory. This result, along with the lack of association between the patientcompleted and significant other–completed memory questionnaire, suggests that the MS patients in this sample may have poor insight into their memory performance, similar to the MS participants of Beatty and Monson (1991), McIntosh-Michaelis et al. (1991), and Taylor (1990). These results suggest that MS patients themselves may not provide useful information in terms of their memory impairment and its impact on their everyday lives. It remains possible though, that the problem lies with the questionnaire rather than the responses provided: The questionnaire may poorly assess those aspects of everyday memory that impact on MS patients’ functioning. The last regression analysis addressed the last hypothesis, that the RBMT would predict functional status better than the patient- or significant other-completed memory questionnaire. Neither of the memory questionnaires, significant other-completed and patient-completed, nor the total score from the RBMT significantly predicted functional status in the stepwise multiple regression analysis. This may suggest that not even significant others familiar with patients’ everyday functioning can provide valid estimates of the degree to which MS patients’ memory difficulties affect their ability to perform daily activities. Alternatively, it may suggest that the items included on the questionnaire are not part of MS patients’ everyday lives (i.e., social validity; Franzen & Wilhelm, 1996). Regardless, the ability of three subtests from the RBMT to correlate with functional status indicates that the performance measure developed to be more ecologically valid used here is indeed more ecologically valid than the memory rating scale, supporting the contention of our hypothesis.

200

C. I. Higginson, P. A. Arnett, and W. D. Voss

Some of our other findings are noteworthy. Our correlations between the Name and Belonging RBMT subtests and functional status were not unexpected, given that McIntosh-Michaelis et al. (1991) found them to be the most frequently impaired subtests from the RBMT in their MS sample. Story-Delayed entered the predictive model but Name and Belonging did not, even though the strength of the relationship between them and functional status was similar to that between Story-Delayed and functional status. This result is likely caused by shared variance between the subtests and physical disability. In other words, some of the variance in functional status predicted by Name and Belonging was most likely removed by EDSS on the first step, allowing it to predict less of the remaining variance than Story-Delayed. These results imply that remembering a person’s name and having to remember to ask for a personal object after a verbal cue are tasks that are more ecologically valid than having to recall a story immediately after it was told (Story-Immediate) or remembering to ask a specific question when a bell rings (Appointment). Belonging and Appointment are very similar tasks. Belonging may have significantly correlated with functional status, while Appointment did not, because it is more difficult in that it involves more things to remember (both place and item hidden), has a greater range of potential scores, involves a more salient object (a personal belonging), or has a better retrieval cue, involving a verbal statement by the test administrator rather than a bell ringing. Unlike Kessler et al. (1992), we failed to find significant correlations between List A Total and Short-Delay Free Recall from the CVLT and functional status and visual memory indices with functional status. A potential explanation for our null visual memory findings is the fact that Kessler et al. (1992) used Visual Reproduction from the WMS-R rather than the 7/24 Spatial Recall Test. The 7/24 Spatial Recall Test appears to be limited by ceiling effects and restricted range of potential scores in our current sample of MS patients, something that most likely reduced its ability to correlate significantly with functional status. Alternatively, Visual Reproduction, which involves drawing a geometric design that was presented for 10 seconds immediately and after a 30-minute delay, may be more ecologically valid than the 7/24 Spatial Recall. There are three potential explanations for Kessler et al.’s (1992) significant correlations between List A Total and ShortDelay Free Recall from the CVLT and functional status (which were not obtained here): (a) the greater power afforded by their sample size, (b) the use of a different measure of functional status, the Activities of Daily Living Scale, and (c) the characteristics of their sample, which was less educated, had a longer disease duration, and a much greater proportion of patients with a chronic-progressive course than our sample. Interestingly, all the memory measures that significantly correlated with functional status involved free recall after a delay of at least 20 minutes, and none of the immediate memory measures correlated with functional status. This outcome is not surprising for two reasons: (a) Impaired delayed recall is one of the most frequently cited cognitive difficulties in MS (Rao, 1986), and (b) one could argue that having to recall information after a delay is a relatively frequent event in an individual’s daily life, while having to do so immediately after the information was presented, with limited retrieval cues, is not. Some subtests of the TEA appear to be more highly correlated with functional status than others. This is not surprising given that the individual subtests are thought to tax specific types of attention. Our results agree with those of Thornton and Raz (1997) who suggest complex attention (working memory) is more impaired than simple attention (span memory) in MS, and the prediction of Kerns and Mateer (1996) that simple attention measures will correlate minimally with real-life functioning. In fact, two measures of simple, sustained attention from the TEA were not included due to ceiling effects and resultant restricted range of scores. A task loading on a factor involving selective atten-

Ecological Validity of Clinical Tests in MS

201

tion, Elevator Counting with Distraction, exhibited the highest correlation with functional status. Although it could be argued that the Elevator Counting with Distraction subtest has a working memory component, it certainly is not as large as that of Elevator Counting with Reversal, which also correlated with functional status, but not as strongly as Elevator Counting with Distraction. This outcome suggests that this sample of MS patients’ ability to ignore distractions may be an important factor in successfully carrying out their daily activities. It is also consistent with the contentions of Kerns and Mateer (1996), who have suggested that selective attention is an important aspect of everyday cognitive function, and that the highly controlled nature of the testing situation, which essentially eliminates distractors, is a potential cause of the poor ability of neuropsychological testing to predict impairment in everyday life. The measure on which participants had the lowest mean score compared with the TEA’s standardization sample, was the timing score from the Visual Elevator subtest, Time per Switch. Time per Switch also significantly correlated with functional status. This may suggest that MS patients’ attention-related difficulties in their daily lives are most pronounced when performing tasks that require speeded processing of information. Interestingly, processing speed has also been identified as an important variable in everyday cognition (Sbordone & Long, 1996). As anticipated, based on previous research, functional impairment was associated with greater neurological disability and depressive symptomatology, as well as cognitive impairment. In fact, neurological disability was the best predictor of functional impairment. Because of the significant motor and sensory impairment in MS, this is not surprising. Also, similar outcomes are typical in other neurological disorders, such as stroke and closed head injury where indices of basic neurological status (e.g., level of consciousness, and continence) are good predictors of outcome. Surprisingly, functional impairment was not associated with more years of education or longer disease duration, as has been reported previously (LaRocca et al., 1985; Rao, Leo, Ellington, et al., 1991). The lack of association with education in our study may have been due to a restricted range among participants in regard to years of education; on average, our sample was highly educated. Disease duration may not have been related to functional status in our sample because of the preponderance of participants with a relapsing-remitting course; although the functional status of MS patients with this course type tends to decline over time, the decline is more variable than the decline typical of other courses. A number of limitations in the present study suggest possibilities for future research. First, although the neuropsychological battery used in the current study focused on the two areas of cognitive functioning most frequently impaired in MS, memory and attention, it did not include specific measures of executive function, conceptual reasoning, and visuospatial perception. Thus, the impact of deficits in these areas on functional status was not ascertained. It is reasonable to expect that deficits in these areas of cognitive functioning could impair the MS patient’s ability to carry out daily activities, but this is especially the case for executive functioning, something required for the successful planning and completion of most tasks. Second, it is important not to generalize these findings beyond the specific measures included in the study. Had other clinical measures or other measures developed with ecological validity in mind been included, the results may have been different. Third, because some of the RBMT subtests were not administered due to ceiling effects and some patients’ difficulty ambulating, the ecological validity of the entire test was not truly assessed. Although the validity of considering a subset of the total RBMT subtests as we did is unknown because the subtests were not designed to stand alone, the subtests did seem to show some validity within our sample in that some did predict functional ability. Nonetheless, one of the reasons that the RBMT total

202

C. I. Higginson, P. A. Arnett, and W. D. Voss

score did not significantly predict functional ability may have been due to our failure to include all of the subtests in the measure in the summary index. Fourth, in some of the regression analyses, global measures composed of multiple subtests were compared to individual indices. Because the composite scores may have greater stability, they may be more likely to correlate with functional status, and thus, a bias may exist where they have a better chance of significantly predicting the criterion. However, in several cases individual subtests predicted the criterion better than composite scores, making this interpretation less likely. Fifth, the number of statistical analyses conducted provides a potential for Type I error. Alpha was not reduced because all analyses were planned a priori and the sample size afforded limited power to detect significant findings. Thus, the ability to detect relationships even at p ⬍ .05 suggests reasonably strong effects. Also, the many variables which influence functional status (Franzen & Wilhelm, 1996), reduces the variance accounted for by each. Another statistical consideration is that the relatively large number of predictors in comparison to the sample size creates a potential for instability of the predictive model. This is unlikely to be a concern here because the majority of predictors were not linearly related to the criterion; therefore, the models were simple (involving only two or three variables), leaving few possibilities for the order in which the variables could enter the equation. Finally, given the heterogeneity of MS, it is important to note that these results may not generalize to MS patients similar to those included in our sample: MS outpatients with moderate physical disability, the majority of whom were college-educated females with a relapsing-remitting course. In summary, our results suggest that if clinical neuropsychology wishes to fulfill its changing role, which now places a greater emphasis on predicting functional deficits, the development and adoption of measures which take into consideration the degree to which tasks simulate real-world behaviors should be considered. This is at least the case for patients who are generally high-functioning and have only mild to moderate cognitive dysfunction, like our sample of MS patients. Better prediction of limitations in MS patients’ everyday functioning should make neuropsychological evaluations more useful in treatment planning and patient management and, as such, result in better care for individuals suffering from the potentially devastating effects of MS.

REFERENCES Acker, M. B. (1990). A review of the ecological validity of neuropsychological tests. In D. E. Tupper & K. D. Cicerone (Eds.), The neuropsychology of everyday life: assessment and basic competencies (pp. 19–55). Boston: Kluwer Academic Publishers. Amato, M. P., Ponziani, G., Pracucci, G., Bracco, L., Siracusa, G., & Amaducci, L. (1995). Cognitive impairment in early-onset multiple sclerosis. Archives of Neurology, 52, 168–172. Antonak, R. F., & Livneh, H. (1995). Psychosocial adaptation to disability and its investigation among persons with multiple sclerosis. Social Science and Medicine, 40, 1099–1108. Beason-Hazen, S., Lynn, J., Rammohan, K. W., & Bornstein, R. A. (1995). Depressive symptoms and neuropsychological performance in chronic progressive and relapsing remitting multiple sclerosis. Unpublished manuscript. Beatty, W. W., Blanco, C. R., Wilbanks, S. L., Paul, R. H., & Hames, K. A. (1995). Demographic, clinical, and cognitive characteristics of multiple sclerosis patients who continue to work. Journal of Neurological Rehabilitation, 9, 167–173. Beatty, W. W., & Monson, N. (1991). Metamemory in multiple sclerosis. Journal of Clinical and Experimental Neuropsychology, 13, 309–327. Beck, A. T., & Steer, R. A. (1987). BDI: Beck Depression Inventory manual. New York: The Psychological Corporation. Broadbent, D. E., Cooper, P. F., FitzGerald, P, & Parkes, K. R. (1982). The Cognitive Failures Questionnaire (CFQ) and its correlates. British Journal of Clinical Psychology, 21, 1–16.

Ecological Validity of Clinical Tests in MS

203

Cella, D. F., Dineen, K., Arnason, B., Reder, A., Webster, K. A., Karabatsos, G., Chang, C., Lloyd, S., Mo, F., Stewart, J., & Stefoski, D. (1996). Validation of the functional assessment of multiple sclerosis quality of life instrument. Neurology, 47(7), 129–139. Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (1987). California Verbal Learning Test: Adult version. San Antonio, TX: The Psychological Corporation. Dikmen, S. S., Donovan, D. M., Loberg, T., Machamer, J. E., & Temkin, N. R. (1993). Alcohol use and its effects on neuropsychological outcome in head injury. Neuropsychology, 7, 296–305. Fischer, J. S., Foley, F. W., Aikens, J. E., Ericson, G. D., Rao, S. M., & Shindell, S. (1994). What do we really know about cognitive dysfunction, affective disorders, and stress in multiple sclerosis. Journal of Neurological Rehabilitation, 8, 151–164. Fisk, J. D., Pontefract, A., Ritvo, P. G., Archibald, C. J. & Murray, T. J. (1994). The impact of fatigue on patients with multiple sclerosis. Canadian Journal of Neurological Science, 21, 9–14. Franzen, M. D., & Wilhelm, S. (1996). Conceptual foundations of ecological validity in neuropsychological assessment. In R. J. Sbordone, & C. J. Long (Eds.), Ecological validity of neuropsychological testing (pp. 91–112). Delray Beach, FL: GR Press/St. Lucie Press. Gilchrist, A. C., & Creed, F. H. (1994). Depression, cognitive impairment and social stress in multiple sclerosis. Journal of Psychosomatic Research, 38(3), 193–201. Ginsberg, J. P., Kibby, M. Y., & Long, C. J. (1995). Ecological validity of neuropsychological data as indicated by interrelationships with vocational data. Poster session presented at the Annual Meeting of the National Academy of Neuropsychology, San Francisco, CA. Gronwall, D. M. A. (1977). Paced auditory serial-addition task: A measure of recovery from concussion. Perceptual and Motor Skills, 44, 367–373. Heinrichs, R. W. (1990). Current and emergent applications of neuropsychological assessment: Problems of validity and utility. Professional Psychology: Research and Practice, 21(3), 171–176. Hermann, D. J. (1979, December). The validity of memory questionnaires as related to a theory of memory introspection. Paper presented at the meeting of the British Psychological Society, London, UK. Holland, N. J., Francabandera, F., & Wiesel-Levison, P. (1986). International scale for assessment of disability in multiple sclerosis. Journal of Neuroscience Nursing, 18(1), 19–44. Iwasaki, Y., Kinoshita, M., Ikeda, K., & Takamiya, K. (1989). Cognitive function in multiple sclerosis and its relation to functional abilities. International Journal of Neuroscience, 48, 219–223. Kerns, K. A., & Mateer, C. A. (1996). Walking and chewing gum: The impact of attentional capacity on everyday activities. In R. J. Sbordone, & C. J. Long (Eds.), Ecological validity of neuropsychological testing (pp. 147–169). Delray Beach, FL: GR Press/St. Lucie Press. Kessler, H. R., Cohen, R. A., Lauer, K., & Kausch, D. F. (1992). The relationship between disability and memory dysfunction in multiple sclerosis. International Journal of Neuroscience, 62, 17–34. Kurtzke, J. F. (1983). Rating neurological impairment in multiple sclerosis: An expanded disability status scale (EDSS). Neurology, 33, 1444–1452. LaRocca, N., Kalb, R., Scheinberg, L., & Kendall, P. (1985). Factors associated with unemployment of patients with multiple sclerosis. Journal of Chronic Disease, 38, 203–210. Larrabee, G. J., & Crook, T. H. (1996). The ecological validity of memory testing procedures: Developments in the assessment of everyday memory. In R. J. Sbordone, & C. J. Long (Eds.), Ecological validity of neuropsychological testing (pp. 225–242). Delray Beach, FL: GR Press/St. Lucie Press. Lublin, F. D., & Reingold, S. C. (1996). Defining the clinical course of multiple sclerosis: Results of an international survey. Neurology, 46, 907–911. McIntosh-Michaelis, S. A., Roberts, M. H., Wilkinson, S. M., Diamond, I. D., McLellan, D. L., Martin, J. P., & Spackman, A. J. (1991). The prevalence of cognitive impairment in a community survey of multiple sclerosis. British Journal of Clinical Psychology, 30, 333–348. Murray, T. J. (1995). The psychosocial aspects of multiple sclerosis. Neurologic Clinics, 13(1), 197–223. Nussbaum, P. D., Goreczny, A., & Haddad, L. (1995). Cognitive correlates of functional capacity in elderly depressed versus patients with probable Alzheimer’s disease. Neuropsychological Rehabilitation, 5, 333–340. PASAT: User and technical manual. (1994). ForThought, Ltd. Poser, C. M., Paty, D. W., Scheinberg, L., McDonald, W. I., Davis, F. A., Ebers, G. C., Johnson, K. P., Sibley, W. A., Silberberg, D. H., & Tourtellotte, W. W. (1983). New diagnostic criteria for multiple sclerosis: Guidelines for research protocols. Annals of Neurology, 13, 227–231. Rao, S. M. (1986). Neuropsychology of multiple sclerosis: A critical review. Journal of Clinical and Experimental Neuropsychology, 8, 503–552. Rao, S. M., Hammeke, T. A., McQuillen, M. P., Khatri, B. O., & Lloyd, D. (1984). Memory disturbance in chronic progressive multiple sclerosis. Archives of Neurology, 41, 625–631. Rao, S. M., Leo, G. J., Bernardin, L., & Unverzagt, F. (1991). Cognitive dysfunction in multiple sclerosis. I. Frequency, patterns, and prediction. Neurology, 41, 685–691.

204

C. I. Higginson, P. A. Arnett, and W. D. Voss

Rao, S. M., Leo, G. J., Ellington, L., Nauertz, T., Bernardin, L., & Unverzagt, F. (1991). Cognitive dysfunction in multiple sclerosis. II. Impact on employment and social functioning. Neurology, 41, 692–696. Rao, S. M., Leo, G. J., Haughton, V. M., St. Aubin-Faubert, P., & Bernardin, L. (1989). Correlations of magnetic resonance imaging with neuropsychological testing in multiple sclerosis. Neurology, 39, 161–166. Robertson, I. H., Ward, T., Ridgeway, V., & Nimmo-Smith, I. (1994). The Test of Everyday Attention. Suffolk, UK: Thames Valley Test Co./Gaylord, MI: National Rehabilitation Services. Robertson, I. H., Ward, T., Ridgeway, V., & Nimmo-Smith, I. (1996). The structure of normal human attention: The test of everyday attention. Journal of the International Neuropsychological Society, 2, 525–534. Ruisel, I. (1991). Memory rating. Studia Psychologica, 33, 71–77. Sbordone, R. J. (1996). Ecological validity: Some critical issues for the neuropsychologist. In R. J. Sbordone, & C. J. Long (Eds.), Ecological validity of neuropsychological testing (pp. 15–41). Delray Beach, FL: GR Press/St. Lucie Press. Sbordone, R. J., & Long, C. J. (1996). Ecological Validity of neuropsychological testing. Delray Beach, FL: GR Press/St. Lucie Press. Shipley, W. C. (1940). A self-administering scale for measuring intellectual impairment and deterioration. Journal of Psychology, 9, 371–377. Shlechter, T. M., Hermann, D. J., Sronach, P., Rubenfeld, L., & Zenker, S. (1982, March). Metamemory questionnaires. Paper presented at the American Educational Research Meeting, New York, NY. Smith, A. (1968). The symbol-digit modalities test: A neuropsychologic test for economic screening of learning and other cerebral disorders. Learning Disorders, 3, 83–91. Sunderland, A., Harris, J. E., & Baddeley, A. D. (1983). Do laboratory tests predict everyday memory? A neuropsychological study. Journal of Verbal Learning and Verbal Behavior, 22, 341–357. Taylor, R. (1990). Relationships between cognitive test performance and everyday cognitive difficulties in multiple sclerosis. British Journal of Clinical Psychology, 29, 251–252. Thornton, A. E., & Raz, N. (1997). Memory impairment in multiple sclerosis: A quantitative review. Journal of the International Neuropsychological Society, 3(1), 18–29. Voss, W. D., Arnett, P. A., Stevens, K. L., Rao, S. M., & Bernardin, L. (1995, November). Effects of depression on functional ability in multiple sclerosis. Poster Session presented at the Annual Meeting of the National Academy of Neuropsychology, San Francisco, CA. Wilson, B. A. (1993). Ecological validity of neuropsychological assessment: Do neuropsychological indexes predict performance in everyday activities? Applied and Preventive Psychology, 2, 209–215. Wilson, B. A., Cockburn, J., & Baddeley, A. (1985). The Rivermead Behavioral Memory Test. Reading, England: Thames Valley Test Co./Gaylord, MI: National Rehabilitation Services.