Archives of Clinical Neuropsychology 16 (2001) 141 ± 149
Development and initial validation of an Arabic version of the Expanded Trail Making Test: Implications for cross-cultural assessment$ Daniel E. Stanczaka,*, Elizabeth M. Stanczakb, Abdel W. Awadallab a
Wilford Hall Air Force Medical Center, Lackland Air Force Base, TX, USA b Children's Hospital of Philadelphia, Philadelphia, PA, USA
Abstract The performances of Sudanese subjects, both normal and brain damaged, on an Arabic version of the Expanded Trail Making Test were compared to those of normal and brain-damaged subjects from the United States, who completed the standard English version of this test. Preliminary psychometric properties of the Arabic version of the Expanded Trail Making Test were defined. Significant intergroup differences in performance were observed. Interestingly, the performances of Sudanese normals were found to be similar to those of U.S. brain-damaged subjects. The results are discussed in terms of reducing neuropsychological diagnostic errors attributable to ethnocultural factors. D 2001 National Academy of Neuropsychology. Published by Elsevier Science Ltd.
It is clear from decades of research that a properly conducted neuropsychological examination is sine qua non for detecting cerebral dysfunction. In addition to measuring behavioral responses in fine detail during the course of such examinations, neuropsychologists also attend to variables such as handedness, gender, age, education, medications, and so forth, with which neuropsychological test scores are known to covary. Deviations from empirically or rationally established norms are then noted and interpreted as indices of dysfunction. Unfortunately, most neuropsychological test norms are derived from, and are thus primarily applicable to, the majority cultural groups in the United States, Canada, and other The views and opinions expressed herein are solely those of the authors, and endorsement by the Department of the Air Force, Department of Defense, or any other governmental body should not be inferred. * Corresponding author. Daniel E. Stanczak, Psychology Research Service, 59th Medical Wing/MMCNB, 2200 Bergquist Drive, Suite 1, Lackland Air Force Base, TX, 73236. E-mail address:
[email protected] (D.E. Stanczak). $
0887-6177/01/$ ± see front matter D 2001 National Academy of Neuropsychology. PII: S 0 8 8 7 - 6 1 7 7 ( 9 9 ) 0 0 0 6 0 - 8
142
D.E. Stanczak et al. / Archives of Clinical Neuropsychology 16 (2001) 141±149
English speaking countries. Typically, when ethnic minorities are included in standardization samples, their performance data is pooled with that of the majority group. The application of majority group or mixed norms to ethnic subcultures may introduce systematic bias into the examination process. Despite this problem, few data exist regarding ethnic or cultural differences in neuropsychological test performance. Indeed, in neuropsychological research, the ethnic composition of samples is rarely identified. In reviewing research articles published in the Archives of Clinical Neuropsychology and the Journal of Clinical and Experimental Neuropsychology for the years 1988 through 1994, the authors found that the ethnic composition of experimental and control subjects was presented in only 83 (14.6%) of the 567 studies reviewed. Of those 83 studies, 15 (18%) listed the ethnic composition of samples merely as White versus non-White. In only 6 (7%) of the 83 studies was any type of analysis using ethnicity as a variable performed. In the 4,000 case studies collected by Ward Halstead, ethnicity is recorded for only 1,011 subjects, all of whom were Caucasian business executives (Egel & Hughes, 1989). Also of interest is the finding that of the 14 predictions Rourke (1991) made regarding changes in neuropsychological research and service delivery in the 1990s, none addressed the issue of cultural diversity. Training in neuropsychological service delivery with diverse populations is not mentioned in the Guidelines for Doctoral Training Programs in Clinical Neuropsychology (International Neuropsychological Society and the American Psychological Association-Division 40, 1987). Indeed, it was not until the recent, controversial Houston Conference (Reynolds, Bigler, & Horton, 1998) that issues of cultural diversity were first formally proposed as an essential component of neuropsychological training. This historical and generally continuing lack of sensitivity to issues associated with ethnic diversity probably stems from the assumption, among neuropsychologists, that central nervous system functioning does not vary as a consequence of ethnocultural variables. Indeed, this assumption may be true, for we have no data suggesting that the nervous systems (or the gastric systems or the circulatory systems) of individuals from varying ethnocultural backgrounds function differently. It is clear, however, that the measures used by neuropsychologists to assess central nervous system functioningÐscores obtained from neuropsychological tests, either quantitative or qualitativeÐare subject to variation attributable to random error and systematic error such as ethnocultural diversity (Ardila, Rosselli, & Puente, 1994; Ardila, Rosselli, & Rosas, 1989; Escobar et al., 1986; Puente, 1990). Thus, it is important for neuropsychologists to identify, and control for, systematic error that may be associated with cultural diversity. Toward this end, the present study examines the cross-cultural generalizability of the Expanded Trail Making Test (ETMT; Stanczak, Lynch, McNeil, & Brown, 1998) to an Arabic population. Arabic is spoken by almost 200 million people in a geographic area ranging from Morocco to Iraq and as far south as Somalia and the Sudan. Much of this population lives in underdeveloped countries where neuropsychological services are largely absent. Though there are many dialects of modern Arabic, the classical written form of the language has remained essentially unchanged over several centuries.
D.E. Stanczak et al. / Archives of Clinical Neuropsychology 16 (2001) 141±149
143
1. Method 1.1. Subjects Subjects were selected according to the recommendations of Stanczak, Stanczak, and Templer (in press). The normal control groups included 497 U.S. citizens (U.S. Normals) and 77 Sudanese (Sudanese Normals) with no history of psycho- or neuropathology. The braindamaged groups consisted of 53 U.S. citizens (U.S. Brain Damaged) and 28 Sudanese (Sudanese Brain Damaged) with neurodiagnostically confirmed brain lesions. The gender compositions of the groups were (a) U.S. Normals: 188 males, 307 females; (b) U.S. Brain Damaged: 39 males, 14 females; (c) Sudanese Normals: 33 males, 44 females; and (d) Sudanese Brain Damaged: 22 males, 6 females. A neuropsychologist or a neuropsychology doctoral candidate tested all U.S. subjects. A Sudanese neuropsychology doctoral candidate tested Sudanese subjects. One-way analyses of variance (ANOVAs) were performed to determine if the groups differed significantly in terms of age or education. Both ANOVAs were significant (age: MS 7501:23; df 3; 389; F 33:20, P .0001; education: MS = 229.55, df = 3, 389, F = 27.24, P .0001). Post hoc Scheffe tests revealed significant differences in education between the U.S. Normals and Sudanese Normals, U.S. Normals and Sudanese BrainDamaged, U.S. Normals and U.S. Brain-Damaged, U.S. Brain-Damaged and Sudanese Brain-Damaged, and Sudanese Normals and Sudanese Brain-Damaged groups. In terms of age, significant differences existed between the U.S. Normals and Sudanese Normals, U.S. Normals and Sudanese Brain-Damaged, and U.S. Brain-Damaged and Sudanese Normals groups (Table 1). Both lesion groups were heterogeneous in terms of lesion type, selected as to optimize the external validity of the study (Gemmell & Stanczak, 1996). The most common lesion types were closed head injury, cerebrovascular accidents, and neoplasia. 1.2. Procedure Five forms of the Expanded Trail Making Test were administered in the order A, B, X, Y, and Z to all subjects. The first four forms are described in Stanczak et al. (1998). Form Z requires the subject to alternate between clock faces and dots of increasing diameter. A 180second time limit was imposed for completion of each form. Scores were prorated for subjects not completing a form within the 3-minute time limit. Prorated scores were calculated as Table 1 Sample demographics Age
Education
Sample
M
SD
M
SD
U.S. normals U.S. brain damaged Sudanese normals Sudanese brain damaged
42.78 36.02 24.84 26.61
(16.43) (17.48) (7.88) (12.69)
14.79 12.37 13.19 10.36
(2.47) (3.14) (3.43) (4.16)
144
D.E. Stanczak et al. / Archives of Clinical Neuropsychology 16 (2001) 141±149
Fig. 1. Sample items for Forms A and B of the Arabic ETMT.
follows: (a) 180 seconds was divided by the number of successfully connected stimuli (counting the ``begin'' stimulus as number one), and (b) the result was then multiplied by the total number of stimuli on the form (Forms A, B, X, and Z 25; FormY 24). The total time, actual or prorated, served as the dependent measure for each form. For all Sudanese subjects, Arabic versions of the ETMT were administered. These versions were constructed to be identical to the English versions with the exception that appropriate Arabic numbers and letters replaced the respective English symbols. In addition, the words
D.E. Stanczak et al. / Archives of Clinical Neuropsychology 16 (2001) 141±149
145
Table 2 Mean logarithmic scores by group and ETMT form ETMT Form A
B
X
Y
Z
Group
M
SD
M
SD
M
SD
M
SD
M
SD
U.S. normals U.S. brain damaged Sudanese normals Sudanese brain damaged
3.25 3.71 4.01 4.41
(.37) (.48) (.51) (.51)
4.06 4.75 4.80 5.02
(.39) (.54) (.58) (.69)
4.85 5.42 5.03 5.52
(.63) (.61) (.57) (.61)
4.12 4.64 4.65 4.91
(.45) (.48) (.39) (.42)
4.50 4.99 5.00 5.29
(.39) (.35) (.43) (.47)
begin and end were translated into Arabic. A native Sudanese neuropsychology doctoral candidate, who also tested all Sudanese participants, performed translation of test instructions into Arabic. Back translation of instructions was not attempted. The sample items for Forms A and B of the Arabic ETMT are shown in Fig. 1. 1.3. Design A preliminary review of the data revealed leptokurtic, positively skewed distributions. Thus, natural logarithmic transformations were conducted on the raw ETMT scores. Although this procedure yielded distributions that more closely approximated the normal curve, extreme scores were still noted. To produce more robust estimates of population means, mean logarithmic scores for all ETMT variables except Form A, which did not produce outliers, were subjected to one to two percent Windsorizing (Table 2), depending on the particular variable. Group differences were then examined using univariate (Group X ETMT form) ANOVAs and Scheffe post hoc tests. 2. Results Results of the univariate ANOVAs are summarized in Table 3. As can be seen, all analyses were significant at the P .001 level, even after Bonferoni corrections for multiple comparisons. Follow-up Scheffe tests revealed significant group differences which are summarized in Table 4. Group means are graphically displayed in Fig. 2.
Table 3 Results of univariate ANOVAs ETMT form
MS
df
A B X Y Z
24.78 23.88 22.18 13.74 12.50
3, 3, 3, 3, 3,
651 651 651 651 651
F
P
151.73 121.91 22.18 68.29 79.59
.001 .001 .001 .001 .001
146
D.E. Stanczak et al. / Archives of Clinical Neuropsychology 16 (2001) 141±149
Table 4 Results of Scheffe post hoc analyses Group
USN
USBD
SN
USBD SN SBD
ABXYZ ABYZ ABXYZ
AX AZ
AXZ
Note. USN = U.S. Normals; USBD = U.S. Brain Damaged; SN = Sudanese Normals; SBD = Sudanese Brain Damaged; A = Expanded Trail Making Test (ETMT) Form A; B = ETMT Form B; X = ETMT Form X; Y = ETMT Form Y; Z = ETMT Form Z.
To determine the diagnostic equivalence of the English and Arabic versions of the ETMT, separate direct discriminant analyses were conducted for each ethnocultural group, using diagnosis (normal vs. brain damaged) as the dependent variable. The results of these analyses are summarized in Table 5. Overall, these results indicate roughly comparable diagnostic utility for the English and Arabic versions of the ETMT, with the exception that the English version had somewhat better negative predictive power and, as a consequence, a somewhat higher overall correct classification rate. Because significant between-group differences existed in terms of age, education, and gender, multiple regression was used to decompose the between-group (U.S. vs. Sudanese) variance, thereby providing estimates (squared part correlations) of the unique variance explained by each ETMT form and demographic variables. The results of this decomposition are summarized in Table 6. As indicated by this table, the results of the present study are potentially confounded by large sample differences in demographics. Thus, to determine whether or not the unique variance explained by each of the ETMT forms was significant, an analysis of covariance was performed with group (brain damaged vs. normal) and country (United States vs. Sudan) as the fixed effects, the five ETMT forms as the dependent variables, and age, gender, and education as covariates. This analysis suggested that even when the variance attributable to age and education is partialled out, the unique amount of between-group variance explained by each ETMT form is still significant (Table 7). There-
Fig. 2. Mean logarithmic scores: ETMT form by subject group.
D.E. Stanczak et al. / Archives of Clinical Neuropsychology 16 (2001) 141±149
147
Table 5 Comparison of the psychometric properties of the English and Arabic forms of the Expanded Trail Making Test Index
U.S. sample
Sudanese sample
Sensitivity Specificity Positive predictive power Negative predictive power False positive rate False negative rate Overall hit rate Chance agreement Kappa Standard error of kappa Z of Kappa Probability
.19 .98 .45 .92 .02 .81 .90 .87 .22 .04 5.86 < .00001
.28 .92 .57 .78 .08 .71 .76 .68 .25 .09 2.83 < .005
fore, although significant sample differences exist, in terms of demographic variables, these differences do not appear to have confounded the obtained results. 3. Discussion Besides providing preliminary data regarding the validity of an Arabic form of the ETMT, the results suggest that ethnocultural variables can indeed impact performance on neuropsychological tests. Of particular interest was the finding that, on the respective versions of the ETMT, the scores obtained by Sudanese normals were generally similar to those obtained by U.S. brain-damaged subjects. The application of U.S. norms to Sudanese subjects would, therefore, yield an unacceptable false positive diagnostic rate. Thus, the assumption that neuropsychological tests measure central nervous system function and that such function is invariant across cultures is not supported. Accordingly, it is suggested that to minimize Type I and II diagnostic errors, it is important that ethnocultural variables be considered during the neuropsychological assessment process. Table 6 Decomposition of between-group variance Variable ETMT Form ETMT Form ETMT Form ETMT Form ETMT Form Age Education Gender All variables
% unique variance explained A(Log) B(Log) X(Log) Y(Log) Z(Log)
Note. ETMT = Expanded Trail Making Test.
2.71 1.97 2.01 .29 .16 12.31 1.65 4.62 25.72
148
D.E. Stanczak et al. / Archives of Clinical Neuropsychology 16 (2001) 141±149
Table 7 Results of the multivariate analysis of covariance ETMT Form
SS
df
A(Log) B(Log) X(Log) Y(Log) Z(Log)
104.65 112.58 77.74 55.33 54.96
5, 5, 5, 5, 5,
531 531 531 531 531
F
P
133.17 103.56 33.01 44.37 51.25
.001 .001 .001 .001 .001
Note. ETMT = Expanded Trail Making Test.
There are basically two approaches to controlling for variance attributable to ethnocultural differences. The first approach, and the approach taken in the present study, is to examine the performance of major ethnocultural groups on currently available neuropsychological tests, adapting such tests when necessary for linguistic differences. The second approach is illustrated by the work of Maj et al. (1993), who attempted to construct a single instrument (the Color Trails Test) for universal cross-cultural use. Although this is an equally viable and logical approach, it is also a problematic one. For instance, in adapting a test such as the Trail Making Test for universal cross-cultural use, modifications of the original test stimuli can alter the instrument's construct validity. Indeed, as is the case with the Color Trails Test, when a test represents a significant modification of the original instrument, it is difficult to determine if differences in variance between the original and modified form of the test are attributable to reduced cultural bias or to other, more nonspecific factors. In the case of the Color Trails Test, such differences in variance might be due to a number of factors, unrelated to cultural variation, such as the introduction of color, the elimination of alphabetic characters, potential differences in stimulus complexity or spatial arrangement, or the relative difficulty of the tasks (Stanczak, et al., 1998). Although the present study reveals significant differences in the ETMT performance of U.S. and Sudanese subjects, this finding is potentially confounded by significant sample differences in terms of age, education, and gender. Nevertheless, a decomposition of the between-group variance and a subsequent analysis of covariance demonstrated that the ETMT still explains a significant unique proportion of this variance when the effects of demographic variables are partialled out. It is also probable that the estimates of variance explained by demographic variables will prove to be conservative and, with more evenly matched samples, these estimates will shrink. Nevertheless, the present findings should be considered preliminary until more matched samples can be examined. It should also be noted that the mean age of the U.S. samples was significantly higher than that of the Sudanese samples. Given that increasing age is generally associated with longer ETMT completion times (McNeil, 1995), it would be expected that the U.S. sample, in the absence of a significant main effect for ethnocultural factors, would have produced higher ETMT scores. Indeed, the opposite effect was found in the current study, with the U.S. sample producing markedly lower ETMT completion times. This observation would also argue against the potentially confounding effect of sample differences in age. The present study does not provide a rationale for the finding of significant differences in ETMT performance between U.S. and Sudanese subjects. Indeed, many hypotheses could be
D.E. Stanczak et al. / Archives of Clinical Neuropsychology 16 (2001) 141±149
149
constructed. One could reasonably argue that the observed differences in ETMT performance are attributable to actual differences in cognitive styles which, in turn, are influenced by sociocultural factors. In contrast, one could also argue that developmental factors such as differences in nutrition, health care, educational opportunities, and so forth, explain the observed differences. However, it should be noted that the Sudanese subjects in the present study were relatively well-educated and healthy, and most were functioning in a professional capacity. Because psychological tests are not widely employed in the Sudan, it is also possible that the observed group differences are attributable, at least in part, to the relative novelty of the testing process for the Sudanese subjects. Whatever the reason or reasons, it is clear that our understanding of ethnocultural differences in neuropsychological test performance, and the bases for such differences, is only in its infancy and that considerable research will be required before the variance in neuropsychological test performance attributable to ethnocultural factors can be adequately explained. References Ardila, A., Roselli, M., & Puente, A. E. (1994). Neuropsychological evaluation of the Spanish speaker. New York: Plenum Press. Ardila, A., Rosselli, M., & Rosas, P. (1989). Neuropsychological assessment in illiterates: Visuospatial and memory abilities. Brain and Cognition, 11, 147 ± 166. Egel, L., & Hughes, H. (1989). Halstead's clinical legacy. Archives of Clinical Neuropsychology, 4, 175 ± 196. Escobar, J. I., Burman, A., Karno, M., Forsythe, A., Landsverk, J., & Golding, J. M. (1986). Use of the MiniMental State Examination (MMSE) in a community population of mixed ethnicity. Journal of Nervous and Mental Disease, 174, 607 ± 614. Gemmell, S. B., & Stanczak, D. E. (1996). Subject selection in neuropsychological research: The use of homogeneous versus heterogeneous samples. Paper presented at the 17th Annual Central California Research Symposium, California State University, Fresno, CA, May. International Neuropsychological Society and the American Psychological Association-Division 40. (1987). Reports of the INS-Division 40 task force on education, accreditation, and credentialing. The Clinical Neuropsychologist, 1, 29 ± 34. Maj, M., D'Elia, L., Satz, P., Janssen, R., Zaudig, M., Uchiyama, C., Starace, F., Galdersis, S., & Chervinsky, A. (1993). Evaluation of two new neuropsychological tests designed to minimize cultural bias in the assessment of HIV-1 seropositive persons: A WHO study. Archives of Clinical Neuropsychology, 8, 123 ± 136. McNeil, C. K. (1995). The psychometric properties of the Trail Making Test Forms X, Y, and Z: Geropsychological implications. Unpublished doctoral dissertation, California School of Professional Psychology, Fresno. Puente, A. E. (1990). Psychological assessment of minority group members. In G. Goldstein & M. Hersen (Eds.), Handbook of psychological assessment. New York: Pergamon Press. Reynolds, C. R., Bigler, E. D., & Horton, A. M. Jr. (1998). Proceedings of the Houston conference on specialty education and training in clinical neuropsychology. Archives of Clinical Neuropsychology, 13(2) (Special issue). Rourke, B. P. (1991). Human neuropsychology in the 1990s. Archives of Clinical Neuropsychology, 6, 1 ± 14. Stanczak, D. E., Lynch, M. D., McNeil, C. K., & Brown, B. (1998). The Expanded Trail Making Test: Rationale, development, and psychometric properties. Archives of Clinical Neuropsychology, 13, 473 ± 487. Stanczak, E. M., Stanczak, D. E., & Templer, D. I. (in press). Subject selection procedures in neuropsychological research: A meta-analysis and prospective study. Archives of Clinical Neuropsychology.