Neuropsychology 19%, Vol. 10, No. 1,120-124
In the public domain
An Empirical Approach to Determining Criteria for Abnormality in Test Batteries With Multiple Measures Lor ing J. Ingraham and Christopher B. Aiken National Institute of Mental Health
Investigators and clinicians are sometimes confronted with the task of determining whether a pattern of scores on a variety of measures reflects impairment. As the number of measures increases, so does the probability that scores on some of the measures will be in the abnormal range. A simple method for setting an overall criterion for abnormality when using multiple measures is presented, and an approach for setting criteria when some of the measures are correlated is suggested. The approach is applied to a study of the neuropsychological manifestations of HIV-1 infection, and the model is found to agree with observed results.
For a variety of reasons, investigators and clinicians often select a number of measures to assess a sample of interest or to screen individual patients. In some cases, there are several domains that may be affected by a pharmacological agent or a given form of psychopathology, and evaluation of each is considered necessary. For example, in studies of schizophrenia, evaluation of affect, cognition, and perception is relevant. Each of these domains has several processes, such as the cognitive processes of memory, attention, and language. Within each process, there may be several components or factors that merit evaluation, such as the ability to focus, maintain, and shift attention (Mirsky, Anthony, Duncan, Ahearn, & Kellam, 1991). A final source of the proliferation of measures is that tests may have multiple subscores or scales. When examining the results of multiple tests, the clinician confronts the problem of determining how many deviant scores are necessary to diagnose a patient as abnormal or whether the configuration of scores is significantly different from an expected pattern. Investigators face a similar dilemma in deciding whether a population is abnormal when studying the number of deviant scores in a sample. In both of these situations, the probability that individuals will have deviant test scores rises as the number of tests in the battery is increased. Attempts to calculate this probability, particularly in relation to blood test batteries, are constrained by the assumptions of independence among tests and of a Gaussian distribution of the results (Phillips & Thompson, 1980; Schoen & Brooks, 1970).
Here we present one approach to the development of empirically determined criteria for abnormality in test batteries composed of multiple tests and suggest an approach to compensating for nonindependence of tests within a battery. In addition, predictions from this approach are then tested in comparison to observed empirical results from a study of the neuropsychological manifestations of human immunodeficiency virus type 1 (HIV-1) infection.
Method The binomial probability distribution was applied to hypothesized independent, normally distributed test scores to produce families of curves representing the probability of exceeding cut-off criteria by chance when an a priori hypothesis predicts a decrement in performance (and thus reflects one tail of the probability distribution). A formula is presented in the Appendix for creating such curves for other sets of criteria.
Results There is a greater probability of exceeding cut-off criteria as the number of tests in a battery increases, as the distance from the mean (in standard deviations) that is required to be classified as abnormal on each test decreases and as the number of tests in the battery that require such a deviation decreases. We have approached our model from the perspective of a one-tailed distribution, such that deviations in only one direction are considered abnormal. In some situations, a two-tailed model might be more appropriate, and Figure 1 includes legends for both one- and two-tailed tests. To create the legend for two-tailed tests, we assumed that one would effectively be giving twice as many tests, one for each tail of the distribution. This assumption was found to approximate the two-tailed probabilities at best by ±0.0001 and at worst by ±0.034; a curve of the exact probabilities can be generated with the formula in the Appendix. Figure la plots the probability of exceeding 1.0, 1.5, or 2.0 standard deviations on at least one test of a battery of n tests. Figure la indicates that in almost all cases using a cut-off criterion based on only one impaired score results in exceeding the criterion more than 5% of the time.
Loring J. Ingraham and Christopher B. Aiken, Laboratory of Psychology and Psychopathology, National Institute of Mental Health, Bethesda, Maryland. This work was supported in part by a Stanley Scholars Award from the Theodore and Vada Stanley Foundation. We thank T. Peter Bridge, who was instrumental in encouraging the development of an empirical approach for studies in which multiple measures are used. Portions of this article were presented at the Neurological and Neuropsychological Complications of HIV Infection Update, June 1&-19,1990, Monterey, California. Correspondence concerning this article should be addressed to Loring J. Ingraham, National Institute of Mental Health, Building 10, Room 4C110, Bethesda, Maryland 20892. Electronic mail may be sent to via Internet to
[email protected].
120
DETERMINING CRITERIA FOR ABNORMALITY
121
Figure 1a: Probability when criterion is at least 1 impaired score
10
15
1@1SD
P= •1.00
- 0.90 -0.80 -0.70 -0.60
[email protected] 1@2SD
- 0.50 -0.40 -0.30 -0.20
fx.05. 0
5 10 15 20 Number of tests in battery (bottom axis is 1-tailed, top axis is two-tailed)
25
Figure 1b: Probability when criterion is at least 2 impaired scores 2@1SD
2 @ 1.5 so 2@2SD
fx.05 0
\ 5 10 15 20 Number of tests in battery (bottom axis is 1-tailed, top axis is two-tailed)
25
Figure 1c: Probability when criterion is at least 3 impaired scores
fx.05. 0
5
I—'—'—'—'—I—'—' 10 15
"