Extracting Principal Components and Determining Critical Questions in Conners ADHD Questionnaire Farnaz Ghassemi, Student Member, IEEE, Mohammad Hasan Moradi, Member, IEEE, Mahdi Tehrani Doust, and Vahid Abootalebi, Member, IEEE. Abstract— Conners Adult ADHD Rating Scale (CAARS) is one of the reliable questionnaires in evaluating ADHD in adults. The goal of this study is to extract principal components of the screening version of questionnaire and evaluate the validity of missed answers estimation. This study is performed on 380 volunteers. Eight principal components are extracted by means of PCA. In the next step, for each question, the answer estimation is calculated (assuming this answer was missed), and then a KruskalWallis test is performed to evaluate the difference between the original answer and its estimation. Results indicate that for some particular questions there is a significant difference between the original and estimated answers. However, result of Multiple Comparison Procedure shows that this estimation when evaluated in the whole group does not have any significant difference with the original value in any of the questionnaire four subscales. Notwithstanding this fact, if the missed question is a critical one, more considerations shall be paid.
I. INTRODUCTION
A
TTENTION Deficit/Hyperactivity Disorder, ADHD, was originally thought to be primarily a pediatric condition. However, in recent years, researchers have consistently demonstrated that ADHD is often a chronic condition that persists into adulthood [1], [2]. The available data suggest that between 30 and 70 percent of children with ADHD continue to manifest symptoms in adulthood [2]-[6]. In addition to the core symptoms of ADHD, which involve problems in attention, hyperactivity, and impulsivity, adults subject to this disorder have been found to be at risk for a variety of other problems and conditions. For example, ADHD adults have been found to be at risk for lower levels of educational and occupational attainment, employment instability, substance abuse, and antisocial behavior [1]. Manuscript received February 14, 2009. F. Ghassemi is with the Biomedical Engineering Faculty of Amirkabir University of Technology, Tehran, Iran (corresponding author, phone: (+98)912-326-0661; fax: (+98)21-6642-0672; e-mail:
[email protected]). M. Moradi is with the Biomedical Engineering Faculty of Amirkabir University of Technology, Tehran, Iran (e-mail:
[email protected]). M. Tehrani Doust is with the Psychology Department, University of Tehran, Tehran, Iran, and Institute for Cognitive Science Studies (ICSS), Tehran, Iran (e-mail:
[email protected]). V. Abootalebi is with the Electrical Engineering Faculty of Yazd University, Yazd, Iran (e-mail:
[email protected]).
It is estimated that between 1 and 7 percent of the adult population experiences ADHD symptoms [2]-[4]. Up to now, different studies have been carried out regarding various types of attention and their relations with brain activities. There are activation states of cerebral cortex that affect the ability to process information where the activation itself contains no specific information. These activation states can be tonic or phasic and may be relatively global or more localized. Terms that have been used to describe these states include arousal, alertness, vigilance, and attention. Unfortunately, no terms are ideal to describe these states of cortical activation since most terms are in broad use with varied associations and there are not perfect physiological markers [7]-[9]. The four most commonly used self-report measures for ADHD are the Conners adult ADHD scale [1],[10], the Wender rating scale [11], the Copeland symptom checklist [12], and the Brown scale [13], [3]. There are a growing number of brief self-administered screens, including the WHO Adult Self-Report-Scale-V1.1 screener [14]. These screening tools should not be used for diagnostic purposes, as inattention, impulsivity, and volatile moods are features of several other psychiatric conditions. The screening forms are useful when a quick screen for DSM-IV ADHD symptoms is required. The Conners Adult ADHD Rating Scale (CAARS) is a set of easily administered self-report and observer-rated instruments. It is designed to assess symptoms and behaviors related to ADHD in adults and causes development in the assessment of the psychopathology and problem behaviors associated with adults ADHD. A main problem in self-report measures is the high probability of missing the questions by patients suffering from this disorder which can disturb the evaluation of questionnaire. In this case, to evaluate the questionnaire, the operator should complete the missed answer(s). Therefore the missed answer should be estimated through averaging the answer of other questions in that subscale and inserting the result for the missed answer. The goal of this study is to extract principal components of the questionnaire and evaluate the validity of estimating missed or scratched answers in the screening version of the CAARS questionnaire.
978-1-4244-3316-2/09/$25.00 ©2009 IEEE
Authorized licensed use limited to: Victoria Univ of Wellington. Downloaded on January 14, 2010 at 06:33 from IEEE Xplore. Restrictions apply.
II. METHODS AND MATERIALS A. Conners Questionnaire The CAARS is a suitable instrument for reporting symptoms and behaviors related to ADHD on adults aged 18 and up. Both self-report and observer forms utilize a 4-point (0 = Not at all, never; 1 = Just a little, once in a while; 2 = Pretty much, often; 3 = Very much, very frequently), Likert-style format in which respondents are asked to rate items pertaining to their behavior/problems. The self-report (CAARS-S:SV) screening form which is used in this study, has 30 items and 3 DSM-IV ADHD symptom measures that assess ADHD symptoms according to the criteria set outlined in the DSM-IV [3]; a 9-item inattentive symptoms subscale (A group), a 9-item hyperactive-impulsive symptoms subscale (B group), and a total ADHD symptoms subscale (C group). The 12-item ADHD index is also included on the form (D group). Interpretation of the CAARS requires a general understanding of the nature of ADHD symptoms across the life span. The CAARS should be interpreted based on an analysis of where a particular individual's scores falls with respect to the CAARS population norms. For example, an individual with a T-score above 70 on the ADHD index is likely to have significant levels of symptoms that may meet diagnostic criteria such as in the DSM-IV [1]. T-score is a standardized score with the useful feature that each subscale will have the same mean and standard deviation. Such a feature allows directly comparison of the scores on one subscale to the scores on another which is not possible with the raw scores. For the CAARS, high T-scores represent a problem and low T-scores suggest that the individual does not present particular symptoms. The minimum possible T-score for each group based on age and gender is indicated in Table I. The maximum T-score in all cases is 90%. B. Subjects 380 volunteers with average age of 22.5 ± 2.96 years participated in the experiment. The characteristics of subjects’ distribution in age, gender, and education are shown in Fig. 1. In this population, 55 percent of subjects were female. 51 percent were graduated and 33 percent under-graduated. Since financial incentives affect the
TABLE I THE MINIMUM POSSIBLE VALUE FOR T-SCORE
Grope A Grope B Grope C Grope D
18-28 36% 33% 31% 30%
Males 28-38 36% 33% 31% 33%
38 < 28% 32% 28% 34%
18-28 35% 31% 30% 32%
Females 28-38 35% 31% 30% 31%
38 < 29% 29% 26% 33%
The minimum possible T-score for each group based on age and gender is indicated. For the CAARS, high T-scores represent a problem and low T-scores suggest that the individual does not present particular symptoms.
attention level [7], subjects were not paid to participate in the experiment. C. Statistical Analysis and Validity Evaluation Software version of the questionnaire was prepared and used which has the benefits of fast evaluation of the questionnaire, increasing the accuracy, simple dispatching to each subject by E-mail and simplicity of constructing the database for statistical examinations. All statistical tests and calculations are performed with "Matlab 2008" software. In some situations, the dimension of the input vector is large, but the components of the vectors are highly correlated (redundant). It is useful in this situation to reduce the dimension of the input vectors. An effective procedure for performing this operation is Principal Components Analysis, PCA. This technique orthogonalizes the components of the input vectors (so that they are uncorrelated with each other). It also orders the resulting principal components so that those with the largest variation come first, and eliminates those components that contribute the least to the variation in the data set. This method is used for extracting principal components of the CAARS questionnaire. Fig. 2 shows the distribution of subscales. Lilliefors test is used to evaluate whether the samples come from a normal distribution or not. The Lilliefors test is a 2-sided goodness-of-fit test suitable when a fully-specified null distribution is unknown and its parameters must be estimated, i.e. it is a Kolmogorov-Smirnov test with unknown null distribution. The T-scores used with CAARS are linear ones which do not transform the actual distributions of the variables in any way; hence, variables
Fig. 1. The characteristics of subjects’ distribution in age, gender, and education.
Authorized licensed use limited to: Victoria Univ of Wellington. Downloaded on January 14, 2010 at 06:33 from IEEE Xplore. Restrictions apply.
that are not normally distributed in the raw data will continue to be non-normally distributed after the transformation. Fig. 2 demonstrates that the distribution of T-scores and consequently raw scores is not normal, thus non-parametric tests are used for statistical evaluations. To evaluate the validity of estimating missed or scratched answers in the screening version of CAARS questionnaire, each of 30 questions is assumed to be missed individually, and its estimation is calculated with the average of other questions in that group. For example, first question belongs to group A which regards to inattention subscale and contains 9 questions; average of other 8 questions in this group is considered as the estimation of answer to the first question. Kruskal-Wallis test (a non-parametric version of ANOVA) is used for investigating the veracity of this approach. A test is performed to evaluate the difference between the original answer and its estimation (30 tests). In the next step, considering that there is a possibility for each of the questions to be missed, the signification of the difference should be evaluated in the whole group, so the "Multiple Comparison Procedure" is used for investigating the significance of the differences between original and estimated answers in each of the groups. III. RESULTS The distribution of T-scores in the statistical population for each subscale is demonstrated in Fig. 3. The T-scores used with CAARS have a mean of 50 and a standard deviation of 10. Values around 50 indicate that subject is in the average range in that subscale whereas higher T-scores represent a problem.
Fig. 2. Distribution of subscales in the statistical population. Solid blue line indicates the distribution of “Inattention” subscale, dotted red line demonstrates the the distribution of “Hyperactivity” subscale, whereas dashed green and dash-dotted pink lines show the distributions of “ADHD” and “ADHD Index” subscales respectively.
Table II, indicates the results of PCA which transforms a number of possibly correlated variables into a smaller number of uncorrelated variables. This table expresses the obtained eigen values, percents of variance and cumulative percents of the components. The eigen value for each component is plotted in Fig. 4. As can be seen, eight components have eigen values greater than one and can be extracted. These components contain 60 percent of variability of all items. The component matrix is demonstrated in Table III which shows the effect of every question on each of eight obtained components. For example, the first component is mostly affected by questions 9, 19, 26, 29, and 30. The results of Kruskal-Wallis tests are demonstrated in table IV. This table defines each question belongs to which group and expresses the obtained P-value for this .
Fig. 3. The distribution of subjects’ T-scores in four subscales (inattention, hyperactivity, ADHD and ADHD index). The T-scores used with CAARS have a mean of 50 and a standard deviation of 10. Values around 50 indicate that subject is in the average range in that subscale whereas higher T-scores represent a problem; Lower T-scores suggest that the subject does not present particular symptoms.
Authorized licensed use limited to: Victoria Univ of Wellington. Downloaded on January 14, 2010 at 06:33 from IEEE Xplore. Restrictions apply.
TABLE II THE RESULTS OF PRINCIPAL COMPONENT ANALYSIS
TABLE III THE COMPONENT MATRIX
Component Eigen Value % of Variance Cumulative% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
6.765 3.156 2.044 1.438 1.286 1.19 1.164 1.014 0.981 0.851 0.78 0.764 0.749 0.718 0.699 0.672 0.629 0.565 0.547 0.515 0.464 0.445 0.416 0.318 0.371 0.354 0.285 0.275 0.253 0.233
22.549 10.521 6.814 4.794 4.288 3.968 3.879 3.381 3.270 2.835 2.602 2.546 2.495 2.394 2.392 2.392 2.097 1.882 1.823 1.716 1.547 1.483 1.377 1.270 1.237 1.178 0.950 0.915 0.842 0.776
22.549 33.070 39.884 44.678 48.966 52.934 56.813 60.194 63.464 66.299 68.901 71.447 73.943 76.337 78.666 80.905 83.002 84.884 86.807 88.423 89.971 91.453 92.830 94.100 95.338 96.516 97.466 98.382 99.224 100
Obtained Eigen values, percents of variance and cumulative percents of the components are expressed. Eigen values for first eight components are greater than one and can be extracted. These eight components contain the 60% of variability of all components.
substitution. Outcomes indicate that for some particular questions there is a significant difference (P-value