This article was downloaded by: [University College London] On: 25 April 2013, At: 07:43 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Neuropsychological Rehabilitation: An International Journal Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/pnrh20
Use of the Multiple Errands Test – Simplified Version in the assessment of suboptimal effort a
Marcia Castiel , Nick Alderman Caroline Knight
b c
b c
& Paul Burgess
, Keith Jenkins
b c
,
d
a
Older People's Mental Health Services, Surrey and Borders Partnership NHS Foundation Trust, UK b
Neuropsychiatry Service, St Andrew's Healthcare, Northampton, UK c
Kings College London, and St Andrew's Academic Centre, Northampton, UK d
Institute of Cognitive Neuroscience, University College London, UK Version of record first published: 06 Jun 2012.
To cite this article: Marcia Castiel , Nick Alderman , Keith Jenkins , Caroline Knight & Paul Burgess (2012): Use of the Multiple Errands Test – Simplified Version in the assessment of suboptimal effort, Neuropsychological Rehabilitation: An International Journal, 22:5, 734-751 To link to this article: http://dx.doi.org/10.1080/09602011.2012.686884
PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/termsand-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-
licensing, systematic supply, or distribution in any form to anyone is expressly forbidden.
Downloaded by [University College London] at 07:43 25 April 2013
The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
NEUROPSYCHOLOGICAL REHABILITATION 2012, 22 (5), 734– 751
Use of the Multiple Errands Test – Simplified Version in the assessment of suboptimal effort
Downloaded by [University College London] at 07:43 25 April 2013
Marcia Castiel1, Nick Alderman2,3, Keith Jenkins2,3, Caroline Knight2,3, and Paul Burgess4 1
Older People’s Mental Health Services, Surrey and Borders Partnership NHS Foundation Trust, UK 2 Neuropsychiatry Service, St Andrew’s Healthcare, Northampton, UK 3 Kings College London, and St Andrew’s Academic Centre, Northampton, UK 4 Institute of Cognitive Neuroscience, University College London, UK
Most measures of suboptimal effort focus on short-term learning; fewer studies have considered non-memory feigned cognitive impairment. This study investigated the utility of the Multiple Errands Test – Simplified Version (MET-SV) in the detection of feigned executive functioning impairment. Performance of simulating malingerers (N ¼ 47) was compared to acquired brain injury (N ¼ 46) and neurologically healthy control groups (N ¼ 50). Although simulating malingerers were successful at feigning a realistic level of impairment compared to the brain injury group, there were significant differences regarding pattern of performance. A logistic regression model successfully classified 84% of simulating malingerers and 74.5% of brain injured individuals. Receiver Operating Characteristic (ROC) analysis supported the discriminatory power of the model. The current study is unique in yielding some understanding of the real-life observation of suspected malingerers compared to individuals with genuine cognitive difficulties. Results suggest the MET-SV can contribute to the clinical assessment of individuals suspected of suboptimal effort in the domain of executive functioning. Further research is needed to establish whether the MET-SV can be reliably used in medico-legal settings. Keywords: Malingering; Suboptimal effort; Executive function; MET-SV. Correspondence should be addressed to Nick Alderman, National Brain Injury Centre, Kemsley Unit, St Andrew’s Healthcare, Billing Road, Northampton, NN1 5DG, UK. E-mail:
[email protected] # 2012 Psychology Press, an imprint of the Taylor & Francis Group, an Informa business http://www.psypress.com/neurorehab http://dx.doi.org/10.1080/09602011.2012.686884
MEASURING EFFORT WITH THE MET-SV
735
Downloaded by [University College London] at 07:43 25 April 2013
INTRODUCTION Neuropsychological assessment has a key role in establishing the presence, nature and extent of cognitive impairment among people who have acquired brain injury (ABI). In order to be valid and useful, performance on neuropsychological assessment measures is dependent on application of full effort during testing. However, in a litigation context, where results of neuropsychological investigations may influence claims for financial compensation, possible exaggeration or fabrication of cognitive impairment may be of concern. A landmark study by Heaton, Smith, Lehman, and Vogt (1978) demonstrated that clinical judgement is insufficient to identify suboptimal effort during testing. A position paper by the American National Academy of Neuropsychology suggests that: “When the potential for secondary gain increases the incentive for symptom exaggeration or fabrication and/or when neuropsychologists become suspicious of insufficient effort or inaccurate or incomplete reporting, neuropsychologists can, and must, utilise symptom validity tests and procedures to assist in the determination of the validity of the information and test data obtained” (Bush et al., 2005, pp. 425–426). A frequently endorsed method of effort testing is forced-choice recognition memory tasks (McCarter, Walton, Brooks, & Powell, 2009). Symptom Validity Tests (SVT) developed for this purpose include the Test of Memory Malingering (TOMM; Tombaugh, 1996) and the Word Memory Test (WMT; Green, 2005). Suboptimal effort is suspected when performance falls below a cut-off, which represents the lowest score achieved by individuals with ABI (Rogers, 1997). The Rey 15-item test is another commonly used measure (McCarter et al., 2009). It is presented as “a difficult memory test”, however, it is actually very simple, even for people with significant cognitive impairment. Its popularity is attributable to ease of administration, simple scoring and short administration time (Nitch & Glassmire, 2007). However, a review of studies concerning the Rey-15 Item Test revealed that although its “specificity” (proportion of non-malingerers correctly classified) is typically above 90%, its “sensitivity” (proportion of malingerers correctly classified) is much lower, around 50% (Nitch & Glassmire, 2007). Despite their widespread use, SVTs have significant shortcomings (Ashendorf, O’Bryant, & McCaffrey, 2003). For example, sensitivity can be influenced when examinees receive coaching about tests (Essig, Mittenberg, Peterson, Strauman, & Cooper, 2001). The internet is also a source of easily available information regarding the nature of effort tests (Ashendorf et al., 2003; Bauer & McCaffrey, 2006). Furthermore, individuals may apply suboptimal effort in domains other than memory functioning. Rogers (2007) noted that research has focused largely on detection strategies based on short-term learning and consequently neglected other areas of cognitive functioning.
Downloaded by [University College London] at 07:43 25 April 2013
736
CASTIEL, ALDERMAN, JENKINS, KNIGHT, AND BURGESS
Given the range of potential vulnerabilities to SVTs, neuropsychological tests used in routine clinical practice have also been used to detect suboptimal effort and are known as “embedded measures”. Examples include Reliable Digit Span from the Wechsler Adult Intelligence Scale (3rd edition) and the Rarely Missed Index from the Wechsler Memory Scale (3rd edition) (Killgore & DellaPietra, 2000; Iverson & Tulsky, 2003). Embedded measures have appeal because they enable simultaneous assessment of neuropsychological functions whilst yielding important information regarding effort. They can also be used to make judgements about effort retrospectively where SVTs were not administered. They are less transparent than SVTs, providing some protection against coaching, and they do not lengthen assessment. An area of need for embedded measures of effort is executive functioning. Executive functions underlie many cognitive, emotional and social skills; impairment of these functions can be seriously disabling and consequently potentially highly financially “compensable”. Litigants may be motivated to fabricate or exaggerate executive functioning difficulties (Greve & Bianchini, 2007). Most research in this area has focused on the Wisconsin Card Sorting Test (WCST) and has attempted to identify atypical patterns of performance indicative of suboptimal effort. However, WCST cognitive effort indices have proved variable. Greve and Bianchini (2007) suggest that whilst these can be used to detect suboptimal effort in mild traumatic brain injury (TBI), caution is required with older or more severe TBI cases because of a higher likelihood of false positives. Traditionally, assessment of executive function involves administering tests within the controlled environment of the consulting room. However, many of these were not devised for clinical application and their relevance to performance in the “real-world” is obscure (Burgess et al., 2006). Furthermore, symptoms of impairment are not necessarily demonstrable in the highly structured context of consulting room assessments (Burgess, Alderman, Volle, Benoit, & Gilbert, 2009). The Multiple Errands Test (MET), originally described by Shallice and Burgess (1991), was created in response to these dilemmas. Unlike most tests of executive functioning it is an “ill-structured” measure (Goel, Grafman, Tajick, Gana, & Danto, 1997) of multitasking carried out in a shopping centre, designed to mimic an everyday activity. The individual is required to achieve a number of simple tasks (buying items, recording information and meeting the examiner at a set time) without breaking a series of arbitrary rules which increase the planning, monitoring and prospective memory demands of the test. Shallice and Burgess (1991) found that people with ABI, who performed normally or near-normally on neuropsychological tests including those concerned with executive functioning, demonstrated severe impairment on the MET. It captured these individuals’ difficulties because a context is created in which symptoms of executive disorder are elicited.
Downloaded by [University College London] at 07:43 25 April 2013
MEASURING EFFORT WITH THE MET-SV
737
The MET was developed to detect executive deficits in patients with preserved high IQ. Alderman, Burgess, Knight, and Henman (2003) created the Multiple Errands Test – Simplified Version (MET-SV) to use with the broader population more typically seen in routine clinical practice. Individuals with ABI tended to make more errors than neurologically healthy controls; furthermore, many types of error were unique to ABI participants. A scoring method reflecting the “normality” of errors correctly identified 82% of people with ABI whilst only misclassifying 5% of controls. Research has identified how individuals may go about faking and exaggerating deficits on tests of cognitive function administered in the consulting room. However, evidence is lacking on how potential malingerers would approach assessment procedures that formalise “real-world” situations in which difficulties with executive function are most readily apparent. This study investigated the utility of the MET-SV in the detection of suboptimal effort by comparing performance of people with ABI, neurologically healthy controls and neurologically healthy people simulating effects of cognitive impairment. Previous investigations typically employed college students instructed to fake cognitive impairment on particular tests. However, students typically lack incentives regarding how well they do this, and may have limited knowledge regarding the actual effects of ABI. This study aimed to avoid this problem by recruiting clinicians from a neurorehabilitation unit. It was predicted their knowledge and experience would result in sophisticated simulations that would be more compatible with suboptimal effort following coaching.
METHOD Design A three-group simulation quasi-experimental design was employed to compare MET-SV performance of simulating malingerers, people with confirmed ABI and neurologically healthy controls.
Participants MET-SV error scores for 143 people were examined. Existing data from the original MET-SV study (Alderman et al., 2003) was utilised for the ABI and control group. Additional neurologically healthy participants were recruited for the simulating malingerer group. Neurologically healthy controls. This group comprised 46 people who had no history of neurological disease. Ages ranged from 21 –58 years
738
CASTIEL, ALDERMAN, JENKINS, KNIGHT, AND BURGESS
Downloaded by [University College London] at 07:43 25 April 2013
(M ¼ 29.2, SD ¼ 8.5). Estimated IQ using the National Adult Reading Test – Revised (NART-R; Nelson, 1991) ranged from 85–124 (M ¼ 107.6, SD ¼ 9.1). Gender was equally represented. ABI group. ABI participants comprised 41 males and 9 females. Age ranged from 18–59 years (M ¼ 34.6, SD ¼ 12.7). Mean premorbid IQ was 99.9 (SD ¼ 12.9). Seventy eight percent had sustained TBI; the remainder had CVAs and surgery for cerebral tumours. With regard to severity, 75% were classified as “very severe”, four “severe”, four “moderate” and one “mild”. Simulating malingerers. This group consisted of 47 participants, who were all clinical staff employed at the National Brain Injury Centre, St Andrew’s Healthcare, Northampton, UK. None had a history of neurological problems. Furthermore, no members of this group had participated in the Alderman et al. (2003) MET-SV validation study, or undertaken the test previously. Age ranged from 22–55 years (M ¼ 34, SD ¼ 9.5). NART-R estimated IQ ranged from 105–124 (M ¼ 115, SD ¼ 5.31). The group was composed of 30 females and 17 males. Mean length of employment was 57 months (SD ¼ 64, range ¼ 4–300 months).
Procedure The procedure used for administering the MET-SV for all participants is described by Alderman and colleagues (2003). The test was conducted in the Weston Favell Shopping Centre, which was in the vicinity of the rehabilitation unit in Northampton. Simulating malingerers were given the following additional directions prior to beginning the MET-SV (adapted from Kelly, Baker, van den Broek, Jackson, & Humphries, 2005): I want you to imagine that you are claiming compensation following an accident in which you suffered a head injury and you now have a solicitor who is dealing with the claim. People who have had an injury commonly suffer from psychological problems such as forgetfulness, loss of concentration, and difficulty in reasoning and thinking clearly. They may also suffer from emotional problems, such as depression and anxiety, as well as medical problems, for example, affecting their eyesight and hearing. You know that the more affected you appear to be, the more financial compensation you are likely to receive. It is therefore in your best interests to magnify your symptoms but not so that it is too obvious to the person testing you. You have been referred by your solicitor to a psychologist who will attempt to evaluate the effect that the
MEASURING EFFORT WITH THE MET-SV
739
Downloaded by [University College London] at 07:43 25 April 2013
head injury has had upon you. As part of the assessment you are asked to carry out the following test. Participants were then briefed using the standard protocol for the test. They were given the exercise sheet on a clipboard, a pen, a carrier bag, a £10 note, and (if necessary) a wrist watch. The examiner then read the instructions. Participants were informed that they would be followed by the assessor during the task for the purpose of monitoring and recording their performance. The examiner also explained that that they should not be spoken to unless it was part of the exercise. Next, participants were given the opportunity to ask any questions. They were then asked to summarise what they had to do to ensure they understood the requirements of the task. Participants were asked to rate themselves regarding the statement, “How efficient would you say you were with tasks like shopping?” using a 10-point Likert type scale with weighted end points (“0 ¼ hopeless”, “10 ¼ excellent”). It was made clear that this was part of the simulating task. Finally, the start of the test was signalled by the examiner with the instruction, “Begin the exercise”. Participants were followed discretely by the examiner who made written notes regarding their performance. At the end of the task, whilst still “in role”, they were asked to rate the question, “How well do you think you did with the shopping task?” using a 10-point scale with “0” being “hopeless” and “10” being “excellent”. Finally, participants were asked to describe what strategies they had adopted to comply with the instructions of “magnifying their symptoms” to degrade MET-SV performance.
Data analysis The MET-SV was scored using information from the examiner’s written notes and the exercise sheet completed by the participant independently by the first and second authors. Errors were categorised by type using the criteria defined by Shallice and Burgess (1991). These were: rule breaks – where a specific rule (either social or explicitly mentioned in the task) was broken; task failure – a task not completed satisfactorily; inefficiencies – where a more effective strategy could be applied; and interpretation failure – where the requirement of a particular task were misunderstood. Each error was weighted following the method provided by Alderman and colleagues (2003) which reflects how characteristic it was of neurologically healthy participants. An error made by up to 95% of controls received a score of “1”; an error type demonstrated by 5% or less of controls was assigned a score of “2”; and an error type that was not observed amongst controls was given a score of “3”. The cut-off score for abnormal performance on the MET-SV is 12 or more. The two authors then agreed error scores for each participant. Intraclass correlation coefficients of .95 to .97 confirmed good inter-rater reliability
740
CASTIEL, ALDERMAN, JENKINS, KNIGHT, AND BURGESS TABLE 1 Error scores by group
Downloaded by [University College London] at 07:43 25 April 2013
Neurologically healthy controls
Rule breaks Task failures Inefficiencies Interpretation failures Total errors
Brain injured individuals (ABI)
Simulating malingerers
M
SD
M
SD
M
SD
2.50 1.59 .48 .20 4.76
3.02 1.07 .89 .45 3.71
8.64 11.62 1.12 .28 21.70
8.84 7.16 2.06 1.20 11.51
7.60 9.43 2.89 1.00 21.04
8.30 5.70 3.07 1.30 12.02
prior to this agreement of the final scores. These new data were then compared to that of the control and ABI groups described by Alderman et al. (2003).
RESULTS Between-group comparisons of MET-SV errors Summary statistics regarding the different error categories are shown in Table 1. Kruskal-Wallis analyses were carried out to compare betweengroup differences. Where appropriate, post-hoc pair wise comparisons were subsequently made using Mann Whitney U tests (with an adjusted minimum p value of .017, see Table 2). Although the total error scores of the simulating malingering and ABI groups were comparable, and there were no differences regarding rule breaks and task failures, malingerers made significantly more interpretation failures and inefficiencies. In addition, simulating malingerers demonstrated TABLE 2 Level of significance and effect sizes across the three contrasts for the MET-SV error categories
Controls vs. ABI Rule breaks Task failures Inefficiencies Interpretation failures Total errors
Controls vs. Simulating Malingerers
p , .001 (r ¼ .5) p , .001 (r ¼ .8) n.s. n.s.
p p p p
p , .001 (r ¼ .8)
p , .001 (r ¼ .8)
, , , ,
.001 (r .001 (r .001 (r .001 (r
¼ .5) ¼ .8) ¼ .4) ¼ .4)
ABI vs. Simulating Malingerers n.s. n.s. p , .01 (r ¼ .3) p , .001 (r ¼ .5) n.s.
MEASURING EFFORT WITH THE MET-SV
741
Downloaded by [University College London] at 07:43 25 April 2013
more errors across the range of available categories: for example, 34% made errors across all four error categories, whilst only 6% of ABI participants did so. The majority of ABI participants (54%) made two different types of errors on the MET-SV, whilst the most common number of error categories for the simulating malingerers was three (48.9%). Simulating malingerers also made 30 unique errors not previously seen in either of the other groups (see Table 3). Thus, whilst there were no between-
TABLE 3 Unique simulating malingerers’ errors on the MET-SV
Task failures Recorded incorrect newspaper headline Purchased fabric strip instead of plasters Wrote down “Tesco” for the number of shops selling televisions Purchased other confectionary items (e.g., biscuits, cookies, mints, box of chocolates) instead of chocolate bar Purchased teacake instead of small brown loaf Recorded price of potatoes instead of tomatoes Wrote down “lots/none” for number of shops selling televisions Rule breaks – actual and social Rushed excessively Agreed to a three minute nail product demonstration Gawped at members of the public Sat on shopping centre floor Overtly invaded member of the public’s personal space Left superstore through “no exit” barrier Placed unpaid item inside bag Asked shop staff a maths question and left without waiting for the answer Threw unpaid for food item into an open freezer Inefficiencies Purchased required item twice in separate shops Walked away from till leaving purchase or change behind Left till queue to obtain another items or obtain a piece of information Weighed tomatoes to establish price Left bag in shop Dropped money on the floor and walked away Selected wrong envelope for birthday card Recorded several prices of tomatoes Recorded newspaper headline and sub-headline Interpretation failures Believed necessary to finish task under clock Recorded two library closing times Wrote down opening time of the library instead of closing time Recorded number of TVs for sale instead of number of shops selling TVs
N
Percent
11 6 4 4
23% 13% 9% 9%
1 1 1
2% 2% 2%
5 1 1 1 1 1 1 1 1
11% 2% 2% 2% 2% 2% 2% 2% 2%
10 3 3 2 1 1 1 1 1
21% 6% 6% 4% 2% 2% 2% 2% 2%
2 2 1 1
4% 4% 2% 2%
Downloaded by [University College London] at 07:43 25 April 2013
742
CASTIEL, ALDERMAN, JENKINS, KNIGHT, AND BURGESS
group differences with regard to error scores for rule breaks and task failures, qualitative distinctions regarding the types of errors observed within all categories were evident when compared to those previously reported for the ABI and control groups by Alderman and colleagues (2003). For example, 23% of simulators wrote down an incorrect newspaper headline, and 11% attempted to complete the test in as little time as possible despite the instruction they were not to do so by “. . . rushing excessively” (rule break). Whilst simulators had higher error scores for both interpretation failures and inefficiencies, 13 of these were also unique to this group. A notable finding was that an inefficiency made by 21% of simulators was that of buying a required item twice in separate shops.
Between-group comparison of requests for help, ratings of efficiency and task success Group comparisons regarding these variables are summarised in Tables 4 and 5. Table 5 confirms that controls and simulating malingerers made significantly fewer requests for help than the ABI group. A notable finding was that malingerers rated efficiency with shopping in general significantly lower than both other groups. There were no significant between-group differences regarding ratings of how successfully participants indicated they had been in undertaking the test.
Strategies used by simulators Strategies used by simulating malingerers to adversely impact MET-SV performance are summarised in Table 6 and fall into two categories: feigned symptoms of ABI and specific task-related strategies. Cognitive impairment featured predominantly, particularly difficulties with attention, memory, inhibition and planning. However, the most prevalent strategies were deliberately breaking the rules of the test and failing to complete the individual tasks as required. TABLE 4 Requests for help, rating of efficiency and rating of success by group Neurologically healthy controls
Requests for help Shopping efficiency rating Task success rating
ABI participants
Simulating malingerers
M
SD
M
SD
M
SD
1.24 6.56 7.27
1.78 1.83 1.81
4.90 6.82 6.44
5.73 2.44 2.50
1.87 5.06 6.17
2.92 2.36 2.83
MEASURING EFFORT WITH THE MET-SV
743
TABLE 5 Level of significance and effect sizes across the three contrasts for the remaining MET-SV variables
Downloaded by [University College London] at 07:43 25 April 2013
Requests for help Shopping efficiency rating Task success rating
Controls vs. ABI
Controls vs. malingerers
ABI vs. malingerers
p , .001 (r ¼ .4) n.s. n.s.
n.s. p ,.001 (r ¼ .4) n.s.
p , .001 (r ¼ .3) p , .001 (r ¼ .4) n.s.
Classification accuracy statistics Logistic regression models were fitted to assess which variables best predicted membership of the ABI and simulated malingerers groups. The independent variables entered into the initial model were rule breaks, task TABLE 6 Simulating malingerers’ feigning strategies Frequency in simulating malingering sample (N ¼ 47) Feigned symptoms of brain injury Distractibility/poor concentration Memory difficulties Impulsivity Disorientation/confusion Poor planning/organisational abilities Slowness Poor social skills/disinhibition Poor decision-making Tiredness Concrete thinking Asking for help/accepting help Poor multitasking Poor self-monitoring/perseveration Emotionality/annoyance Visual difficulties Catastrophic thinking Poor motivation
17 (36%) 16 (34%) 13 (28%) 13 (28%) 12 (26%) 10 (21%) 6 (13%) 6 (13%) 6 (13%) 4 (9%) 4 (9%) 4 (9%) 3 (6%) 3 (6%) 1 (2%) 1 (2%) 1 (2%)
Specific task-related feigning strategies Spoiling tasks Breaking rules Not monitoring time Not monitoring money Trying to malinger subtly Not completing tasks Ignoring environmental cues (e.g., signs) Using inefficient strategies
31 (70%) 26 (55%) 15 (32%) 15 (32%) 10 (21%) 10 (21%) 6 (13%) 4 (9%)
744
CASTIEL, ALDERMAN, JENKINS, KNIGHT, AND BURGESS TABLE 7 Optimal logistic regression model
Downloaded by [University College London] at 07:43 25 April 2013
Task failures Inefficiencies Interpretation failures Requests for help Shopping efficiency rating Constant
B
SE
Wald
df
Sig.
Odds ratio
–0.12 0.25 0.61 –0.28 –0.28 2.88
0.05 0.13 0.27 0.09 0.11 1.04
6.79 3.80 5.07 10.62 6.00 7.59
1 1 1 1 1 1
.01 .05 .02 .00 .01 .01
0.89 1.28 1.84 0.76 0.76 17.78
failures, inefficiencies, interpretation failures, self-rating of shopping efficiency, self-rating of success with task and number of requests for help. Rule breaks and self-ratings of task success did not explain a significant amount of variation (p . .05) and were removed. A second logistic regression was fitted using the five remaining variables. The coefficients of the resultant model are shown in Table 7. The logistic regression model is based on a mathematical equation which utilises the regression coefficients to calculate the probability that an individual is or is not applying suboptimal effort on the MET-SV. The equation results in a probability score ranging from zero to one with a default probability cut-off 0.5. Therefore, participants with probability values at or below the 0.5 cut-off are considered not to be a simulating malingerer whilst those with a probability above the cut-off are classed as being a simulating malingerer. Using a default cut-off of 0.5 to classify participants based on the predicted values from the model 79.2% of ABI and simulating malingers were correctly classified; 74.5% of simulators were assigned to the malingerers group and 84% of ABI participants were accurately identified. A Receiver Operating Characteristic (ROC) curve was plotted to further assess the model’s level of discrimination and to identify the optimal cutoff score for classifying simulating malingerers. The curve is a plot of true positive rates (sensitivity) against false positive rates (the complement of specificity rates). The greater the area under the curve (AUC), which ranges from zero to one, the better the overall diagnostic power of the model. An AUC score of 0.5 has no discriminatory power (no better than chance) whilst values greater than 0.5 reflect increasing levels of discriminatory power. All AUCs of 0.7 to , 0.8 are classed as “acceptable”, 0.8 to , 0.9 “excellent” and values of 0.9 or more “outstanding” (Hosmer & Lemeshow, 2000). The AUC based on probability values resulting from the logistic regression model is 0.87 (p , .001, 95% CI .80 to .94) indicating “excellent” discrimination. A score of 0.87 indicates that in almost 88% of all possible pairs of simulator–ABI participants, the logistic regression model correctly assigned a higher probability of being a malingerer to members of that group.
MEASURING EFFORT WITH THE MET-SV
745
TABLE 8 Co-ordinates of the ROC curve
Downloaded by [University College London] at 07:43 25 April 2013
Logistic regression probability cut-off value .502 .520 .551 .571 .580 .591 .598 .600 .620 .655
Sensitivity
1 – Specificity (false positive rate)
.745 .745 .723 .702 .681 .681 .681 .660 .660 .638
.160 .140 .140 .140 .140 .120 .100 .100 .080 .080
Extreme values for the curve were omitted for the sake of clarity.
In most tests of effort, high specificity rates are desirable to minimise falsepositive errors (i.e., misdiagnosing an individual with genuine cognitive deficits) (Larrabee & Berry, 2007). In order to reach a specificity of 90%, the various coordinates of the ROC curve were inspected (see Table 8). A probability cut-off score of 0.598, with an associated false positive rate of 10% and a sensitivity of 68.1%, was therefore selected. Although this new cut-off reduces sensitivity from 74.5% to 68.1%, it increases specificity from 84% to 90%.
Clinical implications This model might be useful in helping to determine whether MET-SV performance is indicative of suboptimal effort using the procedure detailed below. However, it is presented for illustrative purposes only and should not be relied upon to draw conclusions until validated by further research. . Multiplying error scores/ratings by their respective regression coefficient (see Table 7) and summating these: (B task failures x patient score) + (B inefficiencies x patient score) + (B interpretation failures x patient score) + (B requests for help x patient score) + (B shopping efficiency rating x patient score) . Adding this to the constant coefficient (see Table 7) to achieve the linear composite (logit). . The linear composite (logit) is then exponentiated to yield the probability that an individual may be applying suboptimal effort: elog it logit
Probability (sub − optimaleffort) = 1 + e
746
CASTIEL, ALDERMAN, JENKINS, KNIGHT, AND BURGESS
Downloaded by [University College London] at 07:43 25 April 2013
The e represents a mathematical constant with a value of 2.719 (Millis, 2008). The formula results in a probability score ranging from 0 to 1 which is multiplied by 100 to obtain the percentage probability that an individual is applying suboptimal effort. A worked example illustrates how this might be used. Consider an examinee with the following error scores/ratings: task failures ¼ 2; inefficiencies ¼ 7; interpretation failures ¼ 5; requests for help ¼ 1; efficiency rating ¼ 3. . –.12(2) + .25(7) + .61(5) + –.28(1) + –.28(3) ¼ 3.44 . 3.44 + 2.88 ¼ 6.32 . P ¼ 2.7196.32/(1 + 2.7196.32) ¼ 0.945 Multiplying P by 100 gives a 94.5% probability of this MET-SV presentation being characteristic of suboptimal performance.
DISCUSSION Bespoke measures are available to help clinicians detect feigned or exaggerated cognitive impairment. However, validity of these tools can be compromised through coaching and information from the internet. It is therefore crucial that research continues “to develop novel and creative methods for capturing suspect effort and non-credible symptoms” (Boone & Lu, 2007, p. 40). Most have focused on the domain of memory. This study is unique in that it has evaluated the utility of an ecologically-valid test of executive functioning in the assessment of simulated cognitive deficits.
Quantitative differences in MET-SV performance Rogers (2007) divides malingering detection strategies into unlikely presentations and excessive impairment. Excessive impairment reflects a level of performance significantly lower than expected from people with ABI. Studies have reported that simulated and probable malingerers often overestimate the magnitude of deficits arising from ABI and consequently feign test performances which are more severe than those obtained from actual patients (for example, van Gorp et al., 1999). However, this was not demonstrated in the current study. Simulating malingerers were successful in replicating a similar level of impairment on the MET-SV to the ABI group, as there was no significant difference between them regarding the total number of errors made (81% and 82%, respectively, achieved a total error scores above the cut-off). Simulators were all clinicians working in neurorehabilitation. Their specialised knowledge and experience of ABI patients may explain this: individuals who are well informed about the impact of ABI may be more able to replicate a realistic degree of cognitive impairment.
Downloaded by [University College London] at 07:43 25 April 2013
MEASURING EFFORT WITH THE MET-SV
747
Whilst the overall error score of simulating malingerers was comparable with people with ABI, there were differences regarding performance patterns. Simulating malingerers made significantly more inefficiencies and interpretation failures, and the distribution of errors across the four categories also differed. Whilst only 6% of the ABI group made errors across all four error categories, 34% of simulators did so. People with ABI characteristically made errors that fell into only two of these categories (rule breaks and task failures) as reported before in Alderman et al. (2003); however, the tendency was for simulating malingerers to make errors across three categories. The reason why ABI individuals and non-simulating controls generally perform few inefficiencies and interpretation failures is attributable to the design of the MET-SV. In the process of developing a simplified version, Alderman et al. (2003) made three key changes: provision of more concrete rules to enhance task clarity; less complex task demands; and provision of an exercise sheet explicitly requesting participants to record required information. Very few interpretation failures and inefficiencies were observed and these authors attributed this to simplification of the measure. Consequently, high numbers of both these and prevalence of errors from all four categories may raise doubts regarding the validity of an individual’s performance. Another finding of note concerned requests for help. This strategy was a striking characteristic of people with ABI, who made on average nearly five times more requests for help than controls. In contrast, only 4% of simulating malingerers sought assistance. Similarly, there was also an anomaly regarding ratings of efficiency with tasks like shopping. Non-simulating controls and people with ABI did not differ (both groups had a median score of 7 with “0” being “hopeless” and “10” being “excellent”). However, the median rating for the simulating malingerers was 4. As the ABI group performed significantly worse on the MET-SV than controls, their inflated ratings probably reflects lack of insight (Alderman et al., 2003). However, in contrast, simulators did not replicate this.
Qualitative differences in MET-SV performance In addition to over-representation of inefficiencies and interpretation failures, simulators also exhibited 30 errors not observed in either the control or ABI groups. These were spread across all four error categories. For example, 23% recorded the incorrect newspaper headline. Some simulating malingering errors involved spoiling the task so it constituted a “near-miss”. For instance, 13% purchased fabric strip instead of plasters, 9% recorded the name of the supermarket instead of the number of shops selling televisions, and 9% purchased other confectionary items instead of a chocolate bar. This near-miss
748
CASTIEL, ALDERMAN, JENKINS, KNIGHT, AND BURGESS
strategy perhaps reflected attempts by simulators to undermine performance without appearing too obvious. Observing these errors may raise concerns about suboptimal effort. Because of the impossibility of generating an exhaustive list of all possible MET-SV errors associated with genuine cognitive impairment, clinicians would need to exercise caution when drawing conclusions based on atypical errors alone. However, their presence may provoke further investigation.
Downloaded by [University College London] at 07:43 25 April 2013
Diagnostic validity of the MET-SV Although theoretically interesting, differences in group medians of overall performance tells us little about the ability of the test to discriminate suboptimal effort. However, the pattern of performance is noteworthy. An optimal logistic regression model indicated that the probability of being a simulating malingerer increases with fewer task failures, greater number of inefficiencies, greater number of interpretation failures, fewer requests for help and lower rating of shopping efficiency. Using the default probability cut-off of 0.5, the logistic regression model assigns a positive score (indicating the presence of suboptimal effort) to 74.5% of simulating malingerers and a negative score (indicating the absence of suboptimal effort) to 84% of people with ABI. Previously, Heaton and colleagues (1978) demonstrated that experienced clinicians could not reliably distinguish between simulating malingerers and patients with severe ABI by analysing neuropsychological test scores alone. However, discriminant function analysis was able to reliably identify simulating malingerers from genuine ABI patients in this study. Consequently, statistical formulae may help identify atypical response patterns characteristic of suboptimal effort. Distinguishing between simulating malingerers and people with ABI was not possible using the total MET-SV error score. However, logistic regression produced a predictive model of simulating malingering that correctly classified the majority of participants. Millis (2008) argues a key advantage of using logistic regression models to identify suboptimal effort is that it frees the clinician from “all or nothing” thinking. Rather than yielding dichotomous judgements regarding an examinee’s level of effort (normal versus suboptimal), logistic regression models generate a probability of suboptimal effort. The resulting information is rich and informative: for example, a probability of 51% will have a different weight than a probability of 91%. In addition, Millis (2008) notes that logistic regression models give the clinician the flexibility to modify the probability threshold of suboptimal effort depending on associated consequences of making a false positive versus a false negative error. Millis (2008) also states that multivariable effort composites derived from standard neuropsychological tests, such as that described here, may be more resistant to
MEASURING EFFORT WITH THE MET-SV
749
coaching. Multivariable effort indices are less sensitive to some of the key limitations associated with SVTs, such as transparency and greater vulnerability to coaching.
Downloaded by [University College London] at 07:43 25 April 2013
Future research This study used performance pattern analysis to identify simulated malingering on the MET-SV using logistic regression. The aim of logistic regression is to differentiate test score performance profiles of neurologically healthy individuals instructed to feign impairment from those of ABI participants. At this stage we only know how good the proposed logistic regression model is in relation to the current sample. To establish the generalisability of these findings the model would need to be replicated in other studies. The simulating sample used here was unique in consisting of participants who had ABI knowledge and expertise. Accordingly, it follows they would be more likely to mimic the impact of cognitive and other difficulties on the METSV. Composition of this group was a deliberate choice. The degree of sophistication bought to the experiment by such a group is likely to be greater than that resulting from coaching and therefore constitute a severe test of the null hypothesis. Despite this, it is still unclear how comparable the performance of ABI professionals simulating suboptimal effort is in relation to actual malingerers. Another potential confounding issue is that the simulating malingerers were of above average intellectual ability whilst the ABI group of average premorbid ability. Future research including a litigant group who have shown suboptimal effort on other effort tests would help to clarify this issue and further verify the diagnostic validity of the logistic model presented here. The addition of such measures would also help determine if the METSV adds unique variance or incremental validity to detection of suboptimal effort. Likewise, inclusion of the WCST would also be helpful in determining if the MET-SV potentially adds incremental validity to detection of suboptimal effort regarding tests of executive functioning. Whilst the current study provides useful information about how the METSV could be utilised in investigating suboptimal effort, replication and clarification of findings is required. Practical barriers, such as administration cost and practicalities, may also limit the extent to which clinicians use the MET-SV. Nevertheless it is argued that this study is unique in yielding greater understanding into the real-life observation of suspected malingerers compared to individuals with genuine cognitive difficulties. Whilst further research is needed to establish whether the MET-SV can be reliably used in medico-legal contexts, results of this study suggest the MET-SV might be used in clinical settings alongside other measures in investigations of suboptimal effort.
750
CASTIEL, ALDERMAN, JENKINS, KNIGHT, AND BURGESS
Downloaded by [University College London] at 07:43 25 April 2013
REFERENCES Alderman, N., Burgess, P. W., Knight, C., & Henman, C. (2003). Ecological validity of a simplified version of the multiple errands shopping test. Journal of the International Neuropsychological Society, 9, 31–44. Ashendorf, L., O’Bryant, S. E., & McCaffrey, R. J. (2003). Specificity of malingering detection strategies in older adults using the CVLT and WCST. The Clinical Neuropsychologist, 17, 255–262. Bauer, L., & McCaffrey, R. J. (2006). Coverage of the Test of Memory Malingering, Victoria Symptom Validity Test, and Word Memory Test on the Internet: Is test security threatened? Archives of Clinical Neuropsychology, 21, 121–126. Boone, K. B., & Lu, P. H. (2007). Non-forced-choice effort measures. In G. J. Larrabee (Ed.), Assessment of malingered neuropsychological deficits (pp. 27–43). Oxford, UK: Oxford University Press. Burgess, P. W., Alderman, N., Forbes, C., Costello, A., Coates, L. M. A., Dawson, D. R., et al. (2006). The case for the development and use of “ecologically valid” measures of executive function in experimental and clinical neuropsychology. Journal of the International Neuropsychological Society, 12, 194–209. Burgess, P. W., Alderman, N., Volle, E., Benoit, R. G., & Gilbert, S. J. (2009). Mesulam’s frontal lobe mystery re-examined. Restorative Neurology and Neuroscience, 27, 493–506. Bush, S. S., Ruff, R. M., Tro¨ster, A. I., Barth, J. T., Koffler, S. P., Pliskin, N. H., et al. (2005). Symptom validity assessment: Practice issues and medical necessity. NAN Policy & Planning Committee. Archives of Clinical Neuropsychology, 20, 419–426. Essig, S. M., Mittenberg, W., Peterson, R. S., Strauman, S., & Cooper, J. T. (2001). Practices in forensic neuropsychology: Perspectives of neuropsychologists and trial attorneys. Archives of Clinical Neuropsychology, 16, 271–291. Goel, V., Grafman, J., Tajick, J., Gana, S., & Danto, D. (1997). A study of the performance of patients with frontal lobe lesions in a financial planning task. Brain, 120, 1805–1822. Green, P. (2005). Green’s Word Memory Test User’s Manual. Edmonton: Green Publishing Inc. Greve, K. W., & Bianchini, K. J. (2007). Detection of cognitive malingering with tests of executive function. In G. J. Larrabee (Ed.), Assessment of malingered neuropsychological deficits (pp. 171–225). Oxford: Oxford University Press. Heaton, R. K., Smith, H. H., Lehman, R. A. W., & Vogt, A. (1978). Prospect of faking believable deficits on neuropsychological testing. Journal of Consulting and Clinical Psychology, 46, 892–900. Hosmer, D. W., & Lemeshow, S. (2000). Applied Logistic Regression. New York: John Wiley and Sons. Iverson, G. L., & Tulsky, D. S. (2003). Detecting malingering on the WAIS-III: Unusual digit span performance patterns in the normal population and in clinical groups. Archives of Clinical Neuropsychology, 18, 1–9. Kelly, P. J., Baker, G. A., van den Broek, M. D., Jackson, H., & Humphries, G. (2005). The detection of malingering in memory performance: The sensitivity and specificity of four measures in a UK population. British Journal of Clinical Psychology, 44, 333–341. Killgore, W. D. S., & DellaPietra, L. (2000). Using the WMS-III to detect malingering: Empirical validation of the Rarely Missed Index (RMI). Journal of Clinical and Experimental Neuropsychology, 22, 761–771. Larabee, G. J., & Berry, D. T. (2007). Diagnostic classification statistics and diagnostic validity of malingering assessment. In G. J. Larabee (Ed.), Assessment of malingered neuropsychological deficits (pp. 14–26). Oxford, UK: Oxford University Press. McCarter, R. J., Walton, N. H., Brooks, D. N., & Powell, G. E. (2009). Effort testing in contemporary neuropsychological practice. The Clinical Neuropsychologist, 20, 1–17.
Downloaded by [University College London] at 07:43 25 April 2013
MEASURING EFFORT WITH THE MET-SV
751
Millis, S. R. (2008). What clinicians really need to know about symptom exaggeration, insufficient effort, and malingering: Statistical and measurement matters. In J. E. Morgan & J. J. Sweet (Eds.), Neuropsychology of malingering casebook (pp. 21–37). Hove, UK: Psychology Press. Nelson, H. E. (1991). The National Adult Reading Test (2nd ed.). Windsor, Berkshire, UK: NFER Nelson. Nitch, S. R., & Glassmire, D. M. (2007). Non-forced-choice measures to detect noncredible cognitive performance. In K. B. Boone (Ed.), Assessment of feigned cognitive impairment: A neuropsychological perspective (pp. 78–102). London, UK: The Guilford Press. Rogers, R. (1997). Clinical Assessment of Malingering and Deception (2nd ed.). New York: Guildford Press. Rogers, R. (2007). Clinical assessment of malingering and deception (3rd ed.). New York: Guilford Press. Shallice, T., & Burgess, P. W. (1991). Deficits in strategy application following frontal lobe damage in man. Brain, 114, 727–741. Tombaugh, T. N. (1996). Test of Memory Malingering. New York: Multi-Health System Inc. van Gorp, W. G., Humphrey, L. A., Kalechstein, A., Brumm, V. L., McMullen, W. J., Stoddard, M., et al. (1999). How well do standard clinical neuropsychological tests identify malingering? A preliminary analysis. Journal of Clinical and Experimental Neuropsychology, 21, 245–250. Manuscript received October 2011 Revised manuscript received April 2012 First published online June 2012