Psychology, Crime & Law, March 2005, Vol. 11(1), pp. 99 /122
THE DETECTION OF DECEPTION WITH THE REALITY MONITORING APPROACH: A REVIEW OF THE EMPIRICAL EVIDENCE JAUME MASIPa*, SIEGFRIED L. SPORERb, EUGENIO GARRIDOa and CARMEN HERREROa a Department of Social Psychology and Anthropology, University of Salamanca, Facultad de Psicologı´a, Avda de la Merced, 109-131, 37005 Salamanca, Spain; bDepartment of Psychology and Sports Science, University of Giessen, Germany
(Received 11 December 2002; in final form 8 August 2003) One of the verbal approaches to the detection of deceit is based on research on human memory that tries to identify the characteristics that differentiate between internal and external memories (reality monitoring). This approach has attempted to extrapolate the contributions of reality monitoring (RM) research to the deception area. In this paper, we have attempted to review all available studies conducted in several countries in order to yield some general conclusions concerning the discriminative power of this approach. Regarding individual criteria, the empirical results are not very encouraging: few criteria discriminate significantly across studies, and there are several variables that moderate their effect. Some of the contradictory findings may have emerged because of differences in the operationalizations and procedures used across individual studies. However, more promising results have been reported in recent studies, and the approach as a whole appears to discriminate above chance level, reaching accuracy rates that are similar to those of criteria-based content analysis (CBCA). Some suggestions for future research are made. Keywords: Detection of Deception; Lie Detection; Reality Monitoring (RM); Criteria-based Content Analysis (CBCA); Statement Validity Analysis (SVA); Credibility Assessment; (Child) Sexual Abuse; Content Cues; Verbal Cues
INTRODUCTION The detection of deception has fascinated humanity throughout history (Trovillo, 1939; Larson, 1969; Kerr, 1990; Masip et al ., submitted a, submitted b) and has played a particular role in the history of eyewitness testimony (see Sporer, 1982). This should not be surprising, given the advantage one can gain from knowing information that an opponent wishes to conceal. Therefore, it is no wonder that a myriad of procedures aimed at the detection of deceit and the identification of liars, which have been used institutionally in order to administer justice and preserve the social order, has flourished throughout history. Currently there are three major approaches available to detect deception (Yuille, 1989; Alonso-Quecuty, 1994a; Sporer, 1997; Masip and Garrido, 2000; Vrij, 2000; Vrij et al ., 2001). The first encompasses several procedures based upon the measurement, recording and
*Corresponding author. E-mail:
[email protected]. ISSN 1068-316X print/ISSN 1477-2744 online # 2005 Taylor & Francis Ltd DOI: 10.1080/10683160410001726356
99
100
J. MASIP et al.
analysis of the psychophysiological activity of the subject being examined. The most widely known of such procedures is the polygraph test (e.g. Gale, 1988; Ben-Shakhar and Furedy, 1990; Lykken, 1998; Kleiner, 2002). The second approach focuses on nonverbal and paraverbal correlates of deception (for reviews, see Zuckerman and Driver, 1985; Masip and Garrido, 2000; Vrij, 2000; Sporer and Schwandt, 2002, 2003; DePaulo et al ., 2003), and the third one focuses upon the analysis of the verbal content of the witness’s speech. The origins of this third approach to the detection of deceit are diverse, because despite some similarities among the several (not many, actually) existing verbal approaches, these emerged in different contexts as well as at different historical times. One of these procedures is the currently well-known criteria-based content analysis (CBCA) (see Sporer, 1983, 1997; Steller and Ko¨hnken, 1989; Steller and Boychuk, 1992; Honts, 1994; Lamb et al ., 1997; Ruby and Brigham, 1997; Vrij and Akehurst, 1998; Garrido and Masip, 2001; Vrij, 2000, 2002; Raskin and Esplin, 1991a,b). Another comprises the extrapolations that have been made from the reality monitoring (RM) approach (Johnson and Raye, 1981) to the detection of deceit (e.g. Alonso-Quecuty, 1992, 1995b; Sporer, 1997, in press; Vrij, 2000). Other verbal lie detection procedures are Sapir’s SCAN (Scientific Content Analysis) technique (Lesce, 1990; Driscoll, 1994; Adams, 1996; Smith, 2001; for a review, see Masip et al ., 2002) and Wiener and Mehrabian’s (1968) verbal non-immediacy (see also Kuiken, 1981; Buller et al ., 1996). Among all these verbal procedures, those which are most well-known and which have yielded most empirical research are the CBCA and the RM approaches. The main aim of the present paper is to comprehensively review the findings of the scientific literature concerning the second of these verbal procedures, that is, the RM approach. The theoretical foundations upon which this approach rests are briefly outlined. Subsequently, all studies known to the authors examining the usefulness of the RM model to discriminate between truthful and deceptive accounts are described. Our literature review is based both on a search in relevant databases (including the Social Sciences Citation Index) as well as the reference lists of previous studies. Several of these studies have been neglected in the English language literature, because they were published in languages other than English (i.e. Spanish, German or French). This has resulted in an incomplete picture of the RM approach research. In this review, we especially draw the researchers’ attention to these nonEnglish language reports, without disregarding other more available and, therefore, wellknown research. Finally, some general thoughts and conclusions are drawn from that comprehensive review, concerning both the discriminative power of individual content criteria and the overall validity of the approach, as well as a discussion of the effects that certain moderator variables appear to have upon the criteria, and some suggestions for future research are outlined. Finally, a comparison between the RM and CBCA approaches, in terms of their overall accuracy, instructions or training needed by the raters, their theoretical foundations, etc. is briefly made.
THE RM APPROACH: THEORETICAL FOUNDATIONS AND EMPIRICAL RESEARCH Theoretical Foundations: Cognitive Research into the Origin of Memories Statement validity analysis (SVA) originated from expert testimony in Germany (see Sporer, 1982, 1983; Steller and Ko¨hnken, 1989; Undeutsch, 1989) and Sweden (Trankell, 1971), primarily in cases of (child) sexual abuse. A central part of SVA is CBCA, which focuses on
DETECTING DECEPTION WITH THE RM APPROACH
101
various content aspects of a given statement. In contrast, the RM approach originated from basic research on memory conducted by Johnson and her collaborators in the US (Johnson and Raye, 1981; Schooler et al ., 1986, 1988; Johnson et al ., 1988, 1993; Johnson and Suengas, 1989; Suengas 1991). However, these two research orientations are conceptually similar, as Schooler et al . (1986) rightly noted, since in both cases the underlying idea is that there will be certain features that will differentiate between the recollections of events actually experienced by the witness and those of episodes that the individual has not been involved in or has not witnessed (for a more thorough discussion of the underlying rationale, see Sporer, in press). In their classic 1981 paper, Johnson and Raye proposed that the origin of someone’s memories may be known based on the characteristics of those memories. The authors differentiated between two possible origins of memories: an external origin, based on perceptual processes (memories of experienced events), and an internal origin, based on reasoning, imagination and thought processes (Johnson and Raye, 1981). The strategies used by individuals to differentiate one type of memory from the other was labeled by these authors ‘‘reality monitoring’’. Johnson and Raye argue that there are four different attributes or kinds of information which may be present in someone’s memories: contextual information (concerning space and time), sensory information (shapes, colors, etc.), semantic information, and cognitive operations. They suggest that external memories (i.e. those of perceived or experienced events) will contain more contextual, sensory, and semantic details than internally generated memories (imagined events). The latter, on the other hand, would contain more references to cognitive processes at the time of encoding. In terms of the RM approach, the ‘‘truth’’ is the recollection of something done or witnessed, and a lie is an internally generated memory. In the words of Alonso-Quecuty (1994a): Now let us think of two statements: a truthful one and a deceptive one. When a witness tells the truth, she is remembering things she actually perceived, but if she lies her statement is based on events which did not happen and, therefore, exist only in her imagination: imagined events (p. 146).
Therefore, it is reasonable to suggest that, maybe, the presence of sensory, contextual, semantic information and the mentioning of cognitive processes by the witness in his or her testimony will be useful to discriminate between truthful and deceptive statements (see, e.g. Alonso-Quecuty, 1992, 1994a,b, 1995b; Herna´ndez-Fernaud and Alonso-Quecuty, 1997a). With this idea in mind, the Spanish researcher Marı´a Luisa Alonso-Quecuty has been conducting a series of experiments since the late 1980s (Alonso-Quecuty, 1990a,b, 1992, 1993, 1995a; Alonso-Quecuty and Herna´ndez-Fernaud, 1997; Alonso-Quecuty et al ., 1997; Herna´ndez-Fernaud and Alonso-Quecuty, 1997b). Later on, a number of researchers from other countries also began to do research on this topic: Siegfried L. Sporer (Ku¨pper and Sporer, 1995; Sporer and Ku¨pper, 1995; Sporer et al ., 1995; Sporer and Hamilton, 1996; Sporer 1997, in press) in Germany, Stephen Porter in Canada (Porter and Yuille, 1996), Pekka Santtila, Heli Roppola and Pekka Niemi in Finland (Santtila et al. 1999), Claudine Biland and her colleagues in France (Biland et al ., 1999) and, more recently, Pa¨r Anders Granhag’s research team in Sweden (Granhag et al ., 2001) and Aldert Vrij’s in the UK (Vrij et al ., 2000, 2001). As this latter author observed in his recent review on deception and its detection (Vrij, 2000), the relative novelty of research in this area has not permitted that a standard set of criteria (such as the 19 CBCA criteria that Steller and Ko¨hnken described in 1989) be created
102
J. MASIP et al.
yet. Therefore, different research groups used different definitions and operationalizations of their own. We need to take these differences into account when comparing different studies.
Empirical Research In a pilot study, Alonso-Quecuty (1992)1 examined whether the features that differentiate between externally and internally generated memories made it possible to differentiate between truthful and intentionally distorted statements. She showed a video tape depicting a criminal act to 22 participants. Later on, these participants had to make two statements, a truthful one and a deceptive one, about the events they had witnessed on that video tape. Half of them made the second statement (which was truthful in 50% of the cases and deceptive in the remaining 50%) immediately after the first (immediate condition); the other half was allowed 10 minutes to prepare the second statement (delayed condition). When the statements had been made immediately, there was more sensory and contextual information in truthful than in deceptive accounts, and there was more idiosyncratic information in the latter than in the former. On the contrary, in the delayed condition only the results concerning the idiosyncratic information remained the same, while the effects for sensory and contextual information were reversed: both were more frequent in the deceptive statements than in the truthful ones (see Table 1). TABLE 1 Results of studies by Alonso-Quecuty and her colleagues. Study
RM criteriaa
N Sensory
Contextual
Semantic
Internal or idiosyncratic
Alonso-Quecuty (1992)b Immediate Delayed (10 min)
22 11 11
T D
T D
D D
Alonso-Quecuty (1993)c
68
T
T
NS
Alonso-Quecuty (1995a) Live Adults Children Laboratory (video) Adults Children
45 22 or 10d 12 or 22 or 10d 12 or
NS T
T D
T D
NS NS
NS NS
NS NS
NS NS
NS NS
Alonso-Quecuty et al. (1997)e Adults Children Alonso-Quecuty and Herna´ndez-Fernaud (1997)f Hema´ndez-Fernaud and Alonso-Quecuty (1997b)g
50 25 25 32
T T T
T T T
NS NS NS
73
T
T
NS
23d 13d 23d 13d
Notes. T: the criterion is significantly (p B/ 0.05) associated with truthfulness; D: the criterion is significantly (p B/ 0.05) associated with deceptiveness; NS: the criterion has no significant associations with either truthfulness or deceptiveness. Unless indicated otherwise, participants were adults. a Empty cells indicate that in that study the number of semantic details was not examined. bLaboratory context (videos). c Laboratory context (videos). Both delay and elaboration conditions taken together. dCell frequencies could not be exactly determined from report. eLaboratory context (audio tapes). fLaboratory context (videos). All three repeated statements taken together. The differences were stronger in the third statement. gLaboratory context (videos). Both kinds of interview taken together. The effects were significantly stronger using the CI than using the STI.
DETECTING DECEPTION WITH THE RM APPROACH
103
In addition to the three dependent variables mentioned above, the length of the statements (number of words) and the number of pauses had been measured. These variables were not derived from Johnson and Raye’s (1981) theory, but have traditionally been studied within the nonverbal approach to the detection of deceit (see DePaulo et al ., 2003; Sporer and Schwandt, 2003). When the statements in Alonsos-Quecuty’s study were made immediately after having watched the video they were longer and contained more pauses in the truthful version than in the deceptive version; just the opposite happened in the delayed condition: the deceptive statements contained more words and more pauses than the truthful ones. Thus, it is possible that the differences in sensory and contextual information merely reflect differences in the length of accounts. Alonso-Quecuty (1993) investigated the influence of truth status (truthful vs deceptive), the delay in making the statement (same day on which the stimulus video had been watched vs the next day) and the elaboration of the memory (15 min vs no preparation time before being interviewed) upon the criteria of sensory, contextual and internal information. Truthful statements contained more contextual and sensory information than deceptive ones, but there were no differences between them concerning idiosyncratic information (see Table 1). Preparation increased the number of contextual and sensory details both in truthful and deceptive accounts, as well as the amount of idiosyncratic information in the deceptive ones. The delay in making the statement increased the number of contextual details and decreased the sensory information provided in the statements. The joint effect of delay and elaboration increased the number of contextual details more than any of these factors alone, and increased the sensory details especially in the truthful statements. Alonso-Quecuty (1995a) examined the impact of the age of the witnesses (8- to 10-year-old children vs 18- to 21-year-old adults), the truth status (truthful vs deceptive) and the type of experimental context (live / in which the participants witnessed a live staged criminal act vs video / participants saw the same events recorded on video tape) upon four RM criteria: contextual, sensory, semantic and internal (feelings, thoughts and opinions) information. As can be observed in Table 1, significant effects were apparent only in the live condition, but not in the video condition. Alonso-Quecuty explains this in terms of the stronger personal involvement of the witnesses in the critical event when they are in a real situation rather than when they merely watch a video tape showing that event. In addition, the witness’s age was a critical factor: in the live context, the RM hypotheses regarding contextual details and semantic information were supported for adult witnesses, but when the witnesses were children the results were contrary to what was expected; for child witnesses, only the prediction concerning sensory information was supported (see Table 1). In the experiments reported so far, Alonso-Quecuty had used video recordings. She wondered what would happen if audio tapes were used instead of video tapes, and examined this question using child and adult witnesses (Alonso-Quecuty et al ., 1997). In the auditory condition, adults provided more contextual, sensory and internal information types than children. In addition, there was more contextual and sensory information in truthful than in deceptive accounts, whereas for internal information there was no difference (Table 1). Surprisingly, the authors found that the statements of the audio and video conditions contained more contextual and sensory details than the statements when watching the event live, and that more internal information was contained in the statements of the video and real context conditions than in those of the auditory condition. Alonso-Quecuty and Herna´ndez-Fernaud (1997) studied the effects of repeating statements upon contextual, sensory and internal information. A sample of undergraduates
104
J. MASIP et al.
watched a 15-minute video tape of an event in which people attempted to take something from a car / which looked like a theft. Later on, they had to give testimony three times about the events witnessed. Half of the witnesses had to lie, the remaining half had to tell the truth. There was not much delay between the three statements: the second was made immediately after the first, and the third just after the second. Truthful statements contained more contextual information than false statements, the third statement contained more information of this kind than the first and, in addition, both variables interacted: the maximum number of contextual details was contained in the third truthful statement, and the minimum in the first deceptive statement; the contextual information increased from the first through the third statement for both truthful and deceptive accounts, but that increase was more pronounced for truthful statements. Sensory details were more numerous in truthful than in deceptive statements; in addition, in truthful statements they increased across the series of repetitions, while in the deceptive ones they remained the same. Finally, internal information showed similar levels in truthful as in deceptive statements, but increased across the three repetitions for both types of statements. Unfortunately, the authors did not include post hoc tests or report effect sizes of the differences between truthful and deceptive accounts separately for the first, the second, and the third statement, therefore information about the interaction is not included in Table 1. In any case, since the sensory and contextual details increased across the series of statements more in the truthful accounts than in the deceptive ones, we understand that the discrimination was higher in the third statement than in the first. This has important practical implications. A final study by Herna´ndez-Fernaud and Alonso-Quecuty (1997b) examined, among other things, whether truthful and deceptive statements obtained using either the original cognitive interview (CI) of Geiselman et al. (1984; see also Fisher and Geiselman, 1992) or the kind of interview normally used by the Spanish police (STI) differed in terms of their contextual, sensory and internal (thoughts and feelings) information. Seventy-three participants watched a video tape showing a simulated incident of an apparent theft from a car and then they were interviewed. The independent variables were truth status (35 participants lied and 38 told the truth) and the kind of interview used (CI vs STI). The statements obtained using the CI contained more contextual information than those obtained using the STI; truthful statements contained more contextual information than false statements; and the interaction between these two variables was also significant: the differences between truthful and deceptive statements in terms of contextual information were much more pronounced when the CI had been used than with the STI. Similar results were found concerning sensory information: it was greater in the statements obtained using the CI than in those obtained using the STI, greater in truthful accounts than in deceptive accounts, and a significant interaction showed that the latter effect was stronger when the CI had been used than when the STI had been used. Regarding internal information, neither of the two independent variables was significant, nor was their interaction. The authors did not report post hoc tests to examine whether the differences between truthful and deceptive accounts in terms of their contextual and sensory information were significant within the STI and the CI conditions, therefore only the main effects are included in Table 1. In any case, the differences in contextual and sensory information between truthful and deceptive accounts increased dramatically if the CI was used instead of a standard police interview. In summary, Alonso-Quecuty and her colleagues have presented a series of studies attempting to employ the RM approach to the detection of deception. However, it should be noted, that many of the studies are based on small sample sizes and used (somewhat
DETECTING DECEPTION WITH THE RM APPROACH
105
artificial) repeated-measures designs. Also, the method employed, including the operational definitions of the RM criteria used, is not described in detail, making it difficult to relate the outcomes of their studies to other people’s work. None of the studies contained information about inter-coder reliability. Whereas in the studies by Alonso-Quecuty and her colleagues participants in the deceptive condition had to falsify the report of an event (witnessed via videotape or live), in the studies by Sporer and his colleagues described below reports of truthful autobiographical events were compared with freely invented events of the same event category (e.g. an accident, an operation, a vacation experience, etc.). In the first of these studies, Sporer and Ku¨pper (1995) had 100 participants write a report of a personally significant self-experienced and a freely invented event, in counterbalanced order. In between reports there was a 1-week interval during which participants were supposed to think about (prepare) their report for the second session. A trained rater blind to the experimental conditions evaluated all 200 accounts with the Judgments of Memory Characteristics Questionnaire (JMCQ), a modified version of Johnson et al .’s (1988) Memory Characteristics Questionnaire (MCQ) adapted for evaluating other people’s report.2 Based on factor analyses and on theoretical considerations, 35 of the 39 items3 were grouped into eight scales referred to as RM criteria (see Table 2). The RM criteria were clarity/vividness, sensory experiences, spatial information, time information, emotions and feelings, reconstructability of the story (despite complexity of the actions and an unstructured reporting style), realism (appearing plausible according to one’s own experiences in similar situations), and cognitive operations4 (see Table 2). The first seven criteria were hypothesized to be more pronounced in truthful, the last one in deceptive accounts (analogous to ‘‘idiosyncratic information’’ in Alonso-Quecuty’s studies). Sporer and Ku¨pper (1995) found that the narratives of experienced events were rated significantly higher in realism and time information, but contained less sensory experiences. In addition, some differences between truthful and deceptive accounts, such as those concerning clarity and vividness (pB/ 0.005) and spatial information (pB/ 0.10) were found only in the delayed condition (after 1 week’s time) but not in the immediate condition. Based on a multiple discriminant analysis with the eight RM criteria 68% of the self-experienced, and 70% of the invented accounts were classified correctly (overall: 69%). The rater who judged each account after rating it on all 39 items reached an overall accuracy rate of 69% (which is significantly above chance). In a follow-up study, Sporer et al. (1995) compared the accuracy rates of the rater in the previous study with those of two additional ‘‘naive’’ raters who also evaluated all 200 accounts intuitively, that is without the RM criteria, and two additional groups of 20 male and 20 female students each of whom assessed 20 of the 200 accounts intuitively. The two naive raters reached accuracy rates of 53% each, and the other raters of 53% and 59%, respectively, all performing significantly worse than the rater with the RM criteria (69%). Thus, it appears that knowledge of RM criteria does indeed seem to help to evaluate reports of events above chance level. Sporer (1997) had two raters evaluate 80 reports (transcripts of videotaped accounts) both with 13 CBCA and the eight RM criteria used in the Sporer and Ku¨pper study. There were 40 truthful accounts (descriptions of events that the participants had experienced) and 40 deceptive (i.e. freely invented) accounts provided by 40 psychology undergraduates. Each participant gave two statements, one of which was truthful while the other one was deceptive (in counterbalanced order). One half of the statements were in the immediate condition where participants were given a brief period up to a maximum of 2 minutes to prepare their
106
TABLE 2
Differences in RM criteria (Sporer and Ku¨pper’s Judgment of Memory Characteristics Questionnaire [JMCQ]) between truthful and deceptive accounts.
Criterion Sporer and Ku ¨ pper (1995)
Sporer (1997)
Condition
Condition
Immediate (n/100)
1 week (n/100)
Overall (N/200)
Immediate (n/40)
NS Da NS NS NS NS T NS
T NS Ta T NS Ta T NS
NS D NS T NS NS T NS
NS NS NS NS NS NS T NS
Delayed (n/40) NS NS T T NS T T NS
Overall (N/80) NS NS NS NS NS Ta T NS
Sporer and Hamilton (1996)
Ho¨fer et al. (1996)
Santtila et al. (1999)
(N /240)
Adults/children (N/66)
Children (N /68)
T NS NS T /b NSc T /b
T NSd NSd T T NSd T T
NS Ta NS T NS D NS NS
Notes. T: the criterion is significantly associated with truthfulness; D: criterion is significantly associated with deceptiveness; NS: the criterion has no significant associations with either truthfulness or deceptiveness. ap B/ 0.10. bNot investigated. cDefined differently from other studies (see text). dNot reported (presumably not significant).
J. MASIP et al.
Clarity/Vividness Sensory information Spatial information Time information Reconstructability Emotions/Feelings Realism Cognitive operations
Study
DETECTING DECEPTION WITH THE RM APPROACH
107
first account. The other half of the statements were given in the delayed condition, although the delay consisted only of the time required to produce the first statement (before giving the second statement, they were again given a short period to prepare). A MANOVA where the scores of these criteria were taken as dependent variables yielded a multivariate main effect of truth status, that is, it was possible to discriminate between truthful and deceptive accounts on the basis of the RM criteria scores. At a univariate level, realism and emotions scores were higher in the truthful than in the deceptive narratives. Three variables were useful to discriminate only in the delayed condition but not in the immediate condition: spatial information, time information, and emotions/feelings; all three scored higher in truthful accounts than in the invented ones (Table 2). The poor utility of some of the individual criteria is somewhat surprising, as is the effect of the apparently mild manipulation of the delay variable. A multiple discriminant analysis using the eight RM criteria of Table 2 yielded 75.0% accurate classifications of the truthful statements and 67.5% of the deceptive ones. Taking together the RM and CBCA criteria, correct classifications increased up to 82.5% for the truthful accounts and 75.0% for the deceptive ones. At their final evaluation of each story, the two raters achieved 67.5% and 75.0% correct classifications for the self-experienced, and 77.5% and 55.0% for the deceptive accounts, respectively, that is, an average of 68.8% correct classifications. In a study by Sporer and Hamilton (1996; for a more detailed description, see Sporer, in press) the effects of truthfulness (invented vs self-experienced) and life period (under 15 vs over 15 years of age) in which the events took place were investigated with n/240 adults (160 females, 80 males). All participants wrote either an account of a self-experienced or an invented personally significant experience. Based on factor analyses of self-ratings of memory characteristics, seven RM scales were constructed (clarity/vividness, sensory information, spatial information, time information, memory quality/rehearsal, feelings/ significance of the event, and realism). The results revealed higher ratings for selfexperienced accounts than for invented accounts (see Table 2) for clarity/vividness, time information, and realism. Contrary to expectation, invented stories showed a non-significant tendency to be rated to contain more sensory information and more feelings/to be more significant in the life of the storyteller. Sporer and his colleagues (e.g. Ku¨pper and Sporer, 1995; Sporer, 1997) noted that it is difficult to obtain satisfactory inter-rater reliability for some of the criteria used, in particular for sensory information, emotions, and realism. They also noted that sensory information had shown a floor effect in all of their studies, which may have been responsible for low intercoder reliabilities but should also lead to cautiousness in using this criterion with respect to interpretation of differences between invented and self-experienced accounts. Porter and Yuille (1996) requested some of their participants to commit a simulated crime. They interviewed both the perpetrators and innocent ‘‘suspects’’, and their statements were analyzed upon the basis of 10 CBCA criteria, three criteria based on the so-called SCAN technique (see Driscoll, 1994), and four criteria the authors considered related to the RM approach: frequency of verbal hedges (‘‘I believe’’, ‘‘It seems to me’’, ‘‘I think’’, ‘‘I figure’’, etc.), number of self-references (‘‘I’’, ‘‘me’’, ‘‘my’’, ‘‘mine’’, etc.), number of words, and frequency of ‘‘filled pauses’’ (‘‘umm’’, ‘‘uhh’’, ‘‘hmm’’, etc.). According to the authors,5 these characteristics were expected to be more strongly present in deceptive accounts (internal memories) than in truthful ones (external memories). Significant results were obtained for only three of the 17 criteria examined; all three were CBCA criteria (number of details, logical structure and admitting lack of memory).
108
J. MASIP et al.
Since Porter and Yuille (1996) had included the number of words in the statement as an indicator of its internality,6 it may be relevant to mention that Sporer (1997) / who did not link any of these variables with the RM approach / found that there was no relationship between truthfulness of the narratives and the number of words, number of sentences, or number of words per sentence. Santtila and his colleagues (1999) investigated the usefulness of the RM procedure to distinguish between truthful and deceptive accounts of 68 children using a story-telling paradigm. Three groups of children participated in the experiment: 7 /8, 10 /11 and 13 /14year-olds. Santtila et al . asked the children to tell both a truthful and a deceptive story concerning an event: (a) in which they had been directly involved, (b) which had had a negative emotional tone, and (c) in which they had experienced an extensive loss of control / conditions that, according to Steller et al. (1988), characterize the experience of sexual abuse. Santtila et al . (1999) used the same JMCQ scales as Sporer and Ku¨pper (1995). They performed a MANOVA on these scale scores, entering truth status of the statement, the age group, and the child’s gender as independent variables. A significant multivariate effect of truth status was found. This was due to the univariate significant effects of time information in the predicted direction and emotions/feelings in the direction opposite to what was expected (more information about feelings was provided in false accounts than in truthful ones) (see Table 2). Sensory information discriminated in the predicted direction, but the significance of its effect was only marginal (p/0.061). The authors entered these three criteria simultaneously as predictors in a logistic regression analysis, obtaining a significant model that accounted for 9% of the variance and reached a correct classification rate of 64.0% (66.2% for the deceptive statements and 61.7% for the truthful ones). Santtila et al . indicated that an initial analysis in which they entered all the RM criteria yielded similar results. When using an identical strategy to test the discriminatory power of some of the CBCA criteria7 with the same stimulus material, the overall classification rate attained was 66.9%; 63.8% of the deceptive and 69.1% of the truthful statements, respectively (Santtila et al ., 2000). The children’s age also had a significant multivariate effect in the MANOVA with the RM criteria mentioned above. Indeed, its univariate effect was significant for all the scales except emotions/feelings. In general, the children in the two older groups scored higher on the scales than the youngest children. Neither the gender factor nor the interactions between the different variables were statistically significant. However, it was found that the children’s language skills (as measured with the vocabulary sub-scale of the WISC-R test) correlated significantly with the ratings on the scales of clarity, spatial information, time information, reconstructability and cognitive operations both when the statements were deceptive and when they were truthful. The language skills of the children also correlated significantly with the emotion information of the truthful statements, and with the realism scores of the false accounts. Biland et al . (1999) described two experiments (n/44 and 60, respectively) in which several types of nonverbal behavior, 15 CBCA criteria and six RM criteria were used to detect deceit. In both experiments, a sample of participants watched a film and another sample was given a synopsis of it. All participants had to convince the interviewer that they had watched the film. Interviews were conducted by individuals trained with a kind of interview protocol very similar to the one Raskin and Esplin (1991a) described as a component of the SVA procedure. In experiment 2, the motivation of the participants was
DETECTING DECEPTION WITH THE RM APPROACH
109
TABLE 3 Hypotheses and results of Biland et al .’s (1999) study. Criteria
Hypothesis
Sensory information Contextual information Semantic information Recollection of thoughts Verbal hedges Self-references
T T T D D Da
Experiment 1 (n/44) NS NS NS D T T
Experiment 2 (n/60) NS NS D NS T T
Note. T: the criterion is significantly associated with truthfulness; D: the criterion is significantly associated with deceptiveness; NS: the criterion has no significant associations with either truthfulness or deceptiveness. a Other theorists have predicted the opposite (see text).
also manipulated but was dropped from the analyses as it showed no effect. The results concerning the RM approach are summarized in Table 3. Of the four RM criteria investigated (sensory, contextual, semantic information and thoughts), only the recollection of thoughts discriminated significantly in the first experiment. In experiment 2, contrary to expectation, there was more semantic information in deceptive than in truthful statements. In addition, two criteria derived from Schooler et al .’s (1988) work on suggestion, verbal hedges and self-references, were also investigated.8 Contrary to the authors’ expectation, in both experiments verbal hedges and self-references were significantly more frequently present in truthful accounts than in deceptive accounts. However, based on other research on the detection of deception (Wiener and Mehrabian, 1968; Kuiken, 1981; Miller and Stiff, 1993; Buller and Burgoon, 1994, 1996; Buller et al ., 1996) one could argue that self-references should be expected to be more strongly present in reports of self-experienced events. Nonetheless, the results of Biland et al .’s experiments are discouraging. Vrij et al . (2000) conducted a study where, as in Biland et al .’s, nonverbal, CBCA and RM indicators were examined. They used truthful (n/34) and deceptive (n/39) accounts of 73 nursing students who answered a series of three questions about a video they had watched showing the theft of a handbag of a patient in a hospital by a visitor. Six RM criteria were used: perceptual /visual, perceptual /auditive, spatial information, temporal information, affective9 information, and cognitive operations (see Table 4). All six criteria except affective information differed significantly between truthful and deceptive statements, although the TABLE 4 Results of Vrij et al .’s (2000, 2001) and Granhag et al .’s (2001) studies. RM criteria
Visual details Sound details Spatial information Temporal information Affective information Cognitive operations
Vrij et al. (2000) (n/73)
T T T T NS T
Vrij et al. (2001) (n/73)
T T T T T /
Granhag et al. (2001) (n/44)a Raw frequencies
Controlling for length of account
T T / T T T
D T / T T NS
Note. T: the criterion is significantly associated with truthfulness; D: the criterion is significantly associated with deceptiveness; NS: the criterion has no significant associations with either truthfulness or deceptiveness; /: not investigated. a Participants were children.
110
J. MASIP et al.
differences in cognitive operations were in the direction opposite to expectation. That is to say, all those criteria were more strongly present in truthful than in deceptive accounts. The RM summative score10 was significantly higher in truthful than in deceptive accounts. In addition, the authors conducted a discriminant analysis that yielded 70.6% correct classifications of truthful statements and 64.1% of false statements. In a related paper by Vrij et al . (2001) using a within-subjects design,11 very similar results were found regarding the individual criteria (except for cognitive operations, which were disregarded in this paper; see Table 5). The overall discrimination of the RM approach (i.e. when taking together the rates for all individual criteria) was also significant. In addition, when the overall RM scores (which could range from 0 to 5) for the deceptive interviews were subtracted from those corresponding to the truthful interviews, 60% of the participants reached the highest RM score possible when telling the truth, and only 12% reached this maximum score when lying. This is further evidence of the discriminative power of the RM procedure. Similar results were found using the CBCA approach (i.e. 64% and 15% correct classifications, respectively). On the negative side, Vrij et al . (2001) noted that the RM summative scores correlated with certain personality factors such as public self-consciousness (Fenigstein et al ., 1975) and acting ability (see e.g., Briggs et al ., 1980). Although the differences between RM scores of the deceptive and the truthful accounts were in the hypothesized direction (i.e. higher ratings when telling the truth than when lying), the differences were smaller among respondents with high ratings in public self-consciousness and acting ability than among those with low ratings on these personality traits. Therefore, the RM approach would be potentially more useful to assess the credibility of the latter than to assess that of the former. Vrij et al . (2001) argued that the reason for this is that those witnesses who believe they are the focus of attention (i.e. high public self-consciousness) try to control their behavior to avoid detection while lying. If, in addition, they possess the skills necessary to make an honest impression (something reasonable to expect among people with high acting ability, see Vrij et al ., 2001, p. 901), then they will be successful in manipulating their verbal behavior, increasing at will the occurrence of some RM criteria when lying: Basically, people who are good at imagining themselves in another situation will be able to produce a statement that contains many RM criteria, because in such a situation, many perceptual, contextual, and affective details are likely to be mentioned. . . . good actors and people high in public self-consciousness may well be better at imagining themselves in another situation. . . (Vrij et al ., 2001, p. 902).
Finally, the gender of the witness also affected the RM ratings: females obtained lower RM scores than males (Vrij et al ., 2001; see note 6 at p. 908). Unfortunately, the authors did not report which specific RM criteria were particularly affected by personality differences and gender. Finally, Granhag et al . (2001) showed a magician’s show to 22 11-year-old children and asked another 22 to imagine themselves watching that show. Later on, all children were interviewed twice: the same day and a week later. A MANOVA with visual, auditory, sensory, spatial, temporal, and affective information, as well as cognitive operations, as dependent variables showed a significant multivariate effect of truth status. Using the raw frequencies of accounts, truthful accounts contained more visual, auditory, temporal, and affective information as well as more cognitive operations than deceptive ones (see Table 4). However, after correcting for the number of words contained in each story, results changed somewhat. Truthful accounts contained more auditory, temporal, and affective information than deceptive ones (see Table 4). However, contrary to expectations, truthful accounts also
TABLE 5
Percentage correct classifications of truthful and deceptive statements using discriminant analyses or logistic regression analyses.
Study
Truthful statements
Deceptive statements
Overall accuracy
Judged deceptive
Judged deceptive
Judged truthful
68.0
32.0
70.0
30.0
69
61.0
39.0
70.0
30.0
71 66
Sporer (1997)a CBCA RM
70.0 75.0
30.0 25.0
60.0 67.5
40.0 32.5
65 71
Santtila et al . (1999, 2000)b CBCA RM
69.1 61.7
30.9 38.3
63.8 66.2
36.2 33.8
67 64
Vrij et al . (2000)a CBCA RM
64.7 70.6
35.3 29.4
79.5 61.4
20.5 38.6
73 67
Granhag et al . (2001)a First interview Second interview
91 73
9 27
81 85
19 15
85 79
Sporer and Ku¨pper (1995)a RM Ho¨fer et al . (1996) CBCA RM
a
a
Discriminant analyses were used in these studies. Santtila et al . (1999, 2000) used logistic regression analyses. Only the most discriminative criteria were entered.
DETECTING DECEPTION WITH THE RM APPROACH
Judged truthful
b
111
112
J. MASIP et al.
contained less visual information. The differences for cognitive operations were no longer significant. It is interesting that neither the interview factor nor its interaction with truth status were significant, which means that the 1-week delay of the second interview did not affect the RM ratings. In addition, Granhag et al . (2001) calculated two discriminant analyses, one for each interview. The RM criteria were entered as predictors, and truth status of the statements was entered as the classifying variable. The analyses yielded a 85.4% overall classification rate for the first interview (91% of the truthful and 81% of the deceptive accounts), and a 79.2% classification rate for the second interview (73% and 85%, respectively). In an unpublished study by Ho¨fer et al. (1996), the eight RM scales from Sporer and Ku¨pper’s (1995) JMCQ were used. The witnesses either participated (truth-tellers) or did not (liars) in a photographic session. Those who did not participate were given some information concerning what had happened in the session. Then all the witnesses had to convince the interviewer that they had participated. Truthful accounts had higher scores in clarity, time information, reconstructability, realism and, unexpectedly / according to the RM theory, but not so in view of the results described above / cognitive operations. Sensory experiences, spatial information, and emotions/feelings showed no differences (see Table 2). Using a multiple discriminant analysis, Ho¨fer and his collaborators could accurately classify 61% of the truthful accounts and 70% of the deceptive ones. In the preceding section, the empirical research with the RM approach to detect deception has been described. In the next section, we provide a summative evaluation of the discriminative power of the RM approach and point out some important moderator variables that have to be considered. At the end, suggestions for future research are given.
PUTTING IT ALL TOGETHER: THE DISCRIMINATIVE POWER OF THE RM APPROACH One of the weaknesses that some authors have noted with respect to the SVA/CBCA approach is its poor theoretical foundation (Ko¨hnken, 1990; Sporer, 1997). The so-called Undeutsch hypothesis upon which it is based merely states that the descriptions of events that a witness has experienced him- or herself will differ in content, quality and expression from the descriptions of events which are a product of the imagination. However, Sporer (1997), with reference to Ko¨hnken (1990) maintains that this is merely a working hypothesis that postulates that certain differences should appear but does not specify why these differences are to be expected. In other words, neither the psychological processes why such differences are likely to be found nor the boundary conditions that specify when and when not such differences may be observed are specified (p. 375, emphasis in the original).
Probably, this weakness is rooted in the way the CBCA content criteria emerged within the SVA approach: they were not derived from an existing theory (top-down); rather, they progressively emerged from German psychologists’ practice in interviewing children (see Sporer, 1982, 1983; Arntzen, 1993) that is, these criteria were created in a somewhat ad hoc , intuitive manner and following an inductive (bottom-up) strategy. On the other hand, the theoretical foundations underlying the RM procedure are solid and well articulated: the RM procedure is rooted in a clearly formulated theory (Johnson and Raye, 1981; Johnson et al ., 1993) that has been supported by data in a variety of domains. Within the eyewitness domain, the RM approach and its extension, the source monitoring
DETECTING DECEPTION WITH THE RM APPROACH
113
framework, has been successfully employed to explain misinformation and suggestibility effects (see Johnson et al ., 1993). However, the application to the detection of deception is more problematic. Assuming that internally generated memories are equivalent to intentional distortions of testimony (i.e. lies) involves an inferential leap which may be questionable (for a thorough discussion, see Sporer, in press). Also, despite the strong theoretical foundation, the RM predictions concerning the discrimination between truthful statements and lies have not always been supported by the data, and appear to be influenced by many variables. In addition, the diversity of experimental paradigms, and differences in RM criteria, samples, etc. used in the different experiments makes it difficult to compare the results from different studies and paradigms. In the following, we highlight some of the moderator variables that have emerged in past studies.
Moderator Variables Mode of Presentation One of the variables that may have an influence on the validity of the RM criteria is the mode in which the event to be reported has been presented to participants. We would expect stronger differences when events were experienced personally than when the events were only witnessed on video or via audio tapes. In general, studies that had participants engage in an event or report on autobiographical memories showed stronger support for the RM approach (see Table 2). Also, significant effects occurred in the live condition of AlonsoQuecuty’s (1995a) study (although in opposite directions for adults vs children). However, in the video condition, none of the RM criteria was helpful in trying to discriminate between truthful and deceptive narratives. This is somewhat surprising, because in all of AlonsoQuecuty’s remaining studies (in which, with no exception, video or audio recordings were used) there were always at least two criteria that discriminated significantly (see Table 1). The same occurred in the studies conducted by Vrij et al . (2000, 2001), who also used videos. Similarly, Biland et al . (1999) also found significant effects after showing a film, although most of these effects were not in the predicted direction. Preparation and Delay The manipulation of the delay of the statement, and the opportunity to prepare or plan an account, has yielded contradictory results. Alonso-Quecuty (1990a,b, 1992, 1993) used a between-subjects design with regard to this variable: some of her witnesses were interviewed immediately after watching the critical event, and the rest of them were interviewed after 10 minutes (Alonso-Quecuty, 1990a,b, 1992) or the next day (Alonso-Quecuty, 1993). Witnesses provided both truthful and deceptive accounts in counterbalanced order. Sporer (1997) used a within-subjects design, wherein the same individuals had to make all the statements, with a very short delay (the second interview followed just a few minutes after completion of the first). While Alonso-Quecuty (1990a,b, 1992, 1993) found that sensory and contextual details were more strongly present in deceptive than in truthful statements in the delayed condition (but not in the immediate, in which they were more strongly present in the truthful statements), Sporer found that in the delayed condition contextual (spatial and temporal) information was more strongly present in truthful accounts (with no significant effects for these variables in the immediate condition), while sensory information was not associated with either truthfulness or deception in either condition. Finally, although
114
J. MASIP et al.
Alonso-Quecuty (1993) found that delay had an influence on the occurrence of contextual and sensory information, it did not interact with truth status. It is apparent from the above that the way in which the delay of statements was manipulated differed greatly across the three studies in which that variable was examined. This might provide a likely explanation for the contradictory findings. Repeated Statements There are also contradictory results concerning repetitions. Repeating the account did not affect any RM criteria in Granhag et al .’s (2001) study, but increased the presence of sensory, contextual and internal information in Alonso-Quecuty and Herna´ndez-Fernaud’s (1997) experiment, where repetition also interacted with truth status regarding the presence of contextual and sensory details. In order to account for these conflicting results, it is important to note that Granhag et al .’s manipulation of the repetition variable involved a 1week delay between the first and the second interview, whilst Alonso-Quecuty and Herna´ndez-Fernaud asked their participants to make each statement immediately upon completion of the previous one. They obtained three statements per sender, whereas Granhag et al . obtained only two. The latter authors used a semi-structured interview which comprised several questions, while Alonso-Quecuty and Herna´ndez-Fernaud used a onequestion interview. These or many other differences between these studies (e.g. the witnesses’ age, the definitions of the criteria scored, etc.) may account for the contradictory findings. Age of Witnesses Age of witness participants also appears to moderate the discriminative utility of the RM criteria / similarly to research with CBCA criteria. Santtila et al . (1999) suggested that as verbal skills increase with age, the presence of some criteria may also increase, irrespective of truth status. In line with their predictions, research has shown that the witness’s age has an impact on the presence of certain content criteria (Alonso-Quecuty et al ., 1997; Santtila et al ., 1999). Different subsets of RM criteria may be valid for children vs adults (Sporer, 1997). In addition, under certain circumstances the discriminatory power of some criteria does not appear to be the same in children as in adult witnesses (Alonso-Quecuty, 1995a). These age effects must be taken into account when considering the extant RM research. In most studies the experimental participants acting as witnesses have been adults, in some others they have been children (Granhag et al ., 2001; Santtila et al ., 1999). Individual Differences Personality variables such as public self-consciousness and acting ability (Vrij et al ., 2001), and perhaps self-monitoring and gender of story-tellers, have to be taken into consideration. Just as German court experts have always emphasized that the SVA approach is not a simple application of a CBCA criteria checklist, applying the RM approach also has to take individual differences into consideration.
Methodological Considerations As noted above, many contradictory findings have been found in past deception studies based on the RM approach. The reason for these contradictory findings may lie in the fact that the individual studies differ from each other in numerous features, such as the way the
DETECTING DECEPTION WITH THE RM APPROACH
115
delay of the statements or their repetitions have been manipulated. In this section, other relevant differences between particular RM studies are discussed. Experimental Paradigm While most researchers have taken victims or witnesses as participants, Porter and Yuille (1996) used suspects. This may make a difference. Indeed, the question of whether the RM technique is useful in identifying perpetrators’ false denials merits careful attention, but Porter and Yuille’s study cannot be compared with others which were conducted using victims or witnesses, because one cannot tell whether any potential difference between their study and the others is due to their having used suspects. The appropriate way to study whether there is a difference between suspects’ and victims’ or witnesses’ accounts is to have a sample of each in the same study. Even in those cases in which victims or witnesses were used, there were differences in the kind of events the participants had to declare about. They range from innocuous and even enjoyable performances (Granhag et al .’s magician’s show), to criminal acts (Alonso-Quecuty’s experiments) and negative, emotionally toned events (Santtila et al ., 1999). There are reasons (e.g. the witnesses’ emotional involvement) to think that these differences may affect the presence of the criteria or their discriminative power. Future research should examine this. Sample Size There are large differences in sample sizes across studies (see Tables 1 /4) which affect the reliability of findings (particularly of interactions with truth status) but also the effect sizes of individual criteria. Operationalizations of RM Criteria In general, different authors have used different criteria, and have defined similar criteria differently / surprisingly, detailed descriptions of the criteria are not provided in the relevant literature (see Sporer and Ku¨pper, 1995, and Vrij, 2000, for exceptions). There is a need, therefore to establish and define in detail a standard set of criteria which should be used by researchers in future studies. In particular, operational definitions of ‘‘internal states’’ and ‘‘cognitive operations’’ have to become more precise, which in turn may resolve the contradictory findings. Scoring of Criteria Different research groups also used different ways to score RM criteria: while some groups used frequency counts (e.g. Alonso-Quecuty and her colleagues), others used rating scales (e.g. Sporer and his colleagues). From a methodological point of view, it is noteworthy that Granhag et al . (2001) adjusted their frequency counts of the RM criteria by the number of words of each account because there were more words in truthful than in fabricated statements. Before this standardization, truthful accounts contained more visual details, just the opposite from the adjusted data. Truthful accounts also contained more affective information than fabricated accounts, but this difference disappeared after standardization. Thus, differences in account length (e.g. in Alonso-Quecuty’s, 1992, study) may be confounded with some of the effects reported in some studies. In addition to providing a likely explanation for the contradictory results that have emerged across different studies, all these differences between the individual experiments make it difficult to make appropriate cross-study comparisons. On the other hand, the
116
J. MASIP et al.
consistent results that have emerged despite the differences between experiments do indeed support the robustness of the RM theoretical approach.
Individual RM Criteria: Summary of Results and Considerations While some RM criteria do not seem to discriminate between truthful and deceptive accounts (e.g. internal information or cognitive operations), others seem quite promising (e.g. contextual, spatial and time information, as well as realism). For sensory information the evidence is mixed, which may be a function of different definitions used by different authors as well as by floor effects in some studies. Also, many differences in results are apparent between individual studies in terms of the usefulness of particular RM criteria for discriminating accurately. The multiple differences across the individual experiments in design, witness samples, manipulation of relevant variables, type of event the witnesses have to testify about, how the RM criteria are defined and scored, etc. may account for some of these discrepancies. Thus, it is necessary to systematize the RM research. First, establishing and describing in detail a standard set of criteria seems to be an essential task. Second, the influence of several variables likely to influence the resulting RM criteria should be examined in detail under controlled conditions, taking care not to introduce variables in the studies that may affect the results or may blur the actual influence of the relevant manipulations upon the dependent measures. Efforts should also be taken to study the validity of the RM approach under ecologically valid conditions. Using only videotaped materials which are not likely to involve participants may not tap the processes postulated by the RM approach adequately and underestimate the discriminative validity of some criteria.
Overall Validity of the RM Approach Despite some confusing and at times disappointing results concerning the individual criteria, it has already been shown above that the overall RM system discriminates fairly well (Sporer and Ku¨pper, 1995; Ho¨fer et al ., 1996; Sporer and Hamilton, 1996; Sporer, 1997; Santtila et al ., 1999; Vrij et al ., 2000, 2001; Granhag et al ., 2001). For instance, in all cases in which MANOVAs or multiple discriminant analyses were used, the multivariate effect of truth status upon the set of RM criteria was significant / although the criteria used in different studies were not always the same. In Table 5, the percentages of correct classifications obtained using discriminant analyses (Sporer and Ku¨pper, 1995; Ho¨fer et al ., 1996; Sporer and Hamilton, 1996; Sporer, 1997; Vrij et al ., 2000; Granhag et al ., 2001) or logistic regression analysis (Santtila et al ., 1999), in which the RM criteria were entered as predictors, are shown. Classification rates obtained using the CBCA criteria are also included / for those cases when they were available / for comparison.12 Although no statistical tests to assess the comparative validity of the RM approach with the CBCA approach are available (nor desirable), it is apparent from the information provided in Table 5 that the percentage of correct classifications obtained by both of these methods is similar. However, the discrimination rates are too poor for the procedure to be used in actual criminal cases. Note that the probability of mis-classifying an innocent person as guilty would range between 9% (in Granhag’s study), and 39% (in Ho¨fer et al .’s), and the probability of considering that a deceptive person is telling the truth would range between 15% (an extremely low rate found by Granhag and his colleagues in their second interview)
DETECTING DECEPTION WITH THE RM APPROACH
117
and 39% (Vrij et al .). However, we should take into consideration that the classification rates based on multiple discriminant analyses always depend on the number of predictors used as well as on the overall sample size. The number of statements analyzed should be at least 5 /10 times as large as the number of predictors (cf. Tabachnick and Fidell, 1996). Notwithstanding, the finding that discrimination is above chance is encouraging, and should stimulate researchers’ efforts towards refining the criteria in order to increase the discrimination rates even further, thus creating a useful instrument. There is also one study that shows that a rater who evaluated a large number of accounts using the RM approach was significantly better at classifications than two additional raters examining the same 200 stories, and also better than two control groups of naive raters (Sporer et al ., 1995; see also Sporer, in press).
CONCLUSIONS More than a decade has elapsed since the first study examining whether the principles derived from basic memory research on RM would be useful to discriminate between truthful and deceptive accounts. At this point, a comprehensive and thorough analysis of the existing evidence had to be done in order to reach some general conclusions and provide research orientations. In this paper, all available separate studies have been reviewed, including several unpublished reports as well as some papers published in Spanish, French, and German which, as a consequence, are often not taken into account in the relevant literature in English. The conclusions that may be drawn from this review can be summarized as follows. First, the usefulness of some of the individual criteria is at best limited and unclear. In many cases only a minority of the criteria examined in each individual study has discriminated adequately: null results abound and, at times, differences contrary to the ones predicted have been found. Amongst the most discriminative criteria, visual and auditory details, contextual information, time information and realism stand out. While some studies have reported more cognitive or ‘‘idiosyncratic’’ information in deceptive reports, in other studies, contrary to expectations, cognitive operations have been repeatedly found more frequently in truthful accounts. Some of the criteria are influenced by moderator variables like delay and the opportunity to prepare an account, which have to be taken into consideration. Moreover, the lack of clear definitions in some studies and the different operationalizations of the criteria used by different authors makes it difficult to compare results across studies and might be responsible for the contradictory findings that have emerged. It is necessary for researchers to agree upon what criteria should be used and how they are to be defined, to standardize the RM system. Also, some authors used rating scales (e.g. for clarity/vividness) while others used frequency counts (e.g. of visual details). Although frequencies can be coded more reliably, most studies have not controlled for the number of words contained in truthful vs deceptive accounts (which often differ). A noteworthy exception is the study by Granhag et al . (2001) who noted that results may differ when such a correction is employed. Further, the influence of moderator variables upon the occurrence of the RM criteria should be examined in more detail. Despite these reservations, there is some evidence that the RM approach taken as a whole discriminates with an accuracy rate above chance level, but comparisons with the CBCA procedure are difficult for a variety of methodological reasons. Yet the relatively high risk of mis-classifying witnesses’ accounts advises against its use in applied forensic settings.
118
J. MASIP et al.
Systematic research conducted on the hypothetical standard RM system mentioned in the previous paragraph would be helpful in improving the system by keeping only those criteria that appear most discriminative and least sensitive to the influence of moderator variables. This would raise the validity of the overall RM approach even further, and would turn it into a potentially useful instrument in a variety of forensic situations. In this regard, it is important to keep in mind that the RM approach has some advantages over other lie detection procedures. For example, Sporer (1997) points out that with detailed instructions it is possible to teach raters to use the RM criteria reliably, although not for all criteria (see Sporer, in press). Among the possible advantages of the RM approach according to Vrij (2000) is that (a) it is easier to use because it contains fewer criteria than the CBCA criteria, (b) the operationalizations of most of the criteria are more precise (at least in the JMCQ), thus making the raters’ training probably easier than that required by the CBCA approach, and (c) unlike SVA/CBCA, the RM procedure has a solid theoretical basis. Further topics awaiting research are: (a) the validity of the procedure in field situations where real statements by real witnesses are analyzed, since to our knowledge no actual field study on the topic has been published so far, and (b) its reliability, for only few studies have reported data on inter-rater reliability (e.g. Ku¨pper and Sporer, 1995; Sporer and Ku¨pper, 1995; Sporer, 1997; Vrij et al ., 2000, 2001). Also, efforts should be taken to integrate the CBCA and RM approach. Sporer (1998) has made some efforts in this direction by proposing the Aberdeen Report Judgment Scales for which first data on reliability and validity are available (Sporer, 1998; Sporer et al ., 2000). Also, more effort should be placed on investigating more closely moderator variables like the opportunity to plan and/or rehearse an account. ACKNOWLEDGEMENTS Writing this paper has been possible thanks to the financial support of the Junta de Castilla y Leo´n, Programa de Apoyo a Proyectos de Investigacio´n (Ref. SA52/00B), and by a grant from the Deutsche Forschungsgemeinschaft (DFG 262/3-2) to the second author. We are very grateful to several anonymous reviewers who helped to improve the manuscript by their constructive criticisms.
Notes 1. The same data contained in this English book chapter have previously been published in Spanish (AlonsoQuecuty, 1990a,b). As the data reported are identical, we only refer to the English version. 2. An English translation of the JMCQ, including an answer sheet for recording responses, is available from
[email protected]. 3. The remaining four items (22, 23, 28, 39) are only used for describing the nature of the event (e.g. emotional tone, date of the incident). 4. Ku¨pper and Sporer (1995) examined inter-rater reliabilities of the criteria used in Sporer and Ku¨pper’s study, noting satisfactory inter-rater agreement for sensory, spatial and time information as well as emotions, while inter-rater agreement for clarity/vividness, reconstructability, realism and cognitive operations were in need of improvement. They also noted a clear improvement in inter-rater reliability after a brief training period. 5. On the basis of other people’s work, one should actually expect more self-references and more words to be contained in self-experienced than in deceptive accounts (see Wiener and Mehrabian, 1968; Kuiken, 1981; Miller and Stiff, 1993; Buller and Burgoon, 1994, 1996; Buller et al ., 1996). 6. Based on research on CBCA criteria and other research on deception (see Miller and Stiff, 1993; DePaulo et al ., 2003), we would predict the opposite: truthful accounts should be longer than deceptive ones. 7. Only those CBCA criteria for which significant or marginal differences in means between truthful and deceptive statements were found were entered as predictors in the logistic regression analysis computed to
DETECTING DECEPTION WITH THE RM APPROACH
8. 9. 10.
11. 12.
119
assess the usefulness of the CBCA procedure to correctly classify statements as truthful or deceptive. These criteria were: unstructured production, quantity of details, reproduction of conversations, unusual details, attribution of perpetrators’ mental state, unexpected complications, and superfluous details (Santtila et al ., 2000). See also the research on non-immediacy by Buller et al . (1996), Kuiken (1981) and Wiener and Mehrabian (1968). The authors used CBCA criterion 12 (‘‘Accounts of one’s mental state’’) as part of the RM criteria here. The calculation of a summative score by adding up individual RM criteria (recoded as binary variables for the presence/absence of each criterion) is very problematic unless it is shown first that the scales represent a uni-dimensional construct (either by factor analyses / see Sporer and Ku¨pper, 1995), or at least by high inter-correlations between individual items, and in combination with calculation of internal consistency (Cronbach’s alpha). Each participant told both a truthful account and a lie, in counterbalanced order. The first of these was also reported in the Vrij et al . (2000) study, so the results cannot be considered independent. Since the purpose of Table 5 is to compare the usefulness of the RM approach with that of the CBCA, only results from studies in which the same statements were analyzed with the CBCA technique criteria after being analyzed with the RM technique are summarized. Other authors have used discriminant analyses to examine the CBCA correct classification rates, but they did not code the occurrence of RM criteria. These studies have nothing to do with the RM procedure and, therefore, are not described here. The correct classification scores ranged between 57% and 88% for truthful statements, and between 62% and 100% for deceptive statements (Ruby and Brigham, 1997; see also Vrij, 2000).
References Adams, S. H. (1996). Statement analysis: what do suspects’ words really reveal? FBI Law Enforcement Bulletin , 12 /20 October 1996 (retrieved 3 July 2003, from http://www.fbi.gov/publications/leb/1996/oct964.txt). Alonso-Quecuty, M. L. (1990a). Recuerdo de la realidad percibida vs. imaginada. Buscando la mentira [Memories of perceived vs. imagined events. In search of deception]. Boletı´n de Psicologı´a , 29, 73 /86. Alonso-Quecuty, M. L. (1990b). Memorias de origen interno vs. externo: una alternativa en la deteccio´n de la mentira [Memories of internal vs. external origin: an alternative way of detecting deception]. Libro de Comunicaciones del Congreso Nacional de Psicologı´a Social , vol. 2 (pp. 17 /23). Santiago: To´rculo. Alonso-Quecuty, M. L. (1992). Deception detection and reality monitoring: a new answer to an old question? In F. Lo¨sel, D. Bender and T. Bliesener (Eds.), Psychology and Law: International Perspectives (pp. 328 / 332). Berlin: Walter de Gruyter. Alonso-Quecuty, M. L. (1993). Psicologı´a forense experimental: el efecto de la demora en la toma de declaracio´n y el grado de elaboracio´n de la misma sobre los testimonios verdaderos y falsos [Experimental forensic psychology: the effect of delay and preparation of a statement upon truthful and deceptive testimonies]. In M. Garcı´a (Ed.), Psicologı´a Social Aplicada en los Procesos Jurı´dicos y Polı´ticos (pp. 81 / 88). Seville: Eudema. Alonso-Quecuty, M. L. (1994a). Psicologı´a forense experimental: el testigo deshonesto [experimental forensic psychology: the dishonest witness]. In J. Sobral, R. Arce and A. Prieto (Eds.), Manual de Psicologı´a Jurı´dica (pp. 139 /153). Barcelona: Paido´s. Alonso-Quecuty, M. L. (1994b). Psicologı´a forense experimental: testigos y testimonios [Experimental forensic psychology: witnesses and testimonies]. In S. Delgado (Ed.), Psiquiatrı´a Legal y Forense, vol. 1 (pp. 469 / 479). Madrid: Colex. Alonso-Quecuty, M. L. (1995a). Detecting fact from fallacy in child and adult witness accounts. In G. Davies, S. Lloyd-Bostock, M. McMurran and C. Wilson (Eds.), Psychology, Law and Criminal Justice. International Developments in Research and Practice (pp. 74 /80). Berlin: Walter de Gruyter. Alonso-Quecuty, M. L. (1995b). Psicologı´a y testimonio [Psychology and testimony]. In M. Clemente (Ed.), Fundamentos de Psicologı´a Jurı´dica (pp. 171 /184). Madrid: Pira´mide. Alonso-Quecuty, M. L. and Herna´ndez-Fernaud, E. (1997). To´cala otra vez Sam: repitiendo las mentiras [Play it again Sam: retelling a lie]. Estudios de Psicologı´a , 57, 29 /37. Alonso-Quecuty, M. L., Herna´ndez-Fernaud, E. and Campos, L. (1997). Child witnesses: lying about something heard. In S. Redondo, V. Garrido, J. Pe´rez and R. Barberet (Eds.), Advances in Psychology and Law: International Contributions (pp. 129 /135). Berlin: Walter de Gruyter. Arntzen, F. (1993). Psychologie der Zeugenaussage. Systematik der Glaubwu ¨ rdigkeitsmerkmale [Psychology of Eyewitness Testimony] . Munich: Beck. Ben-Shakhar, G. and Furedy, J. J. (1990). Theories and Applications in the Detection of Deception . New York: Springer-Verlag. Biland, C., Py, J. and Rimboud, S. (1999). Evaluer la since´rite´ d’un te´moin graˆce a` trois techniques d’analyse, verbales et non verbale [Evaluating a witness’ sincerity with three verbal and nonverbal techniques]. Revue Europe´enne de Psychologie Applique´e, 49, 115 /121.
120
J. MASIP et al.
Briggs, S. R., Cheek, J. M. and Buss, A. H. (1980). An analysis of the Self-Monitoring Scale. Journal of Personality and Social Psychology, 38, 679 /686. Buller, D. B. and Burgoon, J. K. (1994). Deception: strategic and nonstrategic communication. In J. A. Daly and J. M. Wiemann (Eds.), Strategic Interpersonal Communication (pp. 191 /223). Hillsdale, NJ: Erlbaum. Buller, D. B. and Burgoon, J. K. (1996). Interpersonal deception theory. Communication Theory, 6, 203 /242. Buller, D. B., Burgoon, J. K., Busling, A. and Roiger, J. (1996). Testing interpersonal deception theory: the language of interpersonal deception. Communication Theory, 6, 268 /289. DePaulo, B. M., Lindsay, J. J., Malone, B. E., Muhlenbruck, L., Charlton, K. and Cooper, H. (2003). Cues to deception. Psychological Bulletin , 129, 74 /112. Driscoll, L. N. (1994). A validity assessment of written statements from suspects in criminal investigations using the Scan Technique. Police Studies, 17, 77 /88. Fenigstein, A., Scheier, M. F. and Buss, A. H. (1975). Public and private self-consciousness: assessment and theory. Journal of Consulting and Clinical Psychology, 43, 522 /527. Fisher, R. P. and Geiselman, R. E. (1992). Memory Enhancing Techniques for Investigative Interviewing: the Cognitive Interview. Springfield, IL: Charles C. Thomas. Gale, A. (Ed.), (1988). The Polygraph Test. Lies, Truth and Science. London: Sage. Garrido, E. and Masip, J. (2001). La evaluacio´n psicolo´gica en los supuestos de abusos sexuales [Psychological assessment in sexual abuse cases]. In F. Jime´nez (Ed.), Evaluacio´n Psicolo´gica Forense 1: Fuentes de Informacio´n, Abusos Sexuales, Testimonio, Peligrosidad y Reincidencia (pp. 25 /140). Salamanca: Amaru´. Geiselman, R. E., Fisher, R. P., Firstenberg, I., Hutton, L. A., Sullivan, S. J., Avettissian, I. and Prosk, A. (1984). Enhancement of eyewitness memory: an empirical evaluation of the cognitive interview. Journal of Police Science and Administration , 12, 74 /80. Granhag, P. A., Stro¨mwall, L. and Olsson, C. (2001). Fact or fiction? Adults’ ability to assess children’s veracity. Paper presented at the 11th European Conference on Psychology and Law, Lisbon, Portugal, June 2001. Herna´ndez-Fernaud, E. and Alonso-Quecuty, M. L. (1997a). La conducta engan˜osa: el riesgo de identificarla con la mentira en el contexto legal [Deceptive behavior: the risk of mistaking it as deception in legal contexts]. In F. Farin˜a and R. Arce (Eds.), Psicologı´a e Investigacio´n Judicial (pp. 41 /62). Madrid: Fundacio´n Universidad-Empresa. Herna´ndez-Fernaud, E. and Alonso-Quecuty, M. L. (1997b). The cognitive interview and lie detection: a new magnifying glass for Sherlock Holmes? Applied Cognitive Psychology, 11, 55 /68. Ho¨fer, E., Akehurst, L. and Metzger, G. (1996). Reality monitoring: a chance for further development of the CBCA? Paper presented at the 6th European Conference on Psychology and Law, Sienna, Italy, August 1996. Honts, C. R. (1994). Assessing children’s credibility: scientific and legal issues in 1994. North Dakota Law Review, 70, 879 /903. Johnson, M. K. and Raye, C. L. (1981). Reality monitoring. Psychological Review, 88, 67 /85. Johnson, M. K. and Suengas, A. (1989). Reality monitoring judgments of other people’s memories. Bulletin of the Psychonomic Society, 27, 107 /110. Johnson, M. K., Foley, M. A., Suengas, A. and Raye, C. L. (1988). Phenomenal characteristics of memories for perceived and imagined autobiographical events. Journal of Experimental Psychology: General , 117, 371 /376. Johnson, M. K., Hashtroudi, S. and Lindsay, D. S. (1993). Source monitoring. Psychological Bulletin , 114, 3 / 29. Kerr, P. (1990). The Penguin Book of Lies. New York: Penguin. Kleiner, M. (Ed.), (2002). Handbook of Polygraph Testing. San Diego, CA: Academic Press. Ko¨hnken, G. (1990). Glaubwu ¨ rdigkeit [Credibility] . Munich: Psychologie Verlags Union. Kuiken, D. (1981). Nonimmediate language style and inconsistency between private and expressed evaluations. Journal of Experimental Social Psychology, 17, 183 /196. Ku¨pper, B. and Sporer, S. L. (1995). Berteileru¨bereinstimmung bei Glaubwu¨rdigkeitsmerkmalen: Eine empirische Studie [Inter-rater agreement for credibility criteria: an empirical study]. In G. Bierbrauer, W. Gottwald and B. Birnbreier-Stahlberger (Eds.), Vefahrensgerechtigkeit /Rechtspsychologische Forschungsbeitra ¨ ge fu ¨ r die Justizpraxis (pp. 187 /213). Ko¨ln: Otto Schmidt Verlag. Lamb, M. E., Sternberg, K. J., Esplin, P. W., Hershkowitz, I. and Orbach, Y. (1997). Assessing the credibility of children’s allegations of sexual abuse: a survey of recent research. Learning and Individual Differences, 9, 175 /194. Larson, J. A. (1969). Lying and Its Detection. A Study of Deception and Deception Tests. Montclair, NJ: Patterson Smith. Lesce, A. (1990). SCAN: Deception detection by Scientific Content Analysis. Law and Order, 8 (retrieved 3 July 2003, from http://www.lsiscan.com/id37_m.htm). Lykken, D. T. (1998). A Tremor in the Blood. Uses and Abuses of the Lie Detector. New York: Plenum.
DETECTING DECEPTION WITH THE RM APPROACH
121
Masip, J. and Garrido, E. (2000). La evaluacio´n de la credibilidad del testimonio en contextos judiciales a partir de indicadores conductuales [Credibility assessment of testimony in judicial contexts from behavioral indicators]. Anuario de Psicologı´a Jurı´dica , 10, 93 /131. Masip, J., Garrido, E. and Herrero, C. (2002). La deteccio´n de la mentira mediante la te´cnica SCAN [The detection of deception with the SCAN technique]. Revista de Psicopatologı´a Clı´nica, Legal y Forense, 2, 39 / 62. Masip, J., Garrido, E. and Herrero, C. (submitted a). The history of lie detection, I. Early unscientific procedures . Masip, J., Garrido, E. and Herrero, C. (submitted b). The history of lie detection, II. Early and current scientifically-based procedures . Miller, G. R. and Stiff, J. B. (1993). Deceptive Communication . Newbury Park, CA: Sage. Porter, S. and Yuille, J. C. (1996). The language of deceit: an investigation of the verbal clues to deception in the interrogation context. Law and Human Behavior, 20, 443 /458. Raskin, D. C. and Esplin, P. W. (1991a). Statement Validity Assessment: interview procedures and content analysis of children’s statements of sexual abuse. Behavioral Assessment , 13, 265 /291. Raskin, D. C. and Esplin, P. W. (1991b). Assessment of children’s statements of sexual abuse. In J. Doris (Ed.), The Suggestibility of Children’s Recollections (pp. 153 /164). Washington, DC: American Psychological Association. Ruby, C. L. and Brigham, J. C. (1997). The usefulness of the Criteria-Based Content Analysis technique in distinguishing between truthful and fabricated allegations. A critical review. Psychology, Public Policy and Law, 3, 705 /737. Santtila, P., Roppola, H. and Niemi, P. (1999). Assessing the truthfulness of witness statements made by children (aged 7 /8, 10 /11, and 13 /14) employing scales derived from Johnson and Raye’s model of Reality Monitoring. Expert Evidence, 6, 273 /289. Santtila, P., Roppola, H., Runtti, M. and Niemi, P. (2000). Assessment of child witness statements using Criteria-based Content Analysis (CBCA): the effects of age, verbal ability, and interviewer’s emotional style. Psychology Crime & Law, 6, 159 /179. Schooler, J. W., Gerhard, D. and Loftus, E. F. (1986) Qualities of the unreal. Journal of Experimental Psychology: Learning, Memory and Cognition , 12, 171 /181. Schooler, J. W., Clark, C. A. and Loftus, E. F. (1988). Knowing when memory is real. In M. M. Gruneberg, P. E. Morris and R. N. Sykes (Eds.), Practical Aspects of Memory: Current Research and Issues, Vol. 1: Memory in Everyday Life (pp. 83 /88). Chichester: Wiley. Smith, N. (2001). Reading Between the Lines: an Evaluation of the Scientific Content Analysis Technique (SCAN). London: Home Office / Policing and Reducing Crime Unit (retrieved 3 July 2003, from http://www.homeoffice.gov.uk/rds/prgpdfs/prs135.pdf). Sporer, S. L. (1982). A brief history of the psychology of testimony. Current Psychological Reviews, 2, 323 / 339. Sporer, S. L. (1983). Content criteria of credibility: the German approach to eyewitness testimony. Paper presented in G. S. Goodman (Chair), The Child Witness: Psychological and Legal Issues. Symposium presented at the 91st Annual Convention of the American Psychological Association in Anaheim, California, August 1983. Sporer, S. L. (1997). The less travelled road to truth: verbal cues in deception detection in accounts of fabricated and self-experienced events. Applied Cognitive Psychology, 11, 373 /397. Sporer, S. L. (1998). Detecting deception with the Aberdeen Report Judgment Scales (ARJS): Theoretical development, reliability and validity. Paper presented at the Biennial Meeting of the American Psychology / Law Society, Redondo Beach, CA, March 1998. Sporer, S. L. (in press). Reality monitoring and the detection of deception. In P.-A. Granhag and L. Stro¨mwall (Eds.), Deception Detection in Forensic Contexts. Cambridge: Cambridge University Press. Sporer, S. L. and Hamilton, S. C. (1996). Should I believe this? Reality monitoring of invented and selfexperienced events from early and late teenage years. Poster presented at the NATO Advanced Study Institute. Port de Bourgenay, France, June 1996. Sporer, S. L. and Ku¨pper, B. (1995). Realita¨tsu¨berwachung und die Beurteilung des Wahrheitsgehaltes von Erza¨hlungen: Eine experimentelle Studie [Reality monitoring and the judgment of credibility of stories: an experimental study]. Zeitschrift fu ¨ r Sozialpsychologie, 26, 173 /193. Sporer, S. L. and Schwandt, B. (2002). Nonverbal indicators of deception: a meta-analytic synthesis. Paper presented at the Biennial Meeting of the American Psychology /Law Society in Austin, Texas, March 2002. Sporer, S. L. and Schwandt, B. (2003). Paraverbal indicators of deception: a meta-analysis. Paper presented at the Joint Meeting of the American and European Psychology /Law Society in Edinburgh, Scotland July 2003. Sporer, S. L., Ku¨pper, B. and Bursch, S. E. (1995). Hilft Wissen u¨ber Realita¨tsu¨berwachung, um zwischen wahren und erfundenen Geschichten zu unterscheiden? [Does knowledge about reality monitoring help to discriminate between true and invented stories?]. Paper presented at the 37th Tagung experimentell arbeitender Psychologen in Bochum, April 1995.
122
J. MASIP et al.
Sporer, S. L., Bursch, S. E., Schreiber, N., Weiss, P. E., Ho¨fer, E., Sievers, K. and Ko¨hnken, G. (2000). Detecting deception with the Aberdeen Report Judgement Scales (ARJS): Inter-rater reliability. In A. Czerederecka, T. Jaskiewicz-Obydzinska and J. Wojcikiewicz (Eds.), Forensic Psychology and Law. Traditional Questions and New Ideas (pp. 197 /204). Cracow: Institute of Forensic Research Publishers. Steller, M. and Boychuk, T. (1992). Children as witnesses in sexual abuse cases: investigative interview and assessment techniques. In H. Dent and R. Flin (Eds.), Children as Witnesses (pp. 47 /71). Chichester: Wiley. Steller, M. and Ko¨hnken, G. (1989). Criteria-based statement analysis. In D. C. Raskin (Ed.), Psychological Methods in Criminal Investigation and Evidence (pp. 217 /245). New York: Springer. Steller, M., Wellershaus, P. and Wolf, T. (1988). Empirical validation of Criteria-Based Content Analysis. Paper presented at the NATO Advanced Institute on Credibility Assessment, Maratea, Italy, June 1988. Suengas, A. (1991). El origen de los recuerdos [The origin of memories]. In J. M. Ruiz-Vargas (Ed.), Psicologı´a de la Memoria (pp. 407 /427). Madrid: Alianza. Tabachnick, B. G. and Fidell, L. S. (1996). Using Multivariate Statistics, 3rd edn. New York: Harper Collins. Trankell, A. (1971/1972). Reliability of Evidence. Stockholm: Rotobeckmann (translation of German edn, 1971; orig. Swedish edn 1963/1971). Trovillo, P. V. (1939). A history of lie detection. Journal of Criminal Law and Criminology, 29, 848 /881. Undeutsch, U. (1989). The development of statement reality analysis. In J. C. Yuille (Ed.), Credibility Assessment (pp. 101 /119). Dordrecht: Kluwer Academic Publishers. Vrij, A. (2000). Detecting Lies and Deceit. The Psychology of Lying and the Implications for Professional Practice. Chichester: Wiley. Vrij, A. (2002). Criteria-Based Content Analysis: a qualitative review of the first 37 studies. Paper presented at the 12th European Conference on Psychology and Law, Leuven, Belgium, September 2002. Vrij, A. and Akehurst, L. (1998). Verbal communication and credibility: Statement Validity Assessment. In A. Memon, A. Vrij and R. Bull (Eds.), Psychology and Law. Truthfulness, Accuracy and Credibility (pp. 3 /31). New York: McGraw Hill. Vrij, A., Edward, K., Roberts, K. and Bull, R. (2000). Detecting deceit via analysis of verbal and nonverbal behavior. Journal of Nonverbal Behavior, 24, 239 /263. Vrij, A., Edward, K. and Bull, R. (2001). Stereotypical verbal and nonverbal responses while deceiving others. Personality and Social Psychology Bulletin , 27, 899 /909. Wiener, M. and Mehrabian, A. (1968). Language within Language: Immediacy, a Channel in Verbal Communication . New York: Appleton Century Crofts. Yuille, J. C. (1989). Preface. In J. C. Yuille (Ed.), Credibility Assessment (pp. vii /xii). Dordrecht: Kluwer Academic. Zuckerman, M. and Driver, R. E. (1985). Telling lies: verbal and nonverbal correlates of deception. In A. W. Siegman and S. Feldstein (Eds.), Multichannel Integrations of Nonverbal Behaviors (pp. 129 /147). Hillsdale, NJ: Erlbaum.