LNCS 3975 - Detecting Deception in Person-of-Interest Statements

42 downloads 0 Views 130KB Size Report
Abstract. Most humans cannot detect lies at a rate better than chance. Alternative methods of deception detection may increase accuracy, but are intrusive, ...
Detecting Deception in Person-of-Interest Statements Christie Fuller, David P. Biros, Mark Adkins, Judee K. Burgoon, Jay F. Nunamaker Jr., and Steven Coulon Spears School of Business, Oklahoma State University, Stillwater, OK 74078, USA {Christie.Fuller, David.Biros}@okstate.edu, {madkins, jburgoon, jnunamaker, scoulon}@cmi.arizona.edu

Abstract. Most humans cannot detect lies at a rate better than chance. Alternative methods of deception detection may increase accuracy, but are intrusive, do not offer immediate feedback, or may not be useful in all situations. Automated classification methods have been suggested as an alternative to address these issues, but few studies have tested their utility with real-world, high-stakes statements. The current paper reports preliminary results from classification of actual security police investigations collected under high stakes and proposes stages for conducting future analyses.

1 Introduction Deception has previously been defined as “a message knowingly transmitted by a sender to foster a false belief or conclusion by the receiver” [1]. Despite thousands of years of attempting to detect deception [2], humans have not proven to be very capable lie detectors, as most are not able to detect lies at a rate better than chance [3]. Only a few groups of professionals, such as secret service agents, exceed chance levels, reaching accuracy levels as high as 73% [4]. Several alternate methods exist for deception detection including the polygraph, Statement Validity Analysis, and Reality Monitoring [3]. However, these methods are intrusive or fail to provide immediate feedback. Accurate, non-invasive methods are needed to address the shortcomings of existing deception detection methods. Automated classification methods have been introduced into deception research as a possible alternative [5, 6]. The focus of the current research is to develop a classifier for analysis of written statements. The classifier is trained and tested on a corpus of actual statements concerning transgressions.

2 Background A number of deception theories can be used to guide a systematic analysis of linguistic information as it relates to deception [6]. Interpersonal deception theory (IDT) [1] focuses on the interaction between participants in a communicative act. Although it was intended to be more applicable to oral deception than written statements, its emphasis on the strategic nature of communication underscores the importance of considering how and why people manage the information in the messages they produce. S. Mehrotra et al. (Eds.): ISI 2006, LNCS 3975, pp. 504 – 509, 2006. © Springer-Verlag Berlin Heidelberg 2006

Detecting Deception in Person-of-Interest Statements

505

A component of IDT, information management [7], proposes dimensions of message manipulation similar to information manipulation theory (IMT) [8]. Relevant features of these perspectives that lend themselves to automated analysis include quantity of message units such as words or sentences; qualitative content features such as amount of details; clarity, including specificity of language or presence of uncertainty terms; and personalization, such as use of personal pronouns. IDT has been applied in multiple, text-based studies. For example, Zhou, Burgoon, Twitchell, Qin and Nunamaker [5] argued that IDT can be used in leaner mediated channels and utilized IDT in the analysis of four classification methods of deception in written statements. They organized linguistic indicators into the following categories: quantity, specificity, affect, expressivity, diversity, complexity, uncertainty, informality, and nonimmediacy. In [9], IDT was used as the framework to guide an examination of dynamic changes in text-based deception. Based on these previous investigations, potential deception indicators from the Zhou et al. categorization scheme were selected to classify written statements from criminal investigations as truthful or deceptive.

3 Method An important issue in text-based message analysis is the extent to which results generated from laboratory data and collected under low stakes conditions are generalizable to actual deceptive messages collected under high stakes conditions. The current investigation addressed this issue by analyzing person-of-interest statements related to criminal incidents. 3.1 Message Feature Mining Message feature mining, [10] was used to classify the documents as truthful or deceptive. This process has two main steps: extracting features and classification. Key aspects of the feature extraction phase are choosing appropriate features, or cues, and calculating those features over desired text portions. Key components of the classification phase are data preparation, choosing an appropriate classifier, and training and testing the model. General Architecture for Text Engineering (GATE) [11] is the primary tool that was used for the feature extraction step. Two main features of GATE are language resources, such as corpora and ontologies, and processing resources, such as tokenizers and part-of-speech taggers. Waikato Environment for Knowledge Analysis (WEKA) [12] was utilized for the classification step. GATE and WEKA are both open-source Java-based programs. While GATE already includes processing resources to accommodate some of the cues to deception, it can be modified to incorporate additional cues. Further, GATE includes the WEKA wrapper class. This facilitates the use of the two programs in combination to conduct message feature mining. We combined WEKA and GATE into a program called the Agent99 Analyzer (A99A).

506

C. Fuller et al.

3.2 Data Collection A pilot study was accomplished on a set of 18 criminal incident statements provided by an Air Force investigative service. Statements were taken from cases that occurred within the last few years and were determined to be truthful or deceptive by the polygraph division of the investigative service. The subject wrote the statement due to a criminal investigation. For deceptive statements, the investigators found additional evidence to suggest the subject was lying and ordered a polygraph. Under polygraph the deceivers recanted their statements, thus confirming that the original accountings of the incidents were lies. The polygraph also confirmed the veracity of the truthful statements. A classification model using those criminal incident statements achieved accuracy of 72% [13]. A99A was deemed feasible and effective. Some minor adjustments were made to the processes before the multi-phased main experiment to be reported here commenced. Currently, this study is in Phase One of three planned phases. In Phase One, 307 known truthful and deceptive statements have been collected and codified to train A99A. The data for this study include criminal incident statements from 2002 to the present, collected in cooperation with Security Forces Squadron personnel from two military bases. Criminal incident statements are official reports written by a subject or witness in an investigation. In some cases, individuals involved in incidents have lied on their incidents in an attempt to avoid prosecution for an incident. The attempted deception is discovered by security forces personnel during their investigation of the incident. They are known as “false official statements”. False official statements provide invaluable data for this study. Unlike mock lies, they are confirmed deceptive statements where the deceiver attempted to avoid prosecution in a criminal incident (or the deceiver was aiding and abetting another’s attempt to avoid prosecution). Ground truth for truthful statements is established where the evidence or eventual outcome of the case supports the person’s statement. Standardized procedures were used to transcribe the statements in preparation for automatic classification. Text-based modeling needs a large sample size relative to number of indicators; therefore, it is prudent to prune the possible set of predictors to a smaller “best’ set. A series of t-tests identified the 17 discriminating variables noted in Table 1 below. Though the data set currently has many more truthful than deceptive statements, these results show that the predictors are largely consistent for both the full sample (N = 307) of statements and a reduced sample (N = 82) consisting of equal numbers of truthful and deceptive statements. To analyze the balanced set of 41 truthful and 41 deceptive statements with A99A, a neural network model was implemented using WEKA for classifying the statements as truthful or deceptive. Preliminary results produced a 72 percent overall accuracy on the balanced data set of 82 statements. A99A had a 68.3% success rate at identifying deceptive statements and a 75.6 % success rate at identifying truthful statements, for an overall accuracy of 71.95%.

Detecting Deception in Person-of-Interest Statements

507

Table 1. Significant Discriminating Variables CATEGORY

Quantity

Specificity

Diversity

Personalization

VARIABLE

SIGNIFICANT IN BALANCED SAMPLE ( N = 82)

Deceptive > Truthful

Word count

*

*

*

Verb count

*

*

*

Sentence count

*

*

*

Modifier count

*

*

*

Affect ratio

*

*

*

Sensory ratio

*

*

*

Lexical diversity

*

*

*

Redundancy

*

Content word diversity

*

*

Non-self references

*

*

*

*

*

2nd person pronouns

Nonimmediacy

Truthful >

SIGNICANT IN FULL SAMPLE (N=307)

Deceptive

* *

Other References

*

*

*

Group pronouns

*

*

*

Immediacy terms

*

*

*

Spatial far terms

*

*

Temporal nonimmediacy

*

*

Passive voice

*

* * *

Phase One, the calibration phase, will continue until a balanced data set of 200 statements has been assessed by A99A. This sample size is based on neural network heuristics for achieving generalizable results [14]. Five to ten statements will be needed per weight in the neural network. Based on estimated network size, a balanced sample size of 200 statements has been deemed sufficient. Once complete, Phase Two, the testing phase, will begin. In this phase A99A will run parallel to the security force incident investigations. A99A will review statements and make predictions of their veracity, but this will be accomplished in a blind study. The security forces will not receive the A99A predictions nor will A99A know the

508

C. Fuller et al.

outcome of the security force investigations. At the end of this period, the two methods will be compared. Phase Three, the final phase, will see A99A used in a predictive mode. Incident statements will be processed through A99A before security forces personnel investigate the incident. It is hoped that the analyzer will decrease the amount of time the SF personnel must spend on each incident.

4 Expected Results and Implications of Study Previous studies in automated detection of ‘mock lies’ have achieved accuracy levels of approximately 80% [5]. A pilot study using criminal incident statements achieved accuracy of 72% [13]. However, the sample size for the pilot study was only 18 statements. The current study has matched this accuracy rate. As sample size increases and the set of cues included is refined, it is anticipated that accuracy will improve. The cues found to be significant in this study are drawn from a larger set that has been developed in previous research. These cues may not be significant in all contexts. However, if significance of these cues can be replicated and verified within the domain of criminal interrogations, this could be an important finding [8]. It is anticipated that the classifier developed by this study may eventually be used by military personnel to augment current investigative tools, such as the polygraph. This tool might be useful in a situation when an investigator would like to quickly determine from a group of statements which statement might be deceptive. Another suggested use of this is for secondary screening interviews within transportation systems [15]. Given the cost, intrusiveness, and lengthy process associated with other forms of deception detection, we believe our approach will provide military and civilian law enforcement entities a useful tool to facilitate their investigations.

References 1. Buller, D.B., Burgoon, J.K.: Interpersonal Deception Theory. Communication Theory. 6 (1996) 203-242 2. Trovillo, P.V.: A History of Lie Detection. Journal of Criminal Law & Criminology. 29 (1939) 848-881 3. Vrij, A., Edward, K., Roberts, K.P., Bull, R.: Detecting Deceit Via Analysis of Verbal and Nonverbal Behavior. Journal of Nonverbal Behavior. 24 (2000) 239-263 4. Ekman, P., O'Sullivan, M., Frank, M.G.: A Few Can Catch a Liar. Psychological Science. 10 (1999) 263-265 5. Zhou, L., Burgoon, J.K., Twitchell, D.P., Qin, T., Nunamaker Jr, J.F.: A Comparison of Classification Methods for Predicting Deception in Computer-Mediated Communication. Journal of Management Information Systems. 20 (2004) 139-163 6. Zhou, L., Burgoon, J.K., Nunamaker, J.F., Twitchell, D.:, Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications. Group Decision and Negotiation. 13 (2004) 81-106 7. Zuckerman, M., DePaulo, B.M., Rosenthal, R.:Verbal and Nonverbal Communication of Deception. In: Berkowitz, L (ed.): Advances in Experimental Social Psychology. Academic Press, New York (1981) 1-59

Detecting Deception in Person-of-Interest Statements

509

8. DePaulo, B.M., Lindsay, J.J., Malone, B.E., Muhlenbruck, L., Charlton, K., Cooper, H.: Cues to Deception. Psychological Bulletin. 129 (2003) 74-118 9. Zuckerman, M., Driver, R.E.: Telling Lies: Verbal and Nonverbal Correlates of Deception. In Multichannel Intergration of Nonverbal Behavior. 1987, Erlbaum: Hillsdale, NJ. 10. Adkins, M. Twitchell, D.P., Burgoon, J.K., Nunamaker Jr, J.F.: Advances in Automated Deception Detection in Text-based Computer-mediated Communication. In: Enabling Technologies for Simulation Science VIII. (2004) 11. Cunningham, H.: GATE, a General Architecture for Text Engineering. Computers and the Humanities. 36 (2002) 223-254 12. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java. Morgan Kaufman, San Francisco (2000) 13. Twitchell, D., Biros, D.P., Forsgren, N., Burgoon, J.K., Nunamaker, Jr, J.F.: Assessing the Veracity of Criminal and Detainee Statements: A Study of Real-World Data. In: 2005 International Conference on Intelligence Analysis (2005) 14. Sarle, W.: What are Cross-Validation and Bootstrapping? [cited 2005; Available from: http://www.faqs.org/faqs/aifaq/neural-nets/part3/section-12.html. (2004) 15. Twitchell, D., Jensen, M.L., Burgoon, J.K., Nunamaker, Jr, J.F.: Detecting deception insecondary screening interviews using linguistic analysis. In Proceedings of The 7th International IEEE Conference on Intelligent Transportation Systems. (2004) 118-123

Suggest Documents