2Centre of Forensic Science, Turin, Italy. Abstract. This study .... master's degree students from the La Sapienza University in Rome, taking part in a course on ...
Journal of Investigative Psychology and Offender Profiling J. Investig. Psych. Offender Profil. (2012) Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/jip.1366
The Effects of Coding Bias on Estimates of Behavioural Similarity in Crime Linking Research of Homicides TOM PAKKANEN1,*, ANGELO ZAPPALÀ1,2, CAROLINE GRÖNROOS1 and PEKKA SANTTILA1 1
Åbo Akademi University, Åbo (Turku), Finland 2 Centre of Forensic Science, Turin, Italy
Abstract This study explored whether a coding bias due to knowledge of which crimes have been committed by the same offender exists when behavioural variables are coded in serial murder cases. The study used an experimental approach where the information given to the participants (N = 60) concerning correct linkages between a number of murder series was manipulated. The participants were divided into three different groups (n = 20 in each). These three groups received correct, incorrect, or no information about the linked series prior to the coding. The results showed that there is no clear evidence to support the hypothesis of a bias in the coding. The risk of expectancy effects and suggestions on how to minimise them in behavioural crime linking research were discussed, and suggestions on how to improve the validity of possible future replications of the experiment were given. The practical implications of expectancy effects on behavioural crime linking decisions for the justice system were also discussed. Copyright © 2012 John Wiley & Sons, Ltd. Key words: crime linking; expectancy effect; coding bias; serial homicide; offender profiling
INTRODUCTION In police investigations, behavioural similarity of crimes is sometimes used to identify a crime series suspected to have been committed by the same offender (Woodhams, Hollin, & Bull, 2007). Correctly identifying series of crimes is an effective investigative strategy because a minority of perpetrators commit the majority of crimes. Methods used to identify crimes committed by the same offender based on the analysis of behavioural similarity are referred to as behavioural crime linking (Grubin, Kelly, & Brunsdon, 2001; Santtila, Pakkanen, Zappalà, Bosco, Valkama, & Mokros, 2008; Woodhams et al., 2007). Research carried out in behavioural crime linking, where coders of the data use coding booklets to identify behavioural crime scene variables, has the potential problem of giving *Corresppondence to: Tom Pakkanen, Åbo Akademi University, Åbo (Turku), Finland. E-mail: tom.pakkanen@abo.fi
Copyright © 2012 John Wiley & Sons, Ltd.
T. Pakkanen et al. biased results, as said coders most commonly are aware of which crimes have been committed by the same offender. As a result, the research might overestimate behavioural similarity in serial crime, distorting behavioural science experts’ conclusions regarding the potential of crime linking. The present experiment aimed at testing whether prior knowledge of which crimes have been committed by the same offender affects the coders’ decisions when coding individual behavioural variables in favour of increased perceived (i.e. coded) behavioural similarity between the linked murders. Behavioural crime linking When utilising crime linking, one tries to draw conclusions about whether several crimes have been committed by the same perpetrator (Grubin et al., 2001). The advantage of behavioural crime linking over more traditional methods, such as DNA analyses, is that the latter are highly expensive and time-consuming (Craik & Patrick, 1994), and less readily available at most crime scenes (Grubin, Kelly, & Ayis, 1997). Behavioural crime linking, such as offender profiling, has its roots in theories of personality (Woodhams et al., 2007). Two critical assumptions underlie behavioural crime linking. First, that offenders’ behaviour across crimes is consistent, and second, that there is variation between individual offenders; in other words, they behave differently from each other (Alison, Bennell, Mokros, & Ormerod, 2002; Canter, 1995). The central hypothesis in behavioural crime linking is that these assumptions of consistency and variability exist in the behaviour of the offender in a particular crime type (Bennell & Canter, 2002; Crabbé, Decoene, & Vertommen, 2008). A number of studies have looked at behavioural crime linking in several types of crimes, using a variety of different methodologies. For example, Santtila, Fritzon, and Tamelander (2004) found that 33% of cases of arson were correctly linked on the basis of behaviour. Behavioural crime linking has also been studied in cases of rape (Bennell, Jones, & Melnyk, 2009; Canter, 1995; Grubin et al., 1997; Grubin et al., 2001; Santtila, Junkkila, & Sandnabba, 2005) and burglary (Bennell & Canter, 2002; Bennell & Jones, 2005; Goodwill & Alison, 2006; Green, Booth, & Biderman, 1976). Some studies have also successfully linked together cases of murder, analysing crime scene behaviour. Salfati and Bateman (2005) were able to classify serial homicides committed in the US into expressive and instrumental themes, and found that the offenders were consistent with regard to this classification in their first three known offences. In their study of Italian serial murders (N = 116), Santtila et al. (2008) managed to correctly assign 63% of the cases to their correct series, using seven identified dimensions of variation in the offenders’ crime scene behaviour. Analysing the same data further with Bayesian reasoning, Salo and his research team (2012) correctly linked 84% of the cases. These results lend support to the notion that behavioural crime linking is possible and that it would make crime investigation more effective if used appropriately (Santtila et al., 2005). The expectancy effect: A potential problem The expectancy effect, also known as the observer effect, experimenter effect, and experimenter bias, refers to situations where the expectations of the researchers led to biased conclusions about their findings in favour of their expectations (Rosenthal, Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
Expectancy effects in behavioural crime linking 1966, 1994). Although the discussion about the development of behavioural crime linking methodology has been mostly concerned with different theoretical models and statistical methods (e.g. Alison et al., 2002; Crabbé et al., 2008; Woodhams et al., 2007), the issue of experimenter effect has yet to receive attention. In a landmark study on experimenter expectancy effect, Rosenthal and Fode (1963) demonstrated that experimenters, with a manipulated prior knowledge of how well rats would perform in a maze, influenced their results in favour of the experimenters’ expectations. In another study, Rosenthal and Jacobson (1968) showed that teachers’ expectations influence the academic and classroom behaviour of their students. Known as the Pygmalion effect, it holds that the greater the expectation placed on the student, the better they perform. The pervasiveness of observer effects has since been demonstrated across a wide range of experiments and conditions (Risinger, Saks, Thompson, & Rosenthal, 2002). In behavioural crime linking research, this could mean that the coders’ prior knowledge of which crimes belong to the same series might affect the behavioural similarity they perceive in the cases they code, thus introducing a systematic bias that weakens the validity and reliability of the research and subsequently the conclusions drawn from it. Sheldrake (1998) reviewed 72 scientific publications of experiments in psychology and found that only a small minority of them (6.9%) had used blind or double-blind methods to guard against the experimenter effect. The authors of the present study found no studies of behavioural crime linking where the researchers report blindmethods with regard to their coding procedures. This suggests that the expectancy effect has not been taken into consideration and that researchers (or in studies where researchers have utilised pre-existing data—the ones who coded it) commonly know which crimes are linked when coding the behavioural variables. On the basis of systematic interviews, Sheldrake (1998) concluded that researchers tend to think of blind techniques as guarding mainly against biases introduced by human subjects rather than the experimenters themselves and that there often is a tacit assumption that the experimenter effects are negligible. Even with stringent coding schemes, where the behavioural analysis of the crime scene concentrates on observable behaviour that is coded using dichotomous variables (e.g. Salfati, 1998; Salfati & Bateman, 2005; Santtila et al., 2005; Santtila et al., 2008), there is still room for interpretation. It may, for example, be difficult to make decisions about variables such as the victim was found naked and the body of the victim was covered, how many pieces of clothing have to be removed for the victim to be considered naked, or what percentage of the body has to be covered for the body to be considered covered? Having already coded homicides committed by the same offender, the coder, perhaps expecting consistency in the offender’s behaviour, may be prone to code certain behaviours in the same way they presented in the offender’s previous homicides. The expectancy effect also poses a problem on a more practical level, when crime linking research is applied in the justice system and forensic experts are asked to comment on whether a particular crime is linked to one or several other crimes. In Daubert v. Merrell Dow Pharmaceuticals, Inc. (1993), the US Supreme Court stipulated that to be admissible, the testimony of an expert witness has to be the product of reliable scientific principles and methods. The Daubert decision, along with a few subsequent rulings, brought about a substantial amendment to the Federal Rules of Evidence 702, which sets the standard for the admission of testimony by expert witnesses in federal courts in the US (FRE 702, 2011). Bosco, Zappalà, and Santtila (2010) reviewed court cases where expert testimony in the field of linkage analysis had been given and concluded that the most common reason Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
T. Pakkanen et al. for excluding expert testimony in these cases was the lack of demonstrable reliability of the methods used. In one particular ruling where the experts’ testimony had been excluded, the court noted that the expert had ignored the many differences between the two cases, concentrated on the similarities, and thus overestimated them. In their review of observer effects in forensic science, Risinger et al. (2002) proposed blind testing as the principal method of preventing distortions caused by expectation. Aim and hypotheses Because information about case linkage and perceived behavioural similarity lie at the very core of behavioural crime linking, and the authors not knowing of any study investigating this so far, an experiment was devised to test for experimenter bias. To test whether coders’ prior knowledge of which crimes had been committed by the same offender affected the similarity these coders perceived in the behaviour of the offender in linked crimes, 60 Italian university students were randomly assigned into three groups of 20 persons each. Each group was then given 10 cases of Italian serial murders, where five offenders had committed two offences each, and a list of behavioural crime scene variables to code from the cases. Prior to the coding task, the information about linkage status was manipulated between the groups. The first group, the Correctly Informed Group, was told which offenders had committed each murder. The second group, the Incorrectly Informed Group, was given false linkage information, and the third group, the Not Informed Group, was not given any information regarding linkage status. The experiment was set up to test two hypotheses. The first hypothesis was that the Correctly Informed Group would code more similarity in the murders they knew belonged together, than the Not Informed Group. This would provide evidence for a coding bias, as the only difference between the groups was the fact that the Correctly Informed Group knows the linkages. Any additional similarity coded by them would, therefore, be due to a bias. The second hypothesis was that the Incorrectly Informed Group would code less similarity in the two murders that were actually linked, than the Not Informed Group would. This would also provide evidence for a coding bias, as the incorrect information would have led to more perceived similarity between actually unlinked offences. In other words, the perceived similarity would be smaller in the actual series for the Incorrectly Informed Group than the Not Informed Group.
METHOD Participants The participants in the crime linking experiment consisted of 60 students from three different areas in Italy. The age of the participants ranged from 23 to 55 years (M = 28 years). One-quarter (25%) of the participants were first year students from Reggio Emilia, studying child and adolescent psychotherapy. Another quarter (25%) were master’s degree students from the La Sapienza University in Rome, taking part in a course on theories and methods in crime investigation. Half of the participants (50%) were from the University of Pontificia Salesiana Roma Torino, Turin, who participated in an extension course on psychology in criminology and criminal investigation. All participants were students of Dr Angelo Zappalà, and none of them had Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
Expectancy effects in behavioural crime linking prior knowledge of the aim or hypotheses of the experiment. None of the participants were familiar with the concept of behavioural consistency, and only a few had heard of the idea of linking crimes utilising the offenders’ observed crime scene behaviour (these concepts were taught in the courses after the completion of the experiment). Materials Each of the participants was given information on 10 murders in the form of a brief description: approximately 15 lines of text extracted, edited, and summarised from court transcripts of Italian cases of serial murder. The vignettes included information about the victim (age and gender), the modus operandi of the offender, the weapon used in the killing, wound pattern(s) on the victims, and the place where the body was found (Table 1). The cases were chosen from a larger set of Italian serial murders used by Santtila et al. (2008). The data consisted of murders committed by five offenders: two offences were chosen randomly from each offender’s series. All data were edited for maximum uniformity, and all unique information that could identify a case, such as dates, names of persons, and places, was removed. The participants were also given a coding scheme with 92 dichotomous variables (available on request from the authors) including situational variables (pertaining to the when and where of the homicide), behaviour observable at the crime scene (the use of weapons, binds, and gags; injuries of the victim; post-mortem activity such as moving, hiding, or destroying the body; etc.), and victim characteristics (age, gender, marital status, employment, Table 1. Three examples of case vignettes excerpted from the court transcripts Case 1 According to the reconstruction, the offender had approached his victim, a 64-year-old prostitute, and together they had gone to the aggressor’s home. After having had sexual intercourse, the offender strangled his victim with a rope, at a moment when the victim had turned her back to the offender. He then put the body in a bag, carried it to his car, and drove to the river XXX, a place not far from his home. Standing on the riverbank, the offender threw the body into the water. Case 2 After getting the woman, a 44-year-old prostitute into the car, the murderer had asked her to come home with him. She refused and he had agreed to stay with her in the car. Before the intercourse even started, while she was still undressing, the offender took advantage of the situation and shot a single shot that hit her in the head, killing her instantaneously. After the murder, he had carried the body to a nearby river. There was a cabin by the river, where he laid the body on a sofa and set it on fire. Everything was destroyed: the cabin, the sofa, the victim’s personal belongings and the body of the victim. Case 3 In the afternoon of Sunday the 12th of April, a body was found in a train toilet by an employee of the railways. The victim was a 32-year-old, married, Italian nurse. The body was found fully clothed and the victim’s head was covered with a jacket. The offender had shot the victim in the head after covering her with a blanket. The toilet door was locked from the inside, and the victim was taken away by the train. The autopsy reported, that the external examination of the body immediately revealed a lesion to the left retro auricular region of the head. It showed the typical features of a gunshot, a single bullet shot into the skin. The autopsy did not find any other signs of violence on the victim’s body. The corpse, however, still had a full bladder of urine which indicates that the attack took place immediately after the victim entered the toilet. The first two homicides were committed by the same offender, whereas the third was committed by a second offender. The vignettes have been translated from Italian into English. Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
T. Pakkanen et al. known health issues, etc.). The coding scheme was based on research by Salfati (1998) and subsequently developed by Pakkanen, Santtila, Mokros, and Sandnabba (2006) and Santtila et al. (2008). Procedure All the participants (N = 60) were randomly assigned to three different groups, with 20 students in each group. Two of the groups were told that the offences had been committed by five offenders, and the groups were given information on which murderer had committed which offence. The first group got correct information about which pairs of two murders belonged together (Correctly Informed Group), and the second group got incorrect information about which pairs of two murders that were committed by the same offender (Incorrectly Informed Group). All participants in the latter group received the same erroneous information about linkage status; in other words, the pairs of murders they were given were actually not committed by the same perpetrator. The third and final group did not get any information about which murders were linked (Not Informed Group) and, presumably, thought that all the murders were single murders. The hypotheses were that by manipulating the information given to the groups about linkage status (independent variable), there would be a measurable difference in the groups’ perceived (i.e. coded) behavioural similarity of the cases (dependent variable). Next, all the groups were asked to read the descriptions of one murder at a time and to code the variables in the provided scheme as present (1), absent (0), or missing (99) for each of the 10 cases. After the coding task was completed, the third group (Not Informed Group) was told that the murders had been committed by only five murderers and asked to identify the five pairs of homicide. This was done to check how easily the series could be identified and whether behavioural similarity was used intuitively as a clue for linking crimes. The students from Rome and Turin got their instructions verbally from the leader of the experiment and filled out the forms in about 4 hours. These participants did not have a possibility to discuss the task with each other. The students from Reggio Emilia received their instructions and returned their results by e-mail. Although it is impossible to prove that these participants did not discuss the task with each other, on the basis of discussions with a subset of the students, the experimenters assumed they did not. Statistical analyses The coded behavioural variables were analysed with regard to behavioural similarity in the cases that were actually linked and compared between the groups. To calculate the behavioural similarity in the linked cases, the phi coefficient was used. Correlations were calculated separately for each subject and case in each experiment group, and the groups were compared pairwise to check for any differences between the groups. This was done using a generalised linear model repeated over subjects, checking if group allocation would predict differences in coded behavioural similarity. Inter-rater reliabilities were calculated separately for each group and case, using the Kuder– Richardson Formula 20, to check for variance in the coding. The correct linking decisions made by the Not Informed Group were calculated and compared with coded behavioural Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
Expectancy effects in behavioural crime linking similarity, again using a generalised linear model, repeated over subjects, to check if coded behavioural similarity predicted whether a series was correctly linked or not.
RESULTS There were five series of two murders for each of the groups to code. The mean coded behavioural similarity over all the five series was the variable studied. The means, measured using the phi coefficient, and standard errors of the mean for behavioural similarity are displayed in Table 2 and Figure 1. The results show no significant difference in the coded behavioural similarity in the series between the groups (Wald w2 = 4.13, p = .127). Contrary to the hypothesis, the Correctly Informed Group (M = .33, SE = .02) did not code more similarity in the series compared with the group with no information (M = .37, SE = .02). The highest perceived similarity withinseries was found in the Not Informed Group, but as the pairwise comparison between this group and the Correctly Informed Group show (Table 3), the difference was not significant (Wald w2 = 2.94, p = .087). The Correctly Informed Group had a slightly higher perceived similarity than the Incorrectly Informed Group (M = .32, SE = .02), but the difference was small and also not statistically significant (Wald w2 = .07, p = .797). The biggest difference in perceived similarity between two groups was found in the comparison of the Incorrectly Informed Group and the Not Informed Group. Although the difference approached statistical significance (Wald w2 = 3.34, p = .068), it remained a tendency. Hence, no evidence for the experimenter bias could be found. Inter-rater reliability was calculated separately for each group and case, and ranged from .74 (acceptable) to .99 (excellent). A summary of the inter-rater reliabilities can be seen in Table 4. The very high overall inter-rater reliability (M = .93) would suggest that the case excerpts were (too) easy to code. Linking decisions were available for 17 participants of the Not Informed Group. A total amount of 52 correct linking decisions were made, and 33 series were linked erroneously. The mean amount of correctly linked series per participant was three. The hardest series to link seemed to be the second one, with only 41% of the participants linking the two murders correctly, whereas series 4 was the easiest with roughly three-quarters (76%) of the group getting it correct. See Table 5 for an overview. When comparing the linking decisions of the different series to the coded similarity of the same, there was a statistically significant relationship (B = !.9, SE = .04, p = .041), meaning that the participants’ linking decision outcome was dependent on the perceived behavioural similarity of the series. In other words, the cases with higher perceived similarity were the ones that were easier to link together. Table 2. Means and standard errors of the mean for coded behavioural similarity, measured using the phi coefficient, for each pair of two murders in the three experiment groups Group
M
SE
Correctly Informed Group Incorrectly Informed Group Not Informed Group
.33 .32 .37
.02 .02 .02
Wald w2 = 4.13, p = .127. Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
Mean behavioural similarity in the linked offences
T. Pakkanen et al. 0,40 0,38 0,36 0,34 0,32 0,30 0,28 0,26 0,24 0,22 0,20
Correctly Informed Group
Incorrectly Informed Group
Not Informed Group
Figure 1. Mean behavioural similarity in the linked offences for the three groups. Error bar represents the standard errors of the mean.
Table 3. Pairwise comparisons of the differences of the means of the phi coefficients for the experiment groups Compared groups
Differences in means SD
Correctly Informed Group versus Not Informed Group Incorrectly Informed Group versus Not Informed Group
.047 .054
.027 .030
Wald w2
p
2.94 3.34
.087 .068
Table 4. Mean and range of the inter-rater reliabilities (Kuder–Richardson Formula 20) in the three experiment groups Group
Min
Max
M
Correctly Informed Group Incorrectly Informed Group Not Informed Group
.81 .74 .97
.92 .98 .99
.87 .95 .98
Table 5. The amount and percentage of correct linking decisions in the Not Informed Group (n = 17) Murder series Series 1 Series 2 Series 3 Series 4 Series 5 Series 1–5
Amount of correct linking decisions
Percentage of correct linking decisions
11 7 9 13 12 52
65 41 53 76 71 61
DISCUSSION Evaluating the hypotheses and the results The aim of the present study was to explore whether a coding bias exists using an experimental approach where the information given to the participants concerning correct Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
Expectancy effects in behavioural crime linking linkages between a number of murder series was manipulated. Table 6 compares the results to the hypotheses of the present study. The first hypothesis was that the correct information about the linked murders would result in more perceived similarity for the series compared with having no such information. In contrast to this hypothesis, the Not Informed Group coded more similarity between actually linked murders than the Correctly Informed Group. Although not formally reaching statistical significance, there was a tendency for this contra-intuitive finding. On the basis of Rosenthal’s (1966, 1994) description and findings of the experimenter effect, one could assume that prior knowledge about which crimes had been committed by the same offender in the studied murder excerpts would distort participants’ perceived (i.e. coded) behavioural similarity in favour of their expectations. The results of testing the first hypothesis contradict this assumption, thus providing no evidence for the presence of a coding bias. The second hypothesis, that the Incorrectly Informed Group would code less similarity for linked murders compared with the Not Informed Group, seems to gain some support from the results. The Incorrectly Informed Group did indeed code less similarity for the linked cases than the Not Informed Group; however, the difference did not reach formal levels of statistical significance. This finding contradicts the conclusion of the first hypothesis, as it provides some evidence for a coding bias. It would seem that the Incorrectly Informed Group coded more similarity in the series they incorrectly thought were linked on the basis of the false information they received, thus leading to lower similarities in the actually linked murders. Taken together, the results indicated that there is no unequivocal evidence for a coding bias. When considering the coding results of the three experiment groups, the odd one seems to be that the Not Informed Group coded more similarity (not significant) in the linked series than both the other groups. One possible explanation could be that not having any prior information about case linkage, and thus not spending any mental effort on the question ‘who committed which murder?’, enabled them to better concentrate on the task of coding the behavioural variables, resulting in a more accurate coding. The higher interrater reliabilities within this group would seem to support this idea. Although contradicting the expectancy effect, perhaps this finding, in line with the recommendations of Risinger et al. (2002), Sheldrake (1998), and Wilkinson (1999), shows that no expectation (blind testing) is the most efficient way to go. The amount of correct linking decisions for the Not Informed Group varied from 53% for the third series to 76% for the fourth series. This finding is in line with previous research on offenders’ behavioural consistency, where consistency has been found to vary both between offenders and within the series of one offender. Because the present study used two cases, randomly picked from the five offenders’ series, a certain variance in behavioural consistency Table 6. Main results compared with the hypotheses of the study 1 2
Correctly Informed Group Incorrectly Informed Group
< (>) < () indicate the level of coded behavioural similarity between the experiment groups; ‘’ stands for more coded similarity. The hypotheses of the study are shown in parenthesis. The results of the hypotheses and their significance levels are displayed in the last column. Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
T. Pakkanen et al. (and perceived similarity) between the crimes was to be expected. Another question is how conscious a lay person is about the fact that they ‘should’ be looking for behavioural similarities in crimes that they know are linked. The statistically significant relationship between the linking decisions of the Not Informed Group and their perceived behavioural similarity would suggest that the participants, who could be considered laymen in terms of forensic expertise, did intuitively make their decisions concerning linkage on the basis of their perceived similarity of the cases. Santtila et al. (2008) found that their statistical model was able to correctly link 63% of the same Italian serial murders (N = 116) when analysing the whole set of 23 serial killers with 2 to 17 victims each. Using a Bayesian method for their analysis, Salo et al. (2012) reached an even higher portion of 84% with the same sample. When only one prior murder was known, as in the present study, Salo et al.’s model correctly linked 59% of the cases. The total amount of correct linking decisions in the present study was 61%. Because the excerpts of the present study were clearly easier to code, thus making the linking task easier, this comparison could be taken to indicate that the aforementioned statistical models are more efficient at linking crimes than university students. Limitations of the study The biggest limitation of the present study lies in the validity of the used murder excerpts. Using the same coding scheme, but with complete pre-trial investigation protocols rather than excerpts, Pakkanen et al. (2006) had a significantly lower inter-rater reliability (.72) than the present study. The high inter-rater reliability in the present study (.93) would suggest that the murder excerpts were easy to code, thus reducing the variance of the perceived similarity in the participants and groups. The heavily edited and summarised vignettes of the court transcripts might have limited the participants to code in a more reliable manner, making it less probably for any bias to have noticeable effects. More thorough documentation, better reflecting the operational situation of the police handling comprehensive amounts of pre-trial investigation data, could have made the coding task more challenging, leaving more room for variation and for bias to emerge. In making the material more realistic for the participants, and hence more ecologically valid, the trade-off would be less uniformity and a significantly more laborious experiment to conduct. Bennell and Jones (2005) pointed out that solved cases might show higher levels of consistency and inter-individual variation than unsolved cases. It might be partly because of these characteristics that the cases are easier to solve. There might also be a deeper inherent problem in using court transcripts in linkage research: they might overestimate behavioural similarity, as court clerks summarise solved cases that are known to be linked. Thus, the expectancy effect could already have taken place in court, when the vast data of the pre-trial investigation protocols presented during the trial have been compiled and summarised, possibly inflating the similarity of the offences in the courts’ transcripts. Much conscious effort has also gone into developing the coding schemes. One possibility is that the variables are defined well enough for there to not to be enough variation in the coding for the experimenter effect to take place. The ideal situation would perhaps be where the crime investigators tick off the coding scheme right at the beginning of the investigation, before any information of the suspect is even available, making the coding blind with regard to prior knowledge of case linkage. It is also worth noting that the present study was carried out using only cases of serial murder. Different types of crimes utilise different coding schemes, and the question posed in the present study concerning a possible bias in the coding would, therefore, have to be tested separately for other crimes as well. Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
Expectancy effects in behavioural crime linking Bosco et al.’s (2010) recommendation is to increase the reliability of forensic research to add to the chance of expert testimony being admitted in the courts. The more specific recommendation of Wilkinson (1999) is for researchers to describe the specific methods used to deal with experimenter bias, especially if the researchers have gathered their data themselves. Sheldrake (1998) went on to state that there is plenty of evidence for the experimenter effect, but scarce evidence for the lack of it, and therefore proposes to test possible experimenter effects by comparing results of an experiment under both open and blind conditions, as the present study has done. Risinger et al. (2002) agreed proposing blind testing as the principal method of preventing distortions caused by expectation in forensic science. Conclusion and suggestions for future research In police investigations, behavioural similarity of crimes is used to identify series suspected of having been committed by the same offender (Woodhams et al., 2007). Studies of crime linking, where the coders of the data use coding schemes, have the potential problem of giving biased results, as the coders usually are aware of which crimes are committed by the same offender. The studies might thereby overestimate behavioural similarity in serial crime, distorting the conclusions drawn by behavioural science experts and automated computer systems about crime linkage. An even more acute problem with the lack of blind testing is when forensic experts are asked to give testimony about whether two or more crimes have been committed by the same offender. Knowing that the police suspect the same offender of committing the crimes and that behavioural similarity is the key to linking them, the risk of overestimating similarity and making a false linking decision is imminent. This issue, however, needs to be studied separately using police pre-trial investigation reports, in order to grasp the extent of a possible expectancy effect on estimates of behavioural similarity in testimonies given by forensic experts. The aim of the present study was to explore whether a coding bias exists when using a coding scheme to record crime scene behaviour in serial murder cases. It seems that there is no clear evidence to support the hypothesis of such a bias in the coding; none of the results of the tested hypotheses were significant. The present study would have the power to confirm a strong bias but not to exclude a weak one; the results would suggest the lack of a confounding bias, but the experiment of the present study might be too robust to detect a smaller one. Replications of the experiment are needed, with specific consideration to the discussed validity of the material used, especially the length and complexity of the vignettes and the source of the data (court transcripts versus pre-trial investigation protocols). Also, the present study was carried out using only cases of serial murder. Replications of the experiment could benefit from studying other types of crimes as well. It is the view of the authors that the issue of expectancy effects in behavioural crime linking needs to be studied further and addressed more systematically by reporting inter-rater reliabilities and favouring blind methods with regard to data coding and giving expert testimony to the courts on the issue. REFERENCES Alison, L., Bennell, C., Mokros, A., & Ormerod, D. (2002). The personality paradox in offender profiling: A theoretical review of the processes involved in deriving background characteristics from crime scene actions. Psychology, Public Policy, and Law, 8, 115–135. Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip
T. Pakkanen et al. Bennell, C., & Canter, D. (2002). Linking commercial burglaries by modus operandi: Tests using regression and ROC analysis. Science & Justice, 42, 153–164. Bennell, C., & Jones, N. (2005). Between a ROC and a hard place: a method for linking serial burglaries by modus operandi. Journal of Investigative Psychology and Offender Profiling, 2, 23–41. Bennell, C., Jones, N., & Melnyk, T. (2009). Addressing problems with traditional crime linking methods using receiver operating characteristic analysis. Legal and Criminological Psychology, 14, 293–310. Bosco, A., Zappalà, A., & Santtila, P. (2010). The admissibility of offender profiling in courtroom: A review of legal issues and court opinions. International Journal of Law and Psychiatry, 33, 184–191. Canter, D. (1995). Psychology of offender profiling. In R. Bull, & D. Carson (Eds.), Handbook of psychology in legal contexts (pp. 343–355). New York: John Wiley & Sons Ltd. Crabbé, A., Decoene, S., & Vertommen, H. (2008). Profiling homicide offenders: A review of the assumptions and theories. Aggression and Violent Behavior, 13, 88–106. Craik, M., & Patrick A. (1994). Linking serial offences. Policing, 10, 181–187. Daubert v. Merrell Dow Pharmaceuticals, Inc. 509 U.S 579 (1993). Federal Rules of Evidence 702. (2011). Testimony by expert witnesses. Retrieved May 13, 2012 from http://www.law.cornell.edu/rules/fre/rule_702 Goodwill, A., & Alison, L. (2006). The development of a filter model for prioritizing suspects in burglary offences. Psychology, Crime & Law, 12, 395–416. Green, E., Booth, C., & Biderman, M. (1976). Cluster analysis of burglary M/Os. Journal of Police Science and Administration, 4, 382–388. Grubin, D., Kelly, P., & Ayis, S. (1997). Linking serious sexual assault. London: Home Office. Grubin, D., Kelly, P., & Brunsdon, C. (2001). Linking serious sexual assaults through behavior. London: Home Office. Pakkanen, T., Santtila, P., Mokros, A., & Sandnabba, K. (2006). Profiling hard-to-solve homicides. Identifying dimensions of offending and associating them with situational variables and offender characteristics. Unpublished manuscript. Risinger, M., Saks, M., Thompson, W., & Rosenthal, R. (2002). The Daubert/Kumho implications of observer effects in forensic science: Hidden problems of expectation and suggestion. California Law Review, 90, 1–56. Rosenthal, R. (1966). Experimenter effects in behavioral research. New York: Appleton-Century-Crofts. Rosenthal, R. (1994). Interpersonal expectancy effect: A 30-year perspective. Current Directions in Psychological Science, 3, 176–179. Rosenthal, R., & Fode, K. (1963). The effect of experimenter bias on performance of the albino rat. Behavioral Science, 8, 183–189. Rosenthal, R., & Jacobson, L. (1968). Pygmalion in the classroom. The Urban Review, 3, 16–20. Salfati, G. (1998). Homicide: A behavioural analysis of crime scene actions and associated offender characteristics. Unpublished doctoral dissertation. University of Liverpool, UK. Salfati, G., & Bateman, A. (2005). Serial homicide: An investigation of behavioral consistency. Journal of Investigative Psychology and Offender Profiling, 2, 121–144. Salo, B., Sirén, J., Corander, J., Zappalà, A. Bosco, D., Mokros, A., & Santtila, P. (2012). Using Bayes’ theorem in behavioral crime linking of serial homicide. Legal and Criminological Psychology. DOI: 10.1111/j.2044–8333.2011.02043.x Santtila, P., Fritzon, K., & Tamelander, A. (2004). Linking arson incidents on the basis of crime scene behavior. Journal of Police and Criminal Psychology, 19, 1–16. Santtila, P., Junkkila, J., & Sandnabba, K. (2005). Behavioural linking of stranger rapes. Journal of Investigative Psychology and Offender Profiling, 2, 87–103. Santtila, P., Pakkanen, T., Zappalà, A., Bosco, D., Valkama, M., & Mokros, A. (2008). Behavioral crime linking in serial homicide. Psychology, Crime & Law, 14, 245–265. Sheldrake, R. (1998). Experimenter effects in scientific research: How widely are they neglected? Journal of Scientific Exploration, 12, 73–78. Wilkinson, L. (Task Force on Statistical Interference, Board of Scientific affairs, APA) (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604. Woodhams, J., Hollin, C., & Bull, R. (2007). The psychology of linking crimes: A review of the evidence. Legal and Criminological Psychology, 12, 233–249. Copyright © 2012 John Wiley & Sons, Ltd.
J. Investig. Psych. Offender Profil. (2012) DOI: 10.1002/jip