International Journal of Offender Therapy and Comparative Criminology http://ijo.sagepub.com/
Are the Major Risk/Need Factors Predictive of Both Female and Male Reoffending?: A Test With the Eight Domains of the Level of Service/Case Management Inventory Donald A. Andrews, Lina Guzzo, Peter Raynor, Robert C. Rowe, L. Jill Rettinger, Albert Brews and J. Stephen Wormith Int J Offender Ther Comp Criminol 2012 56: 113 originally published online 13 February 2011 DOI: 10.1177/0306624X10395716 The online version of this article can be found at: http://ijo.sagepub.com/content/56/1/113
Published by: http://www.sagepublications.com
Additional services and information for International Journal of Offender Therapy and Comparative Criminology can be found at: Email Alerts: http://ijo.sagepub.com/cgi/alerts Subscriptions: http://ijo.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav Citations: http://ijo.sagepub.com/content/56/1/113.refs.html
>> Version of Record - Jan 23, 2012 OnlineFirst Version of Record - Feb 13, 2011 What is This? Downloaded from ijo.sagepub.com by guest on October 11, 2013
395716
IJO56110.1177/0306624X10395716Andre ws et al.International Journal of Offender Therapy and Comparative Criminology
Are the Major Risk/Need Factors Predictive of Both Female and Male Reoffending? A Test With the Eight Domains of the Level of Service/Case Management Inventory
International Journal of Offender Therapy and Comparative Criminology 56(1) 113–133 © The Author(s) 2012 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/0306624X10395716 http://ijo.sagepub.com
Donald A. Andrews1, Lina Guzzo2, Peter Raynor3, Robert C. Rowe4, L. Jill Rettinger5, Albert Brews6, and J. Stephen Wormith6
Abstract The Level of Service/Case Management Inventory (LS/CMI) and the Youth version (YLS/CMI) generate an assessment of risk/need across eight domains that are considered to be relevant for girls and boys and for women and men. Aggregated across five data sets, the predictive validity of each of the eight domains was genderneutral. The composite total score (LS/CMI total risk/need) was strongly associated with the recidivism of males (mean r = .39, mean AUC = .746) and very strongly associated with the recidivism of females (mean r = .53, mean AUC = .827). The enhanced validity of LS total risk/need with females was traced to the exceptional validity of Substance Abuse with females. The intra–data set conclusions survived the introduction of two very large samples composed of female offenders exclusively. Finally, the mean incremental contributions of gender and the gender–by–risk level 1
Carleton University, Ottawa, Ontario, Canada Ministry of Community Service and Correctional Services, North Bay, Ontario, Canada 3 Swansea University, Wales, United Kingdom 4 Queen’s University, Kingston, Ontario, Canada 5 University Partnership Centre, Georgian College, Barrie, Ontario, Canada 6 University of Saskatchewan, Saskatoon, Saskatchewan, Canada 2
Corresponding Author: J. Stephen Wormith, Department of Psychology, 9 Campus Drive, University of Saskatchewan, Saskatoon, SK S7N 5A5, Canada Email:
[email protected]
114
International Journal of Offender Therapy and Comparative Criminology 56(1)
interactions in the prediction of criminal recidivism were minimal compared to the relatively strong validity of the LS/CMI risk level. Although the variance explained by gender was minimal and although high-risk cases were high-risk cases regardless of gender, the recidivism rates of lower risk females were lower than the recidivism rates of lower risk males, suggesting possible implications for test interpretation and policy. Keywords level of service, risk assessment, risk–needs–responsivity (RNR), gender neutral
The purpose of the present report is to explore the gender neutrality of a real-world practical assessment of eight of the best-established risk/need factors for criminal behavior in the psychological and criminological literature. The assessment instruments are known as the Youth Level of Service/Case Management Inventory (YLS/CMI; Hoge & Andrews, 2002); the adult Level of Service/Case Management Inventory (LS/CMI; Andrews, Bonta, & Wormith, 2004); and the Level of Service/Risk, Needs and Responsivity (LS/RNR; Andrews, Bonta, & Wormith, 2009), which is the LS/CMI without the case management protocol. The LS/CMIs are gender- and culturally informed updates of the Level of Service Inventory–Revised (LSI-R; Andrews & Bonta, 1995). The LSI-R risk/need tool is one of the most frequently used and best-reviewed correctional assessment instruments in the world (Smith, Cullen, & Latessa, 2009). Smith and her colleagues reported positively on the Level of Service (LS) instruments for several reasons, but two particularly important reasons include the strong ties of LS instruments to the risk–needs–responsivity (RNR; Andrews & Bonta, 2006) model of assessment and crime prevention and the fact that LS total risk/need scores and the principles of RNR are valid with both female and male offenders. However, a distinct concern has been that the predictive validity of risk level varies with gender. The gender-neutral prediction is that a higher risk case is a higher risk case and a lower risk case is a lower risk case. The oft-feared but relatively unexplored possibility is that the actual recidivism rates associated with risk levels may be different for females and males and that the recidivism of girls and women may be lower than that suggested by their risk levels (Blanchette & Brown, 2006; Hannah-Moffat, 2009; Holtfreter & Cupp, 2007).
Gender Terminology, RNR, and the Central Eight This introduction includes a brief summary of the RNR model of correctional assessment and rehabilitation and a review of the existing evidence in regard to gender differences in the predictive validity of risk factors. We begin, however, with the meaning of the phrases gender-responsive and gender-neutrality. Gender-responsive assessment and programming is now highly valued in many correctional agencies (Hannah-Moffat, 2009). Being highly valued suggests that gender responsiveness is considered important just
Andrews et al.
115
as is programming that delivers services in ethical, legal, decent, and culturally sensitive and cost-efficient ways. It is quite possible for gender-responsive programming to be expected on normative grounds even though adherence with that value does not necessarily enhance the effectiveness of a crime prevention program. For example, availability of trauma-related services may be a highly valued aspect of gender responsiveness without requiring that availability of those services contributes to crime prevention through reduced reoffending. Still other gender-responsive issues may be very strongly related to the effectiveness of crime prevention activities. For example, the RNR model of correctional assessment and rehabilitation goes well beyond normative considerations and identifies several factors purported to possess causal significance in the analysis of criminal conduct and to be causally significant in the determination of the effectiveness of crime prevention activities. One such consideration has to do with the risk factors for criminal activity. Are the characteristics of people and their circumstances that predict criminal behavior the same for females and for males? Gender-neutral risk factors predict the criminal conduct of females and of males. Gender-specific risk factors on the other hand “work” for females but not males (female-specific) or they predict for males but not for females (male-specific). Some gender-neutral risk factors may be more strongly predictive with females (female-salience) whereas other gender-neutral factors may be more strongly predictive of the criminal activity of males (male-salience). The issues of gender differences also apply to dynamic risk factors. Dynamic risk factors are called criminogenic needs in the RNR model. The language of neutrality, specificity, and salience applies to considerations of gender differences in the validity of assessments of dynamic risk factors just as it does with more static risk factors. The RNR model is strongly tied to general personality and cognitive social learning perspectives (GPCSL) on human behavior and thus the model is explicit about the crime prevention implications of adherence with the clinical principles of RNR. In applications of RNR, assessments of risk/need factors help to identify the best candidates for higher levels of supervision (the higher risk cases), the best candidates for participation in correctional treatment programs (i.e., moderate and higher risk cases), and the most appropriate intermediate targets of change for particular cases (i.e., dynamic risk factors or criminogenic needs). The evidence in support of the crime reduction potential of adherence with the human service principles of risk and need is reasonably strong (Andrews & Bonta, 2010; Andrews & Dowden, 2007; Lipsey, 2009; Lowenkamp, Latessa, & Smith, 2006; McGuire, 2004). It is now clear that programming delivered to low-risk cases and/or programming that fails to target the major criminogenic needs flirt with increasing recidivism while programming in adherence with the risk and need principles reduce recidivism rates. Drawing on GPCSL and the now massive meta-analytic literature on risk/need, RNR also identifies the major risk and need factors. Andrews and Bonta (2010, table 2.5) summarized the best-established risk/need factors in the prediction of criminal conduct by listing the “central eight” factors. They acknowledged that other categorizations of risk/need are possible and that general risk/need factors cannot necessarily capture all of
116
International Journal of Offender Therapy and Comparative Criminology 56(1)
the nuances of individual histories or case-specific etiologies. However, the domains represented within the central eight make sense theoretically and in the practice of correctional assessment and in the design and delivery of effective correctional treatment. The eight domains make sense to many practitioners and clinicians who seek the reliable and valid assessment of risk of reoffending along with the identification of the individualized dynamic risk factors (criminogenic needs) that may be targeted in crime prevention programming. The eight domains are also readily understood by offenders, which greatly facilitates their involvement in service planning. The central eight is composed of the “big four” and the “modest four.” The big four includes a history of criminal behavior, antisocial personality pattern, antisocial attitudes, values, beliefs and cognitive-emotional states, and antisocial associates. These are the personal and interpersonal factors most proximate to the occurrence of criminal activity in immediate situations of action. The big four are intercorrelated in theory and in fact but their separate assessment allows the specification of individualized criminogenic need areas for purposes of program planning. Factorial purity has not been a major issue for LS creators, LS users, or the offenders whose period of supervision and/or treatment is being planned. The modest four include other well-established risk/need factors that are theoretically less proximate to the occurrence of criminal activity. The milder set includes three major settings of human interaction, including the home (family/marital), school/ work, and leisure/recreation. Low levels of rewards and satisfactions in those domains are associated with at least mild increases in risk of offending. The fourth of the modest four is substance abuse. Andrews and Bonta (2010) summarized eight meta-analytic investigations of the predictive criterion validity of assessments of the central eight risk/need factors in the prediction of criminal recidivism. The grand mean r value for the big four was .26 (95% CI of .22/.39, k = 24) compared to .17 (.13/.20, k = 23) for the more modest four. The risk/need assessment provided by the YLS/CMI, the adult LS/CMI, and LS/RNR is a composite of the eight central domains of risk/need. The mean value of .17 may seem very low but a set of minor risk/need factors yielded a grand mean validity estimate of only .03 (–.02/.08, k = 16). The latter set included assessments of the validity of some of the most frequently cited risk/need factors in traditional criminology such as lower class origins, fear of official punishment, personal distress/psychopathology, and low verbal intelligence. The three 95% confidence intervals (CIs) were nonoverlapping and thereby supported the existence of statistically significant differences in the magnitude of the predictive validity of the different classes of variables. As a variation of Cohen’s (1988) recommendation, we consider mean validity estimates of .30 and greater as large, those between .20 and .29 as modest, those between .10 and .19 as mild, and those lower than .10 as minimal. Indeed if the value is less than .09, it may be of nil practical significance for prediction purposes. In sum, the gender neutrality of the central eight as assessed with the LS instruments is of considerable theoretical and applied significance. If the major risk/need factors are
Andrews et al.
117
different for females and males, assessments of different risk/need factors would be required. Our introduction now locates this particular investigation of gender and gender differences in the predictors of criminal recidivism in the broader context of studies of gender and crime and the findings of extant studies of gender and risk of reoffending.
The Evidence on Gender Differences in the Risk Factors for Criminal Recidivism Exploring whether the same risk factors are relevant to prediction of reoffending among both women and men is not the same as arguing that they are present to the same extent, are similarly distributed, or are present for the same reasons in both groups. In other words, evidence that the same risk factors are relevant is not evidence that both genders have the same life-chances and opportunities in current societies, or those inequalities, discrimination, and states of disadvantage do not exist. It would be absurd to deny the existence of gendered social disadvantage or the importance of efforts to reduce it. However, these broader issues are likely to have more impact on the distribution and acquisition of criminogenic needs than on the extent to which, once acquired, they predict offending behavior. It is also clear that there is nothing in the evidence of similar risk factors that requires that women and men should offend in exactly the same way, from the same motivation or for the same reasons. Rather, it is quite likely that opportunities will be different for men and women. It has often been argued that much offending by women is motivated by economic pressures, or committed in the context of oppressive and coercive relationships with criminal men, rather than being simply antisocial. In other words, observing the similarity of the major risk factors is not to deny differences and inequalities. On the contrary, it begins to create the basis for a more empirically accurate understanding of differences. In our efforts to examine the applicability of the RNR model of correctional assessment and treatment, and the LS risk/need assessment instruments we have considered the oft-repeated “reasons” for supporting gender specificity in risk/need. In the process, feminist theoretical accounts of female crime were reviewed, with particular attention paid to the work of Blanchette and Brown (2006), Hannah-Moffat (2009), and Holtfreter and Cupp (2007). The most frequently cited “reasons” are that criminology was traditionally male-centric and hence female understandings were underrepresented, that gender differences in the extent and seriousness of criminal behavior are well established, and that gender differences exist in problems such as emotional distress, a history of being abused, and poverty. All of the latter problematic factors are often said to be more evident and/or more important among girls and women than among boys and men. However, not one of these considerations actually establishes gender differences in the predictive validity of the major risk/need factors. In particular, it is quite notable that the three “issues” are often cited without provision of empirical
118
International Journal of Offender Therapy and Comparative Criminology 56(1)
examples of gender differences in predictive validity. The simple fact appears to be that the evidence is not reported because evidence in support of feminist approaches to assessment is “absent” (Hannah-Moffat, 2009, p. 216). It has also been established meta-analytically that LS total risk/need scores are gender-neutral in their predictive validity in that they are at least as valid with girls and women as they are with boys and men (Andrews et al., 2011; Olver, Stockdale, & Wormith, 2009; Schwalbe, 2008; Smith et al., 2009). Most recently, Van Voorhis, Wright, Salisbury, and Bauman (2010) agreed with proponents of the LSI-R that “genderneutral variables and their compilation into a total risk scale LSI-R powerfully predict offense-related outcomes for women” (p. 281). However, as noted, the gender-neutrality of the subcomponent scores has not been explored across a number of studies. Indeed, Van Voorhis et al. (2010) suggested that certain factors within the “big four” (antisocial attitudes and antisocial associates) were not very important in predicting the criminal recidivism of women. In addition, and highly relevant to the present project, Brown (2009) reported that the antisocial attitude domain in an early version of the Youth LS/ CMI was only predictive with young male offenders. Intrastudy investigations of gender differences in validity estimates are valued because the mean validity estimates for LS total risk/need range from modest to very large, depending on involvement of LS authors in the validation study, length of the follow-up period, and whether the study was conducted in Canada (Andrews et al., 2011). The first of the nonregional factors is considered to be a major indicator of integrity in correctional treatment by many investigators, including Petrosino and Soydan (2005), Lipsey (2009), and Andrews and Bonta (2010). Even the regional effect may reflect integrity issues. For example, the superior validity estimates generated in Canadian studies is not limited to investigations of LS risk/need but has also been found in evaluations of two other risk assessment instruments (Olver, Stockdale, & Wormith, 2009; Yang, Wong, & Coid, 2010). Possibly the Canadian effect reflects integrity issues such as the quality of the training and supervision of assessors in Canada along with the quality of the follow-ups on recidivism in Canada. Many Canadian studies of recidivism use a comprehensive national police database in the measurement of officially recorded recidivism. That source may be supplemented by systemwide files in provincial departments of corrections. Whatever might account for variability in estimates, intrastudy investigations are likely more sensitive to gender differences than are investigations in which study-based factors are not controlled. The evidence just referred to regarding the sources of variability in validity estimates does not mean that the validity of LS total risk/need has not been established outside of Canada. Nor does it mean that the validity of LS total risk/need was nonsignificant when LS authors were not involved or when short follow-up periods were used. The mean levels of validity were simply not as great as those found in Canada or those found in longer follow-up studies with LS author involvement (Andrews et al., 2011). Indeed, LSI-R total risk/need was clearly gender neutral in the 16 intrastudy tests of gender differences conducted by Smith et al. (2009). Canadian studies were
Andrews et al.
119
underrepresented in that set of 16 and the mean r values were attenuated (with mean r values just under .30), but the r values were still significant with females and with males. In the total of 27 investigations of the predictive validity, the mean validity estimate was .35 with female offenders. The predictive validities of many of the risk/need factors that are declared to be gender-informed factors by some theorists have also been explored. At best, the mean validity estimates for factors such as poverty and victimization are minimal to mild at best (below .19). Gender neutrality is the rule and gender specificity is the exception, even among the risk/need factors suggested by feminist theories of crime. Once controls for LS total risk/need were introduced, the predictive validity of the genderinformed risk/need factors was reduced to nil levels with female offenders (Andrews et al., 2011). Van Voorhis et al. (2010) drew much more positive conclusions regarding the incremental validity of gender-informed factors but, as they acknowledge, they selectively reported on the cumulative effects of the best performing of the GI factors in relation to the most “representative” of the measures of recidivism on a sample-specific and sitespecific basis. In the prediction of criminal recidivism, one site was not reported upon by Van Voorhis et al. (2010) and significant incremental validity was found in only one of the three other sites reported on in 2010. Therefore, including the study not reported by Van Voorhis et al. (2010), only one of a total of four studies of criminal recidivism supported the incremental validity of gender-informed factors, even after those factors were selectively chosen to maximize incremental validity. The findings with LS total risk/need and with the gender-informed factors do not stand alone in regard to the gender neutrality of the predictive validity of risk/need. The first two meta-analyses of gender differences found no evidence whatsoever of gender specificity in predictive validity (Gendreau, Andrews, Goggin, & Chanteloupe, 1992; Simourd & Andrews, 1994). Likewise, studies from the developmental criminological literature reveal that gender neutrality is the rule whereas evidence of gender specificity is scattered and minimal to minor in magnitude. These conclusions are found in important reviews of the literature by Leschied, Cummings, Van Brunschot, and Saunders (2002) and Zahn, Hawkins, Chiancone, and Whitworth (2008). Similar findings exist in the developmental classics such as the Cambridge studies (Farrington & Painter, 2004) and the New Zealand studies (Moffitt, Caspi, Rutter, & Silva, 2001). The conclusion also fits with the findings of studies of large representative samples of young people in the United States (Daigle, Cullen, & Wright, 2007). In direct opposition to the conclusion by Holtfreter and Cupp (2007), developmental trajectories have been found to be predominately gender-neutral in their origins and behavioral consequences (Johansson & Kempf-Leonard, 2009; Odgers et al., 2008; Samuelson, Hodgins, Larsson, Larm, & Tengstrom, 2010). Although the current article is limited to the predictive validity of risk/need factors, it is relevant that gender neutrality is also the dominant conclusion in reviews of effective correctional treatment (Andrews & Bonta, 2010; Dowden & Andrews, 1999; Lipsey, 2009).
120
International Journal of Offender Therapy and Comparative Criminology 56(1)
Interestingly, a major proponent of gender responsiveness and a prolific critic of the RNR model of correctional assessment and crime prevention reached a similar conclusion in regard to both assessment and treatment. Recently, Hannah-Moffat (2009) described the research literature on risk and correctional treatment as “expansive” (p. 216). In contrast, she referred to “the absence of empirical evidence to support the effectiveness of alternative feminist approaches to risk assessment, treatment and programming” (p. 216). “Expansive” support for gender neutrality and an “absence” of empirical support for gender specificity fits very neatly with the findings of our reviews. The single greatest exception to the pattern of results just reviewed is the large-scale study conducted by Brown and Motiuk (2008). Working at the item level (and with many items that overlap considerably with LS items), they reported evidence of gender specificity in 53% of the items that actually displayed any significant predictive validity. Gender neutrality was found in only 47% of the items. According to our reviews of the literature, no other study, with large or small samples of women and men, has revealed a pattern of findings so strongly supportive of gender specificity. Finally, one should consider the possibility that gender carries predictive information on its own. Our reading of the evidence regarding gender and criminal behavior does not deny that males are drastically overrepresented in samples of frequent, serious, and officially processed offenders. But that undisputed fact says nothing at all about the correlation between gender and criminal recidivism. In fact, the correlation between being male and recidivism is minimal to mild in magnitude (from a mean correlation of .06 to .17 with a grand mean of .12 across four meta-analyses according to Andrews ([2009]). In fact, the mean predictive estimate for being male was reduced from .13 to .09 once statistical controls for LS total risk/need were introduced (Andrews et al., 2009). The percentage of variance in recidivism that is explained by gender typically is nil to minimal in magnitude, and particularly once risk/need scores are considered. For example, a correlation of .09 accounts for less than 1% of the variance in recidivism.
The Current Study The present article describes a multistudy exploration of the gender neutrality of the predictive validity of the eight subcomponents of LS/CMI total risk/need. It is the first meta-analysis of the predictive validity of LS subcomponent scores. The LS risk/need instruments are among the leading correctional assessment approaches internationally, and the results of LS assessments touch the lives of thousands of offenders with measurable impact on their probability of reoffending. Indeed, LS risk/need is at the center of many discussions of gender specificity (Andrews et al., 2011; Blanchette & Brown, 2006; Hannah-Moffat, 2009; Holtfreter & Cupp, 2007; Olver et al., 2009; Schwalbe, 2008; Smith et al., 2009; Van Voorhis et al., 2010; Van Voorhis, Salisbury, Wright, & Bauman, 2008). In addition, conventional tests of gender differences in predictive validity are supplemented, for the first time, by an integrated examination of the relative contributions of LS risk level, gender, and their interaction to the prediction of criminal recidivism.
Andrews et al.
121
In other words, how important is gender in the task of accounting for variability in criminal behavior when gender is entered into the prediction formula along with the interaction of gender and risk? In practice, the total risk scores of offenders are typically sorted into three to five levels of risk from lower risk categories to higher risk categories. This approach allows an exploration of the actual variation in recidivism rates by risk level and gender. The current study is not simply a conventional exploration of gender differences in the predictive validity of risk/need factors. Rather, it explores the issues of gender and recidivism with attention to risk, gender, and their interaction as three sources of variability in criminal recidivism. Without knowing exactly what to make of the findings of Brown and Motiuk (2008), without denying some gender specificity in risk/need, without denial of discrimination and disadvantage, and without denying myriad differences between females and males, the weight of the evidence in support of the gender neutrality of risk/need is becoming overwhelming. Thus, our hypotheses are as follows. One, each of the central eight risk/ need factors as assessed with LS/CMI is predictive of the criminal recidivism of females and of males. Two, the predictive validity of LS/CMI risk level greatly exceeds that of gender and of any gender–by–risk level interaction. The “explained variance” approach is expanded on by directly examining the recidivism rates associated with risk level by gender. It is one thing to explore gender differences in the predictive validity of risk/ need factors as is required for tests of the first hypothesis. That is the dominant approach in the field of gender and crime. It is another matter to explore the relative and independent contributions of risk level, gender, and their interaction to the prediction of recidivism (as in our test of the second hypothesis).
Method The Intrastudy Data Sets The five intrastudy data sets used in the current study are noted in Table 1. The fact that the meta-analysis entails the pooling of intrastudy tests of gender differences reduces the very serious threat of interstudy variation in validity estimates (Andrews et al., 2011). Inspection of Table 1 reveals that four of the five data sets were created in Canadian corrections, had an LS author involved in the evaluation study, and the follow-up periods greatly exceeded 1 year. As noted in the introduction, each of these indicators is associated with the production of large to very large validity estimates for LS total risk/need (Andrews et al., 2011), but they have had no impact on gender differences in the predictive validity of LS total risk/need.
Representativeness of the Samples The representativeness of the samples studied was explored through considerations of certain key descriptive statistics. First, and as expected because of the general overrepresentation of males in correctional settings, in every data set the number of male offenders greatly exceeded the number of female offenders. The number of male
122
International Journal of Offender Therapy and Comparative Criminology 56(1)
Table 1. Five Data Sets Allowing Intrastudy Testing of Gender Differences in the Predictive Validity of LS/CMI Domain Scores and LS/CMI Total Risk/Need Follow-up
n Study
Female
Male
Years
Recidivism
Base rate Female Male
a,b
Andrews & Robinson (1984) Young offenders (16-17 years) Adult offenders (18 plus years) Girard & Wormith (2004)b Adult offenders Raynor (2007)a Adult offenders Rowe (2002)b Young offenders (YLS/CMI) Total (k = 5)
27 70
113 351
3 3
Any new Any new
.26 .20
43
655
2.5
Any new
.19
133
623
1
Reconviction
.31
81 354
327 2,069
4.2
Any postprogram
.51
.48 .28 .55 .43 .76
Note: LS/CMI = Level of Service/Case Management Inventory; LSI-R = Level of Service Inventory– Revised. a. LS/CMI Risk/Need scored from LSI-R records. b. An LS author was the principal investigator or a supervisor of the principal investigator.
offenders (2,069) was nearly 6 times (5.84) that of the number of female offenders (354). Second, although gender differences in official processing may sometimes change the outcome, the general expectation is that female offenders will recidivate at lower rates than male offenders. As would normally be expected, the mean base rate of recidivism was substantially lower for female offenders (.29) than for male offenders (.50), paired t(4) = 4.04, p < .02. Third, consistent with normative data, the mean LS/ CMI total score was in the moderate range of risk for both females and males but was slightly higher for male offenders (15.95) than for female offenders (13.98). The paired t was 1.78, p < .15. Fourth, the grand mean standard deviations of LS/CMI scores were normative (in the area of 8.0) and statistically indistinguishable for females (8.15) and males (8.11), paired t = 0.16, p < .88.
Two Large Studies of Female Offenders Although normative in terms of the representation of females and males in correctional systems, the underrepresentation of female offenders in the intrastudy comparisons suggests that the female validity estimates are based on numbers so small that confidence in the representativeness of the female validity estimates is less than that for the male validity estimates. Fortunately, two large studies of female offenders have been conducted. Rettinger and Andrews (2010) reported on 411 female offenders from Ontario whereas Brews (2009) reported on another 2,832 female offenders from Ontario. The latter sample is composed of all female offenders released from correctional supervision
Andrews et al.
123
in the 12-month period between April 1, 2002, and March 31, 2003. For comparison purposes, the two large studies are Canadian, LS authors were involved as research supervisors, and the follow-up periods were much greater than a year. The mean validity estimates generated in the two large studies will be compared with the mean of the five intrastudy estimates. This provides another check on the representativeness of the validity estimates with female offenders.
The Measures of Validity Consistent with the state of the available evidence on LS risk/need, the Pearson’s correlation coefficient (r) was the primary measure of predictive validity used. The appropriateness of r as a measure of validity is thought to decline as the base rate of recidivism deviates from 50%. Thus, the analyses of the first set of hypotheses were repeated with the receiver operating characteristic’s (ROC’s; Hanley & McNeil, 1983) area under the curve (AUC) as the measure of predictive validity. AUC is less sensitive to variations in the base rate of recidivism (Rice & Harris, 2005).
Testing of Hypothesis 1 Because of the intrastudy nature of the comparisons, gender differences in validity could be tested with the more sensitive paired-sample t tests. Less sensitive conventional F tests were used for the comparison of the mean of the five validity estimates for males with the seven validity estimates for females that included the two large sample studies of female offenders.
Testing Hypothesis 2 This test involved the computation of partial eta values in five ANOVAs with LS/CMI risk level, gender, and the interaction of gender and risk level as potential sources of variation in recidivism rates. ANOVAs were computed separately with each of the five data sets. Thereby, each data set yielded its own partial r square values. The mean of the five measures of percentage of total variance explained for risk level, gender, and their interaction were then compared with paired t tests.
Results Hypothesis 1 Inspection of Table 2 reveals that each of the eight domains of the LS/CMI was predictive of the criminal recidivism of female offenders and of male offenders. Gender neutrality was the case with each of the domains of the LS/CMI just as it was with LS/ CMI total risk/need. The predictive validity of each domain was greater for females than for males but the effect of gender was statistically significant only on the substance
124
International Journal of Offender Therapy and Comparative Criminology 56(1)
Table 2. Mean Predictive Criterion Validity Estimates (r) by LS/CMI Subdomain and LS/CMI Total Risk/Need by Gender Paired t test
Mean Validity Estimate
Female
95% CI
Male
95% CI
t
p
LS/CMI subdomain Criminal history Companions Procriminal attitude Antisocial pattern Education/employment Family/marital Leisure/recreation Substance abuse
.41 .39 .35 .36 .35 .20 .30 .46
30/52 19/59 15/54 26/47 25/44 04/36 13/48 25/66
.30 .32 .26 .32 .28 .18 .23 .17
17/42 22/41 07/46 27/38 22/33 11/25 21/24 10/25
2.11 1.02 1.76 1.50 1.90 0.52 1.21 5.06
.10 ns .16 ns .13 ns ns .001
LS/CMI total risk/need Composite unadjusted
.53
41/64
.39
30/49
3.79
.02
.47
37/58
.45
34/55
0.11b
ns
Composite adjusted for substance abusea
Gender neutral Yes Yes Yes Yes Yes Yes Yes Yes, F salient Yes, F salient Yes
Note: k = 5 intrasample estimates; n = 354 females and 2,069 males. LS/CMI = Level of Service/Case Management Inventory; CI = confidence interval (decimals omitted). a. ANOVA with gender as a factor and substance abuse as a covariate. b. F(1, 7).
abuse domain and with LS/CMI total score. In brief, gender neutrality was found in all nine tests of the mean validity estimates and female salience was evident on substance abuse and LS/CMI total risk/need. The gender effect on LS/CMI total risk/need was traced to the exceptional predictive validity of substance abuse with female offenders. Once statistically adjusted for substance abuse, the mean validity of LS total risk/need with female offenders (.47) was virtually identical to the validity with male offenders (.45). The effect of gender on the validity of substance abuse was highly significant statistically (a mean of .46 with female offenders compared with .17 for males, p < .001). Yet inspection of the 95% CI reveals a very large interval of 41 points (.25-.66) around the mean validity estimate for substance abuse with female offenders. This reduces the confidence that can be placed in that mean value. There is a considerable and troubling distance between a mean estimate in the .20s and one in the .60s. Fortunately, two large-sample female-only studies permit exploration of the representativeness of the validity estimates for female offenders. The mean validity estimates from the two large-sample female-only investigations were .50, .36, .40, and .42 for Criminal History, Companions, Procriminal Attitude, and
Andrews et al.
125
Antisocial Pattern, respectively. Each estimate fell within the corresponding 95% CIs outlined in Table 2 for the intrastudy estimates. The mean validity estimates were .33, .18, .27, and .41, respectively, for Education/Employment, Family Marital, Leisure/ Recreation, and Substance Abuse. These values also fall within the corresponding 95% CIs presented in Table 2. The mean validity of LS total risk/need was .54 in the two large female studies compared with the .53 value reported in Table 2 for the intra– data set studies. All in all, the mean validity estimates generated by the five studies with only 354 female offenders were quite consistent with the mean validity estimates derived from two studies involving a total of 3,243 female offenders. Including the two large sample mean estimates in the computation of the mean validity of Substance Abuse with female offenders yielded a mean validity estimate of .44, 95% CI = .31 to .56, k = 7, n = 3,597. That mean was significantly greater than the mean validity estimate for male offenders listed in Table 2 (.17, 95 % CI = .10 to .25, k = 5, n = 2,069) with F(1, 10) = 15.08, p < .003. The female salience of substance abuse appears to be robust. The basic findings with the r measure of predictive validity were replicated with AUC as the measure of validity (see Table 3). The predictive validity of each of the eight domains and of the total risk/need score was gender-neutral with the AUC measure of validity. The female salience of the Criminal History domain reached a statistically significant level with AUC as the measure of predictive validity. The correlation between validity as assessed with r and validity as assessed with AUC was .91 for the LS/CMI total score. Once again, the female salience of LS/CMI total risk/need was eliminated with statistical controls for substance abuse.
Hypothesis 2 Sometimes, in applications of the risk principle from the RNR model, the total LS risk/ need score is used to assign offenders to categories based on level of risk from very low through very high. Four levels are used herein because of the small number of cases in the very high risk category. Analyses of variance in recidivism were conducted within each data set. The three potential sources of variance in recidivism were LS/CMI risk level, gender, and the interaction of risk level and gender. The test of significance of the interaction term is another way of testing for gender differences in the predictive validity of LS/CMI risk. The sum of the three partial eta values squared was used as a measure of the total variance explained by risk level, gender, and the interaction term. The partial correlation values reflect the independent contribution of each of the three factors. The relative contribution of each of the three sources was assessed as the percentage of the total explained variance attributable to each of the three sources. For example, the relative incremental validity of risk level is as follows [100 × (the partial correlation squared for risk level divided by the sum of the partial correlations squared for each of risk level, gender and the interaction term)]. Risk level was by far the major source of explained variance in recidivism rates and the mean contribution of risk level (84.9%) greatly exceeded the mean contributions
126
International Journal of Offender Therapy and Comparative Criminology 56(1)
Table 3. Mean Predictive Criterion Validity Estimates (AUC) by LS/CMI Subdomain and LS/ CMI Total Risk/Need by Gender Mean validity estimate LS/CMI subdomain Criminal history Companions Procriminal attitude Antisocial pattern Education/employment Family/marital Leisure/recreation Substance abuse LS/CMI total risk/need Composite unadjusted Composite adjusted for substance abusea
Paired t test
Female
Male
t
p
.754 .745 .673 .713 .729 .630 .682 .771
.677 .685 .639 .709 .690 .609 .638 .611
3.06 2.00 1.58 1.50 1.99 0.60 1.63 3.78
.04 .12 .19 20 .12 ns .18 .02
.827 .790
.746 .783
3.39 0.02b
.02 ns
Gender neutral Yes, F Salient Yes Yes Yes Yes Yes Yes Yes, F salient Yes, F salient Yes
Note: k = 5 intrasample estimates; n = 354 females and 2,069 males. LS/CMI = Level of Service/Case Management Inventory; AUC = area under the curve. a. ANOVA with gender as a factor and substance abuse as a covariate. b. F(1, 7).
of gender (7.6%) and of the interaction term (7.9%). The gender effect just missed being statistically greater than zero whereas the interaction term was significantly different from zero (p < .05). Of course, the effect of risk level was significantly greater than the gender effect, the interaction effect, and the combined contribution of gender and the interaction. Rounding off the percentage estimates, LS risk level could account for 85% of the explained variance whereas the pooled contribution of gender was 15%. Gender just failed to reach statistical significance while the Interaction term reached conventional levels of significance with p