Estimating the effect of perceived risk of crime on social trust in the presence of endogeneity bias Luca Zanin
Rosalba Radice
Prometeia
Department of Economics, Mathematics and Statistics
G. Marconi, Bologna 40122, Italy
Birkbeck, University of London
[email protected]
Malet Street, London WC1E 7HX, U.K.
Giampiero Marra Department of Statistical Science University College London Gower Street, London WC1E 6BT, U.K.
Abstract This study aims to estimate the effect of perceived risk of crime on the social trust probability for Italian men and women, accounting for both observed and unobserved confounding. We use microdata collected by the Italian National Statistical Office for the year 2010 during a multi-scope survey of Italian households. The relationship under investigation is estimated after controlling for observed confounding by using a propensity score weighting approach. To control for both observed and unobserved confounding (better known as endogeneity), a semiparametric recursive bivariate probit approach is ultimately employed instead. Our findings show that the perceived risk of crime has a significant negative effect on the social trust probability regardless of gender and that endogeneity seems to be present for both genders. The paper rep-
1
resents the first such application in which the effect of interest is estimated accounting for the presence of endogeneity. Key Words: average treatment effect on the treated, endogeneity; confounding; perceived risk of crime; propensity score weighting; semiparametric recursive bivariate probit; social trust. JEL Classification: C1; C3; C52; Z13; Z18
1
Introduction
A number of studies in economics and social sciences have focused on generalised or social trust (which is defined as trust in strangers) as a fundamental pillar of social capital (e.g., Putnam, 2000; Welch et al., 2005; Dearmon and Grier, 2011; Torpe and Lolle, 2011). Torpe and Lolle (2011) noted that there are two main schools of thought on the process of trust formation in the relevant literature. On the one hand, social trust can be linked to a rational evaluation of the reliability of people through concrete experience or information; on the other hand, trust can be based on the moral predisposition of the individual. However, a combination of the two perspectives (both rational and moral) cannot be excluded. Scholars agree that social trust is fundamental for a nation because it stimulates human interactions and cooperation between citizens (who sometimes differ socially and culturally), and hence, social trust represents an important component of the social life of people, of an economic system, and of a democratic polity (e.g., Alesina and La Ferrara 2002; Hardin, 2002; Welch et al., 2005; Herreros and Criado, 2008; Torpe and Lolle, 2011). In this regard, some empirical studies demonstrate that economic development is more likely in the presence of high social trust (e.g., Helliwell and Putnam, 1995; Knack and Keefer 1997; Zak and Knack, 2001). Using individual-level data, social trust was found to be highest among people who are more educated and have higher incomes, and to be lowest among people who are unemployed, in poor health and live in areas with high degrees of income inequality, immigration, ethnic diversity, or corruption (e.g., Putnam, 2000; Alesina and la Ferrara, 2002; Putnam, 2007; 2
Gustavsson and Jordahl, 2008; Herreros and Criado, 2008; Rothstein and Eek, 2009; Richey, 2010). Torpe and Lolle (2011) noted that countries such as Norway, Sweden and Finland rank highest in Western Europe in social trust. The opposite situation has been observed for Italy. A number of studies have focused on the case of Italy to explain the reasons for low social trust. Almond and Verba (1963) observed that the determinants of social distrust in the country include the political culture of suspicion and low cooperation. Other scholars have highlighted the existence of lower social trust in regions of southern Italy than in northern Italy (e.g., Fukuyama, 1995; Helliwell and Putnam, 1995; Misztal, 1996). The social distrust and dichotomy between these two areas of the country have deep historical roots, i.e., since the collapse of the Roman Empire (In-Young, 2008). Following this historical period, the Italian territory was conquered, divided and ruled by several foreign forces that contributed over the centuries to the culture of social distrust and spatial differences that continue to be evident today, despite are passed more than 150 years from the unification of Italy (established in 1861). Among the reasons for low social trust, it is plausible to include factors of social degradation as crime events. Although official statistics show that crimes rates in Italy have decreased in recent years (about -8% from 2006 to 2010) and that they are below the European average (see, e.g., Maffei and Merzagora Betsos, 2007), a high number of crimes continue to be recorded (in 2010, more than 4000 events per 100000 inhabitants). Interestingly, approximately 50% of the offences in Italy were thefts. It is reasonable that people who directly or indirectly experience this type of crime (which includes bag-snatching, domestic theft, and car theft), possibly in combination with other offences (e.g., robberies, murders, extortion, vandalism), may have reduced perceptions of personal security. Thus, crime (or people’s perceptions of the risk of crime in their neighbourhood) is among the factors that negatively affect social trust. In other words, the risk of crime events in a community represents a ‘corrosive’ factor as well as a serious obstacle to the development and maintenance of high social trust. It should not be surprising that people who perceive a high risk of crime in their neighbourhood fear being victimised inside or outside their own homes. For example, people may be afraid of experiencing a burglary or of leaving home, especially 3
alone, at certain hours of the day. In such situations, a sense of poor personal security arises that may negatively affect quality of life and interpersonal relationships (e.g., Lewis and Salem, 1986; Michalos and Zumbo, 2000; Ross and Jang, 2000; Zanin, 2011). Hence, we expect that in a neighbourhood in which people perceive social and physical disorder (such as the presence of people who use drugs and alcohol in public, or vandalism) and in which major crimes (such as burglary, robberies, murders, and rape) occur, a deterioration of social trust is likely. The aim of our study is to examine the effect of perceived risk of crime on the probability of social trust for both men and women in Italy. Our investigation should contribute to understanding the mechanisms that govern this relationship in a country where social trust is historically compromised. This study should provide useful and informative support for policy-makers who are called upon to implement initiatives to reduce crime events. In the literature, the most influential observational studies that have explored a similar relationship have mainly been conducted for the United States. The findings reveal that individuals who perceive neighbourhood disorder show significantly lower levels of trust in strangers than those who perceive themselves to live in neighbourhoods characterised by social control and order (e.g., Ross and Jang, 2000; Ross et al., 2001 and 2002). Moreover, Salmi et al. (2007) demonstrated that experiences of victimisation, fear of crime, and viewing television crime reality programmes can be further elements that contribute to distrust in strangers. Conversely, the crime rate (computed as the number of registered crimes per inhabitant in an area) did not significantly affect social trust (e.g., Alesina and La Ferrara, 2002; Gustavsson and Jordahl, 2008). However, Alesina and La Ferrara (2002) noted that these results may be due to poor data quality with regard to crime (i.e., under-registration of crimes by the judicial authorities or police). From a methodological point of view, the effects of such relationships were mainly estimated using classic linear or logit/probit models. We focus on obtaining a more refined estimate of the effect of perceived risk of crime on social trust in these kinds of studies. In particular, obtaining a realistic estimate of the effect of perceived risk of crime on social trust is a difficult task, especially because of the presence of unbalanced observed and unobserved characteristics (confounders). Accounting 4
for covariate imbalance is important to obtain unbiased estimates of the treatment of interest (the perceived risk of crime). Many statistical methods are available to address the imbalance induced by observed confounders. However, matters are not so simple when confounders are not observable. In this study, for example, life experiences and the psychology and behaviour of individuals can be considered unobservable variables affecting social trust that are also associated with the perceived risk of crime in the neighbourhood of residence. Because we cannot directly account for these characteristics, we would expect biased parameter estimates and hence biased interpretations. In the literature, this bias is known as the endogeneity problem. Our empirical analysis is based on the Aspects of Daily Life dataset, Italian microdata collected by the Italian National Statistical Office (ISTAT) for the year 2010 during a multiscope survey of Italian households. We carried out our empirical analysis in two phases. First, we assume that unobservable confounding is not present, and control for observed confounders with a weighted propensity score approach. Then, we account for both observed confounders and the possible presence of endogeneity with a flexible system of equations. With regard to the first phase, because classic regression techniques may be sensitive to model misspecification (Rubin, 1997) and to reduce reliance on modelling assumptions, a propensity score technique has been used to estimate the treatment effect (e.g. Austin, 2008; Shah et al., 2005). The propensity score is the probability of receiving a treatment conditional on observed covariates. By conditioning on this quantity, it is possible to obtain an unbiased estimate of such an effect (Rosenbaum and Rubin, 1983). There are several methods for using propensity scores. These include regression adjustment, stratification, matching, and weighting (e.g. Austin, 2007; Stuart and Rubin, 2007). Some evidence suggests that weighting may be preferable to the first three alternatives (Ukoumunne et al., 2010; Harder et al., 2006; Hirano et al., 2003). However, weighting approaches may yield biased and inefficient estimates when the propensity score model is misspecified (Kang and Schafer, 2007). This problem can be overcome using a boosted classification and regression trees approach (boosted CART; McCaffrey et al., 2004), which is a machine learning 5
technique capable of producing very accurate estimated propensity scores (Lee et al., 2010, 2011). However, no propensity score techniques can account for the possible presence of unobserved confounders, which may be crucial in the current study. As for the second phase, the semiparametric recursive bivariate probit (SRBP) recently introduced by Marra and Radice (2011) has been employed. This approach involves estimating a simultaneous system of two flexible binary regressions, which account for unobserved confounding by modelling the equations’ dependence. This approach’s advantage over conventional instrumental variable (IV) methods (e.g. Angrist et al., 1996; Wooldridge, 2010) such as the generalised method of moment (GMM; Amemiya, 1974; Johnston et al., 2008), classic maximum likelihood approaches (ML; Heckman, 1979; Greene, 2007), and structural mean model (SMM; Robins, 1994; Vansteelandt and Goetghebeur, 2003) is twofold. First, SRBP allows for flexible functional dependence of the response variables on continuous covariates via the use of penalised regression splines. Unlike the classic parametric approach typically employed in these kinds of studies, a semiparametric specification allows us to flexibly model the effect of continuous covariates (for example, the age of individuals) without making a priori assumptions (e.g. linearity or nonlinearity specified using quadratic or cubic polynomials). This reduces the risk of model misspecification due to undetected nonlinearity which can have severe consequences on the estimation of the parameter of interest (Chib et al., 2009; Marra and Radice, 2011). Second, provided that the model assumptions are met, identification of the treatment effect is theoretically achieved even if an instrument is not included in the model (Wilde, 2000). In practice, however, model assumptions are difficult to satisfy and, as a result, empirical identification is better achieved if an IV is available. To the best of our knowledge, the present study constitutes the first application in which the effect of perceived risk of crime on social trust is estimated accounting for the presence of endogeneity.
6
2
The data
We investigated the impact of perceived risk of crime on social trust in Italy using the microdata collected by ISTAT for the year 2010 as part of the Aspects of Daily Life survey, a multi-scope survey of Italian households. The survey employed the following sampling design: • For self-representative municipalities (municipalities with large populations), each municipality was treated as a specific stratum. Cluster sampling was used for these municipalities. The primary units represented are households, extracted in a systematic way using municipal registers; data were collected for each member of the households included in the sample. • For non-self-representative municipalities (the remaining municipalities), the study adopted a two-stage sampling technique, with primary unit stratification. The primary units are the municipalities; the secondary units are the households. Data were collected for all members of each household included in the sample. Municipalities were selected with proportional probability to their population size and without replacement, while households were extracted with equal probability and without replacement. For our analysis, we have considered individuals between 18 and 80 years old, because our sample of units is considerably reduced outside of this age range. The number of observations in the sample is 33150 units, of whom 16055 are men and 17095 are women. The advantage of using a national survey is that it represents the population as a whole. The disadvantage is that, given the generality of the survey, some information such as the size of the municipality in which each individual lives and citizenship of the individuals may not be available.
7
2.1
Social trust
In economics and social sciences, generalised or social trust (trust) is understood as trust in strangers (e.g., Torpe and Lolle, 2011). The questionnaire used for the survey was structured to contain a section devoted to collecting this information for each individual in the sample. Specifically, the question asks: ‘Generally speaking, would you say that most people can be trusted or that you need to be very careful in dealing with people?’. We identify social trust in those people who say that ‘most people can be trusted’, and distrust in those who say that ‘you need to be very careful in dealing with people’. In other words, the response is of binary nature and takes the value of one if the respondent is trusting, and zero otherwise. We recognise that the use of a dichotomous response requires a decisional effort for the interviewee: trust or distrust. However, this type of response allowed us to avoid subjective choices on how to classify possible intermediate and ambiguous responses, such as ‘it depends’. Table 1 shows that in our sample, 24.5% and 21.6% of men and women, respectively, did not trust strangers. The highest and lowest social trust was registered in Trentino Alto Adige (44.1% for men and 37.0% for women) and Campania (16.5% for men and 13.4% for women), respectively (see Table 4 in Appendix A). These percentages are low compared with the highest levels of social trust registered for Norway and Sweden (about 70%; Torpe and Lolle, 2011). In other words, an important pillar of social capital is ‘weaker’ in Italy than in other European countries. This finding is critical because it is well known that social distrust does not help human interactions and cooperation between individuals, nor does the presence of distrust help economic development.
2.2
The perceived risk of neighbourhood crime
A section of the questionnaire was devoted to collecting a set of qualitative information on the respondent’s neighbourhood of residence (information such as air pollution, traffic, parking problems, and the risk of crime). Specifically, we are interested in the question that asks: ‘Is your neighbourhood of residence at risk of crime?’. The possible answers
8
are as follows: ‘very at risk’, ‘fairly at risk’, ‘a little at risk’, or ‘not at risk’. Two people living in the same neighbourhood might answer the question differently, although both are describing the same area. This variability is because each individual can have a different scheme of evaluation regarding the presence/absence or the degree of risk of crime. Here, we are interested in investigating how the individual probability of trusting strangers can change in the presence of perceived risk of crime (i.e., a perception of the neighbourhood of residence as very or fairly at risk of crime) as compared with individuals who do not perceive a severe risk of crime (i.e., the neighbourhood of residence is perceived to be a little or not at all at risk of crime). Thus, we construct a binary variable (crime) that assumes the value of one in the presence of perceived risk of crime and zero otherwise. Table 2 shows that in 2010, the highest percentages of respondents who had a perceived risk of crime were concentrated in Campania (14.1% of men and 14.3% of women), Lombardy (12.4% of men and 11.9% of women), Piedmont and Valle Aosta (9.0% of men and 9.3% of women), Lazio (8.8% of men and 8.3% of women), and Sicily (7.2% of men and 6.9% of women). In contrast, the lowest percentages were found in Basilicata (0.7% of men and 0.8% of women), Molise (1.3% of men and 1.4% of women) and Sardinia (1.8% for both men and women). As in Ross et al. (2002), we have considered the perceived risk of crime rather than the crime rate. This is because we expect that it is the subjective perception of threat that is likely to have a greater detrimental effect on social trust. Moreover, we use the concept of crime in a general way to identify events that include both social and physical disorder1 as well as major crimes, such as burglary, robberies, murders, and rape (Skogan, 1990; Ross and Jang, 2000). We agree with scholars who argue that it can be difficult for people to distinguish between disorder and major crimes. This difficulty is because we are referring to events that in one way or another represent a potential threat (physical or psychological) for individuals (e.g., Bursik and Grasmick, 1993; Sampson and Raudenbush, 1999). The presence of risk of crime can be reported both by people who have 1
Specifically, social disorder refers to people (such as the presence of scuffles or people who use drugs and alcohol in public) while physical disorder refers to the external appearance of the neighbourhood (such as vandalism). For further details, please refer to Ross and Jang (2000).
9
actually been victimised and by those who have not been victimised but are exposed to visible cues of crime (for example, when individuals walk down the street and are aware of thefts in neighbouring households). In any case, it is reasonable for people who perceive the risk of crime in their neighbourhood of residence to fear being victimised inside or outside their own dwelling, thus reinforcing a sense of poor personal security, powerlessness, and the idea that life is characterised by external uncontrollable threats (e.g., Geis and Ross, 1998). A number of studies have indicated that poor social and economic conditions in a neighbourhood (such as income inequality, the absence of opportunity in the labour market, and the lack of social integration, just to name a few) can lead to the breakdown of social cohesion and normlessness (or, in other words, a lack of respect for other people and their goods and property) and hence the likely presence of crime (e.g., Wilson, 1996; Sampson and Groves, 1989; Sampson et al. 1997; Ross et al. 2002; Gustavsson and Jordahl 2008; Elgar and Aitken 2010). People who live in such neighbourhoods are likely to view unknown people around them with suspicion. Additionally, they might believe that local and central government, as well as the local police force in charge of maintaining social order, are not able to adequately prevent crime (e.g., Lewis and Salem, 1986; Skogan, 1990; Ross and Jang, 2000; Ross et al., 2002). In light of this discussion, it is reasonable for people who perceive the risk of crime in their neighbourhood to manifest distrust in strangers to a greater degree than people who perceive themselves to live in the presence of order and social control. Accordingly, trust and trustworthiness may be affected by the environment in which people interact. Table 1 reports the proportion of the sample that did or did not perceive the risk (or severe risk) of crime in the neighbourhood of residence. Of interest, we found that there is a serious problem of social distrust in Italy that is particularly accentuated in the presence of perceived risk of crime, for both men and women. We also investigate possible differences between respondents who perceived a risk of neighbourhood crime and those who did not (by gender) with respect to a number of spatial and socio-demographic variables. These variables are also included as observed confounders in the models presented in Section 3. Specifically, we consider the region of residence, marital 10
Category Perceived risk of crime Non-perceived a severe risk of crime Total
Sample (%) 24.2 75.8 100.00
Men Social trust (%) 19.4 26.1 24.5
Women Sample (%) Social trust (%) 25.1 15.9 74.9 23.5 100.00 21.6
Table 1: Proportion, in percentage, of social trust among respondents who perceived risk of crime and did not perceive a severe risk of crime in the neighbourhood of residence (by gender).
Perceived risk of crime
Men Non-perceived a severe risk of crime
9.0 12.4 2.3 6.2 2.1 3.7 5.2 5.0 2.7 2.9 8.8 3.7 1.3 14.1 6.5 0.7 4.5 7.2 1.8
9.4 7.5 6.9 6.1 4.0 4.2 5.3 5.6 3.1 4.6 4.8 4.0 3.3 6.0 5.4 3.6 5.2 6.3 4.8
9.3 11.9 2.5 5.8 2.1 3.6 6.1 5.4 2.7 3.0 8.3 3.4 1.4 14.3 6.4 0.8 4.6 6.9 1.8
8.8 7.2 6.7 5.6 4.2 4.3 5.1 5.7 3.2 5.0 4.9 4.1 3.5 6.1 5.7 3.4 5.3 6.6 4.7
Marital status (marit) Single Married Divorce/Separated Widowed
30.4 61.1 6.2 2.4
32.2 59.5 6.1 2.1
24.3 56.3 8.3 11.2
24.4 57.7 7.5 10.4
Limits in daily activities (limit) No limitation Limitation not serious Severe limitation
77.1 17.4 5.5
79.3 15.9 4.8
73.5 20.3 6.2
76.6 18.5 5.0
Education level (educ) Illiterate or primary school First-stage secondary school Secondary school education University degree
15.9 32.6 38.4 13.1
17.0 33.6 38.0 11.4
21.4 28.4 37.4 12.9
23.4 27.6 35.7 13.4
Professional status (prof status) Employed Unemployed Retired Student Housewife
55.0 11.0 28.8 5.2 -
57.5 9.8 27.4 5.3 -
36.3 8.5 17.8 5.5 31.9
37.2 8.0 17.8 6.1 30.9
48.2 (16.4) 3890 24.2
47.7 (16.6) 12165 75.8
48.8 (16.6) 4295 25.1
48.7 (16.8) 12800 74.9
Variables Region of residence (region) Piedmont and Valle Aosta Lombardy Trentino Alto Adige Veneto Friuli Venezia Giulia Liguria Emilia-Romagna Tuscany Umbria Marche Lazio Abruzzo Molise Campania Puglia Basilicata Calabria Sicily Sardinia
Average years of individuals (age) Observations Sample (%)
Women Perceived risk of crime Non-perceived a severe risk of crime
Table 2: Descriptive statistics reported in terms of percentage values of the categorical variables considered here. For the continuous variable (age), the standard deviation (within parentheses) is also reported.
11
status, educational level achieved, professional status, the presence of limitations (lasting at least six months) in the activities that people perform daily, and age. Overall, the results show that the distribution of some covariates varies between those who perceived the risk of crime and those who did not. For example, if we consider the region of residence, the proportion of men and women who perceived the risk of crime and live in Lombardy, Lazio, Campania, Puglia and Sicily is higher than those who did not perceive the risk of crime. An opposite result was found for Trentino Alto Adige, Marche, Molise, Basilicata, and Sardinia. As for limits in daily activities, the proportion of men and women who perceived the risk of crime is higher among those whose activities are not limited. This result may be due to a sense of vulnerability to crime events in the presence of limits in the ability to protect themselves (see also Stiles et al., 2003). This descriptive analysis suggests that covariate imbalance is present in some observed confounders. This imbalance might be problematic because it could lead to biased parameter estimates.
3
Methods
The aim of this section is to describe briefly how the effect of crime on trust can be quantified in a meaningful manner, and provide a description of the statistical approaches used to estimate such an effect. In doing that, we have used a notation and terminology which is consistent with our case study.
3.1
Average treatment effect on the treated
The effect of the treatment (in our case crime) on the outcome of interest (trust) can be defined following the counterfactual framework of Rosenbaum and Rubin (1983) and Holland (1986). Specifically, each individual in the population has two potential values for the outcome, trustcrime=1 and trustcrime=0 . The first refers to the response obtained when the individual perceives the risk of crime, the second when the individual does not. Only one of these values is observed for each individual, the other outcome is the counterfactual. 12
The treatment effect is therefore defined as
E(trustcrime=1 ) − E(trustcrime=0 ),
where expectation is over the entire population. The effect of interest is however typically calculated considering only the individuals who received the treatment (in this case, those who perceive the risk of crime), hence giving rise to the so-called average treatment effect on the treated (ATT; e.g. Wooldridge, 2010). That is, let E(trustcrime=1 |crime = 1) be the average outcome of individuals who perceive the risk of crime when they are risk perceiving and E(trustcrime=0 |crime = 1) the average outcome of crime risk-perceiving individuals if they do not, then the ATT is defined as
ATT = E(trustcrime=1 |crime = 1) − E(trustcrime=0 |crime = 1).
Since trustcrime=0 can not be observed for individuals who perceive such a crime risk, E(trustcrime=0 |crime = 1) must be estimated from those who do not perceive it. Because of the potentially differing observed characteristics between people who perceive the risk of crime and people who do not (see, e.g., Section 2.2, Table 2), the average outcome from the group of individuals who perceive the risk of crime will not generally yield an unbiased estimate of E(trustcrime=0 |crime = 1) (for a more general and detailed discussion of the issue see, e.g., Rosenbaum and Rubin, 1983). In the next two sections, we describe two approaches which can help to obtain a realistic estimate of the ATT.
3.2
Propensity score weighting
Propensity score methods can adjust for observable differences between people who perceive or not the risk of crime. The propensity score is the probability of perceiving the risk of crime conditional on observed covariates. Specifically, if X denotes an n × k matrix of n observations for k predictors, the propensity score vector, p(X), is equal to P(crime=1|X).
13
Rosenbaum and Rubin (1983) showed that, conditional on p(X), the distribution of the covariates in X does not depend on the treatment (here, crime). That is, conditioning on the propensity score would result in the two groups of individuals having similar distributions of all observed covariates, as in a random assignment design. Following again Rosenbaum and Rubin (1983), if trustcrime=1 and trustcrime=0 are independent of crime conditional on X, then they are also independent of crime conditional on p(X). This result can therefore be used to estimate E(trustcrime=0 |crime = 1) from the group of individuals who do not perceive the risk of crime. In order to estimate the ATT, we employ a propensity score weighting (PSW) approach where p(X) is used to weight the outcomes of individuals who do not perceive the risk (Harder et al., 2006; Hirano et al., 2003; Rosenbaum, 1987). Specifically, let xi be the ith row vector of X, and the ith individual have weight wi = 1 if the individual is crime riskperceiving and wi = p(xi )/ {1 − p(xi )} otherwise. Provided a set of estimated weights is available, E(trustcrime=0 |crime = 1) can be estimated via the weighted mean of the observed outcomes for the group who does not perceive the risk of crime, i.e. b E(trust crime=0 |crime = 1) =
P
w bi trusti , bi i∈non-crime w
i∈non-crime
P
(1)
where i ∈ non-crime denotes the ith observation in the group who does not perceive the risk of crime and summation is over the set of observations in this group. Recall that the estimate obtained using (1) is unbiased if trustcrime=0 is independent of crime given X, provided no unobserved confounding is present. Now, let ncrime denote the number of individuals in the risk group and i ∈ crime the ith observation in this group, E(trustcrime=1 |crime = 1) can simply be estimated using b E(trust crime=1 |crime = 1) =
14
P
i∈crime
crimei
ncrime
.
Therefore, b b [ = E(trust ATT crime=1 |crime = 1) − E(trustcrime=0 |crime = 1). 3.2.1
Estimating p(X) via boosted classification and regression trees
The propensity score vector, and hence the vector of weights needed to calculate (1), is unknown and must be estimated from the data. The degree of accuracy in estimating the treatment effect will depend on the capability of producing accurate estimated propensity scores (e.g. Drake, 1993; Zhao, 2004). We use boosted classification and regression trees (boosted CART; McCaffrey et al., 2004) which, unlike other implementations of boosting (Freund and Schapire, 1997; Ridgeway, 1999; Friedman et al., 2000; Friedman, 2001), can produce estimated probabilities leading to unbiased estimates of the ATT (Lee et al., 2010, 2011). This approach is based on an automated, data adaptive algorithm that can be used with a large number of covariates to fit a nonlinear surface and ultimately predict crime. The key aspects of the algorithm are reported in Appendix B; the reader is referred to McCaffrey et al. (2004) and Ridgeway et al. (2010) for more details.
3.3
Semiparametric recursive bivariate probit modelling
The issue with propensity score techniques is that the possible presence of unobserved confounding/endogeneity can not be controlled for. To this end, we propose using a semiparametric recursive bivariate probit (SRBP) technique (Marra and Radice, 2011). A possible downside is that estimation results may be sensitive to model misspecification, although such a risk is reduced because the approach allows for flexible functional dependence of the response variables on continuous covariates. The model can be written as crime∗i = x+ 1i α1 + s1 (agei ) + ε1i trust∗i
= βcrimei +
x+ 2i α2
+ s2 (agei ) + ε2i 15
, i = 1, . . . , n,
(2)
where crime∗i and trust∗i are continuous latent variables determining the observed binary outcomes crimei and trusti through the rules 1(crime∗i > 0) and 1(trust∗i > 0), x+ 1i th and x+ row vectors containing the parametric model components described in 2i are the i
Section 2, with corresponding parameter vectors α1 and α2 , and s1 and s2 are unknown one-dimensional smooth functions of the continuous covariate age, represented using thin plate regression splines (Wood, 2006). In short, the generic smooth function of age is given as a linear combination of known thin plate regression spline bases, bj (age), and unknown PJ regression parameters, δj . That is, s(age) = j=1 δj bj (age), where J is the number of bases. Calculating bj (agei ) for each j and i yields J curves encompassing different degrees of complexity which multiplied by some real valued parameters δj and then summed give an estimated curve for s(age) (see, e.g., Marra and Radice (2010) for a more detailed introduction). Smooth components are typically subject to some identifiability constraints P such as i s(agei ) = 0. The errors (ε1i , ε2i ) are assumed to follow the bivariate distribution
ε1i iid 0 1 ρ ∼ N , , ε2i 0 ρ 1 where ρ is the correlation coefficient and the error variances are normalized to unity, which is a conventional normalization required to identify the parameters in the model. The parameter of interest is β, through which is possible to estimate the ATT. In order to identify this parameter, it is typically assumed that the exclusion restriction (ER) on the covariates holds (e.g. Maddala, 1983, p. 122). That is, the set of regressors in the first equation of (2) contains at least one or more regressors than those included in the second equation. These are regarded as instrumental variables, which, in the current context, would have to be associated with crime, independent of trust conditional on the observed and unobserved confounders, and independent of the unobserved confounders (hence, independent of the errors (ε1i , ε2i ) (e.g. Marra and Radice, 2011)). However, as demonstrated, e.g., in Wilde (2000) and Marra and Radice (2011), in recursive bivariate probit models, identification can be achieved even if the same regressors appear in both equations. In particular, 16
let us consider the linear combination ψ1 crime∗i + ψ2 trust∗i , where ψ1 and ψ2 are generic coefficients. Solving for trust∗i will yield an expression which differs structurally from the second equation in (2) by the term (ψ1 /ψ2 )crime∗i . Hence, theoretical identification does not require the availability of any IV under correct model specification. In practice, however, both functional form and model errors are likely to be misspecified to some degree. In this case empirical identification is better achieved if the exclusion restriction on the covariates in the two equations holds (e.g. Little, 1985). For the current case study, because it has not been possible to identify a variable satisfying the three aforementioned core conditions for a valid instrument, the SRBP approach without ER has been employed to reduce the risk of obtaining biased estimates resulting from the use of a flawed instrument. Although we could check the assumption of normality (see Section 4.2), we are aware that the model could still be subject to misspecification of functional form. Provided estimates for the parametric and smooth function components are available, the quantities needed for the ATT can be estimated as X Φ β + x+ b α + s b (age ) 2 2 2i i b , E(trust crime=1 |crime = 1) = ncrime i∈crime and
X Φ x+ b α + s b (age ) 2 2 2i i b , E(trust crime=0 |crime = 1) = ncrime i∈crime
where Φ is the distribution function of a standardized normal. 3.3.1
Parameter estimation and inference
Because the (ε1i , ε2i ) are assumed to be correlated, simultaneous parameter estimation is advisable. Let θ be a parameter vector containing all parametric and smooth function regression coefficients as well as ρ, it is not difficult to write down the log-likelihood function, ℓ(θ), associated with model (2) (Green, 2007, pp. 738-741; Marra and Radice, 2011). In principle, such a model can be estimated by maximization of ℓ(θ). However, given the flexible model specification considered here, unpenalized parameter estimation would result 17
in smooth term estimates that are too wiggly. Although the primary interest is not in the smooth functions of age, if the estimated curves are exceedingly “wiggly” then this can lead to a biased estimate of β and hence of the ATT, because of overfitting (Chib and Greenberg, 2007; Marra and Radice, 2011). This issue can be overcome by penalized likelihood maximization by recalling that the generic smooth term s(age) has an associated penalty, δ T Sδ, where δ is a parameter vector containing all spline regression coefficients, and S is a known positive semi-definite matrix measuring the roughness of the smooth component (for instance, the second-order roughness measure for a univariate spline penalty R evaluates s′′ (age)2 dage). By using penalties during the model fitting process, it is possible to suppress that part of smooth term complexity which has no support from the data (e.g. Marra and Radice, 2010). The model is therefore fitted by maximization of the penalized log-likelihood ℓp (θ) = ℓ(θ) −
1 λ1 δ1T S1 δ1 + λ2 δ2T S2 δ2 , 2
(3)
where the two terms within brackets represent the penalties associated with s1 (age) and s2 (age), and λ1 and λ2 are smoothing parameters controlling the trade-off between fit and smoothness. Given values for λ1 and λ2 , maximization of (3) is straightforward. However, smoothing parameter estimation has to be settled in practice. This usually involves the use of specialized numerical routines minimizing, for instance, a prediction error criterion so that the estimated smooth functions are as close as possible to the true functions. In the current case, multiple smoothing parameter estimation is achieved by minimization of the approximate unbiased risk estimator (UBRE), which ca be also thought of as an approximate rescaled Akaike information criterion (Craven and Wahba, 1979). Full computational details can be found in Marra and Radice (2011). The inferential theory for penalized spline models is complicated by the presence of penalties which undermines the use of classic asymptotic likelihood results for practical modelling. As explained in Marra and Radice (2011), CIs for the components in the semiparametric bivariate probit model can be constructed using the results for the well known Bayesian
18
‘confidence’ intervals typically employed in a generalized additive model context (e.g. Gu, 2002). One of the reasons for using such results is that the resulting intervals include both a bias and variance component, a fact that makes such intervals have good observed frequentist coverage probabilities across the function (Marra and Wood, 2012). Interval calculations ˆ Vθ ), where y contains the response vectors, θˆ is are therefore based on the result θ|y∽N ˙ (θ, the estimate of θ, and Vθ represents the inverse of the penalized Fisher information matrix obtained at convergence of the algorithm used to fit the model (see Marra and Radice (2011) for further details). Given this result, CIs for linear functions of the model parameters can be easily obtained. Furthermore, CIs for nonlinear functions such as the estimated ATT can be conveniently obtained by simulation from the posterior distribution of θ. Note that, for any strictly parametric model component, using such intervals is equivalent to using classic likelihood results. This is because parametric model terms are not penalized. Also, there is no contradiction in fitting model (2) by penalized log-likelihood estimation and then constructing confidence intervals following a Bayesian approach, and such a procedure has been employed many times in the literature (e.g. Gu, 2002; Wood, 2006).
4
Modelling the trust-crime relationship
The empirical analysis was carried out in the R environment (R Development Core Team, 2011). All computations were performed using the packages twang (Ridgeway et al., 2010) in combination with survey (Lumley, 2011) for PSW, and SemiParBIVProbit for SRBP (Marra and Radice, 2012). These implement the methods discussed in Sections 3.2 and 3.3.
4.1
Model fitting details
The parameter settings to estimate the propensity score models via boosted CART were as follows: maximum number of iterations = 20000, number of splits = 4, shrinkage coefficient (γ) = 0.005. The measure for stopping the algorithm and assessing covariate balance was the largest of the KS statistics among the covariates in the model. P-values for the KS 19
statistics were calculated using 500 Monte Carlo trials. Increasing the values of maximum number of iterations, number of splits and bootstrap trials, and decreasing that of shrinkage coefficient did not change the final results which are reported in the next section. Semiparametric recursive bivariate probit models without ER were fitted using smooth components represented by penalized thin plate regression splines with basis dimensions equal to 10 and penalties based on second-order derivatives. Multiple smoothing parameter estimation was achieved by using the approximate UBRE, and the tolerance used to judge the algorithm convergence was set to 1e-06. 95% CIs for the ATT were obtained using 1000 simulated draws from the posterior distribution of the estimated model parameters. Increasing the values of the basis dimensions and simulated draws, and decreasing that of the tolerance did not change the final results which are reported below. For comparison with PSW, classic semiparametric univariate probit models (henceforth Probit) were also fitted using the same thin plate regression spline settings as discussed above.
4.2
Results
We begin by discussing the evidence found using the PSW method. However, we must check covariate balance as well as the common support of the propensity score first. As for the balance, one of the most used statistics is the absolute standardised mean difference (ASMD), defined as the distance between the weighted treatment and control group means divided by the weighted treatment group standard deviation (Lee et al., 2010). There is no consensus on the size of the standardised difference that indicates imbalance. Some researchers propose that a value of 0.2 or greater denotes meaningful imbalance (Stuart and Rubin, 2007); others suggest that balance is maximised without limit (Imai et al., 2008). Figure 1a illustrates the ASMD for each covariate for the sample of men. The circles on the left represent the ASMDs before weighting whereas those on the right represent the ASMDs after weighting; substantial reductions in the weighted ASMDs can be observed for all covariates (grey lines). Furthermore, filled circles indicate statistically significant mean 20
differences between the groups who perceive and do not perceive the risk of crime calculated using either t-tests or a χ2 statistic, depending on whether the variable is continuous or categorical; some of the variables are significant before weighting while none of the variables is significant after weighting. Similar conclusions can be drawn for the sample of women (Figure 1b). Overall, these results indicate that the balance of the covariates is clearly improved when using the PSW method. Notice that the ASMDs do not address imbalance beyond differences in means; non-parametric tests (such as KS statistics) could be used instead (Stuart, 2010). However, further work is required to provide weighted versions of these statistics. As for the common support, Figure 2 shows that the overlap of the estimated propensity score between the treatment and comparison groups is reasonable for both men
0.05 0.04 0.00
0.00
0.01
0.01
0.02
0.03
ASMD
0.03 0.02
ASMD
0.04
0.06
0.05
0.07
and women.
Unweighted
Weighted
Unweighted
(a) Men
Weighted
(b) Women
Figure 1: Absolute standardised mean differences (ASMDs) of the observed confounders before and after weighting in the sample of men and women. The filled circles indicate a statistically significant mean difference between who perceived the risk of crime and who did not perceive a severe risk of crime, and the grey lines represent substantial reductions in the ASMDs after weighting. The covariates considered were: region of residence, marital status, educational level achieved, the presence of limits in daily activities, the professional status, and each individual’s age.
The ATT results are reported in Table 3. For both genders, the ATT is statistically significant and negative. This result means that the perception of the risk of crime reduces 21
Women
3 2
3
0
0
1
1
2
density
4
4
5
5
6
Men
0.1
0.2
0.3
0.4
0.5
0.6
0.1
estimated propensity score
0.2
0.3
0.4
0.5
0.6
0.7
estimated propensity score
Figure 2: Density of the estimated propensity scores of both groups, those who perceive risk of crime (black line) and those who do not (gray line), for men and women.
the probability of social trust. Moreover, the magnitude of the effect is slightly higher for women than for men (-0.070 and -0.061, respectively). The ATT computed using PSW and Probit are substantially equal for both men and women. This suggests that, although the regression is more prone to functional form misspecification, the flexibility of the semiparametric univariate probit may have allowed us to overcome such a problem. In other words, this result suggests that Probit can account for observed confounders in the same way as a method such as PSW does, which supports the use of regression-based models for the current case study. However, as stressed in the previous sections, these two approaches cannot control for the endogeneity of perceived risk of crime. This issue can be addressed using the SRBP model described in Section 3.3. One important implication of the results described in this paragraph is that SRBP will be able to account for observed confounders in the same way that the Probit did. As explained in Section 3.3, because parameter estimates are inconsistent when the model error distribution is misspecified, and because identification without an instrument requires the model assumptions to be met, it is important to check the empirical validity of the normality assumption in model (2). This was checked using a score test of bivariate 22
normality where the density of the errors under the alternative hypothesis is based on a type AA bivariate Gram Charlier series with 9 additional parameters (Chiburis, 2010; Lee, 1984; Murphy, 2007). Under the null, the test that all 9 terms are zero can be approximated fairly well by a chi-squared distribution with 9 degrees of freedom, given standard regularity conditions. However, as demonstrated by Murphy (2007), bootstrapped critical values lead to a better (higher order) asymptotic approximation. Since such a test is currently available for classic bivariate probit models only, we implemented it using the result that a penalised regression spline is approximately equivalent to a pure regression spline with degrees of freedom close to that of the penalised spline (e.g. Wood, 2006, p. 210-212). This means that the appropriate use of pure splines can yield estimated smooth functions that are very similar to those obtained when employing penalised splines. In this way, a semiparametric bivariate probit model with smooth terms represented using pure splines is simply a parametric bivariate probit model for which the test described above can be employed. P-values based on 4999 bootstrap replications were in the range (0.69,0.81) for all bivariate models considered here, which supports the hypothesis of bivariate normality. Although the assumption of normality seems to be met, identification without ER relies on correct functional form of the model, which cannot be checked. For this reason the results from this analysis should be regarded with caution and as complementary to those obtained with Probit and PSW. The magnitude of the ATT estimated using SRPB is equal to -0.144 for men and 0.124 for women; for the sample of individuals who perceive risk of crime, the probability that men and women who perceive the risk of crime have trust in people is 0.144 and 0.124 lower than that of men and women who do not the perceive risk of crime. CIs indicate that the estimated ATT is statistically significant for both men and women. The estimated correlation coefficient (b ρ) is positive for both men and women (see Table 3). This result suggests that the unobservables affecting the perceived risk of crime and trust are positively correlated. However, the endogeneity hypothesis has to be tested empirically. The null and alternative hypotheses of interest are H0 : ρ = 0 and H1 : ρ 6= 0, where 23
[ men ATT
ρbmen
[ women ATT
ρbwomen
PSW
−0.061 (−0.077, −0.046)
-
−0.070 (−0.084, −0.046)
-
Probit
−0.061 (−0.074, −0.046)
-
−0.069 (−0.082, −0.056)
-
SRBP
−0.144 (−0.157, −0.130)
0.190 (0.161, 0.217)
−0.124 (−0.137, −0.112)
0.137 (0.108, 0.165)
Method
Table 3: Estimates of the ATT and ρ by men and women in the relationship between social trust and perceived risk of crime. These were obtained applying the Probit, propensity score weighting (PSW) and semiparametric recursive bivariate probit (SRPB) approaches on the data described in Section 2. 95% Bayesian ‘confidence’ intervals for the average treatment effect and correlation coefficient of all cases, except for PSW, were obtained using 1000 coefficient vectors simulated from the posterior distribution of the estimated model parameters. Confidence intervals for the estimated ATT of PSW were obtained using robust standard errors generated by the weighted analytic model (see APPENDIX B).
H0 corresponds to the absence of endogeneity of crime. Given the large datasets available here and that the assumption of bivariate normality holds, estimated correlation coefficients with corresponding CIs can be used for reliably testing H0 (Monfardini and Radice, 2008). The results in Table 3 support the presence of endogeneity for both men and women. The presence of such unmeasured confounders suggests that either Probit or PSW might yield biased results. Finally, Table 6 in the Appendix C reports the estimated parameter of β and those of the other observed confounders obtained employing Probit and SRBP. For all estimated models, β is negative and statistically significant. Specifically, we found a strong increase in the z-statistic values of β (for both genders) when observed and unobserved confounders are accounted for. Some interesting evidence was found for the observed confounders. Specifically, signs and magnitudes of the estimated coefficients for the spatial variable (region) indicate that social trust is higher in northern regions (which are also characterised by a higher economic attractiveness; Marra et al., 2012) than in the southern regions of Italy. This duality in social capital between northern and southern Italy has also been found by previous studies (e.g., Helliwell and Putnam 1995). Moreover, we found that the presence of limits in daily activities can have a detrimental impact on social trust. This finding is interesting because it can be interpreted to suggest that a sense of vulnerability can arise in 24
people with some limits in their ability to protect themselves. We also report the estimated smooth component for the continuous variable age. The plots in Figure 3 in the Appendix C support the presence of non-linear patterns in the relationship between trust and age. Specifically, such a relationship follows a more marked inverted U-shaped for men than for women. The reasons underlying these patterns are not well clear and deserve further future investigation. Methodologically speaking, however, these shapes can not be assumed a priori and might well be different for other countries. Hence, the need to use a more complex method than the classic approach, which accounts for non-linearities by making a priori assumptions (for instance, with quadratic or cubic polynomial functions). The remaining estimated coefficients for the observed confounders were consistent with those reported in the literature (e.g., Alesina and La Ferrara, 2002).
5
Discussion
Motivated by an interest in understanding the reasons for social distrust in Italy, this study aimed to investigate the effect of perceived risk of crime on the probability of social trust among Italian men and women while accounting for both observed confounders and endogeneity. Using SRBP, Probit and PSW, we have found different results which should not be regarded as conflicting but rather as complementary. The bottom line message is that the perceived risk of crime has a negative and significant impact on the probability of social trust and that the endogeneity issue seems to be a concern for both men and women. Importantly, the SRBP allows us to relax assumptions related to specific functional forms for continuous covariates, thereby minimising specification errors by allowing the data to determine the appropriate relationships (i.e., linear or non-linear). Hence, the SRBP can be applied as a replacement of the fully parametric specification. As a future extension of our study, it would be interesting to investigate the relationship between the perceived risk of crime and social trust using longitudinal data, which are not currently available. Given the need and importance of providing initial evidence on the effect of perceived risk of crime
25
on social trust in Italy, we conducted this analysis using the available cross-sectional data. Moreover, the ATT estimated for Italy should be compared with the results of other European countries, especially with countries that record the highest levels of social trust (e.g., Norway and Sweden). Although our study has focused primarily on the estimation of the effect of perceived risk of crime on social trust, the findings should be of interest to sociologists, economists, and policymakers because social distrust can spread easily and contributes making social relations more complex and onerous and reducing the quality of life. We must not forget that the loss of social trust cannot be ‘recovered or re-built’ easily or quickly. This is because people can view the environment around them as characterised by external and uncontrollable threats, and they may relate with unknown individuals with suspicion. We have observed that spatial differences in the percentage of respondents who perceived a risk of crime (Table 2) are reflected in some of the official statistics on crime events (Table 5 in Appendix A). For example, the highest incidences of bag-snatching registered in 2010 were located in Campania, Sicily, Puglia, and Lazio. The regions of Lombardy and Piedmont are characterised by the highest incidence of domestic theft (418 and 415 per 100000 inhabitants, respectively). Analysing the spatial frequency of the robberies, we note that the largest numbers of these events were recorded in Campania, Lazio, Sicily, Piedmont and Lombardy, whereas the Basilicata is the Italian region that registered the lowest number of these offences. Overall, the regions with the highest percentage of respondents who perceived a risk of crime are also those that registered the highest number of thefts (that is, the most frequent crime event registered in Italy) and robberies. We have reported only the statistics for some types of offenses that may be barriers to better quality of life and perceptions of personal security. However, the perception of the risk of crime is determined by a mix of factors, including direct or indirect experiences of crime in one’s residential area, watching crime reality programmes on television, and the subjective characteristics of individuals, such as psychological and health status. We recognise that further research is needed to better understand the mechanisms that regulate the relationship between perceptions of risk and crime events recorded by official statistics. For policy-makers, our analysis suggests 26
that reducing the perceived risk of crime should be one of the challenges to address in the next years to improve social trust in Italy. Hence, in defining the country’s political agenda, policymakers should include a number of initiatives for the prevention and suppression of disorder and major crimes. To best pursue this aim, we believe that policies should not only seek to improve the coordination and efficiency of the police force, public video surveillance, and the justice system but also consider initiatives aimed to improve social and economic conditions (for example, reducing income inequality2 and poverty of people; see also Blau and Blau, 1986; McIntyre and Lacombe, 2012). Because a full discussion on policies and best practices for the reduction of disorders and major crimes is beyond the scope of this paper, we refer the reader to Witte (1996), Weisburd and Eck (2004), Wilson and Petersilia (2011) and Vollaard (2012). Furthermore, we must not forget that a loss of social trust can impose an economic burden on individuals and on society as a whole. The literature is rich with studies that discuss and propose approaches for estimating costs (direct/indirect and tangible/intangible) of crime, fear of crime, and the perceived risk of crime (e.g., Becker, 1968; Dolan et al. 2005; Dolan and Peasgood, 2007; Loomes, 2007; Czabansku 2008; Detotto and Vannini, 2010), but little or no attention has been focused on estimating costs (direct/indirect and tangible/intangible) linked with the ‘erosion’ of social trust due to the perceived risk of crime. In this regard, our empirical contribution should represent a useful step towards this important goal. However, future research is needed in order to propose robust approaches for converting in monetary terms the estimated ATT.
ACKNOWLEDGEMENT We would like to thank one anonymous reviewer for many suggestions, which have helped to improve the presentation and quality of the article.
2
For an extensive discussion of growing income inequality in Italy over time, please refer to OECD (2011, 2012).
27
Appendix A Region of residence Piedmont and Valle Aosta Lombardy Trentino Alto Adige Veneto Friuli Venezia Giulia Liguria Emilia-Romagna Tuscany Umbria Marche Lazio Abruzzo Molise Campania Puglia Basilicata Calabria Sicily Sardinia Italy
Social trust Men Women 26.7 23.7 28.5 24.5 44.1 37.0 24.6 22.3 28.8 27.3 29.9 27.0 26.5 21.2 25.6 23.9 24.9 20.9 25.0 21.2 25.0 23.7 21.3 19.8 20.2 18.2 16.5 13.4 18.7 18.6 18.2 14.4 17.2 13.8 17.6 16.2 21.5 21.4 24.5 21.6
Table 4: Proportion, in percentages, of social trust among respondents by region of residence and gender. The typical geographic areas are North West (consisting of the following regions: Piedmont, Valle Aosta, Lombardy, and Liguria), North East (Trentino Alto-Adige, Veneto, Friuli-Venezia Giulia, and Emilia Romagna), Centre (Tuscany, Umbria, Marche, and Lazio), South and Islands (Abruzzo, Molise, Campania, Puglia, Basilicata, Calabria, Sicily and Sardinia).
28
Region Piedmont Valle Aosta Lombardy Trentino Alto Adige Veneto Friuli Venezia Giulia Liguria Emilia-Romagna Tuscany Umbria Marche Lazio Abruzzo Molise Campania Puglia Basilicata Calabria Sicily Sardinia Italy
Total 2337.4 1384.6 2861.1 1414.0 1983.5 1549.5 2632.4 2778.8 2314.1 1900.4 1600.3 2955.5 1757.7 1184.7 1589.9 1935.0 768.1 1195.1 1881.6 1202.1 2190.7
Of which: bag-snatching 20.5 3.1 20.5 3.7 7.5 4.0 31.4 16.6 20.1 11.3 9.9 32.3 13.7 4.7 51.8 33.0 2.7 10.5 36.6 6.9 23.5
Theft Of which: domestic theft 414.7 250.7 417.6 110.9 266.2 234.3 276.5 320.5 340.4 307.9 233.2 296.2 234.9 134.1 131.2 232.4 112.2 121.6 219.3 145.6 279.7
Of which: car theft 174.4 25.8 228.4 15.0 60.5 31.2 102.5 98.1 58.9 59.6 55.2 388.2 136.3 125.0 347.9 445.5 67.8 220.7 295.9 105.0 211.4
Total 62.5 10.9 56.4 15.9 24.4 14.9 44.4 40.6 37.0 26.0 20.7 75.1 25.6 10.3 143.3 50.9 9.2 30.3 64.4 22.5 55.8
Robbery Of which: domestic robbery 3.6 0.8 3.4 0.8 2.3 2.7 3.1 3.0 3.5 3.4 2.2 4.5 3.0 0.0 3.7 3.7 1.9 4.3 5.0 4.1 3.5
Total crimes 5173.9 3734.5 5185.7 2971.4 3690.8 3110.5 5630.4 5122.2 4691.5 3825.4 3463.8 5100.9 3863.0 2833.7 3557.9 3794.5 2398.7 3342.1 3785.2 3293.9 4333.5
Table 5: Some official statistics on crime events registered in Italy in 2010 (by region). The crimes are those reported by the police to the judicial authority. Sources: ISTAT and Ministry of Interior. Total crimes include massacre, homicide, infanticide, beatings, intentional injury, threats, seizures in person, insults, sexual violence, corruption, exploitation and abetting prostitution, child pornography and possession of child pornography, theft, robbery, extortion, fraud and computer fraud, counterfeiting trademarks and industrial products, violations of intellectual property, stolen goods, recycling and use of money, goods or assets of illicit origin, damage, fire, drugs, attacks, criminal association, and smuggling. All values reported in the table are per 100000 inhabitants.
29
Appendix B Boosted classification and regression trees To simplify matters, boosted CART models the log-odds of crime, g(X) = log [p(X)/ {1 − p(X)}], rather than p(X). In the first step of the procedure, g(X) is set to log crime/(1 − crime) , where crime is the average value of crime for the entire sample. In the next step, the algorithm searches for a small adjustment, h(X), to add to the initial estimate so to improve the fit of the model to the data. Model fit is measured by the following Bernoulli log-likelihood
ℓ {g(X)} =
n X
crimei g(xi ) − log [1 + exp {g(xi )}] ,
(4)
i=1
with larger values implying a better fit. If the algorithm finds an adjustment which improves ℓ {g(X)}, then g(X) ← g(X) + γh(X), where γ ∈ (0, 1] is a shrinkage coefficient. Small values for γ allow the procedure to make fine adjustments. This will certainly increase the number of iterations needed to produce good propensity score estimates, but will result in better model fits. h(X) is a regression tree which models the residuals from the current model fit, where the ith residual is defined as ri = crimei −
1 . 1 + exp {−b g (xi )}
Note that using h(X) to model the residuals is equivalent to estimating the derivative of the log-likelihood function, hence boosted CART can find the maximum likelihood estimate of g(X). In order to produce well-calibrated propensity score estimates, which are crucial to obtain an unbiased estimate of the ATT, the function describing the relationship between the predictors and crime is estimated using a regression tree as it can allow for very complex covariate-response relationships. Generally speaking, a tree-fitting algorithm starts by splitting the dataset into two regions on the basis of any pair of observed values
30
of any of the covariates. Among all the possible splits, the algorithm selects the one that minimizes a given prediction error. Splitting continues recursively until the tree includes the maximum number of splits, which is crucial to determine the tree complexity in that each additional split allows for additional interactions among covariates. Full details can be found in Breiman et al. (1984). Since each iteration will produce a new estimate of g(X) b(X) which increases the log-likelihood, if a high number of iterations are performed then g
will eventually overfit the data, hence not providing a meaningful estimate of the propensity score. Because obtaining balance of the observed covariates between the two groups of individuals is the primary goal, the algorithm is iterated until the best matching between the covariates of the two groups, measured using the largest of Kolmogorov-Smirnov (KS) statistics among the predictors, is achieved. For categorical variables this is just the χ2 test. Confidence intervals (CIs) for the estimated ATT are calculated using robust standard errors generated by the weighted analytic model (Lee et al., 2010; Lumley, 2011). As an alternative, standard errors can be obtained using nonparametric bootstrap (Efron and Tibshirani, 1993), as suggested by Hern´an et al. (2000).
31
Appendix C Men
Women f(age; 2.888) − eq. 2
f(age; 4.113) − eq. 2
0.2 0.1 0.0 −0.1 −0.2
0.1 0.0 −0.1 −0.2 −0.3 −0.4
−0.3 20
30
40
50
60
70
80
20
age
30
40
50
60
70
80
age
Figure 3: Estimated smooth components for the continuous variable age by men and women in the relationship between social trust and perceived risk of crime. These were obtained applying the semiparametric recursive bivariate probit without exclusion restriction on the data described in Section 2. Results are on the scale of the respective linear predictors. The estimated degree of freedom of the smooth curves are reported in the y-axis of each graph. Dashed lines represent 95% Bayesian ‘confidence’ intervals and the ‘rug plot’, at the bottom of each graph, is used to show the covariate values. eq.2 refers to the impact of age on trust.
32
Men
Women
Variables
Probit
SRBP
Probit
SRBP
Intercept
-0.787 (-14.3)
-0.720 (-13.1)
-0.854 (-15.2)
-0.799 (-14.2)
Perceived risk of crime
-0.212 (-7.8)
-0.530 (-18.4)
-0.264 (-9.7)
-0.493 (-17.3)
Region of residence (region) Piedmont and Valle Aosta Lombardy Trentino Alto Adige Veneto Friuli Venezia Giulia Liguria Emilia-Romagna Tuscany Umbria Marche Lazio Abruzzo Molise Campania Puglia Basilicata Calabria Sicily Sardinia
0.067 (1.3) 0.461 (8.4) -0.064 (-1.1) 0.019 (0.3) 0.064 (1.0) -0.019 (-0.3) -0.023 (-0.4) -0.052 (-0.7) -0.034 (-0.5) -0.069 (-1.2) -0.153 (-2.3) -0.202 (-2.6) -0.280 (-5.0) -0.219 (-3.6) -0.289 (-3.7) -0.296 (-4.6) -0.269 (-4.6) -0.135 (-2.0)
0.104 (2.0) 0.413 (7.5) -0.060 (-1.1) -0.011 (-0.2) 0.059 (0.9) -0.017 (-0.3) -0.026 (-0.4) -0.058 (-0.8) -0.050 (-0.9) -0.026 (-0.4) -0.155 (-2.3) -0.241 (-3.1) -0.214 (-3.8) -0.202 (-3.3) -0.347 (-4.5) -0.301 (-4.7) -0.255 (-4.4) -0.179 (-2.7)
0.040 (0.8) 0.353 (6.4) -0.038 (-0.7) 0.079 (1.2) 0.082 (1.3) -0.102 (-1.7) -0.0002 (-0.0) -0.102 (-1.4) -0.113 (-1.8) 0.009 (0.2) -0.117 (-1.7) -0.169 (-2.2) -0.275 (-4.9) -0.089 (-1.5) -0.321 (-3.9) -0.322 (-5.0) -0.217 (-3.7) -0.054 (-0.8)
0.062 (1.2) 0.316 (5.7) -0.039 (-0.7) 0.051 (0.8) 0.071 (1.1) -0.096 (-1.6) -0.004 (-0.1) -0.113 (-1.6) -0.134 (-2.1) 0.032 (0.5) -0.126 (-1.9) -0.202 (-2.7) -0.232 (-4.1) -0.085 (-1.4) -0.365 (-4.5) -0.329 (-5.1) -0.217 (-3.7) -0.089 (-1.4)
Marital status (marit) Single Married Divorce/Separated Widowed
-0.101 (-3.1) -0.073 (-1.4) -0.152 (-1.7)
-0.096 (-2.9) -0.072 (-1.4) -0.142 (-1.6)
0.024 (0.7) 0.037 (0.8) -0.032 (-0.6)
0.022 (0.6) 0.040 (0.8) -0.027 (-0.5)
Limits in daily activities (limit) No limitation Limitation not serious Severe limitation
-0.086 (-2.7) -0.194 (-3.5)
-0.078 (-2.4) -0.180 (-3.2)
-0.105 (-3.4) -0.182 (-3.2)
-0.097 (-3.2) -0.164 (-2.9)
Education level (educ) Illiterate or primary school First-stage secondary school Secondary school education University degree
0.110 (2.8) 0.349 (8.7) 0.699 (15.1)
0.115 (2.9) 0.356 (8.9) 0.708 (15.2)
0.150 (3.9) 0.369 (9.5) 0.620 (13.7)
0.159 (4.2) 0.379 (9.7) 0.625 (13.8)
-0.113 (-2.7) 0.081 (1.9) 0.228 (3.7)
-0.104 (-2.5) 0.091 (2.1) 0.231 (3.7)
16055
16055
-0.214 (-4.6) -0.131 (-3.1) 0.106 (1.7) -0.156 (-4.8) 17095
-0.211 (-4.5) -0.130 (-3.1) 0.102 (1.6) -0.154 (-4.7) 17095
Professional status (prof status) Employed Unemployed Retired Student Housewife Observations
Table 6: Estimates of β and observed confounders by men and women in the relationship between social trust and perceived risk of crime. These were obtained applying the Probit and semiparametric recursive bivariate probit without exclusion restriction (SRPB) approaches on the data described in Section 2. In parenthesis are z-statistic values.
33
References [1] Alesina A and La Ferrara E (2002). Who trusts others?. Journal of Public Economics, 85, 207–234. [2] Almond G and Verba, S (1963). The Civic Culture. Princenton: Princenton University Press. [3] Amemiya T (1974). The nonlinear two-stage least-squares estimator. Journal of Econometrics, 2, 105–110. [4] Angrist JD, Imbens GW, and Rubin DB (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91, 444-472. [5] Austin PC (2007). The performance of different propensity score methods for estimating marginal odds ratios. Statistics in Medicine, 26, 3078–3094. [6] Austin PC (2008). A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Statistics in Medicine, 27, 2037–2049. [7] Becker GS (1968). Crime and punishment: an economic approach. Journal of Political Economy, 76, 169–217. [8] Berggren N, Elinder M, and Jordahl H (2008). Trust and growth: a shaky relationship. Empirical Economics, 35, 251–274. [9] Breiman L, Friedman JH, Olshen RA, and Stone CJ. (1984). Classification and regression trees. Belmont, CA: Wadsworth International Group. [10] Blau JR and Blau PM (1982). The cost of inequality: metropolitan structure and criminal violence. Sociological Quarterly, 27, 15–26. [11] Bursik RJ, Jr, and Grasmick HG (1993). Neighborhood and crime. NY: Lexington Books.
34
[12] Chib S, and Greenberg E (2007). Semiparametric modeling and estimation of instrumental variable models. Journal of Computational and Graphical Statistics, 16, 86–114. [13] Chib S, Greenberg E, and Jeliazkov I (2009). Estimation of semiparametric models in the presence of endogeneity and sample selection. Journal of Computational and Graphical Statistics, 18, 321–348. [14] Chiburis ate
RC
probit
(2010). models:
Score
tests
comment.
of
Working
normality paper.
in
bivari-
Available
at
https://webspace.utexas.edu/rcc485/www/research.html. [15] Craven P, and Wahba G (1979). Smoothing noisy data with spline functions. Numerische Mathematik, 31, 377–403. [16] Czabanski J (2008). Estimates of cost of crime. History, methodologies, and implications. Springer Berlin Heidelberg. [17] Dearmon J and Grier R (2011). Trust and the accumulation of physical and human capital. European Journal of Political Economy, 27, 507–519. [18] Dolan P, Loomes G, Peasgood T, and Tsuchiya A (2005). Estimating the intangible victim costs of violent crime. British Journal of Criminology, 45, 958–976. [19] Dolan P and Peasgood T (2007). Estimating the economic and social costs of the fear of crime. British Journal of Criminology, 47, 121–132. [20] Drake C (1993). Effects of misspefication of the propensity score on estimators of treatment effect. Biometrics, 49, 1231–1236. [21] Efron B, and Tibshirani R (1993). An introduction to the bootstrap. New York: Chapman & Hall. [22] Elgar FJ and Aitken N (2010). Income inequality, trust and homicide in 33 countries. European Journal of Public Health, 21, 241–246. 35
[23] Freund Y and Schapire R (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 119-139. [24] Friedman JH, Hastie T and Tibshirani R (2000). Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28, 337-374. [25] Friedman JH. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232. [26] Fukuyama F (1995). Trust: The Social Virtues and the Creation of Prosperity. New York: Free Press. [27] Geis KJ and Ross CE (1998). New look at urban alienation: the effect of neighborhood disorder on perceived powerlessness. Social Psychology Quarterly, 61, 232–246. [28] Greene WH (2007). Econometric Analysis. New York: Prentice Hall. [29] Gu C (2002). Smoothing Spline ANOVA Models. Springer-Verlag: London. [30] Gustavsson M and Jordahl H (2008). Inequality and trust in Sweden: some inequalities are more harmful than others. Journal of Public Economics, 92, 348–365. [31] Harder VS, Morral AR, and Arkes J (2006). Marijuana use and depression among adults: Testing for causal associations. Addiction, 101, 1463–1472. [32] Hardin R (2002). Trust and trustworthiness. New York: Russell Sage Foundation. [33] Heckman J (1979). Dummy endogenous variables in a simultaneous equation system. Econometrica, 46, 931–959. [34] Helliwell JF and Putnam RD (1995). Economic growth and social capital in Italy. Eastern Economic Journal, 21, 295–307. [35] Herreros F and Criado H (2008). The state and the development of social trust. International Political Science Review, 29, 53–71. 36
[36] Hern´an MA, Brumback B, and Robins JM (2000) Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology, 11, 561-570. [37] Hirano K, Imbens G W, and Ridder G (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71, 1161–1189. [38] Holland PW (1986). Statistics and causal inference. Journal of the American Statistical Association, 81, 945–960. [39] Imai K, King G and Stuart EA (2008). Misunderstandings between experimentalists and observationalists about causal inference. Journal of the Royal Statistical Society Series A, 171, 481–502. [40] In-Young K (2008). A historical and social interpretation of low trust in Italy and Korea. The Review of Korean Studies, 11, 149–168. [41] Johnston KM, Gustafson P, Levy AR, and Grootendorst P (2008). Use of instrumental variables in the analysis of generalized linear models in the presence of unmeasured confounding with applications to epidemiological research. Statistics in Medicine, 27, 1539–1556. [42] Jones AM (2007). Applied econometrics for health economists: a practical guide. Radcliffe Medical Publishing: London. [43] Lewis DA and Salem G (1986). Fear of crime: incivilty and the production of a social problem. New Brunswick, NJ: Transaction Books. [44] Lee LF (1984). Tests for the bivariate normal distribution in econometric models with selectivity. Econometrica, 52, 843–863. [45] Lee BK, Lessler J, and Stuart EA (2010). Improving propensity score weighting using machine learning. Statistics in Medicine, 29, 337–346.
37
[46] Lee BK, Lessler J, and Stuart EA (2011). Weight trimming and propensity score weighting. Public Library of Science, 6, e18174. [47] Little R (1985). A note about models for selectivity bias. Econometrica, 53, 1469–1474. [48] Loomes G (2007). Valuing reductions in the risk of being a victim of crime: the ‘willingness to pay’ approach to valuing the ‘intangible’ consequences of crime. International Review of Victimology, 14, 237–251. [49] Lumley T (2011). Survey: Analysis of complex survey samples. R Package version 3.24. [50] McIntyre SG and Lacombe DJ (2012). Personal indebtedness, spatial effects and crime. Economics Letters, 117, 455–459. [51] Maddala GS (1983). Limited Dependent and Qualitative Variables in Econometrics. Cambridge University Press: Cambridge. [52] Maffei S and Merzagora Betsos I (2007). Crime and criminal policy in Italy: tradition and modernity in a troubled country. European Journal of Criminology, 4, 461–482. [53] Marra G, and Radice R (2010). Penalised regression splines: theory and application to medical research. Statistical Methods in Medical Research, 19, 107–125. [54] Marra G, and Radice R (2011). Estimation of a semiparametric recursive bivariate probit model in the presence of endogeneity. Canadian Journal of Statistics, 39, 259– 279. [55] Marra G, Miller DL, and Zanin L (2012). Modelling the spatiotemporal distribution of the incidence of resident foreign population. Statistica Neerlandica, 66, 133–160 [56] Marra G, and Radice R (2012). SemiParBIVProbit: semiparametric bivariate probit modelling. R package version 3.0. [57] Marra G and Wood S (2012). Coverage properties of confidence intervals for generalized additive model components. Scandinavian Journal of Statistics, 39, 53–74 38
[58] McCaffrey DF, Ridgeway G, and Morral AR (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9, 403–425. [59] Michalos AC and Zumbo BD (2000). Criminal victimization and the quality of life. Social Indicators Research, 50, 245–295. [60] Misztal B (1996). Trust in Modern Societies: The Search for the Bases of Social Order. New York: Polity Press. [61] Monfardini C and Radice R (2008). Testing exogeneity in the bivariate probit model: A monte carlo study. Oxford Bulletin of Economics and Statistics, 70, 271–282. [62] Murphy A (2007). Score tests of normality in bivariate probit models. Economics Letters, 95, 374–379. [63] Kang J, and Schafer J (2007). Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22, 523-580. [64] Knack S and Keefer P (1997). Does social capital have an economic payoff? A crosscountry investigation. The Quarterly Journal of Economics, 112, 1251–1288. [65] Knack S (2003). Groups, growth and trust: cross-country evidence on the Olson and Putnam hypothesis. Public Choice, 117, 341–355. [66] OECD (2011). Divided we stand: why inequality keeps rising. OECD Publishing, Paris. [67] OECD (2012). Economic Policy Reforms: Going for Growth 2012. OECD Publishing, Paris. [68] Putnam RD (2000). Bowling alone: the collapse and revival of American community. New York: Simon and Schuster.
39
[69] Putnam RD (2007). E Pluribus Unum: diversity and community in the twenty-first century. The 2006 Johan Skytte prize lecture. Scandinavian Political Studies, 30, 137– 174. [70] Richey S (2010). The impact of corruption on social trust. American Politics Research, 38, 676–690. [71] Ridgeway G. (1999). The state of boosting. Computing Science and Statistics, 31, 172181. [72] Ridgeway G, McCaffrey DF, and Morral AR. (2010). Twang: Toolkit for Weighting and Analysis of Nonequivalent Groups. R package version 1.0-2. [73] Robins JM (1994). Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics: Theory and Methods, 23, 2379– 2412. [74] Rosenbaum P, and Rubin D (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55. [75] Rosenbaum P (1987). Model-based direct adjustment. Journal of the American Statistical Association, 82, 387-394. [76] Ross CE and Jang SJ (2000). Neighborhood disorder, fear, and mistrust: the buffering role of social ties with neighbors. American Journal of Community Psychology, 28, 401–420. [77] Ross CE, Mirowsky J, and Pribesh S (2001). Powerlessness and the amplification of threat: neighborhood disadvantage, disorder, and mistrust. American Sociological Review, 66, 568–591. [78] Ross CE, Mirowsky J, and Pribesh S (2002). Disadvantage, disorder, and urban mistrust. City & Community, 1, 59–82.
40
[79] Rothstein B and Eek D (2009). Political corruption and social trust: an experimental approach. Rationality and Society, 21, 81–112. [80] Rubin DB (1997). Estimating causal effects from large data sets using propensity scores. Annals of Internal Medicine, 127, 757–763. [81] Salmi V, Smolej M, and Kivivuori J (2007). Crime victimization, exposure to crime news and social trust among adolescents. Young, 15, 255–272. [82] Sampson RJ and Groves WB (1989). Community structure and crime: testing socialdisorganization theory. American Journal of Sociology, 94, 774–802. [83] Sampson RJ, Raudenbush SW, and Earls F (1997). Neighborhoods and violent crime: a multilevel study of collective efficacy. Science, 277, 918–924. [84] Sampson RJ and Raudenbush SW (1999). Systematic social observation of public spaces: a new look at disorder in urban neighborhoods. American Journal of Sociology, 105, 603–651. [85] Shah BR, Laupacis A, Hux JE, and Austin PC (2005). Propensity score methods give similar results to traditional regression modelling in observational studies: a systematic review. Journal of Clinical Epidemiology, 58, 550–559. [86] Skogan, WG (1990). Disorder and decline. Berkeley: University of California Press. [87] Stiles LB, Halim S and Kaplan BH (2003). Fear of crime among individuals with physical limitations. Criminal Justice Review. 28, 232–253. [88] Stuart EA, and Rubin DB (2007). Best practices in quasi-experimental designs: matching methods for causal inference. Best Practices in Quantitative Social Science. Sage Publications: New York, 573–176. [89] Stuart EA (2010). Matching methods for causal inference: A review and a look forward. Statistical Sciences, 25, 1–21. 41
[90] Torpe L and Lolle H (2011). Identifying social trust in cross-country analysis: do we really measure the same?. Social Indicators Research, 103, 481–500. [91] Ukoumunne OC, Williamson E, Forbes AB, Gulliford MC, Carlin JB (2010). Confounder-adjusted estimates of the risk difference using propensity score-based weighting. Statistics in Medicine, 29, 3126–3136. [92] Vansteelandt S, and Goetghebeur E (2003). Causal inference with generalized structural mean models. Journal of the Royal Statistical Society Series B, 65, 817–835. [93] Vollaard B (2012). Preventing crime through selective incapacitation. The Economic Journal. DOI: 10.1111/j.1468-0297.2012.02522.x [94] Weisburd D and Eck EJ (2004). What can police do to reduce crime, disorder, and fear?. The Annals of the American Academy of Political and Social Science, 593, 42–65. [95] Welch MR, Rivera REN, Conway BP, Yonkoski J, Lupton PM, and Giancola R (2005). Determinants and consequences of social trust. Sociological Inquiry, 75, 453–473. [96] Wilde J (2000). Identification of multiple equation probit models with endogenous dummy regressors. Economics Letters, 69, 309–312. [97] Wilson WJ (1996). When work disappears. The world of the new urban poor. New York: Alfred A. Knopf. [98] Wilson QJ and Petersilia J (2011). Crime and public policy. Oxford University Press, New York. [99] Wood SN (2006). Generalized additive models: An Introduction with R. London: Chapman & Hall. [100] Witte AD (1996). Urban crime: issues and policies. Housing Policy Debate, 7, 731–748. [101] Wooldridge JM (2010). Econometric analysis of cross section and panel data. MIT Press, Cambridge. 42
[102] Zanin L (2011). Detecting unobserved heterogeneity in the relationship between subjective well-being and satisfaction in various domains of life using the REBUS-PLS path modelling approach: a case study. Social Indicators Research, DOI: 10.1007/s11205-0119931-5 [103] Zak PJ and Knack S (2001). Trust and growth. The Economic Journal, 111, 295–321. [104] Zhao Z (2004). Using matching to estimate treatment effects: data requirements, matching metrics, and Monte Carlo evidence. Review of Economics and Statistics, 86, 91-107.
43