Article
The Effect of Survey Mode on Socially Undesirable Responses to Openended Questions: A Mixed Methods Approach
Field Methods 2018, Vol. 30(2) 105-123 ª The Author(s) 2018 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/1525822X18766284 journals.sagepub.com/home/fmx
Danielle Wallace1, Gabriel Cesar2, and E. C. Hedberg3
Abstract We estimate the difference in the chance of giving a socially undesirable response, one that violates social norms, by administration mode (online or in person) for a self-administered questionnaire (SAQ). We do this by evaluating open-ended responses to a photographic stimulus designed to generate undesirable responses. We test five variables: socially undesirable responses generally, blatant stereotyping, distaste, inappropriate descriptors, and the number of words in the response. The results show that, contrary to expectations, online SAQs are not more likely to produce socially
1 2 3
School of Criminology and Criminal Justice, Arizona State University, Phoenix, AZ, USA School of Criminology and Criminal Justice, Florida Atlantic University, Boca Raton, FL, USA NORC at the University of Chicago, Chicago, IL, USA
Corresponding Author: Danielle Wallace, School of Criminology and Criminal Justice, Arizona State University, 411 N. Central Ave., Phoenix, AZ 85004, USA. Email:
[email protected]
106
Field Methods 30(2)
undesirable responses generally, but online responses are less likely to contain blatant stereotyping. Responses from the online SAQs were longer, suggesting that respondents from the paper-and-pencil SAQs may have used stereotypes to bypass writing lengthy responses. The longer responses of online respondents enabled them to avoid stereotyping, thereby biasing their responses in an alternate way.
Introduction Social desirability is understudied in open-ended questions. Only a few studies exist that compare social desirability in open-ended questions to other questions types (Allison et al. 2002; Ivis et al. 1997), and, while important, these studies do not consider various modes of administration—such as online versus paper-and-pencil surveys—as potentially impacting social desirability. Furthermore, to the best of our knowledge, there are no studies that address the existence of nonfiltered or politically incorrect responses in open-ended questions. This article addresses this issue using a mixed method approach to examine social desirability across two modes of data collection: online and paper-and-pencil selfadministered questionnaires (SAQs). Attention has been paid to socially desirable responses or responses where the respondent is not truthful so that they are not seen in a negative light (Holbrook et al. 2003; Kuran 1995; Tourangeau and Smith 1996; Tourangeau and Yan 2007). However, there may be variability in response social desirability by mode of administration. Online SAQs or surveys, as opposed to surveys delivered in person, offer the ability to answer questions without face-to-face interactions. This allows respondents to express opinions or feelings they may not feel are socially desirable or that make them uncomfortable when in person. Thus, online surveys may be less likely to produce socially desirable (i.e., filtered or politically correct) responses on prosocial topics such as church attendance (Presser and Stinson 1998; Richman et al. 1999; Tourangeau and Yan 2007). Without the face-to-face cues individuals rely on to guide social interactions, online surveys give respondents the ability to answer survey questions without feeling the social presence of the researcher, enhancing the sense of privacy individuals feel when taking the survey. Thus, people may respond in socially unacceptable ways (in this study, we call this social undesirability) that would typically not occur in some face-to-face settings. Further, evidence exists showing that the Internet disinhibits individuals
Wallace et al.
107
(Cassidy et al. 2009; Herring et al. 2002; Phillips 2011); thus, there is reason to suspect that this disinhibition may find its way into responses (Christopherson 2007). In short, the lack of social contact coupled with feelings of enhanced privacy may lead individuals to give more candid (Kreuter et al. 2008) and perhaps socially inappropriate responses to survey questions when taking the survey online. Conversely, online surveys may elicit more socially desirable responses from individuals, or at least less informative responses. Due to increased concern over privacy on the Internet, online surveys may make subjects more leery about the confidentiality and anonymity that surveys typically offer (Fogel and Nehmad 2009; Young and Quan-Haase 2009). As such, their responses might reflect elements of social desirability to control personal information. To reconcile this paradox, we assess whether taking an SAQ online influences the likelihood of giving a socially undesirable response. We explore this issue through open-ended responses to an SAQ regarding neighborhood disorder, defined here as visual cues in a neighborhood’s physical environment that convey the potential for crime and safety issues (Skogan 1990; Wallace 2015). We present respondents with a photographic stimulus expected to generate socially undesirable responses from respondents. Respondents are asked to freewrite a response to the photographic stimulus. Using a mixed method approach, we then code the responses for elements of social undesirability. Because the SAQ has a mixed administration mode—paper and online—it provides an opportunity to measure whether the level of social undesirability differs by mode, and, if it does, by what aspects of undesirability. While there has been a wealth of research on social desirability in surveys, most of that work is quantitative. Scholars have a well-defined sense of when social desirability appears in surveys. Similarly, the moments where, in qualitative research, the researcher potentially influences data, such as during interviews or even through the researcher’s presence, have also been well-documented. However, to our knowledge, there has been limited work addressing the presence of social desirability in open-ended questions. Here lies the major contribution of our study. Mixed method approaches are excellent for adding insight into complex research problems (Branigan 2002; Creswell 2013; Stewart et al. 2008), and the issue of social desirability is one of these problems, given the increasing number of sensitive topics being broached in surveys (Tourangeau and Smith 1996; Tourangeau and Yan 2007). Without the mixed method approach of this study, or specifically examining the open-end answers to a photographic stimulus,
108
Field Methods 30(2)
the element of social undesirability we found would not be readily observable. This study capitalizes on one of the major benefits of mixed method research: allowing one method to connect with another for a more nuanced and powerful exploration of a common research dilemma such as social desirability (Mayoh and Onwuegbuzie 2013).
Social Desirability and Survey Mode Survey mode may have an impact on how individuals respond to sensitive questions (Christopherson 2007; Tourangeau and Smith 1996). Of particular interest here is social undesirability in responses, or when responses to survey questions violate social norms that surround the context of the question. Consider a question that asks respondents to describe their ideal racial composition of their neighborhood; a socially undesirable response to this question might include racial slang, stereotyping, or expressions of disgust. While the respondents could have expressed their view without their underlying racial attitudes being known, they chose not to. Below, we discuss why online surveys may have an impact on the prevalence of socially undesirable responses.
Web Surveys Generate Socially Undesirable Responses Scholarship shows that the online environment can be rife with bullying, name calling, and harassment (Cassidy et al. 2009; Herring et al. 2002; Phillips 2011). Because of the enhanced sense of privacy people may feel when using the Internet, individuals demonstrate little restraint when discussing opinions, attitudes, or behaviors that would be unacceptable in faceto-face situations. As such, online surveys offer social anonymity or “the perception of others and/or one’s self as unidentifiable because of a lack of cues to use to attribute an identity to that individual” (Christopherson 2007:3040). For instance, Coffey and Woolworth (2004) found that when comparing a town hall meeting concerning a local problem to an anonymous online forum about the same local problem, the forum contained overtly racist and violence-laden comments that were not expressed in the town hall. Coupled with the many ways in which the web impacts social interactions, the online SAQ format may impact responses to socially delicate topics. Particularly with open-ended responses, the online mode may present respondents with an opportunity to feel less pressure to respond in a prosocial way.
Wallace et al.
109
Additionally, some research suggests that respondents are more open to responding to sensitive questions when taking surveys on the web than in other, non-Internet-based settings (Kreuter et al. 2008; Kuran and McCaffery 2004). When presented questions regarding discrimination over the phone and online, Kuran and McCaffery (2004) found that people were more willing to discuss discrimination when taking an online survey. Similarly, Kreuter et al. (2008) found that respondents were more likely to report sensitive information when taking an online survey. Thus, if the subject is controversial or if the respondent wants to discuss the topic in a controversial manner, respondents may be more willing to engage the topic in an online survey format.
Web Surveys Generate Fewer Socially Undesirable Responses Individuals are becoming savvier about privacy on the Internet. Scholarship demonstrates individuals’ growing concerns about privacy, particularly in regard to social networking websites such as Facebook (Fogel and Nehmad 2009; Young and Quan-Haase 2009). Online respondents may be more concerned about the disclosure of both personal information and their response to sensitive questions than individuals taking paper surveys, regardless of whether the survey offers confidentiality and/or anonymity. Research on web-based surveys shows that respondents’ perceived sensitivity of some items is more extreme when taking online surveys (Kreuter et al. 2008), perhaps due to an increased realization that the Internet is not private. Individuals’ answers to sensitive questions may be less accurate and more socially desirable to control content and privacy.
Current Study This study examines whether mode of administration—online versus paper—for an SAQ has an impact on the likelihood of a respondent delivering a socially undesirable response. Given research showing that the online environment is conducive to violating social norms (Cassidy et al. 2009; Herring et al. 2002; Phillips 2011), we hypothesize that respondents taking the online SAQ will be more likely to give socially undesirable responses than those who took the SAQ on paper. To do this, we use a photographic stimulus of neighborhood disorder that is expected to generate socially undesirable responses, given disorder’s ties to racial/ethnic stereotypes (Sampson and Raudenbush 2004; Wacquant 1993). We then test whether subjects’ responses to an open-ended question in the online version of the
110
Field Methods 30(2)
SAQ were more likely to result in a socially undesirable response than subjects’ responses from the paper version. It is important to note that while we examine the likelihood of socially undesirable responses across modes, we are not able to ascertain why individuals respond in certain ways. The well-established link between disorder, race, and crime is the primary reason behind the specific formulation of our hypothesis. While space limitations prohibit a lengthy examination of individuals’ perceptions of disorder (see Wallace et al. [2014] for a discussion), we briefly discuss why individuals’ disorder perceptions may be tied to socially undesirable responses. Disorder is most prominent in crime-ridden, minority, and disadvantaged neighborhoods (Sampson and Raudenbush 2004). Due to the strong and durable link between race, poverty, crime, and racial segregation in the United States (Wacquant 1993), research demonstrates that individuals view the racial composition of a neighborhood as a primary indicator of disorder and the potential for crime and victimization (Sampson and Raudenbush 2004). This link ties disorderly neighborhoods to ideas of race, racism, and the ghetto or “ghettoness” (Anderson 2012). Through the process of ecological contamination, residents of disorderly neighborhoods are assumed to be minority, criminal, poor, and without morals (Werthman and Piliavin 1967). Given that our stimulus hints to these processes, our hypothesis speculates that respondents from the online SAQ will be more likely to give socially undesirable responses than their nononline counterparts, where responses reference many of the above themes.
Method Data Our data are from the Perceptions of Neighborhood Disorder and Interpersonal Conflict Project (PNDICP), a study that investigates individuals’ perceptions and interpretations of neighborhood contexts and interpersonal conflicts. Students from a large southwestern university in the United States were recruited for the study. Because the university from which the sample was drawn conducts many classes online, two modes of administration were created: paper SAQ and online SAQ. Some students were enrolled in more than one class recruited for the study. The total sample included 1,371 currently enrolled, nonreplicated students. The respondents totaled 1,056, yielding a response rate of 77%. The mode-specific response rates differed, with an 86% response rate for paper SAQ and a 67% response rate for online SAQ. We calculated the response rate for each course surveyed using
Wallace et al.
111
the number of students currently enrolled in the course as the denominator; thus, our response rates do not distinguish between not attending the class on the day of the administration and refusal. The lowest response rate was for an online course with 42% of students responding, with the highest response rate corresponding to an on-campus class at 98% of students responding. All instructors who participated in the administration of the survey were given the option to offer extra credit to their students for participation. The web survey was administered using Qualtrics, which allows for complex randomization designs and contingency questions. Nearly 55% of respondents took the online survey.
Respondents Nearly 83% of respondents reported being between the ages of 20 and 29, while 86% reported being single. Variation in race and ethnicity was more pronounced: 53% of the respondents were white, about 30% of the respondents were Hispanic, and 7% were black. Other races in the sample included Native Americans, Asians, and Other. While a student sample lacks generalizability, the admission rate of the university where the study was conducted was approximately 89% (Reisig and Pratt 2011); thus, the sample was heterogeneous enough to generate differences across respondents. The final sample size of open-ended responses to the stimuli photograph in the PNDICP was 1,019.
Stimulus Our stimulus, as seen in Figure 1, is a high-resolution, color photograph of a western Chicago neighborhood that is low- to lower-middle class and largely Hispanic. The photograph contained a religious mural, street art, and graffiti. The murals tacitly cued the ethnic composition of the neighborhood. Upon seeing the photograph, respondents were asked to give their immediate or gut reaction and finish the following statement: “Based on this photograph, I get the sense that this neighborhood is . . . .” Given that individuals’ disorder perceptions do vary (Sampson and Raudenbush 2004; Wallace 2015; Wallace et al. 2014), the photographic experiments1 in the PNDICP were designed to hold the neighborhood constant to inhibit bias coming from one’s personal neighborhood context. Consequently, interpretations of the photograph are attributable to the individual and not neighborhood context in which they currently live.
112
Field Methods 30(2)
Figure 1. The photographic stimulus.
Coding Social Undesirability The coding process began by coders identifying and discussing common themes emerging from the open-ended responses. The coding team immersed themselves in the responses by reading and entering the data.
Wallace et al.
113
Coding meetings involved an iterative process where each coder suggested themes while examining the data (Maxwell 2005; Saldan˜a 2009). Understanding that the operationalization of social undesirability is subjective (Crowne and Marlowe 1960; Edwards 1957), initial research meetings were focused on flagging responses as socially undesirable and then determining the emergent themes of social undesirability in the responses. Once these types of undesirability were established, MAXQDA version 10.0 was used to organize the data and attach codes to responses. Responses were coded as socially undesirable when they appeared to violate some social norm surrounding the context of the stimulus photograph (i.e., the responses violated acceptable ways to describe or discuss a neighborhood). These responses tended to be culturally or racially insensitive, include slang or swearing, or demonstrated the respondent’s dislike of the neighborhood. Approximately 20% of responses contained some element of social undesirability. Undesirable responses were categorized into three types: inappropriate descriptors, blatant stereotyping, and express distaste. Inappropriate descriptors include responses containing profanity such as “shit hole” and “gun-toting assholes.” Responses that contained culturally insensitive terms, like ghetto, were also coded as inappropriate descriptors.2 Blatant stereotypes included responses that overtly link certain types of people to the places in the photographs. Responses like “lazy people live here” and “dominated by guilt-ridden Hispanic Catholics” typify the presence of blatant stereotyping in the responses. This coding practice is strengthened by the photograph itself: The stimulus does not portray individuals; thus, any reference to people in the responses was derived only from the physical cues in the photograph. Finally, express distaste responses included expressions of distaste, disapproval, or a disinclination to participate in a community context. “OH, NO!” is a typical example of the use of express distaste in responses in the current sample; the capitalization and exclamation point also drive the tenor of the response. The reliability of the coding was assessed using Cohen’s k statistic, which equaled .68 and demonstrated 88.6% agreement among coders.
Analysis Strategy Our analysis strategy explicitly seeks to compensate for the fact that online classes, thus online SAQs, were not assigned randomly. This required us to consider the propensity of taking the SAQ online (i.e., treatment) in our estimate of the mean difference between online and paper SAQs. To estimate the propensity weights, we fit a logit model predicting the likelihood
114
Field Methods 30(2)
of taking the SAQ online using demographic variables, specifically whether the student lived in the same metropolitan area as the university, and their race, age, and gender. When age was missing (7.5% of the sample), the missing value was replaced with the mean (24 years old). When race or gender values were missing, a nonresponse category was added to the factor variable used in the analysis procedures. The location indicator was an important measure because some online students lived in a different region of the country. We then predicted the individual’s propensity to take an online SAQ (Klausch et al. 2015; Vannieuwenhuyze and Loosveldt 2013) and created weights that are equal to: Xi 1 Xi 1 if i took survey online ; Xi ¼ ; e ¼ propensity; wi ¼ þ 0 otherwise ei 1 ei where Xi is a dichotomous variable indicating that the respondent took the SAQ online, ei is the predicated probability, based on the logistic regression of whether the respondent took the SAQ online, and wi is the weight for respondent i. To assess the effectiveness of our weights in balancing the covariates between online and paper SAQs, we examined the standardized difference (Rosenbaum and Rubin 1985) in the covariates between the groups (online vs. paper) without weights (the raw difference) and while employing weights (the weighted difference). We also specified models with the number of words included in the logistic model used to derive the weights. We found that the effects reported were very similar, except our significant effect for blatant stereotyping was somewhat stronger and more significant. To be conservative, we report the findings without this adjustment. Next, to quantify the plausible effects of administration mode on the nature of the responses, we used the derived weights to estimate two quantities. The first is an estimate of the average treatment effect (ATE), which is an estimate of the plausible average difference between online and paper SAQs generically for all individuals. To examine the robustness of these results, we also estimate the average treatment effect on the treated (ATET), which is a more nuanced estimated of how the treatment (online SAQs) impacted those who received the online SAQ. A difference between these two quantities speaks to the possibility that the treatment effects are moderated by the characteristics specific to the treated. These effects were estimated using a doubly robust method (Bang and Robins 2005); not only were the weights employed based on the set of covariates but the covariates were also employed using regression adjustment. Robust standard errors
Wallace et al.
115
were also estimated to quantify the uncertainty of the estimated mean difference. The entire analysis procedure was carried out using Stata version 14’s “teffects ipwra” routine. Statistical testing was accomplished by constructing the ratio of the estimated difference to its standard error and then comparing this ratio to the normal distribution. Our analysis focused on five outcomes. The first was a dichotomous outcome measuring whether a respondent’s answer was categorized as socially undesirable. The next three outcomes, which were also dichotomous, are the types of socially undesirable responses, specifically inappropriate descriptors, blatant stereotyping, and express distaste. Finally, to test whether the results of the analyses of these four outcomes were the result of the method of inputting the responses (i.e., typing the responses for the online SAQ or using a writing instrument write them down on paper for the paper SAQ), we also tested the treatment effects on the number of words used in the responses, calculated as the number of words in the response (not characters). The dichotomous outcomes were analyzed using logistic regression, but the results are presented as differences in the probability. The number of words outcome was analyzed using a linear estimator.
Results Table 1 presents the characteristics of the sample, the standardized raw mean differences on the set of demographic covariates, and the standardized mean differences after the weighting adjustment. The majority (93%) of the sample lives in a zip code in the university metropolitan area; respondents who took the online survey were less likely to live in a local zip code. About half of the respondents are white and almost a third are Hispanic. The remaining sample is primarily black, Asian, or Other. The average age is about 24, with the sample almost evenly split between males and females. Next in Table 1, the weighting procedure reduced the magnitude of the difference in social undesirability across administration mode for respondents living locally by two-thirds. Hispanics were less likely to participate in the SAQ online, while females were more likely to take the SAQ online, though these biases were removed using the weighting procedure. For other demographic measures, the weighting procedure eliminated a large portion of the standardized difference in socially undesirable responses by mode. Exceptions to this include the American Indian, race/ethnicity nonresponse indicator (which had small proportions, so achieving balance is difficult), and age; the difference in scores here was reduced by only 60%.
116
Field Methods 30(2)
Table 1. Descriptive Statistics of Sample, Raw Standardized Differences between Online and Paper Surveys, and Weighted Standardized Differences between Online and Paper SAQs.
Demographics Local zip code Race White (reference) Black Hispanic Asian American Indian Other Nonresponse Age Gender Male (reference) Female Non-response
Raw Standardized Weighted Standardized Difference between Difference between Online and Paper Sample Mean Online and Paper 0.934
.169
.053
0.522 0.070 0.285 0.026 0.017 0.063 0.018 24.033
— .020 .293 .056 .022 .001 .005 0.374
— .004 .001 .001 .038 .014 .008 0.150
0.468 0.525 0.007
— .280 .009
— .005 .004
Note: N ¼ 1,019 respondents: 556 online SAQs, 463 paper SAQs. SAQs ¼ self-administered questionnaires.
Table 2 presents the estimated treatment effects using the weighting/ regression procedure for all five outcomes. For most of the outcomes, the estimated differences between online and paper SAQs are not statistically significant for the socially undesirable outcomes. The exception is blatant stereotyping, which was less likely to occur in online responses by about 5%, using the ATE estimator. This effect was only marginally significant using the average treatment on the treated estimator, but substantively the same. The other significant finding pertains to the number of words used model. Here, online responses to the SAQ were about five words longer. This was significant for both the ATE and ATET estimators.
Discussion Since the upsurge in the usage of online surveys, survey researchers have been concerned with whether online SAQs generate different responses than paper SAQs. The online environment may potentially generate malicious responses because of the unique social environment of the Internet,
Wallace et al.
117
Table 2. Estimated Treatment Effects. Outcomes Sample mean Online mean Paper mean ATE ATET
Any Response
Inappropriate Descriptors
Blatant Stereotyping
Express Distaste
Number of Words
.147 .128 .171 .035 .024 .025 .029
.063 .063 .063 .01 .014 .019 .013
.097 .07 .13 .054* .022 .052þ .026
.038 .036 .041 .003 .011 .002 .011
13.742 16.23 10.754 5.430*** 0.731 5.483*** 0.786
Note: N ¼ 1,019 respondents: 556 surveys online, 463 surveys paper. Robust standard errors are in parentheses. ATE ¼ average treatment effect; ATET ¼ average treatment effect on the treated. *p < .10. **p < .05. ***p < .001.
where individuals can hide their identity and be dishonest without many consequences (de Leeuw 2012; Miller et al. 2002; Richman et al. 1999). Thus, understanding if and when social (un)desirability in responses occurs is a concern (Christopherson 2007). We address the issue of social undesirability across administration mode by using a mixed method design involving qualitative coding and a photographic stimulus. Our primary findings are discussed below. To begin with, we found that the online SAQ respondents were less likely to express socially undesirable responses in the form of blatant stereotyping than our respondents who took the paper SAQ. Thus, our hypothesis is unsupported. Evidence from online communities and forums would suggest more socially undesirable responses would come from the online SAQ respondents (Blevins and Holt 2009; Christopherson 2007; Hardaker 2010; Herring et al. 2002; Phillips 2011); however, we found the opposite. Thus, for stimuli aimed at eliciting a potentially sensitive and socially undesirable response, online SAQs are unique in that respondents offered more socially acceptable responses. These results lend support to research suggesting Internet consumers are growing more aware of online privacy issues (Fogel and Nehmad 2009; Young and Quan-Haase 2009) and may be more likely to think that online surveys are not fully anonymous or confidential than previously. Research needs to be conducted to ascertain whether this is indeed the case. This
118
Field Methods 30(2)
finding also supports work that demonstrates respondents have heightened concern over sensitive questions when fielded in an online survey (Kreuter et al. 2008). While online SAQs enable the creation of cheap and easy to conduct surveys, a consequence of mode of administration may be that respondents are less trusting of our assurances of anonymity and confidentiality. For paper SAQs, claims of privacy have face validity, whereas respondents may not perceive online SAQs as being anonymous and confidential, despite researchers’ assurances. Next, respondents for the paper SAQ were far more likely to offer blatant stereotyping; this speaks to the mode of the SAQ influencing detail in responses (Kiesler and Sproull 1986; Wright 2005). Distasteful language, slang, and stereotyping are all examples of complex ideas or processes that have been condensed. In the paper SAQ, there was finite space to handwrite responses, which may have promoted socially undesirable “shorthand.” Further, since the online SAQs are taken at the time of the respondent’s choosing, online respondents may have been less likely to rush through the survey. Furthermore, typing is more comfortable than writing for many individuals since typing is now common in many individuals’ lives, especially students in an online class. In sum, despite research finding online SAQs to be associated with shorter responses, the increasing ubiquity of mobile computing devices (Couper 2013) may have led to the longer responses we found for online SAQs. Our second finding, that online SAQs elicited more words, supports these explanations. The nonsignificant findings are also of note in this study. While we find a difference in blatant stereotyping across the two modes of administration, there are no differences generally or with the other two types of socially undesirable responses across mode of administration. Thus, when responding to a contentious topic, photograph, or other stimulus likely to elicit socially undesirable responses, respondents are not more likely to respond a specific way depending on the mode. On the other hand, concerns about respondents being freer with socially inappropriate language on web-based surveys may be unfounded. Our study has consequences for survey administration as well. We find that online SAQs may impact response content in open-ended questions by encouraging more detailed responses that avoid socially undesirable linguistic shortcuts. Thus, concern over online surveys inspiring malicious responses may be unnecessary. While survey administrators still need to carefully consider their topic in conjunction with socially (un)desirable responses, comparing open-ended questions across SAQ mode may not
Wallace et al.
119
pose problems, so long as differences in word counts of responses across modes are examined. There were limitations to our study worth noting. First, the response rates for the paper and online SAQ were quite different (86% vs. 67%). This signals that there may be an unobserved confounder that is spurious to our findings and willingness to take the survey. This may not invalidate the study but should signal some caution in generalizability. Second, while we endeavored to increase internal validity with our propensity adjustments, our sample lacks external validity. However, the prevalence of substantive variations in responses to the stimulus should be minimal in such a constrained sample. Yet the variability that we saw across responses suggests that this variation, and any differences in survey mode found here, would likely be replicated in the general population. Work of this kind should be replicated using more representative samples. Next, it may be that our propensity model failed to balance adequately the online and paper SAQ responses (the variance explained by the predictors is only 6%) and that those who were likely to take online courses (older, white, and females) were simply more polite than or as polite as the paper respondents. Another limitation of the current study was its reliance on a sample of criminal justice students to interpret neighborhood disorder. A small number of responses included direct references to concepts such as “broken windows” or “gang affiliated.” However, given that only a few subjects responded in such a way, we remain relatively unconcerned about this influence on responses. In conclusion, this study demonstrates that the mode of administration of SAQs—whether online or on paper—is important for socially undesirable responses, particularly responses that imply stereotyping individuals and neighborhoods. Future research would do well to focus on how perceptions of Internet anonymity shape respondents’ willingness to offer truthful, sensitive, or socially undesirable responses. Understanding differences in responses and social desirability by mode is vital to planning future surveys, particularly those assessing delicate topics.
Acknowledgments The authors thank Andrea Borrego and Taylor Bianchino for their excellent work with data gathering, entry, and assistance with thematic coding, as well as Alyssa Chamberlain for her helpful comments. Additionally, we would like to thank the three reviewers of this piece for their thoughtful and excellent suggestions.
120
Field Methods 30(2)
Declaration of Conflicting Interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the College of Public Service and Community Solutions at Arizona State University.
Notes 1. The project involved two experiments combined in one instrument. The first, a factorial vignette design, randomly assigned several measures of neighborhood cultural questions into vignettes. The second experiment involved testing perceptions and interpretations of disorder and how best to measure them. The first experiment in the project was not tested in this study. 2. There was extensive discussion among the coding team over the inclusion of the word ghetto in the inappropriate descriptors category. Given both the historical and the contemporary cultural context of the term ghetto (Wacquant 1993), as well as the enduring effects of popular perceptions of ghettoness (Anderson 2012), we argue that the term ghetto is not a socially appropriate way of describing a neighborhood.
References Allison, L. D., M. A. Okun, and K. S. Dutridge. 2002. Assessing volunteer motives: A comparison of an open-ended probe and Likert rating scales. Journal of Community & Applied Social Psychology 12:243–55. Anderson, E. 2012. The iconic ghetto. The ANNALS of the American Academy of Political and Social Science 642:8–24. Bang, H., and J. M. Robins. 2005. Doubly robust estimation in missing data and causal inference models. Biometrics 61:962–73. Blevins, K. R., and T. J. Holt. 2009. Examining the virtual subculture of Johns. Journal of Contemporary Ethnography 38:619–48. Branigan, E. 2002. But how can you prove it? Issues of rigour in action research. Stronger Families Learning Exchange Bulletin 2:12–13. Cassidy, W., M. Jackson, and K. N. Brown. 2009. Sticks and stones can break my bones, but how can pixels hurt me? Students’ experiences with cyber-bullying. School Psychology International 30:383–402.
Wallace et al.
121
Christopherson, K. M. 2007. The positive and negative implications of anonymity in Internet social interactions: “On the Internet, nobody knows you’re a dog.” Computers in Human Behavior 23:3038–56. Coffey, B., and S. Woolworth. 2004. “Destroy the scum, and then neuter their families”: The web forum as a vehicle for community discourse? The Social Science Journal 41:1–14. Couper, M. P. 2013. Is the sky falling? New technology, changing media, and the future of surveys. Survey Research Methods 7:145–56. Creswell, J. W. 2013. Research design: Qualitative, quantitative, and mixed methods approaches. Thousand Oaks, CA: Sage. Crowne, D. P., and D. Marlowe. 1960. A new scale of social desirability independent of psychopathology. Journal of Consulting Psychology 24:349. de Leeuw, E. D. 2012. Counting and measuring online the quality of Internet surveys. Bulletin of Sociological Methodology/Bulletin de Me´thodologie Sociologique 114:68–78. Edwards, A. L. 1957. The social desirability variable in personality assessment and research. Academic Medicine 33:610–11. Fogel, J., and E. Nehmad. 2009. Internet social network communities: Risk taking, trust, and privacy concerns. Computers in Human Behavior 25:153–60. Hardaker, C. 2010. Trolling in asynchronous computer-mediated communication: From user discussions to academic definitions. Journal of Politeness Research. Language, Behaviour, Culture 6:215–42. Herring, S., K. Job-Sluder, R. Scheckler, and S. Barab. 2002. Searching for safety online: Managing “trolling” in a feminist forum. The Information Society 18: 371–84. Holbrook, A. L., M. C. Green, and J. A. Krosnick. 2003. Telephone versus face-toface interviewing of national probability samples with long questionnaires: Comparisons of respondent satisficing and social desirability response bias. Public Opinion Quarterly 67:79–125. Ivis, F. J., S. J. Bondy, and E. M. Adlaf. 1997. The effect of question structure on self-reports of heavy drinking: Closed-ended versus open-ended questions. Journal of Studies on Alcohol 58:622–24. Kiesler, S., and L. S. Sproull 1986. Response effects in the electronic survey. Public Opinion Quarterly 50:402–13. Klausch, T., B. Schouten, and J. J. Hox. 2015. Evaluating bias of sequential mixedmode designs against benchmark surveys. Sociological Methods & Research 46: 456–89. Kreuter, F., S. Presser, and R. Tourangeau. 2008. Social desirability bias in CATI, IVR, and web surveys: The effects of mode and question sensitivity. Public Opinion Quarterly 72:847–65.
122
Field Methods 30(2)
Kuran, T. 1995. Private truths, public lies: The social consequences of preference falsification. Cambridge, MA: Harvard University Press. Kuran, T., and E. J. McCaffery. 2004. Expanding discrimination research: Beyond ethnicity and to the web. Social Science Quarterly 85:713–30. Maxwell, J. 2005. Qualitative research design: An interactive approach. Thousand Oaks, CA: Sage. Mayoh, J., and A. J. Onwuegbuzie. 2013. Toward a conceptualization of mixed methods phenomenological research. Journal of Mixed Methods Research 9: 91–107. Miller, T. I., M. M. Kobayashi, E. Caldwell, S. Thurston, and B. Collett. 2002. Citizen surveys on the web: General population surveys of community opinion. Social Science Computer Review 20:124–36. Phillips, W. 2011. Meet the trolls. Index on Censorship 40:68–76. Presser, S., and L. Stinson. 1998. Data collection mode and social desirability bias in self-reported religious attendance. American Sociological Review 63:137–45. Reisig, M. D., and T. C. Pratt. 2011. Low self-control and imprudent behavior revisited. Deviant Behavior 32:589–625. Richman, W. L., S. Kiesler, S. Weisband, and F. Drasgow. 1999. A metaanalytic study of social desirability distortion in computer-administered questionnaires, traditional questionnaires, and interviews. Journal of Applied Psychology 84:754. Rosenbaum, P. R., and D. B. Rubin. 1985. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician 39:33–38. Saldan˜a, J. 2009. The coding manual for qualitative researchers. Thousand Oaks, CA: Sage. Sampson, R. J., and S. W. Raudenbush. 2004. Seeing disorder: Neighborhood stigma and the social construction of “broken windows.” Social Psychology Quarterly 67:319–42. Skogan, W. G. 1990. Disorder and decline: Crime and the spiral of decay in American neighborhoods. Berkeley: University of California Press. Stewart, M., E. Makwarimba, A. Barnfather, N. Letourneau, and A. Neufeld. 2008. Researching reducing health disparities: Mixed-methods approaches. Social Science & Medicine 66:1406–17. Tourangeau, R., and T. W. Smith. 1996. Asking sensitive questions: The impact of data collection mode, question format, and question context. Public Opinion Quarterly 60:275–304. Tourangeau, R., and T. Yan. 2007. Sensitive questions in surveys. Psychological Bulletin 133:859.
Wallace et al.
123
Vannieuwenhuyze, J. T. A., and G. Loosveldt. 2013. Evaluating relative mode effects in mixed-mode surveys: Three methods to disentangle selection and measurement effects. Sociological Methods & Research 42:82–104. Wacquant, L. J. D. 1993. Urban outcasts: Stigma and division in the black American ghetto and the French urban periphery. International Journal of Urban and Regional Research 17:366–83. Wallace, D. 2015. A test of routine activities and neighborhood attachment explanations for bias in disorder perceptions. Crime & Delinquency 61:587–609. Wallace, D., B. Louton, and R. Fornango. 2014. Do you see what I see?: Perceptual variation in registering the presence of disorder cues. Social Science Research 51:247–61. Werthman, C., and I. Piliavin. 1967. Gang members and the police. In The police: Six sociological essays, (ed.) D. Bordua, 56–98. New York: Wiley. Wright, K. B. 2005. Researching Internet-based populations: Advantages and disadvantages of online survey research, online questionnaire authoring software packages, and web survey services. Journal of Computer-Mediated Communication 10:1–26. Young, A. L., and A. Quan-Haase. 2009. Information revelation and Internet privacy concerns on social network sites: A case study of Facebook. In Proceedings of the fourth international conference on communities and technologies, (ed.) J. Carroll, 265–74. New York: ACM.