The randomized response technique (RRT) was designed to eliminate distor- tions of answers ... to questions that require a numerical response. Three special ...
Psychological Bulletin 1980, Vol. 87, No. 3, 525-530
Additive Constants Model: A Randomized Response Technique for Eliminating Evasiveness to Quantitative Response Questions Samuel Himmelfarb and Stephen E. Edgell University of Louisville The randomized response technique (RRT) was designed to eliminate distortions of answers to and refusals to respond to questions of a sensitive nature. This article presents a general additive constants RRT model for application to questions that require a numerical response. Three special cases of the additive constants model are described and shown to be more statistically efficient than the most efficient but idealized case of the unrelated question quantitative response model RRT. A few suggestions for implementing the additive constants model are given and discussed. Warner's (1965) randomized response technique (RRT) was designed to eliminate evasiveness in response to questions of a sensitive, possibly embarrassing, or stigmatizing nature that require a dichotomous response. We are indebted to Levy (1976,1977) for calling our and other psychologists' attention to this technique in the Psychological Bulletin. The purpose of this article is to describe a new model of the randomized response technique, the additive constants RRT, that may be applied to questions that require a numerical response. Before describing our model, we briefly review Warner's original proposal. Warner's RRT Warner (1965) was concerned that directly questioning a respondent about a sensitive or socially undesirable matter would lead the respondent to refuse to answer the question or to distort the true response. Warner ingeniously suggested that we not confront the respondent with the sensitive question by itself but with two questions, the sensitive The authors would like to thank Carl Lickteig for his comments on an earlier draft of this article. Requests for reprints should be sent to Samuel Himmelfarb, Department of Psychology, University of Louisville, Louisville, Kentucky 40208.
one and its logical complement. The respondent is given a randomizing device, for example, a die, and told to roll it in a way so that its outcome is concealed from the interviewer. The respondent may be further instructed to answer Question 1 (the sensitive one) if the roll of the die is 3 or higher and to answer Question 2 if a 1 or 2 is obtained. Warner suggested that the RRT, if understood by the respondent, would eliminate response refusals and evasiveness in answering because the technique guarantees the respondent complete privacy: No one can know for certain to which question the answer pertained. Despite the uncertainty, Warner showed that sample estimates of the population parameters could be obtained through the application of elementary probability theory (cf. Levy, 1976, 1977; Warner, 1965, 1971). However, the technique has one serious drawback: The randomization process introduces an additional source of statistical error that makes estimation of population parameters less efficient than in the standard case. Depending on the probability P that the respondent is directed by the device to answer the sensitive question, Warner's RRT might require as much as 25 or more times the number of respondents as in the standard binomial case. Of course, the efficiency of the RRT can be improved by setting higher values
Copyright 1980 by the American Psychological Association, Inc. 0033-2909/80/8703-0525S00.7S
525
526
SAMUEL HIMMELFARB AND STEPHEN E. EDGELL
for P, but as P approaches unity, the RRT approaches direct questioning. Warner's technique, then, is elegant but costly. Consequently, much of the research on the RRT since Warner's presentation concerned the development of methodological variants that would improve its statistical efficiency. Horvitz, Greenberg, and Abernathy (1975) have reviewed these variants and Himmelfarb and Lickteig (Note 1) recently compiled a bibliography of most of the theoretical and empirical research on the RRT. Additive Constants Quantitative Response Model Warner's RRT and more efficient variants thereof disguise the true response to the sensitive question by creating uncertainty as to which question is being answered. For example, in the unrelated question variant (Greenberg, Abul-Ela, Simmons, & Horvitz, 1969; Horvitz, Shah, & Simmons, 1967), respondents either answer the sensitive question or a nonsensitive, unrelated question. The trade-off for the uncertainty created is the inefficiency of having 1 — P of the respondents answer a totally irrelevant question. The central insight behind the additive constants RRT is that it is possible to create uncertainty but yet gain some information about the sensitive question. In our model we do this by having the respondents add or subtract one or more constants from the true answer a certain proportion of the time. Since the constants are of known values, the respondent really provides information about the true answer to the question. We first consider the general case and then a few subcases that seem to be most usable.
structed to shuffle the cards thoroughly, take the top card from the deck, read what it says to do, and do it without telling the interviewer what the card said. Let X represent the true value of the head of the houshold's earnings last year (or whatever the sensitive question is), and let Z represent the respondent's answer. Thus, the respondent's numerical response is
Z = X + Ki,
(1)
with probability P{ (i — 1, c). Certainly one of the Ki& could be equal to zero, in which case the respondent answers directly the question about income (or whatever the sensitive question is). However, that is not necessary, and it is possible to arrange the situation so that the respondent never directly states how much is actually earned by the head of the household. We now turn to the derivation of a sample estimate of the mean and the variance of the sampling distribution of the mean on the sensitive question. Let g and / be the density functions of the respondents' numerical responses and the true value of the variable to be measured, respectively. Then, by Equation 1, g(Z) = ZPif(X + Ki). Therefore, £(Z) = 2PiE(X + Ki) = E(X) + ZPiKi. An unbiased estimate of the population mean MX is given by fix = Z - ZPiKi,
(2)
where Z is the sample mean of the numerical responses. The variance of fix is then var (fo) = var (Z) = var (Z)/N,
General Case
where N is the number of respondents and
For illustrative purposes we assume that the respondent is handed a deck of 100 cards. On each card is printed the following: "Take the amount in dollars that the head of this household earned last year, add to it an amount Ki, and tell me the answer to your addition." The value of Ki varies through the deck, and there are c different constants with the probability of occurrence of constant K{ being Pi (i = 1, c). The respondent, is in-
var (Z) = £(Z2) - E(Z)2
(3)
(4)
ADDITIVE CONSTANTS MODEL
Special Case 1 We now consider the first special case of the general model described previously. Let K be a constant. The respondents are required to respond directly to the sensitive question with probability P. If the randomization device does not direct them to answer the question directly, then, with equal probability [i.e., |(1 — P)~\, they are directed either to add K or IK or to subtract K or IK from the true value and give the resultant as their answer. For this case, fo = Z, (5) and
527
(Abernathy, Greenberg, & Horvitz, 1970) the women respondents were directed either to answer a sensitive question about the earnings of the head of the household in the previous year with probability P or to state their opinions about the average yearly earnings of the head of a household of their same family size with probability 1 — P. The central problem in the estimation of the population parameters of the sensitive question and the efficiency of the unrelated question RRT is to unconfound the data by having information about the population parameters of the unrelated question. The most efficient case is when the population var (Ax) = frx2 + 2.5(1 - P)K^/N. (6) parameters of the unrelated question are known in advance, for example, from census Special Case 2 data. Otherwise, a second independent sample using a second value of P is required to provide This case is similar to Case 1 except that if information about the unrelated question. the respondents are not directed by the To show the superiority of our model, we randomizing device to answer the quescompare it with the case of the unrelated tion directly, then with equal probability [J(l — P)] they are directed to either add or question RRT when the population mean HY and the population standard deviation