Yesterday’s post discussed approaches for asking survey questions of a sensitive nature in a way that would make individual respondents more inclined to answer them truthfully. Sometimes, however, you don’t care about the individual respondent’s answer to the sensitive question, but would rather get an idea of the incidence of that sensitive issue among all respondents. Sometimes, knowing the incidence of such a topic is what we need in order to conduct further research, or get an understanding of the market potential for a new product, or decide how to prioritize the allocation of resources for exploiting that instance. The most effective ways to do this are through Randomized Response Techniques, which are useful for assessing group behavior, as opposed to individual behavior.
Let’s assume that you are marketing a new over-the-counter ointment for athlete’s foot to college males, and you want to understand how large a market you have for your ointment. You decide to survey of 100 college males, randomly selected. Asking them if they’ve had athlete’s foot might be something they don’t want to answer, yet you’re not concerned with whether a particular respondent has athlete’s foot, but rather, get an estimate of how many college age men suffer from it.
Try a Coin Toss
One indirect way of finding out the incidence of athlete’s foot among college men might be to ask a question like this:
“Flip a coin (in private) and answer ‘yes’ if either the coin was a head or you’ve suffered from athlete’s foot in the last three months.”
If the respondent answers “yes” to the question, you will not know whether he did so because of the athlete’s foot or because of the coin toss. However, once you’ve compiled all the responses to this question, you can get a good estimate of the incidence of athlete’s foot among college males. You would figure it out as follows:
|Number answering “yes”||
|Expected Number of Heads on flip||
|Excess “Yes” over Expected||
|Percent with Athlete’s Foot (15/50)||
Generally, when you flip a coin, you expect the results of the toss to come up “heads” about 50% of the time. If 65% of the respondents answer “yes” to the heads/athlete’s foot question, then you are 15 points over the expected value. Dividing that difference by the expected value (50) gives you an estimate that 30% of respondents have athlete’s foot.
Roll the Dice
Another approach would be asking respondents to roll a die and answer one question if the roll comes up anywhere from 1 to 4 and answer another if the roll comes up 5 or 6. If the die comes up as 1-4, the respondent answers the question, “I have had athlete’s foot” with either a “Yes” or a “No.” Respondents whose die roll came up 5 or 6 will need to answer the yes/no question, “I have never had athlete’s foot.”
What is the probability that a respondent has had athlete’s foot? The probability of a “Yes” is determined as follows:
P(YES) = P(Directed to first question)*P(Answering Yes to first question) + P(Directed to second question)*P(Answering Yes to second question)
Remember that respondents have a 100% probability of being assigned to either question. Hence the probability of being directed to the first question must be subtracted from 100 in order to get the probability of being directed to the second question. Expressing the probabilities in decimal form, we modify the probability equation as follows:
P(YES)= P(Directed to first question)*P(Answering Yes to first question) + (1-P(Directed to first question))*(1-P(Answering Yes to first question))
In the above example, the probability of being assigned the first question (for rolling a 1-4) is .67 (four chances out of six, or two-thirds). Now, if 35 respondents indicated “Yes” to “I have had athlete’s foot”, we get the following equation, denoting probability as “P”:
0.35 = 0.67P + 0.33(1-P)
0.35 = 0.67P + 0.33 – 0.33P
0.35-0.33 = 0.67P – 0.33P
0.02 = 0.34P
Hence, 5.88% of respondents will have had athlete’s foot.
There are several other randomized response techniques you can do, but these two are some examples you might want to try. Note that the dice approach may not be a very reliable estimator, since if 36 respondents indicated “Yes”, then the probability increases to 8.82%; it’s as if a 1% increase in “Yes” responses increases the overall probability of a “Yes” response by almost 3%. Randomized response techniques are good when you don’t care about the individual responses to sensitive information, but want to know the incidence of such behavior within the respondents. By wording questions in this fashion, you can put respondents as ease when asking these questions, and give them the feeling their responses are obscured, all the while gaining estimates of the percentage of the group engaging in said behavior.