# Surveying embarrassing questions

## Background

When you want to collect information on individuals, with the objective of deducing something about the population, there are several issues to contend with:

- Are you talking to the right people (have you done a good job of
**sampling**); - Do you have enough people to make good predictions (
**sample size, power**)? - Have you posed the questions carefully (
**experimental design**)? - Will the respondents answer truthfully?
- Have you provided anonymity, so that individuals have faith that their answers won't be discovered and shared?
- Will the respondents take your questions seriously, and answer them honestly?

These are just some of the issues we face when we want to collect information in order to make deductions about populations of interest. Polls are one such situation, where we hope that a small sample of would-be voters will inform us about the tendency of the whole.

## Providing cover for respondents

There are many methods to help provide anonymity to respondents:

- anonymous surveys
- using fill-in dots, rather than hand-written information (so that hand-writing won't be recognized)
- asking respondents to use the internet and anonymous computers to provide information

We propose the following mathematical relationship between anonymity and truth (a **direct** relationship):

The more assured the respondent is that there is anonymity, the more truthful the answer.

One way of providing anonymity to respondents in a survey is to provide them "deniability": their answer may or may not reflect their actual state or feelings. We've already looked at one way of doing this: the one-coin-toss sampling technique. The basic idea is that if a coin lands heads, then a standing answer is given (the "embarrassing" answer); whereas if the coin comes up tails, the honest answer is given (which may be the embarrassing answer or not, depending on the actual state of the respondent). This method provides cover to respondents so that they may answer embarrassing questions, but it does so at a cost: we lose half of our information (we lose "power"), since we expect that half of the answers are junk. For example, if we asked a room full of 100 people to use this method to determine the rate of space aliens in the population, about 50 would admit to being space aliens. If exactly 50 tossed heads, then we would correctly deduce that 0% of people are space aliens. If more or fewer heads came up, then we would either deduce a small positive or negative rate of alienness -- but we know that, in any event, this is merely an estimate (I'm assuming a zero rate of alienness, but I may be wrong!).

In order to improve the situation with respect to the loss of information through the "junk answers", it turns out that we can use a two-coin-toss sampling technique, which is related to the one-coin-toss sampling technique.

## Try it!

To see if you really understand the two techniques described above, you need to do some experiments. Think of some good examples of questions that you'd like to see the anwer to, that might involve embarrassing material. Then use this technique to answer them. You might permit your respondents a little more anonymity by allowing them to give their answers on 3x5 cards, for example -- that might allow them to play along exactly as you would like. Always be on the alert for the cheater, and the kidder, of course! These are the banes of pollsters everywhere....

- Do you have the embarrassing medical problem of multiple siblings?
- Have you flipped off someone in traffic in the last month?