## Posts Tagged ‘sample bias’

### Radio Commercial Statistic: Another Example of Lies, Damn Lies, and then Statistics

May 10, 2010

Each morning, I awake to my favorite radio station, and the last few days, I’ve awakened to a commercial about a teaming up of Feeding America and the reality show Biggest Loser to support food banks.  While I think that’s a laudable joint venture, I have been somewhat puzzled by, if not leery of, a claim made in the commercial: that “49 million Americans struggled to put food on the table.”  Forty-nine million?  That’s one out of every six Americans!

Lots of questions popped into my head: Where did this number come from?  How was it determined?  How did the study define “struggling?”  Why were the respondents struggling?  How did the researcher define the implied “enough food?”  What was the length of time these 49 million people went “struggling” for enough food?  And most importantly, what was the motive behind the study?

The Biggest Loser/Feeding America commercial is a good reminder of why we should never take numbers or statistics at face value.  Several things are fishy here.  Does “enough food” mean the standard daily calorie intake (which, incidentally, is another statistic)?  Or, given that two-thirds of Americans are either overweight or obese (another statistic I have trouble believing), is “enough food” defined as the average number of calories a person actually eats each day?

I also want to know how the people who conducted the study came up with 49 million people.  Surely they could not have surveyed so many people.  Most likely, they needed to survey a sample of people, and then make statistical estimations – extrapolations – based on the size of the population.  In order to do that, the sample needed to be selected randomly: that is, every American had to have an equal chance of being selected for the survey.  That’s the only way we could be sure the results are representative of the entire population.

Next, who and how many completed the survey?  The issue of hunger is political in nature, and hence is likely to be very polarizing.  Generally, people who respond to surveys based on such political issues have a vested interest in the subject matter.  This introduces sample bias.  Also, having an adequate sample size (neither too small nor too large) is important.  There’s no way to know if the study that came up with the “49 million” statistic accounted for these issues.

We also don’t know how long a time these 49 million had to struggle in order to be counted?  Was it just any one time during a certain year, or did it have to go for at least two consecutive weeks before it could be contacted?  We’re not told.

As you can see, the commercial’s claim of 49 million “struggling to put food on the table” just doesn’t jive with me.  Whenever you must rely on statistics, you must remember to:

1. Consider the source of the statistic and its purpose in conducting the research;
2. Ask how the sample was selected and the study executed, and how many responded;
3. Understand the researcher’s definition of the variables being measured;
4. Not look at just the survey’s margin of error, but also at the confidence level and the diversity within the population being sampled.

The Feeding America/Biggest Loser team-up is great, but that radio claim is a sobering example of how statistics can mislead as well as inform.

### Paid Surveys & Bad Respondents Redux

May 27, 2009

A while back, I wrote about how paid surveys are a good source of bad respondents and how just a few bad respondents can increase your likelihood of drawing incorrect conclusions from survey findings.  Yet, I still stumble upon blogs promoting and encouraging people to make money by taking paid surveys.

Remember when blood banks used to pay people to donate blood?  And how it led to alcoholics and junkies donating blood to get money to support their addictions?  Needless to say, their blood was worthless.  The same undesirability happens with paid surveys – companies engaging in paid surveys often end up with respondents whose objective is the money rather than the topic of the survey; and, just as needless to say, their responses are worthless.

There is nothing wrong with providing an incentive for a survey.  Often, an incentive is necessary to get more respondents to participate.  However, the respondent base and the incentives should be controlled.  As I described in my last post, How Much Damage do Bad Respondents do to Survey Results?, I suggest controls such as asking your sample vendor how it screens panelists, how it prevents respondents from getting the same survey multiple times through different panels, understanding how the sample vendor tracks survey-taking behavior, and how it prevents the same panelist from using multiple e-mail addresses to register as several separate panelists.

By the looks of this blog post, Paid Market Research – Is It Really a Way to Make Money?, which I’ve “tracked back,” it seems that this problem isn’t going away anytime soon.

### How Much Damage do Bad Respondents do to Survey Results?

May 11, 2009

Minimizing both the number of bad respondents who take a survey and their impact on the survey results can seem as futile as Sisyphus pushing the rock up the mountain.  Bad respondents come in all flavors: professional respondents, speeders, retakers (people who take the same survey multiple times),  and outright frauds (people who aren’t who or what they claim to be).

Researchers have tried different approaches to these problems, including increasing sample size, eliminating one or two biggest types of bad respondents, or even ignoring the problem altogether.   Unfortunately, the first two approaches can actually cause more damage than doing nothing at all.  Let’s look at these three approaches more closely.

Approach 1: Increase the sample size

When concerned about accuracy, the common prescription among researchers is to increase the size of their sample.  Indeed, this approach reduces sampling error,  margin of error, and the impact of multicollinearity, and increases the confidence level in the results.   However, larger sample sizes are a double-edged sword.  Because a larger sample size reduces the standard error in the data, it also increases the t-value.  As a result, a small difference between two or more respondent groups can greatly increase the chance of committing a Type I error (rejecting a true null hypothesis).

Similarly,  if a sample has bad respondents, a larger sample size can actually exacerbate their impact on survey results.  After all, bad respondents are likely to respond to survey questions differently than legitimate respondents.  A larger sample size (even if every additional new respondent is good), will simply reduce the degree of these differences needed for statistical significance, and inflate the chance of drawing an erroneous conclusion from the survey findings.

Approach 2: Tackle the biggest offender

When faced with multiple problems, it is human nature to focus on eradicating the one or two worst problems.  While that might work in most situations, eliminating only one type of bad respondents can actually cause more problems.

Assume that a survey’s results include responses from both professional respondents and speeders.  Assume also that the survey has some ratings questions.  What if – compared to legitimate respondents – the former rates an item higher than average, and the latter lower than average?

By having both types of bad respondents in the survey, their overall impact on the mean may be negligible.  However, if you take out only one of them, the mean will become biased in favor of the type that was left alone, again exacerbating the impact of bad respondents.

Approach 3: Do Nothing

While doing nothing is preferable to the other two approaches, it has its own problems.  Return to the example of two types of bad responders.  While leaving both of them alone will keep the mean close to what it would be in the absence of both types, it will also inflate the variance of the data, resulting in an estimate of the mean that is untrustworthy.  Hence, removing one type of bad respondents causes biased results while doing nothing causes inefficient results, neither of which has pleasant outcomes.

1. Ask how your sample vendor screens people wishing to join its panel;
2. Find out how your vendor ensures that panelists who are on other panels are precluded from being sent the same survey;
3. Determine how your vendor tracks the survey-taking behavior of its panelists, assesses the legitimacy of each, and purges itself of suspected bad respondents; and
4. Determine how your vendor prevents a person with multiple e-mail addresses – if you’re doing online surveys – from trying to register each one as a separate panelist.

### Beware of “Professional” Survey Respondents!

April 3, 2009

Thanks to the Internet, conducting surveys has never been easier.  Being able to use the Web to conduct marketing research has greatly reduced the cost and time involved and has democratized the process for many companies.

While online surveys have increased simplicity and cost-savings, they have also given rise to a dangerous breed of respondents – “Professional” survey-takers.

A “professional” respondent is one who actively seeks out online surveys offering paid incentives – cash, rewards, or some other benefit – for completing the survey.  In fact, many blogs and online articles tell of different sites people can go to find paid online surveys.

If your company conducts online surveys, “professionals” can render your findings useless.  In order for your survey to provide accurate and useful results, the people surveyed must be representative of the population you are measuring and selected randomly (that is, everyone from the population has an equal chance of selection).

“Professionals” subvert the sampling principles of representativeness and randomness simply because they self-select to take the survey.  The survey tool does not know that they are not part of the population to be measured, nor their probability of selection.  What’s more, online surveys exclude persons from the population without Internet access.  This results in a survey bias double-whammy.

In addition, “professionals” may simply go through a survey for the sake of the incentive.  Hence they may speed through it, paying little or no attention to the questions, or they may give untruthful answers.  Now your survey results are both biased and wrong.

Minimizing the impact of “Professionals”

There are some steps you can take to protect your survey from “professionals,” including:

• Maintain complete control of your survey distribution.  If possible, use a professional online survey panel company, such as e-Rewards, Greenfield Online, or Harris Interactive.  There are lots of others, and all maintain tight screening processes for their survey participants and tight controls for distribution of your survey;
• If an online survey panel is out of your budget, perhaps you can build your own controlled e-mail list (following CAN-SPAM laws, of course).  E-mailing your survey is less prone to bias than keeping it on a Web site for anyone to join.
• Have adequate screening criteria in your survey.  If you can get respondents to sign in using a passcode and/or ask questions at the beginning, which terminate the survey for people whose responses indicate they are not representative of the population, you can reduce the number of “professionals”;
• Put “speed bumps” into your survey.  An example would be to have a dummy question inside that simply says: “Select the 3rd radio bottom from the top.”  Put two or three bumps in your survey.  A respondent who answers two or more of those bump questions incorrectly is likely to be a speeder and the survey can be instructed to terminate;
• Ask validation questions.  That is, ask a question one way and then later in the survey ask it in another form, and see if the responses are consistent.  If they’re not, then the respondent may be a “professional” or a speeder.

The Internet may have made marketing research easier, but it has also made it more susceptible to bias.  The tools to conduct marketing research have become much easier and more user-friendly, but that doesn’t change the principles of statistics and marketing research.  Online surveys, no matter how easily, fast, or cheaply they can be implemented, will waste time and money if those principles are violated.